Choosing Videos to Fit a Time Slot

For this year’s Hack Week, our team built a TV channel on IGN. IGN has a ton of video content, but you had to find it yourself. There wasn’t a place where you could just tune in and have interesting videos play automatically, so we built it. Scratch your own itch, right?

One of the core pieces of the project was scheduling videos that fit a particular theme into a fixed time slot. For example, we wanted to show 15 minutes of news videos starting at 6pm. Unlike traditional TV where the content is made to fit into pre-determined time slots, we were picking from videos that varied in length. We knew segments would not end exactly on time, but we wanted the overflow to be as small as possible so the next segment would start as close to on-time as possible.

There was also a matter of prioritization. Some videos, like our daily news show, should always make it into our news segment. Fresher content should be prioritized over older content. And in the future, we might want to take the popularity of videos into consideration as well.

Essentially, we had to solve a variation of the knapsack problem. Unfortunately,

The decision problem form of the knapsack problem (“Can a value of at least V be achieved without exceeding the weight W?) is NP-complete, thus it is expected that no algorithm can be both correct and fast (polynomial-time) on all cases.

Fortunately, for our purposes, we didn’t have to find the correct solution, just a solution that was good enough.

Continue reading

githublink.vim Plugin

I use Vim as my primary editor. I also use GitHub for a lot of my projects. On more than one occasion, I have found myself wanting to share the line under my Vim cursor with a collaborator in the form of a github.com link. The manual process went something like this:

  • Go to github.com in a browser
  • Navigate to the correct repository and branch
  • Drill down to the file I was working on
  • Find the line in the file and click on the line number
  • Copy the URL and paste it in the email or instant message

Wouldn’t it be nice if you could just hit a hotkey and have Vim figure out the URL for you? I thought so too, so I started hacking on Vim plugin to do exactly that.

To use the plugin, install it, then start editing a file that’s part of a GitHub-hosted repository. Press \ g (backslash, followed by g) to display the URL of the current line. You still have to copy and paste the URL, but everything else is done for you. Update: the plugin now copies the URL to the clipboard for you if pbcopy is available.

Some of the constructs for the plugin are borrowed from rubytest.vim, so thanks to janx for the rubytest plugin. This is my first foray into Vim script, so if there are better ways of doing things, please share!

Pagination and Sort Order

For the last few months, I’ve been hacking on a Scala REST service at work. While the subject domain and the schema are reasonably complex, the service itself is fairly simple. For the most part, it’s a thin wrapper around a Mongo database. It supports basic CRUD operations and speaks JSON over HTTP. There are some additional features like data validation and support for advanced queries, but those features aren’t important for the purposes of this post.

Most consumers of this service fetch a few records at a time to generate a web page or mobile view. To save bandwidth and processing time, clients can voluntarily limit the maximum number of records to be returned via a count parameter. If a page only shows 10 results at a time, there’s no reason to fetch more than 10 records from the service.

There are a few clients (mostly batch jobs) that need to fetch a large number of records at a time. To save memory and to help prevent abuse, the service limits the maximum number of records a client is allowed to retrieve in one call. But pagination is supported via a startIndex parameter. For example, a client might fetch the first 100 entries of a result set by specifying startIndex=0 with count=100, then the next 100 entries by specifying startIndex=100 with count=100, then the next 100 entries by specifying startIndex=200 with count=100, etc.

All of this worked fine in development and staging. But in production, some batch jobs would fail intermittently. The batch job was trying to use pagination to fetch all of the records of a certain type, but sometimes, only a subset of the records were being returned. The only difference between staging and production was the number of Tomcat instances — in production we had several nodes behind a load balancer but in staging we only had one. I was puzzled.

Continue reading

Restarting a Git Branch

I tend to think about my code a lot, even when I’m not in front of a computer. Sometimes I’ll think of a refactor or performance improvement on the train, in the shower, or as I’m lying in bed right before I get up in the morning. When I have one of these ideas, I tend to just start banging out code the moment I sit down. I often forget to pull or start a new branch first. Fortunately, git makes it relatively simple to move your commits around as long as you haven’t pushed your changes yet. (Once you push your commits, you shouldn’t alter them because someone else may have pulled them.)

The easiest case is if you have changes that haven’t been committed yet. Just create your branch before committing your changes.

git checkout -b great-train-idea
# now on branch great-train-idea
git commit -a -m "Great idea I thought of on the train"

The next-eaisest case is if you have made several commits in master that you haven’t pushed yet, but you want to move them to a new branch. Since branches are simply pointers to commits in git, you can just create a new brach that points at your new changes, then reset master to the commit before you made those changes.

# assuming current branch is master and working directory is clean
git branch genius-shower-idea
git reset --hard origin/master

The example above works with branches other than master too — if you accidentally made commits on great-train-idea that you wanted to move to a new genius-shower-idea branch, just replace master with great-train-idea in the example. Also, if you’ve never pushed great-train-idea (the branch you mistakenly committed to), you can use git log to figure out which commit to revert to instead of relying on the origin/great-train-idea reference.

The trickiest case is if you have made several commits in master, but master wasn’t up to date. The solution here is to move the commits to a new branch as before, then update master, then replay the changes in your new branch using the new updated master as a starting point.

# assuming current branch is master and working directory is clean
git branch awesome-morning-idea
git reset --hard origin/master

# update master
git pull

# make it as if awesome-morning-idea was branched from up-to-date master
git checkout awesome-morning-idea
git rebase master

Again, let me emphasize that you shouldn’t do any of this if you’ve already pushed your changes. Hopefully, if you were so wrapped up in your idea that you forgot to pull or committed to the incorrect branch, you were also too distracted to push. 😉