technical

Archived Posts from this Category

How we hash our Javascript for better caching and less breakage on updates

Posted by on 01 Sep 2009 | Tagged as: technical

One of the problems we use to see frequently on Green Felt happened when we’d update a Javascript API: We’d add some parameters to some library function and then update some other files so that they called the function with the new parameters. But when we’d push the changes to the site we’d end up with few users that somehow had the old version of one of the files stuck in their cache. Now their browser is causing old code to call new code or new code to call old code and the site doesn’t work for them. We’d then have to explain how to reset their cache (and of course every browser has different instructions) and hope that if they didn’t write back that everything went OK.

This fragility annoyed us and so we came up with a solution:

  • We replaced all of our <script> tags with calls to a custom “script” function in our HTML template system (we use Template::Toolkit; [% script("sht.js") %] is what the new calls look like).
  • The “script” function is a native perl function that does a number of things:
    1. Reads the Javascript file into memory. While reading it understands C style “#include” directives so we can structure the code nicely (though we actually don’t take advantage of that yet)
    2. Uses JavaScript::Minifier::XS to minify the resulting code.
    3. Calculates the SHA hash of the minified code.
    4. Saves the minified code to a cache directory where it is named based on its hash value which makes the name globally unique (it also keeps it’s original name as a prefix so debugging is sane).
    5. Keeps track of the original script name, the minified script’s globally unique name, and the dependencies used to build the image. This is stored in a hash table and also saved to the disk for future runs.
    6. It returns a script tag that refers to the globally unique Javascript file back to the template which ends up going out in the html file. For example, <script src="js/sht-bfe39ec2e457bd091cb6b680873c4a90.js" type="text/javascript"></script>
  • There’s actually a step 0 in there too. If the original Javascript file name is found in the hash table then it quickly stats its saved dependencies to see if they are newer than the saved minified file. If the minified file is up to date then steps 1 through 5 are skipped.

The advantages of this approach

It solves the original problem.

When the user refreshes the page they will either get the page from their browser cache or they will get it from our site. No matter where it came from the Javascript files it references are now uniquely named so that it is impossible for the files to be out of date from each other.

That is, if you get the old html file you will reference all the old named Javascript files and everything will be mutually consistent (even though it is out of date). If you get the new html file it guarantees you will have to fetch the latest Javascript files because the new html only references the new hashed names that aren’t going to be in your browser cache.

It’s fast.

Everything is cached so it only does the minification and hash calculations once per file. We’re obviously running FastCGI so the in memory cache goes across http requests. More importantly the js/ dir is statically served by the web server so it’s exactly as fast as it was before we did this (since we served the .js files without any preprocessing). All this technique adds is a couple filesystem stats per page load, which isn’t much.

It’s automatic.

There’s no script to remember to run when we update the site. We just push our changes up to the site using our version control and the script lazily takes care of rebuilding any files that may have gone out of date.

So you might be thinking, isn’t all that dependency stuff hard and error prone? Well, it’s really only one line of perl code:

sub max_timestamp(@) { max map { (stat $_)[9] || 0 } @_ } # Obviously 9 is mtime

It’s stateless.

It doesn’t rely on incrementing numbers (“js/v10/script.js” or even “js/script-v10.js”). We considered this approach but decided it was actually harder to implement and had no advantages over the way we chose to do it. This may have been colored by our chosen version control system (darcs) where monotonically increasing version numbers have no meaning.

It allows aggressive caching.

Since the files are named by their contents’ hash, you can set the cache time up to be practically infinite.

It’s very simple to understand.

It took less than a page of perl code to implement the whole thing and it worked the first time with no bugs. I believe it’s taken me longer to write this blog post than it took to write the code (granted I’d been thinking about it for a long time before I started coding).

No files are deleted.

The old js files are not automatically deleted (why bother, they are tiny) so people with extremely old html files will not have inconsistent pages when they reload. However:

The js/ dir is volatile.

It’s written so we can rm js/* at any point and it will just recreate what it needs to on the next request. This means there’s nothing to do when you unpack the source into a new directory while developing.

You get a bit of history.

Do a quick ls -lrt of the directory and you can see which scripts have been updated recently and in what order they got built.

What it doesn’t solve

While it does solve the problem of Javascript to Javascript API interaction, it does not help with Javascript to server API interaction–it doesn’t even attempt to solve that issue. The only way I know to solve that is to carefully craft the new APIs in parallel with the old ones so that there is a period of time where both old and new can work while the browser caches slowly catch up with your new world.

And… It seems to work

I’ve seen similar schemes discussed but I’ve not seen exactly what we ended up with. It’s been working well for us–I don’t think I’ve seen a single bug from a user in a couple months that is caused by inconsistent caching of Javascript files by the browser.

We suck. Bear with us!

Posted by on 08 Jul 2007 | Tagged as: technical

You’ve probably noticed over the past two or three months that certain parts of the site aren’t working well. Specifically, the high scores and leader board pages are often slow and even completely unresponsive. It’s happening for two reasons. The first reason is that we don’t really know squat about database design. Actually I should say “I know nothing about database design” instead of “we” and leave Jim out of this. There are a number of, uh, questionable design decisions that were made when I first set up the database and we think they might be causing the problems.

Or maybe not. See, we run our site at a hosting company (originally it ran off my cable modem at home but it was too unreliable when we started to get lots of people playing). The problem is that our hosting company runs the database on a server we don’t have access to. So when things are locking up there’s no way for us to go in and figure out what exactly is going wrong. We’ve emailed them and tried to solicit help but they aren’t very responsive. They most they’ll do is just reset the server–They don’t try to help us figure out why the problems are happening in the first place.

So what is the solution? Well, we’re attacking both problems. First we are going to move to a new database that will run on a server that we have access to. Second, we are going to start making some changes to our database that will make it smaller and more streamlined. This should also help make things faster.

So hang in there! Bear with us. We’re working on it and we’ll hopefully have something soonish. Hopefully you won’t notice the transition except that the non-responsiveness will go away (forever, ideally). We’ll probably mention something here–I’m even considering a beta test, but no guarantees.

A bevy of new things

Posted by on 21 Aug 2006 | Tagged as: news, technical

We’ve been busy lately. Or, more acurately, we’ve not been busy lately–Green Felt is our “spare time” project. However you look at it, we’ve made a bunch of under the hood and user visible changes lately. I’ll explain them one by one:

  1. We added a new graphic to represent an empty card pile and got rid of the cheesy old-school html table borders. I have to say I’m really happy with the way this turned out. I originally wanted the empty pile to look like a depression in the green felt background, but I was unable to draw it in Illustrator. I’m not much of an artist. I still have hope that my brother Michael will come through with a nice picture. In the mean time I think the dotted outlines look good enough.
  2. We added a little box in the top right that tells you when there are new posts on this blog or in the forum. I don’t like that it takes a long time to load, plus it doesn’t wrap nicely when the window is too small (especially in Internet Explorer for some reason), but I like the functionality. I wish it were quick and built into the static HTML instead of being AJAX.
  3. We added redo support! If you undo too far or decide it was a bad idea to undo you can redo back to where you were. Control-Shift-Z (or Command-Shift-Z for you Mac heads) is the key shortcut for redo. Note that if you undo and then start moving cards around then your redo history is lost.
  4. We moved the Forum to our hosting provider and updated it to the latest version. This is more of a behind the scenes change, but you should notice that the forum is more responsive now that it’s hosted by a provider with a lot of bandwidth and not from my personal cable modem. 🙂 We also added some spam protection which, for the moment, seems to be keeping the spam bots a bay (which may not have been annoying you, but it was making us delete a bunch of stupid spam messages every day).
  5. We also created a new forum just for resolved bugs and moved all the bugs that have been fixed into that forum. That way the bugs forum only lists outstanding/unresolved issues.
  6. It’s not quite done yet, but we’ve done a lot of work on a new game. Suduko! There’s a couple bugs we have to fix and loose ends to tie up, but you can try it out now if you can guess the address. 🙂

-David

Leader Board Changes

Posted by on 16 Aug 2006 | Tagged as: news, technical

We made a change to the way the scores are ranked on the leader board. Today I pushed the change to the main site. The change is this: If you play a game multiple times and get the same score then only the first game will be shown.

Why did we make this change? We wanted to combat the technique of playing a game multiple times and getting the same score, but each time incrementally decreasing the time. This seems like gaming the system to us. Not only that but it pushes everyone off the high score list which can be annoying. So now only your first game of a particular score gets counted.

We’ve been calling this the “Kathy Problem” amongst ourselves as Kathy in particular seems to engage in this activity. Don’t worry Kathy, we still love you. 🙂

Think this is a good change? Hate it? Feel slighted because we singled you out and named a problem after you when you’re really not the only person to engage in this activity? Leave a comment and let us know how you feel.

Added a news box

Posted by on 13 Aug 2006 | Tagged as: technical

The following post was originally written in August of 20066. For some reason I never posted it to the blog, though reading it now it seems interesting. It’s fairly technical, unlike some of the other posts. Hope you enjoy it:


I wasn’t sure how to best link in our new blog. After thinking about it a while I decided that a little box in the corner of the site could list the last 2 or 3 blog posts. I then figured I should be able to list the last forum posts too. Once I had the idea, I had to figure out how I was going to get the info from two diverse pieces of code into our green felt page template.

Then I noticed that the blog we set up (wordpress) has an RSS feed. This is a little XML data file that always has the latest posts in it. Nice. The forum didn’t have any RSS support though. Luckily, it’s one of the most popular forums out there so there were at least 3 add-ons that would add RSS functionality. It took me 2 tries to get a good one, but in the end I liked the way it worked. I was able to customize it and add in a couple XML fields that they didn’t set by default.

After seeing how slow the RSS feeds were to respond I decided that it would be best for the client to go get the data and transform it onto the screen. Nobody wants to wait 2 extra seconds for their page to load just for some integrated RSS.

I then had to coax the XML from the RSS into a form that could be displayed on the page. I started with the standard javascript DOM interface, but it was a big pain and the code was growing way too big way too fast. I then realized that it would be best to parse it on the server side and send JSON over to the javascript client. So it was time for a quick CGI script.

Let’s hear it for perl! I whipped up something that parsed the XML adequately in about 5 minutes. The key to the whole script is this line here:

print objToJson(XMLin($response->content, ForceArray=>["item", "channel"]));

That uses perl’s XML::Simple to parse the XML into a perl structure. The “ForceArray” part makes sure that all the “item” and “channel” tags in the XML get interpreted as array items. It then outputs that structure as JSON, ready to be interpreted by the javascript side.

The javascript side is easy since it has everything in a native structure. Want the title of the first item in the first channel? rss.channel[0].item[0].title. Once I had the right data structures, the rest of the code basically wrote itself.

In the end I’m happy with the way the code turned out, but I’m not sure whether an extra little box in the corner of the page use useful enough to balance out the amount of clutter it adds.

-David