Looking at the current production site the average response time is 16ms, with 8ms of database work and 7ms of rendering time. I ran numbers on the beta and saw the same query was averaging around 63ms, with the split 26ms database and .17ms rendering (No clue where the missing milliseconds are). I was very very sad to see things were going close to 4x slower, I thought Rails 3 was suppose to make my world better!
Turns out it can, you just have to work a little bit harder. What I almost forget to mention was that the current Rails-2 production system uses a much smaller dataset, the table with all the shuttle positions is archived and wiped clean every night, so at worst (like 11pm) the queries are hitting a few thousand rows. On the other hand, my research into route identification and arrival prediction requires a historical dataset so I didn’t build any support into the new Rails 3 code to throw that data aside. Maybe my code wasn’t so bad after all, but it was still measurable slower.
I switched the database over to my development server which runs orders of magnitude slower than the production box (all production / beta code is running on the same dedicated shuttle tracking production server). I started by taking a look at the database queries my code was generating and none of them seemed too outrageous. The first query finds all the shuttles that have the enabled flag true, SELECT vehicles.* FROM vehicles WHERE vehicles.enabled = true, and was only taking 1ms, nothing significant at all. The real slow guy is the query, executed one for each shuttle, to grab the latest position SELECT “updates”.* FROM “updates” WHERE (“updates”.vehicle_id = 1) ORDER BY timestamp DESC LIMIT 1. On the development box, running this query for just one shuttle (like it looks previously) was taking 1100ms, multiply that by 8 shuttles and you have >8 seconds of dedicated thinking time. With the update interval of 4 seconds, the development server would probably implode as a result!
I considered rewriting the code to try and generate a different sql query. We actually don’t want to know the latest position, we want to know the latest position if that position is recent (e.g has a timestamp within the last N minutes). To achieve that I’d probably have to write a lamda scope, generating a query like SELECT “updates”.* FROM “updates” WHERE “updates”.vehicle_id = 1 AND “updates”.timestamp > recent_time_here) ORDER BY timestamp DESC LIMIT 1 which isn’t really that intimidating, but I don’t know if it would solve the real problem. Database indexes, besides requiring less typing on my end, seemed like the better way to speed the query up. (Lamda scopes are still intimidating most days)
I figure there are 2 parameters that the database cares about when it’s running the latest position query from above: vehicle_id and timestamp. To figure out the best indexes to add I set out and tested my options, running each index independently, together, and them combined (in both orders).
The first row in represents the indexes added to the table, the vehicle_id + timestamp represents having two independent indexes (combining the first two test) and the comma-separated index represented a combined key.
The data showed, pretty clearly, that the combined key on [vehicle_id, timestamp] was the best index to add to the table. The results came in faster than any other index and (as a nice bonus) the index size wasn’t as large some of those that placed emphasis on the timestamp over the vehicle_id. Given the SQL query being executed, this makes sense. The query first needs to scope what vehicle to look for and then perform the timestamp operation.
I committed code to add the indexes to the updates table and updated the beta appropriately. I posted a new link on Twitter asking people to help out load / stress test the server and it was re-posted on Facebook a bit. I wanted to quickly generate enough data to compare with the previous beta run and the production log to see if the indexes signifnicantly helped everyone’s experience or it just a fluke on my development service.
Below you’ll find the numbers, after expanding out some of request-log-analyzer results, that show how much faster the indexes actually made things.
At first glance I wasn’t super thrilled that the new code, with indexes, was only 4ms faster than the existing code… but I guess another way to frame that is a 25% improvement which is fairly substantial and that same change (closer to 22%) was carried over the upper limit of the 95 percentile range of requests.
I do find myself wishing request-log-analyzer could run it’s computations on the millisecond level, perhaps I’ll look into that change if I’m feeling extra adventurous sometime soon.
While I look forward to having an expanded dataset in the production system for cool things like route identification and estimated arrival times, until those features are public you can look forward to saving around 4ms every time the shuttles move (or don’t move) on your display!