Developing Offline

I found myself on a plane a few days ago and was hoping to do some work on a few of my Ruby on Rails projects, primarily some polishing of the the Community Mapping project I’m launching later this week.  Here are a few tips / tricks to developing in Ruby on Rails without internet access:

  1. Clone / Pull / Update the code for your application locally.  I do almost all of my development on remote servers so it’s rare I have the latest of anything on my hard drive.  git clone / git pull is a must to make this happen.  If you don’t have your SCM tool installed (like git, svn, hg. etc) you need to do this ASAP.
  2. Bundle. Bundler helps maintain the dependencies in your applications plugins / libraries but to do that is usually needs to download libraries from the internet unless you have them installed locally.  If you’re short on time (i.e. waiting to board the airplane) I would run bundle install from the app that has the most libraries associated with it.  Bundler will reuse things that are already installed and if you’re lucky many of your applications share common libraries.  The more wifi time you have the more applications you should bundle before you try and do it offline.
  3. Try and find any useful wiki / documentation pages that aren’t generated from source code.  These pages are likely going to include examples of implementations and features not associated with a particularly function.  In my case, I know there is a wiki page on Github that describes a “best approach” to the problem I’m having right now, but I can’t get to that in the airplane.  I tend to have the most trouble with jQuery-based documentation… when ever I need to know the syntax for a function like $.ajax I just google it.  Not possible on an airplane.  Instead of waiting til I land to quickly fire up wifi before dashing to my connecting flight (that’s my current plan), I could have been smart and downloaded the documentation first.  http://www.jqapi.com/ or http://docs.jquery.com/Alternative_Resources may be worth exploring.
  4. Don’t worry about the framework / gem documentation, at least not the function-by-function style documentation that is generated automatically.  You can regenerate it on your own if you need to.  The Ruby on Rails documentation can be generated by running `rake doc:rails` from your application directory.  You’ll find the output in your apps doc/api directory.  If you need documentation for a gem your system might already have it.  Run `gem server` to start a server with information about your gems.  If the rdoc link isn’t working for the gem you’re interested in fear not, you generate it most of the time using `gem rdoc gemname`.  I needed the documentation for CanCan so I ran `gem rdoc cancan` and presto, the server was able to point me to some moderately useful information.
  5. Hack it if you have to.  If you forget step 3 and step 4 didn’t help, you can probably write some really sloppy code to do what you’re trying to.  If you can’t (or don’t want to) write some junky code perhaps you can simulate it.  For example, I don’t know the exact call I need to figure out if I want to give the user access or not, but knowing that it will return true or false lets me very easily simulate what will happen in the rest of my application.
  6. Write lots of comments.  You’re flying in an airplane.  For all you know the baby crying behind you could be effecting your normal coding practices, it’s not going to be very easy to get back in the same mindset again so you should document what you’re doing extensively.  This applies extra if you have to use step 5.

Best of luck with your offline development, and safe travels.

Driving Faster

I’ve been beta testing the new Shuttle Tracking system for the past 2 weeks and, after discovering the awesome Request-log-analyzer tool I started to crunch some numbers on the request for new shuttle positions.  Every 4 seconds the page calls /vehicles/current.js (translating to VehiclesController#current.JS) to ask for the latest shuttle locations.  It is important we answer this query as fast as possible, a slow response here can queue up incoming requests very quickly.  The client JavaScript isn’t very smart right now, so requests keep coming every 4 seconds until you leave the page which can bring the server to a screeching halt if we don’t answer (been there, seen that).

Looking at the current production site the average response time is 16ms, with 8ms of database work and 7ms of rendering time.  I ran numbers on the beta and saw the same query was averaging around 63ms, with the split 26ms database and .17ms rendering (No clue where the missing milliseconds are).  I was very very sad to see things were going close to 4x slower, I thought Rails 3 was suppose to make my world better!

Turns out it can, you just have to work a little bit harder.  What I almost forget to mention was that the current Rails-2 production system uses a much smaller dataset, the table with all the shuttle positions is archived and wiped clean every night, so at worst (like 11pm) the queries are hitting a few thousand rows.  On the other hand, my research into route identification and arrival prediction requires a historical dataset so I didn’t build any support into the new Rails 3 code to throw that data aside.  Maybe my code wasn’t so bad after all, but it was still measurable slower.

I switched the database over to my development server which runs orders of magnitude slower than the production box (all production / beta code is running on the same dedicated shuttle tracking production server).  I started by taking a look at the database queries my code was generating and none of them seemed too outrageous.  The first query finds all the shuttles that have the enabled flag true, SELECT vehicles.* FROM vehicles WHERE vehicles.enabled = true, and was only taking 1ms, nothing significant at all.  The real slow guy is the query, executed one for each shuttle, to grab the latest position SELECT “updates”.* FROM “updates” WHERE (“updates”.vehicle_id = 1) ORDER BY timestamp DESC LIMIT 1.  On the development box, running this query for just one shuttle (like it looks previously) was taking 1100ms, multiply that by 8 shuttles and you have >8 seconds of dedicated thinking time.  With the update interval of 4 seconds, the development server would probably implode as a result!

I considered rewriting the code to try and generate a different sql query.  We actually don’t want to know the latest position, we want to know the latest position if that position is recent (e.g has a timestamp within the last N minutes).  To achieve that I’d probably have to write a lamda scope, generating a query like SELECT “updates”.* FROM “updates” WHERE “updates”.vehicle_id = 1 AND “updates”.timestamp > recent_time_here) ORDER BY timestamp DESC LIMIT 1 which isn’t really that intimidating, but I don’t know if it would solve the real problem.  Database indexes, besides requiring less typing on my end, seemed like the better way to speed the query up.  (Lamda scopes are still intimidating most days)

I figure there are 2 parameters that the database cares about when it’s running the latest position query from above: vehicle_id and timestamp.  To figure out the best indexes to add I set out and tested my options, running each index independently, together, and them combined (in both orders).

The first row in represents the indexes added to the table, the vehicle_id + timestamp represents having two independent indexes (combining the first two test) and the comma-separated index represented a combined key.

The data showed, pretty clearly, that the combined key on [vehicle_id, timestamp] was the best index to add to the table. The results came in faster than any other index and (as a nice bonus) the index size wasn’t as large some of those that placed emphasis on the timestamp over the vehicle_id. Given the SQL query being executed, this makes sense. The query first needs to scope what vehicle to look for and then perform the timestamp operation.

I committed code to add the indexes to the updates table and updated the beta appropriately. I posted a new link on Twitter asking people to help out load / stress test the server and it was re-posted on Facebook a bit. I wanted to quickly generate enough data to compare with the previous beta run and the production log to see if the indexes signifnicantly helped everyone’s experience or it just a fluke on my development service.

Below you’ll find the numbers, after expanding out some of request-log-analyzer results, that show how much faster the indexes actually made things.

At first glance I wasn’t super thrilled that the new code, with indexes, was only 4ms faster than the existing code… but I guess another way to frame that is a 25% improvement which is fairly substantial and that same change (closer to 22%) was carried over the upper limit of the 95 percentile range of requests.

I do find myself wishing request-log-analyzer could run it’s computations on the millisecond level, perhaps I’ll look into that change if I’m feeling extra adventurous sometime soon.

While I look forward to having an expanded dataset in the production system for cool things like route identification and estimated arrival times, until those features are public you can look forward to saving around 4ms every time the shuttles move (or don’t move) on your display!

Shuttle Tracking Upgrades

Over the past 6 months or so I’ve been spearheading the re-write of RPI’s Shuttle Tracking system into something less RPI-specific to make it useful to other organizations.  Part of this has been small semantic changes like removing RPI-specific words, location references (like the hard coded map center) and CAS-based authentication, but on a much larger level the application was restructured to do a lot more.

Both old and new systems store the same data (vehicles, vehicle positions, routes, and stops along the routes) but you no longer have to directly manipulate the database to hide a stop from the map and you don’t have to understand how to build a KML file to change the route around anymore.  Additionally, the new system feel much less “hacky” if that makes any sense, things are where they should be (for the most part) and there’s actually some back end pages worth showing off; we’ll be able to iterate and release new features much faster.

I am always impressed when an interface get’s polished, but I’m rarely the one to do it (Thanks Reilly!)… what I can take credit for is the switch to Ruby on Rails 3.  Flagship Geo was a primary driver behind this, Rails 3 was necessary to pull in all those resources like the route and stop editors, but Rails 3 should also provide some performance enhancements.

The server has also been upgraded to include Ruby 1.9.2 via RVM because I think that makes it harder to break things.  When the site goes into production we’ll be serving using Passenger 3 to, in theory, speed up our web server end of the pipe.

As for the timeline of this release, the current system is staging in beta at RPI for performance testing / feedback.  After I’m satisfied the new one is performing at least as well as the old one it will be switched into production.  In the meantime, you can follow development on github: https://github.com/wtg/shuttle_tracking

Builder Files

Typically when I go to generate some XML-style documents in Ruby on Rails I manually code the XML syntax and manually escape and substitute the strings where I want them. This technique is pretty sloppy in to toss in a view, and relies on your ability to generate well-formed XML off the top of your head. (Its usually the escaping that becomes an issue.)

Thinking back, this the probably the quick and easy technique I picked up from my PHP development projects.  I could hand-code and debug some XML faster than I could find a suitable library, install it, and figure out how to work it.  Sometimes it’s just easier to do things the hard way.

As a rule of thumb, I think the hard way in PHP never translates well into Ruby on Rails.

Yesterday I was struggling to cleanly generate XML in Rails 3 because of the new default sanitation.   Using Builder was easy enough to generate the right XML structure for my KML file, but getting it to output was a challenge.  All of my < tags were getting replaced with < and such, and the raw parameter (<%=raw foo %>) wasn’t cooperating.

I ended up discovering that I could rename my file from show.kml.erb to show.kml.builder, and I wouldn’t have to mess around with any escaping or erb syntax at all. You can check out the code I used in this commit.  It might be just me, but I always struggle to find the appropriate documentation for these little nuances in Rails.  There is tons of code showing how to use Builder to build XML documents, but not one of them mentioned what to name your file.

This technique definitely took me longer than a quick pass through manually plugging in the string would have, but its a lot cleaner.  I don’t have to worry about escaping or generating valid XML, and if performance is an issue I can install a new XML builder to speed things up.

Bumping to Beta 4

I just bumped Flagship Geo to use Rails 3 Beta 4 (commit), luckily everything seems to still be working when I run my test suite.   Installing the new version of Rails is pretty easy, you just have to run `gem install rails –pre`.  You might want to sudo if you keep your gems system wide.

I use Passenger to serve most of my rails app on my development server and I had to change around the config a bit to get things working with Beta 4.  Specifically, I had to switch Passenger to treat my application like a Rack application.  I don’t know exactly what this means from an application-architecture standpoint, but I believe its related to the initialization procedures used to boot the application.  To make the switch, I had to edit my public/.htaccess file.  You might need to edit your apache virtual host config if that is where you store your Passenger config info.  The switch is pretty easy, switch every instance of “Rails” to “Rack”:

RackEnv development
RackBaseURI /geo

Also, bundler has been getting on my nerves.  When I upgraded rails, bundler was updated to 0.9.26 which is very confused about its gem locations.  From the command line, everything works great.  I can rake test, ./script/rails console, and do all of that great stuff but when I load my application up in the browser half of my gems can’t be found.  I needed to sudo gem install them to manually make them available system-wide, doing just `bundle install` would install them locally which was good enough for CLI work but didn’t cut it for Passenger.  I believe this might be fixed in bundler 0.10… due out ASAP?

Otherwise, everything is working great.  I look forward to the Rails 3 RC upgrade later in the week… hopefully the upgrade won’t be much harder.