Conversion Job Pool

In my previous post I talked about some conceptual stuff I use to convert videos to various formats. Yesterday I finished up the implementation that allows for multiple video conversions to occur at the same, so let me walk you though some of the more technical details to the best of my understanding.

BackgrounDRb does all the hard work, and can handle the idea of a job pool. What happens is this: there is one worker that is started by default. Instead of using this worker to handle the video conversion, I use it as the queue manager. My conversion controller calls something like this to send a new conversion request to the queue:

MiddleMan.worker(:video_worker).enq_queue_convert(:args => {:conversion_id => @conversion.id}, :job_key => @conversion.id)

In my video worker, I have two methods one for the queue_convert calls, and one that actually handles the convert.

#Queue up the video for conversion
def queue_convert(args)
conversion = Conversion.find(args[:conversion_id])
conversion.update_attributes({:status => “queued”})
thread_pool.defer(:convert,args)
end

#Run the conversion of the video
def convert(args = nil)
logger.info(“Calling convert method for conversion #{args[:conversion_id]}.”)
conversion = Conversion.find(args[:conversion_id])
#do stuff
#Mark this job as complete in the job queue
persistent_job.finish!
end

My understanding is that thread_pool.defer method puts the job into a persistent queue, such that if everything crashes before the jobs starts it can still recover. If the maximum number of workers hasn’t been reached a new one is spawned to handle the request. For my uses, it would be nice if I could write a bit more code to choose when a new worker spawns. Free memory and processor usage are much more important than total number of procs when it comes to video conversion. Time permitting, I might dive into BackgrounDRb and see how easy it would be to change that around a bit. In the convert method, the only important line is at the end, persistent_job.finish! which marks the job as complete. I suspect this only frees the worked to look for another job or shut down, during my tests a job that crashes halfway through its run (lets say FFMPEG Seg Faulted in the middle of my convert code) is not automatically retried when BackgrounDRb is restarted.

Since converting videos can take a really long time, I had a conversion model to track that status of things of a video that being processed by the video worker. I think I could have implemented something with BackgrounDRb’s result cache, but it struck me as not the most persistent way to track details about a conversion.

If you’re looking for the actual code that *works* for me, you can check it out on GitHub: http://github.com/bamnet/bonsai-video/tree/master

Converting Video

I spent this week working on tools to convert videos to different formats. My main goal was allow people to specify ffmpeg conversion settings that could be used to render something into a “web-friendly” format like flv, h264, or even ogg. To do this, I have a profile model that stores a command (string) with infile and outfile dummy parameters. You select a video you want to convert and choose the conversion profile, and send it off into the sunset.

Right now the “sunset” consists of a BackgrounDRb worker. I found it pretty challenging to debug when I was working on it, BackgrounDRb gives you a very limited trace of the error, and it never pinpointed what line in my video worker was making it unhappy. When things work well, the video conversion worker does a great job. Videos convert, they’re created in the system and associated with the parent. No problems at all. The trick comes into play when videos conversion fails. Right now I don’t have any way to tell if ffmpeg is having a good time or a bad time converting which would be a really handy feature. Ideally, I’d be able to grab the last line or two from the FFMPEG output that show the status, fps, etc. I might look into this with more time.

Additionally, I’m working on some code to support more than one worker running at the same time. Right now I spawn 1 video worker, which queues up all the requests to convert video… ideally I think I’d like to enable users to define how many conversions go on at the same time so faster machines could handle more conversion processes.

Progress Updates

I spent some time today polishing up the basic video create/view/update/delete pages so they are a little easier to use. Not being a great CSS designer, I opted to use the Blueprint CSS framework [released under the MIT Lisense] to make things easier… I’ve never been a fan of manually crafting stylesheets for very complicated pages, and knowing that someone else will deal with the various browser-compatability hacks is a big relief. In addition to Blueprint, I’m using the Silk icon set from FamFamFam. I think I’ve used this icon set in just about every project I’ve done that doesn’t someone who is a “designer”. I was pleasently surprised to see they have icons for film.

On the coding front of things, the simple management (CRUD) of videos and collections is complete. I need to do a bit more work for adding additional videos to a collection, which I hope to have done by Wedneday… then I’m onto thumbnailing and conversion.

Nested Attributes

The regular disclaimer applies to this post: I’m not a rails expert, so I may not be presenting the “best practices” technique, or a technique that will work for everyone all the time. This works for me, in my use case, and I don’t see any obvious reasons why it wouldn’t work for others.

I have two models I’m working with right now in my project, a collection and a video. A collection is essential a bundle of similar videos that share a title, description, date, etc. This supports the concept multiple formats or slight variations of the same video being used in the system. Its a pretty standard setup in my models:

==Collection.rb==

class Collection < ActiveRecord::Base
has_many :videos
end

==Video.rb==

class Video < ActiveRecord::Base
belongs_to :collection
end

So every entry in the video table has an id in the collection_id field that enables that join to take place. Now its time to get a form that will create a collection and the first video in the collection at the same time… initially I spend time trying to generate a text field (using the text_field form helper) that would be called collection[uploaded_data] {uploaded_data is the field I use on my new video form} but I just couldn’t get that to work. I can’t remember where that naming syntax is used.. maybe its CakePHP, Concerto, or even a different rails technique.. but it wasn’t working.

Googling around for “nested attributes” seem to yield some interesting results, and I stubled onto the accepts_nested_attributes_for concept. I changed my collection model to something like this:

class Collection < ActiveRecord::Base
has_many :videos
accepts_nested_attributes_for :videos
end

Now, I had to add a line (@collection.videos.build) to my collections controller so that it will create a new video object:

def new
@collection = Collection.new
@collection.videos.build

respond_to do |format|
format.html # new.html.erb
format.xml { render :xml => @collection }
end
end

Last, but not least, I updated updated my form with the following stuff:

<% form_for (@collection,:html => { :multipart => true }) do |f| -%>
<%= f.error_messages %>
<p>
<%= f.label :title %>
<%= f.text_field :title %>
</p>
<p>
<%= f.label :description %>
<%= f.text_area :description %>
</p>
<p>
<% f.fields_for :videos do |video_fields| %>
<%= video_fields.label :uploaded_data %>
<%= video_fields.file_field :uploaded_data %>
<% end %>
</p>
<p>
<%= f.submit ‘Save’ %>
</p>
<% end -%>

Now the save stuff is all handled by the exact same save I used for the regular collection entry, no need to updated that. Whew!

Here are some additional resources I found helpful:

acts-as-timecode

In a few places I need to store the length of a video. To keep things simple and fast, I’m using an :integer, and storing the length in seconds of the video file. This is all well and good when it comes to indexing and sorting in the database, when it comes to user interaction its really junky. No one wants to know that a video 421 seconds long is 7 minutes, and 1 second long… and I didn’t want people to have to convert 3 hours, 45 minutes, and 10 seconds into a huge integer.

At first I implemented it up quick and dirty in the edit code of controller for the portion of code that needed this conversion, however as I started to build out a few other items I realized I was going to be needing the same feature so I moved it to a plugin.

You should note, this is my first time authoring a plugin and my first time trying to make it into a gem. It works for me, so I would hope it can work for you too. Its pretty straight-forward to setup and use. Here’s my model:

class Video < ActiveRecord::Base

acts_as_timecode :column => :duration

end

Now I can use this cool new timecode field in my views and controllers. For example, in my edit.html.erb I use <%= f.text_field :timecode %>. which generates a textbox with the video duration in it, looking like 00:34:12 or however long the video is. Hitting the save button works as expected, updating the duration field in my database to the correct seconds value (2052 in this case). Because I don’t always like to type leading zeros, the timecode field can take the following formats: HH:MM:SS:FF, HH:MM:SS, MM:SS, SS. The frame implementation isn’t very useful, but it will round your frames to the nearest second value, based on the :fps configuration setting (defaulted at 30).

At some point I might expand it, but it will depend on what I need it to do.

You can check it out on GitHub: http://github.com/bamnet/acts-as-timecode/