Posts Tagged: Programming


1
Jun 08

Facebook chat uses Erlang to scale

I started playing with Erlang last year. Mostly that meant reading the Joe Armstrong book, looking at ejabberd and writing a little code. Sadly, I’ve not had the chance to go much beyond the “playing” stage. Anyway, I’ve got a soft spot for functional languages like Erlang since my Programming Language Theory class in undergrad where we used ML. I especially like the way Joe and the rest of the people behind Erlang have built it for concurrency via tiny processes that share nothing and have provided a framework for building apps that know how to operate correctly in soft-realtime. It’s a very different way of thinking about building systems and seems to be remarkably effective.

The Erlang guys have to be feeling pretty good to hear that Facebook has used Erlang as a core component of their new chat service. (High Scalability also has a good writeup.) As Facebook engineer Eugene Letuchy describes it, their implementation uses XHR long polling which means tons of open HTTP connections. Spread this out over 70 million potential users and it’s not hard to see that Apache would break down pretty quickly. Basically it sounds like they have tons of Erlang processes servicing these connections and holding messages and presence events for users in memory if there’s not an open connection to the client.

Eugene mentions the challenge of delivering presence information as being more difficult than real-time messaging. (Something I thought a lot about when building Effusia.) He lays out the issues inherent in broadcasting presence on every state change in the form of a nasty worst-case asymptotic complexity:

The naive implementation of sending a notification to all friends whenever a user comes online or goes offline has a worst case cost of O(average friendlist size * peak users * churn rate) messages/second, where churn rate is the frequency with which users come online and go offline, in events/second.

However, he doesn’t really go into any detail on how they solved this problem. I can only assume they used some form of periodic polling on a need-to-know basis and/or coalescing friend presence updates in such a way that they’re only occasionally sent to a user.

A few other interesting notes… Apparently they used C++ to do the chat logs as Erlang is not that great at raw I/O. They also apparently use Thrift to glue everything together. (Reminds me I need to look into Thrift in more detail.)


25
May 08

Surprisingly basic Rails performance tips – and the people that don’t love them

Antonio Cangiano offers up 10 Ruby on Rails Performance tips, some of which are really just good practice in any web application and aren’t specific to Rails. This includes gems like:

don’t be afraid of using the cool features provided by your database, even if they are not directly supported by Rails and doing so means bypassing ActiveRecord. For example define stored procedures and functions, knowing that you can use them by communicating directly with the database through driver calls, rather than ActiveRecord high level method.

And:

Retrieve only the information that you need. A lot of execution time can be wasted by running selects for data that is not really needed.

Shocking stuff!

What amazes me is the level of irritation evident in the comments from people decrying this as “premature optimization”. I agree that you shouldn’t completely reorganize your code to achieve some speculative performance increase before you really know what parts of your app are going to have issues. However, some things are just common sense. If I can pull back data from the database without doing O(n) queries in a loop, I should do that. If I need to run a report with lots of aggregated data, I should probably consider computing that in the DB via a function or stored procedure. Bottom line, there are things that are guaranteed to cause you problems. Sitting around and smugly saying, “I don’t want to optimize prematurely here” is no excuse for writing dumb code.

(Link to the performance tips via the FiveRuns Blog)