Below is the presentation I gave to the Austin MongoDB Day on 3/27/2010. I talked about the design decisions one needs to make when determining whether to introduce a document database like MongoDB into an existing application built on a relational database. I give a few examples of how we’ve made MongoDB and our MySQL database live together in harmony inside CheapTweet.
I did a quick talk at Austin on Rails last night about my experience using Sphinx and Thinking Sphinx for doing full-text search on CheapTweet. We did a series of short talks last night so this is only a 10 minute overview. Hope some of you will find it helpful.
I gave a talk last night (3/24/2009) at Austin on Rails about building a new (yet-to-be-released) application called TweetReach using tools that are somewhat off the beaten path. These included Sinatra, Tokyo Cabinet and my new Twitter API library: Grackle (which I’ll talk about in more detail in a future blog post).
I referenced some code during the “Building TweetReach” part of the presentation. If you’re interested, you can download it.
By the way, I was the second speaker of the night. Mike Perham spoke before me about caching with Rails. Check his stuff out as well. Thanks to everyone who came out.
Update 4/2/2009: Launched TweetReach! Go check it out.
don’t be afraid of using the cool features provided by your database, even if they are not directly supported by Rails and doing so means bypassing ActiveRecord. For example define stored procedures and functions, knowing that you can use them by communicating directly with the database through driver calls, rather than ActiveRecord high level method.
Retrieve only the information that you need. A lot of execution time can be wasted by running selects for data that is not really needed.
What amazes me is the level of irritation evident in the comments from people decrying this as “premature optimization”. I agree that you shouldn’t completely reorganize your code to achieve some speculative performance increase before you really know what parts of your app are going to have issues. However, some things are just common sense. If I can pull back data from the database without doing O(n) queries in a loop, I should do that. If I need to run a report with lots of aggregated data, I should probably consider computing that in the DB via a function or stored procedure. Bottom line, there are things that are guaranteed to cause you problems. Sitting around and smugly saying, “I don’t want to optimize prematurely here” is no excuse for writing dumb code.
(Link to the performance tips via the FiveRuns Blog)