Hacker News new | past | comments | ask | show | jobs | submit login
MySQL InnoDB Clustered Indexes and Rails (or How RescueTime Avoided Buying More Hardware) (joehruska.com)
48 points by bfioca on March 10, 2008 | hide | past | favorite | 11 comments



I don't know much about InnoDB's clustered index, but if it is like Oracles IOT's, then normally its more expensive to do inserts (than for a heap table) as it has to put the data into a specific place, but cheaper to find related data when its queried due to all the data being stored together.

In this case, that extra 'expense' was easily off-set by the savings giving a great speed up. It does teach a good lesson though, which is index as little as possible by using composite indexes where you can!

In databases its amazing how 1 bad table/query/index can hose the performance of the either database - and once you find it, you get such a performance boost!


There are a lot of optimization tricks you could do with an app like rescuetime.

1. Buffer inserts on two levels: a. In-memory at the application level (if it crashes and you lose a few seconds' worth of data, it is not the end of the world). You can now do bulk inserts. b. Use MySQL's delayed insert buffer.

2. Split the data into separate tables for each duration interval so that you don't have one massive table.


1) Delayed inserts do not work with InnoDB tables.

2) This is a great idea for time based data. You age out the tables and create a merge table which is a logical view of all the tables together. You can get tricky and use compressed myisam tables for the older, static data since it will never change.


My startup has similar data volumes and reporting needs to rescuetime, and also runs rails on the backend. We found that traditional data warehousing techniques, and specifically running a MOLAP parallel to the SQL database, solved the performance problems so soundly that we could put 10x as many charts on a page and enabled data-slicing possibilities that were previously out of the question with the highly tuned SQL route. We were also able to get rid of all of the extra model code we had written to manage aggregations. See http://en.wikipedia.org/wiki/Online_analytical_processing for more background info if you're interested.


I'd love to talk to you regarding your MOLAP implementation with RoR. I had previously taken a brief look at ActiveWarehouse as a potential solution, but put that on the back burner as we are currently in a very fast iterative development mode - having launched just a couple of months ago. My last 16 years of database experience have been grounded in Oracle (oh Oracle 6 how I don't miss thee...) so I am fairly familiar with the the "Oracle" solution and storage of MOLAP cubes in Oracle Express, but am a complete noob when it comes to the OSS solutions. Any insight into your learning experiences would be most welcome!


email me at the address in my profile and we can talk


You can thank DHH for the dumbing down of Rails' databases. Everything should be in your app, don't you know?


Anyone have any my.conf examples for a good InnoDB setup?


Here is a sample configuration that we use for RescueTime, it's fairly InnoDB centric. http://www.joehruska.com/?p=7


Joe, thank you!


Thanks, Joe. This post rocks.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: