HN2new | past | comments | ask | show | jobs | submitlogin
How Facebook stores its billions of photos (facebook.com)
80 points by anirudh on April 30, 2009 | hide | past | favorite | 15 comments


That was much more exhaustive than i was anticipating. I'm curious how this stacks up to sites like flickr.


@thehigherlife - I was thinking about the same thing recently (an announcement about FB becoming the biggest photo app on the planet). I found two great sources of information on this..

- http://highscalability.com/flickr-architecture

Very deep analysis of Flickr's arch from a couple of years ago - including a link to a presentation to Cal Henderson's slides. I'm surprised he got canned from Y! yesterday though!


Not to be wooden, but if you reply to someone's comment, I don't think there's a need to state @[username].


i don't know anything about cal's departure, but i can say that any large co like yahoo, google, msft, apple etc is more or less impervious against the loss of one or a few technical individuals. his departure may have symbolic value but the machine that is chugging there will keep chugging, for better or for worse


Don't confuse continuing to operate with continuing to innovate.

A few technical individuals can move the needle more than you'd imagine, even at a big co.


No doubt. I am studying this further, for sure.

The info from a public-facing, private company is striking. I'm amazed how much code and knowledge Facebook has already given back to the community given their state. Though perhaps being private and not publicly traded makes it easier. Still, it seems like they have done some things very well and wouldn't want to expose them.


Amazing? Really? There are an awful lot of companies dramatically smaller than Facebook that have released a lot more code. 37signals, with maybe a 10th the valuation (if we're being very generous to 37signals), has given away a lot more code than Facebook.

Facebook may be many things, but "amazing" generosity is not one of their characteristics. Call them "somewhat" generous, if you like. Heck, I'll even let "pretty" generous slide. But let's not be over generous in our assessment of just how nice Facebook is to the community. They're nice guys, and all, but they aren't saints.


Just to clarify, to the best of my knowledge, they are private, not profitable and sharing code/knowledge. This amazes me.

I never said they were amazing; a different distinction. I'm in total agreement with you.


I wonder if they have some mechanism for rebuilding the raid array periodically to handle the inevitable undetected read errors. With raid6 you can actually do this, by comparing both parity blocks with the actual data.

Hard disks typically have undetected read error rates of 1 bit in 1E15, so assuming they transfer 1PB per day that's about 9 errors per day. Which isn't much I guess, but I wonder if they do anything about it.


(Not speaking officially, obviously. I had nothing to do with Haystack anyway).

They're images. So you flip a bit every once in a great while; as long as the bit isn't part of image metadata, no human user will notice. The next time they load the image, (at least, assuming it's been long enough for it to get evicted from the CDN) they see the right data.


Here's a Flowgram presentation from last year, "Needle in a Haystack", which covers these updates. http://www.flowgram.com/f/p.html#2qi3k8eicrfgkv


Why bother putting the haystack in a filesystem? Just use a raw partition.

Haystack effectively IS a filesystem - a log structured filesystem.


I think the haystack folks made the right decision, here. A rule of thumb is that it takes 3-5 years for a novel on-disk format to gel. There are always random bugs that rely on particular sequences of allocations/deallocations, races, etc. Using an off-the-shelf filesystem that allocates the blocks and then stays out of the application's way, like XFS, probably saved these guys at least one year of screwing around.


raw io? facebook better go ipo soon, they're running out of abstraction layers to code through


Even if Facebook end up not making any money, it contribution to the next generation of software engineering will be important. Solving major scaling issues, make Facebook a full scale lab for tomorrow. They will set standards for how to approach some unique problems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: