SCALE5x: Talk summary of Admin++, what root never told you
So I'm at SCALE5x, listening to Ron Gorodetzky talk about what he learned about sysadmining for Digg and Revision3 (who try to be an "Internet television network"; in effect, they distribute loads of big files). Most of the tools he mentioned I already knew, but it was nice to get independent reviews of "hey I think this is good". Here's what I took home from his talk:
He really thinks highly of the OSCon 2005 talk
Livejournal's Backend (A history of scaling)(PDF).
Between the lines I understood Revision3 has outsourced their big bandwidth use -- the CDNs he mentioned by name were Cachefly (the color scheme hurts even my eyes and real designers think I'm colorblind), BitGravity (caution hideous flash site) and of course Akamai.
He stressed the importance of setting up KVMs etc properly for the data center.
Set up your infrastructure and plan for scaling before you get popular, because you will be too busy to do them afterwards. That's nice, I like building things scalable from scratch.
Specific infrastructure management tools:
As usual, I haven't yet seen anything that would actually seem to work in the real world, unless you give up everything you already have (like package management etc), and do things 100% their way.
His suggestion: as the tools are based on very different worldviews, look at everything and try to pick the one that matches your opinions.
One thing he wouldn't skimp on: "Don't skimp on RAM."
At Revision3, they use long-life server hardware and don't upgrade the servers, instead they go for a full new deployment.