Notes from SCALE12x

(Sidenote: this little blog engine had bitrotted pretty bad.. I reimplemented it with markdown, go & bootstrap, and it's much more pleasant to work with now. Time for new content!)

I spent Saturday & Sunday at the Southern California Linux Expo (SCALE), and here's my very personal report of how I experienced it.

SCALE is not your typical tech conference, it brings in very diverse groups of people. The organizers are actively trying to reach out to e.g. kids that are in that "might grow up interested in things" age. Just about every age group, techie background, and personal interest is present -- the common theme really is only Linux (and a few BSD-based vendors trying to sell their gear). Of course this means that SCALE won't ever serve my desires perfectly -- but it serves the community well, and the feel of the conference is very friendly and engaging.


First of all, I was too busy to go on Friday, and the streaming video had some sort of audio codec trouble, so I won't comment about content of the devops day. What I will say is that I'm impressed by the strength of the devops presence at SCALE. It's becoming a significant backbone of SCALE, year by year. Kudos to the organizers. And they're at it all year long -- the local ops-oriented meetups have a great community going. Heartily recommended, whether you carry a pager or not. Also see hangops.

SCALE also hosted another sub-event on Friday called Infrastructure.Next, @infranext. It looked interesting, though I fear overpresence of Red Hat and vendor agenda. I'm still waiting for slides and/or video of How to Assemble A Cutting Edge Cloud Stack With Minimal Bleeding. (The archived live streams for all three days are useless because of audio problems.)

I also missed Greg Farnum's talk on Ceph. I worked at Inktank for almost two years, and this technology is one of a kind, and a good indicator of what direction the future lies. If you deal with >20 machines, you should definitely take time to look into Ceph.


Saturday started off with a talk about SmartOS vs Linux performance tooling (slides). There wasn't much new there this time around, but Brendan is a good speaker, and SmartOS is probably the most serious server-side alternative to Linux I'd personally consider these days, so it's good to keep tabs on what they've been working on.

My interestests drew me next to the talk about Presto (slides). Takeaways:

  • batch and interactive systems have fundamentally different needs, e.g. for monitoring grace periods, how and when maintenance can be performed; they require a different ops culture.
  • Dain shared background on Facebook's internal networking challenges, and how data center power limits forced them to essentially trade off other servers for Presto servers, to avoid network bottlenecks.
  • Presto is integrating the BlinkDB research on approximate queries, e.g. <10% error for 10-100x faster queries sounds like a very good trade-off.
  • many "big data" stores don't store enough statistics about index hit rates to guide query planning

I'm sad I missed Beyond the Hypervisor (slides) due to a schedule conflict.

The OpenLDAP talk (slides) was really largely about LMDB, and that's what I came for. LMDB is a library that implements a key-value store, with an on-disk B-tree where read operations happen purely through a read-only mmap. This is a really nice architecture, pretty much as good as a btree gets -- that is, it's probably happiest with read-mostly workloads, and probably at its worst with small writes to random keys. Pretty much the opposite of LevelDB, there. I wish the benchmarks were less biased, but that seems to be the unavoidable nature of benchmarking. LMDB has a lot of the kind of mechanical sympathy that may remind you of Varnish: all aspects of caching are offloaded to the kernel, and data can be accessed in zero-copy fashion because the read-only mmap prevents accidents. For Go programmers, Bolt is a reimplementation of the design in pure Go, avoiding the Cgo function call overhead, and offering a much nicer api than the direct wrapper szferi/gomdb. My quick microbenchmarks say that, when used from Go, Bolt can be faster.

Next up was High volume metric collection, visualization and analysis. If I could take back those 20 minutes, I would.

I spent the rest of the day catching up with old friends and making new ones.


Clint Byrum is now at HP and working on TripleO, a project that aims to make OpenStack do bare-metal deploys, and then run a public-facing OpenStack on top of that. His talk was a good status report (slides), but in situations like this I always end up wanting more details.

For the next slot, I bounced between three different talks, not 100% happy with any one of them. First, Hadoop 2 (slides) was an intro to YARN et al that started off like an apologist "I swear Hadoop and Java don't really suck as much as they seem to". Mark me down as unconvinced.

Second, Configuration Management 101 was a good effort from a Chef developer to be party neutral, and talked about the common things you find in all the common CM frameworks. His references to promise theory are pretty much dead on, and in the 3 years since I fiddled momentarily with, my thoughts have gone more and more into thinking about distributed CM as an eventual consistency problem. With Juju-inspired notifications about config changes, using more gossip & vector clock style communication to update peers on e.g. services provided, this might result in something very nice. That one is definitely on the ever-growing itches to scratch list.

Third, Seven problems of Linux Containers was an OpenVZ-biased look into remaining problems. Some of it was a bit ridiculous -- who says containers must share a filesystem, just mount one for each container if you want to -- and some of it was just too OpenVZ-specific to be interesting. Still, a good topic, and OpenVZ was groundbreaking work.

For the next slot, I returned from lunch too late to fit in the packed rooms, and enjoyed breathing too much to try harder. I watched three talks, mostly from open doorways. The hotel's AC was not really keeping up anymore at this point, and only the main room was pleasant to be in.

Big Data Visualization left me wishing that 1) it wasn't fashionable to say "big data" 2) he'd have shown more visualizations 3) he'd talked about the hard parts.

ZFS 101 (slides) is interesting to me mostly to see what people think about & want from storage. Btrfs is really promising in this space, feature-wise; it still has implementation trouble like IO stalls, but the integrated snapshots and RAID are just so much more useful and usable than any combination of hardware RAID, software RAID, and LVM. Snapshots really need to be a first-class concern. So far, my troubles with Btrfs are of a magnitude completely comparable to my troubles with the combination of LVM, LVM snapshots, HW-RAID cards dying, and SW-RAID1 sometimes booting the drive that was meant to be disabled. All in all, I find the "not yet stable" argument a bit boring; there's a whole lot of code and complexity in Btrfs, but it also removes the need for a whole lot of other kinds of code and complexity. If nothing else, the ZFS/Btrfs feature set should be a design template for future efforts; I understand e.g. F2FS has a very specific design goal (think devices rather than full computers), but not supporting snapshots in a new filesystem design is a bummer.

And finally, I spent time in Jordan Sissel's fpm talk. fpm is a tool that converts various package formats into other package formats, a lot like Alien. Jordan's viewpoint on this is a frustrated admin who just wants the damn square peg to fit in the round hole, and fpm is the jigsaw & hammer that'll make that happen. I fundamentally disagree with him about the role of packaging; the whole point of packaging is destroyed if the ecosystem has too many bad packages, and the reason e.g. Debian packaging can be a lot of work is not because cramming files in an archive should be hard, but because making all that software work together and upgrade smoothly actually is a difficult problem. But Jordan is an entertaining speaker, and his point is valid; there are plenty of cases where you don't care about the quality of the resulting package. Just.. please don't distribute them, ok?