Byzantine Reality

Searching for Byzantine failures in the world around us

Big Data 2010 Workshop

Today Raj and I hit up the Computer History Museum for the Big Data Workshop and had a pretty good time. Here’s the lowdown on the sessions I attended:

  • Cassandra Explained: This session was jam-packed full of people and talked about how Cassandra is laid out as well as upcoming features. Interesting features that may be coming to Cassandra include vector clocks instead of timestamps (currently of the ‘long’ data type), possibly ditching SuperColumns (the consensus was that they’re very powerful but too confusing to developers), and including “sloppy quorums” (maybe more on this in a later post).
  • Glue: This session had a lot less people but was focused on the role of middleware in today’s database systems. A lot of talk went on about systems that use multiple databases concurrently (e.g., MySQL and Cassandra together) and the problems that can come up. Apparently the popular solution for problems like this is involves sticking datastore requests in a message queue (ActiveMQand RabbitMQ immediately come to mind). Definitely not a solution that I had in mind (nor anyone else that I talked to who that wasn’t in the session) but gives interesting food for thought.
  • Graph DBs: This session had a lot of people as well but I felt was a missed opportunity. It stayed very high-level the entire time and didn’t tell me any useful implementation info like Cassandra had. The primary questions that I ask about a new NoSQL datastore are typically of the form “How do I run it?”, “What is the replication model?”, and “What’s the relationship between nodes in the system?”, but unfortunately, none of these questions were answered. The talk started off nicely, with talk about high-level concepts, but devolved into things that were way to specific and not helpful to me as an actual developer.
  • The Babbage Machine: This was actually really cool. We heard a talk about the history involved and got to see it in action, which was really cool. After that, I wandered around the museum a bit and saw a picture of Garry Kasparov v. Deep Blue that was pretty close to this one:

Since I love chess and computers, I can understand just a small amount of the anguish / stress that Garry Kasparov is in during his epic battle with Deep Blue (of course I far from understand what’s actually going on in his head). However, it still made for a cool exhibit to see the whole history of chess and computers.

  • Limitations and Alternatives to MapReduce: This was actually a really small group, and just chatted about when to use MapReduce and what other technologies are appropriate. This was nice as well but could have been a bit more technical.

All in all, it was good times, and fun was had by all. I would have loved to caught Chris Anderson’s talk on going “beyond the cloud”, as the general consensus was that it was great, but seeing the Babbage Machine and wandering through the museum was totally worth it. I picked up some sweet books as well, so check back later for reviews on those.