Byzantine Reality

Searching for Byzantine failures in the world around us

Thoughts on BigTable

Now that we’ve looked at the Google File System and MapReduce, we come to the next paper on our reading list: BigTable. It comes to us two years after the MapReduce paper and presents an interesting new step in databases. Like MapReduce, it is another notable paper and takes a very familiar programming construct and revamps it to new heights. Let’s see if I can whet your appetite with a spoiler:

Maybe a better name for it is Big Hash Table.

Now that I have your interest, let’s take a step back and have a quick history lesson. To this point, pretty much all databases I’ve run into have fallen into one of three categories:

  1. Relational: If you’ve ever used a database, you likely have used a relational database (MySQL and friends). No need to expand much here.
  2. Hierarchical: Certainly not as popular or well-known as relational databases, but hierarchical databases like LDAP provide a great way to fetch and store your data if you can organize them in a tree-like structure.
  3. Object-oriented: Can’t say I know a whole lot about these. I hear they were quite the rage back in the ’90s, since an object-oriented database was supposed to be a perfect fit for an object-oriented language. But for some reason or another it just didn’t work out, and two new technologies supplanted it that are the new hot thing. Rails has made Object-Relational Mapping (ORM) popular for their users, while C# has Language Integrated Query (LINQ).

So from above we can begin to see a trend: each database type seems to be modeled after some type of data structure. In order, we look like we’ve got a two-dimensional array (more or less), a tree, and an object. The notable exception from the data structures you would learn about in a basic data structure class is thus the hashtable, so BigTable’s entry in the field is not as surprising as people seem to give it credit for. Now, let’s see what makes BigTable stand out.

As we’ve hinted before, BigTable is a relatively large hash table. It’s simple to read from and write to, as you are restricted to standard hashtable operations (get, put, delete). You still have rows and columns, and data items also have a timestamp associated with it, so that versioning is automatic. BigTable then garbage collects older versions as specified by the user (e.g., garbage collect everything but the most recent three versions of all items).

One big difference between BigTable and relational databases is how transactions are handled. Row-level transactions are possible in BigTable, but not any other type of transactions. The paper states that this feature may make it confusing for programmers new to BigTable to use, but that apparently this is a minor obstacle to writing much more scalable code.

The unsung hero of the day that really makes BigTable shine, however, is Chubby. Chubby is a lock-providing service that is highly available and allows many difficult design decisions to be abstracted away from BigTable. When a BigTable server (master or slave) comes up, it reserves a lock through Chubby with its name and path, and renews the lock periodically. This makes it trivial for the master to determine which nodes are up and down, as it can just query Chubby to see who has locks (thus these nodes are up). Since the BigTable master also knows the locations of the slaves, it can trivially determine if Chubby is down (it would be unable to see who has locks in Chubby but still be able to contact them). But since Chubby is the next paper on our reading list, let’s not say too much more about it right now.

Another reason BigTable shines so brightly for Google is the infrastructure it’s been built on, namely the last two papers we’ve looked at. By placing BigTable on the Google File System, the database automagically gains three-copy replication and fault tolerance. Furthermore, since the file system is already geared towards processing huge amounts of data, the database now doesn’t suffer from doing so (in fact it performs even better on it). BigTable adds onto this by allowing certain columns in a table to be split up from other columns, speeding up access time on queries for only those columns. The example given in the paper is that if the table contains web pages, maybe we can put meta-data in one column and the pages contents in another column. Then, clients that only need the page’s meta-data are able to save a lot of time versus querying all the columns.

BigTable also gains substantially from being able to use MapReduce. Now complex operations can be performed against this huge repository of data with many many machines and can be done so reliably (that is, it recovers from failures). This seems to be the primary reason why BigTable doesn’t mention any ability to perform joins or any other standard relational database features (as MapReduce makes it simple enough to do so on their own).

So as always, we end up with a common question: how can I get my hands on this to mess around with? And as usual, you can’t get it, but there are some open-source versions you can fiddle around with. Just asHadoop puts out the Hadoop Distributed File System (open source Google File System) and Hadoop Map Reduce (open source Google Map Reduce), they also put out HBase, an open source implementation of BigTable. It also looks to be the most stable and easy to get going of all the open source BigTable implementations. There’s also HyperTable, but we’ve had orders of magnitude more difficulty getting it up and running than HBase, so I’d recommend HBase over it at this point in their development. And as usual, if you’re up to any kind of work in this area, drop a line and let all of us know!