Byzantine Reality

Avatar

Searching for Byzantine failures in the world around us

Articles tagged with 'app engine'

Two Caching Strategies for App Engine apps

Recently I took on a redesign of an old project of mine: Active Cloud DB. It's a Python App Engine app that exposes a REST API to the Datastore, allowing clients of any programming language to access Google's scalable key-value datastore. However, the web frontend didn't look too hot, and when I saw what Bootstrap could do, I knew I could use it to do justice for Active Cloud DB. So I did just that, and made a new Active Cloud DB with Go App Engine and Bootstrap, and boy does it look a lot nicer now. Of course it's all open source, so feel free to grab the code we'll be talking about today and follow along at home.

Active Cloud DB provides a minimalist REST API for the Datastore, exposing four of the Datastore API's operations: get, put, delete, and query. To speed up these operations, we also throw in two types of caching via the Memcache API: generational caching and write-through caching. The latter is the more familiar, so let's go over that one first. With write-through caching, write operations (put and delete) hit both Memcache and the Datastore, while read operations (get) read the Memcache version, and only hit the Datastore if the data isn't found in Memcache. Here's what the code for a put looks like:

func put(w http.ResponseWriter, r *http.Request) {
 keyName := r.FormValue("key")
 value := r.FormValue("val")

 c := appengine.NewContext(r)

 key := datastore.NewKey("Entity", keyName, 0, nil)
 entity := new(Entity)
 entity.Value = value

 result := map[string] string {
  "error":"",
 }
 if _, err := datastore.Put(c, key, entity); err != nil {
  result["error"] = fmt.Sprintf("%s", err)
 }

 // Set the value to speed up future reads - errors here aren't
 // that bad, so don't worry about them
 item := &memcache.Item{
  Key: keyName,
  Value: []byte(value),
 }
 memcache.Set(c, item)
 bumpGeneration(c)

 fmt.Fprintf(w, "%s", mapToJson(result))
}

And here's what the code for a get looks like:

func get(w http.ResponseWriter, r *http.Request) {
 keyName := r.FormValue("key")

 c := appengine.NewContext(r)

 result := map[string] string {
  keyName:"",
  "error":"",
 }

 if item, err := memcache.Get(c, keyName); err == nil {
  result[keyName] = fmt.Sprintf("%q", item.Value)
  fmt.Fprintf(w, "%s", mapToJson(result))
  return
 }

 key := datastore.NewKey("Entity", keyName, 0, nil)
 entity := new(Entity)

 if err := datastore.Get(c, key, entity); err == nil {
  result[keyName] = entity.Value

  // Set the value to speed up future reads - errors here aren't
  // that bad, so don't worry about them
  item := &memcache.Item{
   Key: keyName,
   Value: []byte(entity.Value),
  }
  memcache.Set(c, item)
 } else {
  result["error"] = fmt.Sprintf("%s", err)
 }

 fmt.Fprintf(w, "%s", mapToJson(result))
}
The careful reader will have noticed that I haven't talked about the query operation at all. This is because the query operation is a bit more complex than the others. The other operations specifically indicate which key they're operating on, while an arbitrary query can operate on any number of items. So to cache query operations, we employ a generational caching strategy. Essentially, we set a generation value in the Datastore (an integer), and whenever a query is performed, we associate the current generation value with it and store the result in Memcache. So a query for "SELECT * from Entity" performed on an initially empty database (with generation value 0) could be stored with the key "SELECT * FROM Entity / 0". Whenever a write is performed (a put or delete), we increment the generation value, which means that when we do a query, we'll be looking for "SELECT * FROM Entity / 1". That implicitly invalidates all old queries and ensures we don't get any stale query data. In our implementation, we are only concerned with a single query right now, so we simplify how we store that key, but in general it should work fine. The code for retrieving queries thus looks as follows:

func query(w http.ResponseWriter, r *http.Request) {
 c := appengine.NewContext(r)

 cacheKey := getCacheKey(c)
 if item, err := memcache.Get(c, cacheKey); err != memcache.ErrCacheMiss {
  fmt.Fprintf(w, "%s", item.Value)
  return
 }

 q := datastore.NewQuery("Entity")
 result := map[string] string {}
 for t := q.Run(c); ; {
  var entity Entity
  key, err := t.Next(&entity)
  if err == datastore.Done {
   break
  }
  if err != nil {
   result["error"] = fmt.Sprintf("%s", err)
  }
  keyString := fmt.Sprintf("%s", key)
  result[keyString] = entity.Value
 }

 jsonResult := mapToJson(result)
 item := &memcache.Item{
  Key: cacheKey,
  Value: jsonResult,
 }
 memcache.Set(c, item)

 fmt.Fprintf(w, "%s", jsonResult)
}

With that, you now know how to use Memcache to cache Datastore accesses in Go. Of course, see our CloudComp paper for more details on the Python implementation and an evaluation. I hope that piqued your interest in Go and App Engine, so get coding!

Go + AppScale

I just got back from Google I/O last week and after hearing about the cool new Go support in App Engine, I set out to get Go working over our implementation in of App Engine, namely AppScale. After a bit of hacking I got it working - here's a screenshot of the go-mandlebrot demo:

Living up to their promise as being super cool and open-source, the Go App Engine code is available for people wanting to hack around with it - of course that's me! But I wasn't able to compile the Go App Engine SDK out of the box. Thankfully, it turned out to be pretty straightforward, so here's what to do if you want to do it (or if you're a future version of me who has forgotten how to do it and needs to remember). Enjoy!

Compiling the Go App Engine SDK in Three Easy Steps, each with sub-steps:

Step 1: Install Go

Step 1.1: Install Go's language prerequisites

sudo apt-get install bison ed gawk gcc libc6-dev make
sudo apt-get install python-setuptools python-dev build-essential

Step 1.2: Install Mercurial (skip if you can run 'hg')

sudo easy_install mercurial

Step 1.3: Actually install go

hg clone -u release https://go.googlecode.com/hg/ go
cd go/src
./all.bash

Step 1.4: Move Go out of your home directory

cd ../..
sudo mv go /usr/local/
export PATH=$PATH:/usr/local/go/bin
export GOROOT=/usr/local/go

Step 2: Install Go's Protocol Buffer Bindings

Step 2.1: Install Protocol Buffers

wget http://protobuf.googlecode.com/files/protobuf-2.4.1.tar.gz
tar zxvf protobuf-2.4.1.tar.gz
cd protobuf-2.4.1
./configure
make
make check
make install
cd ..
rm -rfv protobuf-2.4.1 protobuf-2.4.1.tar.gz

Step 2.2: Install the Go Protocol Buffer Library and Protocol Compiler Plugin

goinstall goprotobuf.googlecode.com/hg/proto
cd $GOROOT/src/pkg/goprotobuf.googlecode.com/hg/compiler
make install

Step 3: Actually compile the Go App Engine SDK

Step 3.1: Get the Go App Engine SDK Source

hg clone https://appengine-go.googlecode.com/hg/ appengine-go

Step 3.2: Compile it

cd appengine-go
make

And that's it! It should compile without problems, and now you have the ability to mess around with your very own Go App Engine SDK! I'm not exactly sure what you'd want to do with it, but I sure know what I do! Stay tuned for more updates from the wacky world of the cloud, now with 100% more Go!

profile for Chris Bunch at Stack Overflow, Q&A for professional and enthusiast programmers