Monday, February 9, 2009

Google App Engine, or How I learned to Stop Worrying and Love Javascript (Part I)

It has been over a week since my last post, so things are not exactly getting off to a rip-roaring start here. Rather than dwell on the past, though, here's an entry on some experimenting I've done recently with Google App Engine.

There is much hype about 'the cloud', and like a lot of hype, most of it is not necessarily worth the paper it is or isn't printed on. However, as a programmer and not a computer guy (to some of you, this will make sense), the idea of abstracting away the server provisioning process is not without appeal. As a tightwad, the idea of 'only paying for what you use' is not without appeal. Finally, as an extreme tightwad, the idea of free (which Google App Engine is, up to 10 apps) sold me. I'm not starting a business here, I'm just getting my feet wet.

As far as what to do, I am a huge fan of the website blip.fm, (the twitter-length pitch:'It's like twitter, but for music') and not long ago they released an API. It is currently in private beta, where it has been for a while now. At any rate, I got my keys, re-wrote the sample PHP wrapper for the API in Python, and I was ready to go. At least I thought I was.

At this point I would like to praise the Google App Engine Launcher for MacOSX (and, indirectly, praise MacOSX). This was put together by John Grabowski at Google in his 20% time, so apparently that is not a Google urban legend like the one about Sergey or Larry sometimes give an underperformer a brand new Prius under the sole condition that they 'drive away, far away'. In this case using the Launcher simplified the process of getting something up and running using the development server quickly. It's very intuitive, and the interactive console is handy.

Not so fast, pal

After a bit of hacking I had something fairly simple up and running which would grab info for a user via the Blip API, then show the user via an intensity map what the global distribution of 'listeners' and 'favorites' is. For this I used the lovely Visualization API from Google. It also showed who you are following that is not following you, which is essential info not easily obtainable via Blip's website.

The problem I encountered off the bat was that the queries against the API take time, and there really isn't a good way to get just the subset of info you want at this time. On the development server, requests could take awhile, especially for users with thousands of listeners (such people exist).

Curious as to how I'd fare on the real thing, I deployed (a one-click operation w/ the GAE Launcher) the app to appspot.com. At this point all hell broke loose.

Read the fine print

In turns out, amongst the other limitations of GAE (for more Beavis and Butthead immature laffs, check out the Google App Engine Backup and Restore tool, aka GAEBAR) you may have heard about (no background/batch processes, no mischief with sockets, etc) there are a couple tight limitations I should have looked into before deploying:
  • If a call to urlfetch takes more than 5 seconds, you lose, it times out.
  • If your request takes more than 10 seconds, you lose: 'DeadlineExceededError'.
(It is worth noting here that according to this release from Google today, some of these limitations will vanish in the next 6 months).

In addition, there is a limit to CPU that can be consumed by a single request. So even in the event you can fetch your data quickly, if you do too much crunching per request, you are going to violate a quota and start getting errors (the specifics of this limit: a 'high CPU request would be one consuming 0.84 CPU seconds, with the CPU in this case being a 1.2 GHz Intel x86. You are allowed 2 of these per minute).

Thus, I needed to factor in that there'd be a lot of retries going on, but I would not be doing those retries within a request. The inescapable conclusion was that I'd have to use the back end as a simple data store, and rely on Javascript on the front end to handle the retries, putting the pieces together, boiling down the data, and pumping out the results. Also, I'd obviously need a progress indicator to keep the poor end-user updated as things proceeded, rather than leave them hanging.

At this point I will leave you, the reader, hanging, until next time when we get into the Javascript side of things in Part II.

No comments:

Post a Comment