Wednesday, May 20, 2009

Up next: a survey of the 9 bazillion databases in the world

Something's Happening But You Don't Know What It Is, Do you, Mr. Jones?

High on the list of a technology professional's worst fears is becoming out of touch with technology and being left behind. Like death, it's going to happen one day no matter what you do, but as with death, one can 'rage against the dying of the light' and make every effort to stave it off as long as possible.

I have been a database guy for a number of years, either as a developer using the database, or one of the guys designing the database, or even sometimes setting up and acting as an administrator for database servers. Pretty much every database system I've ever dealt with was an RDBMS, in other words, it followed the relational model outlined by E.F. Codd in 1970, and misunderstood by the majority of computer type professionals ever since.

Monstrously large companies like Amazon and Google with monstrously large data sets and extreme scaling needs have in recent years run up against the 'ceiling' of what RDBMS's can do, and in some cases they are willing to trade off some of the letters in ACID to meet other needs. Their work is starting to make its way into the outside world: anybody can read the BigTable paper, Google has opened up Google App Engine (as mentioned in previous entries), and implementations like the open-source Hadoop have made BigTable-like storage available to anyone w/ the intellectual and budgetary resources to take it on.

At the other extreme are the very tiny and lightweight databases like SQLite, for cases where a full-blown database system is overkill, but having the option of using a more-or-less standard interface to your data is great to have.

So You Put 2 and 2 together, and got 5?

In future entries I intend to do more of a deep dive into these alternative options to boring old suit-and-tie, plastic-fantastic-wall-street-scene databases like SQL Server, Oracle, and DB2. But first a few cautionary words.

To predict the death of the relational model is likely premature and ill-advised as well. As I mentioned earlier, and have observed over the past 15 or so years, many (almost all?) IT Professionals manage to go from 'hello world' to the executive suite without knowing or understanding much of anything about the databases that, you know, store and access the data. The thing distinguishing your business from every other business in the world (although maybe you also put cool stickers on your servers). A programmer might learn to regurgitate definitions of first, second and third normal form, and know the difference between an inner and outer join, and 'Congratulations! You got the job, kid.' (it worked for me at least once in my more ignorant days). A lot of times, 'I'm running into limitations in the relational model' can be translated as 'Huh. I didn't know you could do it that way.'

Outside the walls of the IT ivory basement, the situation is even worse. Not particularly rigorous, but quite prolific writer Robert X. Cringely recently predicted the death of SQL in this column.

There are some cringe-worthy howlers within:
  • It's SQL, not Sequel (but maybe he was just being clever)
  • SQL is the language, not the database (and hard-core relational theorists will talk yr ear off about where SQL diverges from the relational model, if you aren't careful)
  • Given the fumbling, Keystone Kops way a lot of shops handle the 'you install it, turn it on, and it runs itself' SQL Server, do you really want to turn something as game-changing as BigTable loose on them? That'd be like teaching a kindergarten class how to make Greek Fire.
All that said, going outside the playground perimeter of your day job and trying to learn completely new things is a good way to keep your brain from freezing, and this stuff is just interesting (to me) anyway. So next time, more about these new (or not so new) database systems that will preserve our way of life, eliminate the need for people to do tedious work of any kind, and bring about Ray Kurzweil's fabled Singularity.

As @iamdiddy would say: LET'S GOOOOOOOOO!!!1!

Monday, May 4, 2009

I guarantee you: no more music by the suckas.

As mentioned earlier, I am a fan of I right now have 364 listeners, which is not shabby but not necessarily fantastic, either (I'm this guy). Blip makes it easy to add people you follow: specifically, if you blip The Gap Band, you're presented with a list of 5 or so people who also blipped The Gap Band, and with one click you can add them all.

This is not necessarily a bad idea, but these people add up fast, and horrendous musical whiplash can result as you find out somebody who blipped Sigur Ros once really really loves an abomination like 80s era Aerosmith. Even worse, a lot of these people won't be bothered to reciprocate: they'll have 5,000 listeners, but only 74 favorites. Perhaps they're discerning, but it just seems rude to me, and annoying.

Anyhow, I had accumulated 500+ favorite DJs, and many of them were deadbeats who weren't reciprocating. So with the handy API, I wiped them all out like in that revenge montage from the first Godfather movie. Very, very minimal Python was required:
def getNoRecip(bconn, username, length):
lisses = bconn.user_getListeners(username,0,length)
faves = bconn.user_getFavoriteDJs(username,0,length)
lisNames = [l["urlName"] for l in lisses]
favNames = [f["urlName"] for f in faves]
noRecip = [f for f in favNames if f not in lisNames]
youNoRecip = [l for l in lisNames if l not in favNames]
return (noRecip, youNoRecip)

if __name__ == "__main__":
deaders = open("killed.log", "a")
deaders.write("Getting ready to wax some chumps.\n")
# need REAL password
blipConn = BlipConnection(username = 'SoundSystemSDC',
(dudes, youDiss) = getNoRecip(blipConn,
for dude in dudes:
print "Killin' wack punk:\t%\n" % dude
deaders.write("Killed: %s\n" % dude)
print "That's %i shifty dudes." % len(dudes)
deaders.write("killed %s all told." % len(dudes))
I put the sleep(5) in there to be nice, although it really wasn't that many requests all told.

It was quick and dirty, and kind of handy. Now it'd be nice to have a Greasemonkey plugin to filter out Blips from Aerosmith (and other unfavorite artists).