Friday, January 23, 2009

Cloud Connect Conference - Thursday

I wanted to demonstrate my application running on a real hadoop cluster on EC2, so I woke up early on Thursday to bring up a 3-node cluster using the excellent deployment scripts provided by the hadoop-18 distribution.

At the conference I was preoccupied in Java jar file hell trying to build a deployable version of my demo and did not pay good attention to the speakers. By noon I had finally gotten over that roadblock and had a jar file that would run the entire application on hadoop. After I showed David my application, he challenged me to integrate it with the Google Maps API so I also missed most of the unconference sessions that preceeded the demo session attempting that. I was able to get one zip code to show in a browser on a map but a more complete solution eluded me. And so it goes with me, often getting sucked into building things when I should be listening to and interacting with others.

At the demonstration session I gave a brief talk titled "Using Hadoop to invert data - or - How to drive a thumbtack with a pile driver". The program used Axis to extract some account data tuples from the demonstration site. It then used 48 mappers and a single reducer to invert these tuples using much the same map/reduce algorithm as Google and Yahoo! use to invert the Internet for page rank data. My demo worked, was well received and I won a nice iTouch for my labors. I thought the conference was useful and informative and I made a couple new friends in the process. I'd recommend it to others with a Net Promoter Score of 9.


Hamilton said...

Did they happen to record your demo? It would be good to see for the Hadoop uninitiated.

