Friday, January 23, 2009

Cloud Connect Conference - Thursday

I wanted to demonstrate my application running on a real hadoop cluster on EC2, so I woke up early on Thursday to bring up a 3-node cluster using the excellent deployment scripts provided by the hadoop-18 distribution.

At the conference I was preoccupied in Java jar file hell trying to build a deployable version of my demo and did not pay good attention to the speakers. By noon I had finally gotten over that roadblock and had a jar file that would run the entire application on hadoop. After I showed David my application, he challenged me to integrate it with the Google Maps API so I also missed most of the unconference sessions that preceeded the demo session attempting that. I was able to get one zip code to show in a browser on a map but a more complete solution eluded me. And so it goes with me, often getting sucked into building things when I should be listening to and interacting with others.

At the demonstration session I gave a brief talk titled "Using Hadoop to invert Force.com data - or - How to drive a thumbtack with a pile driver". The program used Axis to extract some account data tuples from the force.com demonstration site. It then used 48 mappers and a single reducer to invert these tuples using much the same map/reduce algorithm as Google and Yahoo! use to invert the Internet for page rank data. My demo worked, was well received and I won a nice iTouch for my labors. I thought the conference was useful and informative and I made a couple new friends in the process. I'd recommend it to others with a Net Promoter Score of 9.

Cloud Connect Conference - Wednesday

The Wednesday session began with opening remarks by David Berlind followed by a panel discussion moderated by Stephen O'Grady of RedMonk with panelists: Sam Charrington of Appistry, Alistair Croll of Bitcurrent and Bob Sutor of IBM.
  • ASPs -> SaaS -> cloud computing evolution has been around for over ten years now
  • PaaS is a more recent addition that offers the most open platform for hosting custom and proprietary applications
  • Standards, interoperability, portability and collaboration offer ways to avoid vendor lock-in
  • Companies should experiment with internal and external cloud technologies to gain perspective
  • Challenges in administration, governance, control and ownership of derivative works remain
Some questions from the audience:
  • Acquisitions create myriad application integration issues, how does the cloud help? Coexistence, interoperation and migration offer a range of approaches that are really independent of the cloud. The cloud offers the ability to mashup applications that were not possible before.
  • Larry Ellison and Richard Stahlman have been vocal critics of cloud computing. What's their beef? Some vendors thrive on lock-in and others advocate viral open software. The cloud is already here, it is thriving and it will assimilate everything.
  • Where is the cloud in terms of crossing the chasm? Email and web hosting are already on the other side, with SaaS vendors hot on their heels. Companies are cautiously entering the market but most are still on the early adopter side. Multiple layers of services from bare boxes to enterprise solutions offer many ways for companies to cross as they can benefit from the cloud's economies of scale.
The panel was followed by nine brief technology "Solution Provider Speed Geeking" pitches and demonstrations that were given in the exhibit hall. We formed up in small groups and rotated between presentations on the various vendor products to the sound of Dave's loudspeaker siren. These were then followed by more in-depth sessions by the vendors after lunch. I attended the following sessions:
  • Google App Engine - takes care of automatically scaling my web applications written on top of their Python deployment framework. They support all the tools needed to build new dynamic application involving search, maps, earth, blogs and visualization.
  • Force.com Platform - an extended Java application framework that integrates with the Salesforce.com CRM artifacts. It has a great set of developer tools and rich new applications can be constructed and deployed easily.
  • Amazon EC2 - has released a new administration console that is a huge improvement over its predecessors.
  • Amazon Mechanical Turk - has a huge pool of "artificial artificial intelligence" workers that can be put to work on a fee-for-task basis, doing simple to complicated tasks for a sliding compensation scale from pennies to hundreds of dollars.
  • Google APIs - offer JavaScript libraries for integrating their server side applications in your web applications. Simple yet powerful to use.
Dave threw down the gauntlet to developers by offering some prizes to volunteers who would use some of these technologies to build a demonstration application for the following day. I volunteered and spent some time with a guy from force.com exporing their quickstart.java package to use web services to access some account data to munch with hadoop on EC2.

It took only a few minutes to customize their quickstart application to obtain and invert some account_name and zip_code tuples in memory. I left the conference and by 11pm had a working Hadoop application that would perform the same inversion on terabytes of similar data using a supercomputer. Ironically, both programs were almost the same size!

Cloud Connect Conference - Tuesday

I just got back from the Cloud Connect Conference at the Computer History Museum in Mountain View. The conference was partly an unconference that was sponsored by Google, Amazon, Salesforce and others. David Berlind ran an energetic show that was product and technology focused and very hands-on.

The first session on Tuesday evening brought three short customer "elevator pitch" presentations from Peter Coffee of Salesforce.com, Adam Selipsky of Amazon Web Services and Rajen Sheth of Google to a group of four IT executives: Tim Crawford from Stanford University, Carolyn Lawson of California PUC, Ronald Smith of Cadence Design Systems and Robert Loolley of Utah Technical Services.

The three vendors pitched different cloud computing products but there was a fair amount of overlap in many of their messages: "The benefits of cloud computing are clear, so why delay?"
  • Adam presented the AWS platform-as-service offerings that he equated to the development of the electric power grid in the US. "We make electricity so you don't have to." I have a little experience with EC2 and S3 and would recommend. I've been running a web server on it for some months and a 5-node Hadoop cloud more recently.
  • Rajen presented their code.google.com/apis which consist of a collection of client-side JavaScript libraries that work in concert with server-side Python services. I don't do either language very well but got some hands-on experience later in the program. This would appeal to developers building calendar, map, search and earth related web applications.
  • Peter talked about desktops burdened with too much state and IT departments benefitting from improved productivity, scalability and governance provided by the force.com platform. It consists of a set of developer tools and web services that open up the innards of the salesforce.com CRM to facilitate integration of custom business applications. It is written in a Java dialect with SQL integration that really makes it easy to construct new applications.
The four potential customers asked a number of questions on the following that were fielded by the presenters:
  • Interactive Applications - Lag is a big impediment to hosting truly interactive applications remotely in the cloud
  • Migration into the Cloud - Custom applications often must be rewritten to move into cloud deployment. Email and public website hosting were offered as no-brainer cloud services already in full production. Customers can leverage the innovation scale of cloud providers to gain business advantage.
  • Migration between Cloud vendors - Vendor lock-in is an issue since some of the platforms rely upon proprietary languages and all proprietary software frameworks discourage migration. Open source and standards were offered as mitigating lock-in but premature standards only help the established early providers.
  • Security - A general uneasiness with allowing private data to be hosted in the cloud was expressed. Vendors responded that their large investments in state of the art security lended economies of scale in the quest for data security.
  • Privacy - Once private data is cloud hosted it needs strict access controls to ensure its integrity. Vendors pointed out that lots of corporate data is lost every year to laptop theft and loss of USB keys and that the cloud offers better governance.
  • Legal Uncertainties - The cloud is so new that many legal issues about data ownership and rights to disclosure are untested in the courts.