Greenplum, MapReduce, and Hadoop

If your job involves processing massive amounts of data you should familiarize yourself with Greenplum, MapReduce, and Hadoop.

With 6.5 Petabytes of data eBay runs the world’s largest data warehouse on Greenplum. Facebook runs a 2 PB warehouse on Hadoop. Impressive.

Both Greenplum and Hadoop make use of the MapReduce framework pioneered by Google.

You can run Hadoop on Amazon Elastic MapReduce to play around with the technology.

There have also been two Hadoop books published recently. I have ordered both of them and can’t wait to hold them in my hands.

Hadoop: The Definitive Guide

Pro Hadoop

No books on Greenplum, but they have some good whitepapers on their website.


Related posts

2 Comments

  • Bradford on Jun 22, 2009

    Greetings,

    You may find a post I wrote on my blog to be of interest. It breaks down some of the weaknesses of RDBMS for processing “massive amounts of data”, and interesting things you can do with low-latency in other frameworks (including Hadoop and HBase). You can check it out at: http://www.roadtofailure.com/?p=21

  • Steve Wooledge on Jun 18, 2009

    Just to round this out, Aster Data Systems also has a MapReduce implementation within their DBMS which is being used by several companies including Specific Media and ShareThis (largest DW in the EC2 cloud at >10TB now).

    Aster just announced support for .NET (C#), which is an industry-first. Lots of great MapReduce content here:http://www.asterdata.com/mapreduce

Leave Reply