Greenplum, MapReduce, and Hadoop
If your job involves processing massive amounts of data you should familiarize yourself with Greenplum, MapReduce, and Hadoop.
With 6.5 Petabytes of data eBay runs the world’s largest data warehouse on Greenplum. Facebook runs a 2 PB warehouse on Hadoop. Impressive.
Both Greenplum and Hadoop make use of the MapReduce framework pioneered by Google.
You can run Hadoop on Amazon Elastic MapReduce to play around with the technology.
There have also been two Hadoop books published recently. I have ordered both of them and can’t wait to hold them in my hands.
No books on Greenplum, but they have some good whitepapers on their website.

2 Comments
Greetings,
You may find a post I wrote on my blog to be of interest. It breaks down some of the weaknesses of RDBMS for processing “massive amounts of data”, and interesting things you can do with low-latency in other frameworks (including Hadoop and HBase). You can check it out at: http://www.roadtofailure.com/?p=21
Just to round this out, Aster Data Systems also has a MapReduce implementation within their DBMS which is being used by several companies including Specific Media and ShareThis (largest DW in the EC2 cloud at >10TB now).
Aster just announced support for .NET (C#), which is an industry-first. Lots of great MapReduce content here:http://www.asterdata.com/mapreduce
Looking for ODI developers in Dublin.
About Me
Recent Posts
Search Blog
Blogroll
Tags