Wednesday, October 30, 2013

Cloudera : Center of Universe?

Today GigaOm release an article talking about Cloudera introducing enterprise data hub (short for EDH) as quoting Mr Tom Reilly , Cloudera CEO
 “We believe the EDH is going to become the center of most enterprise’s data architectures.”

I'm not sure this is just another gimmick from big data company but big data technologies is going to be really 'big' in coming future. So the biggest question for myself is what i can do with it. 

Here is Cloudera Platform Architecture which i get from article. 


Cloudera Architecture

















You can compare this to Hortonworks way of doing things. 

Hortonworks partners
















So which platform you like or thinking of implementing in your organization? 

Tuesday, October 29, 2013

Hortonworks is shipping version 2.0





Hortonworks have new release as Hortonworks Data Platform 2.0. In summary, the release has few highlights as follows





Enterprise Ready YARN, the Hadoop Operating System

With Hadoop 2, Apache Hadoop YARN serves as the Hadoop operating system, and takes Hadoop from a single-use data platform for batch processing to a multi-use platform that enables batch, interactive, online and stream processing.

Stinger Phase 2; Interactive SQL Queries at Petabyte Scale

  • The Stinger Initiative was launched at the beginning of 2013 as a broad community-based effort to enhance the speed, scale and breadth of SQL semantics supported by Apache Hive. Hive 0.12 represents phase 2 of the Stinger Initiative and HDP 2.0 is a significant step forward for Hive, the de-facto standard for SQL access in Hadoop.
  • Reliable NoSQL IN Hadoop with HBaseApache HBase 0.96 is the culmination of more than a year’s worth of effort that’s delivered important enterprise features such as Snapshots and improved MTTR
  • Manage & Monitor YARN and a Hadoop 2 clusterApache Ambari 1.4.1 allows you to provision, manage and monitor a cluster based on the Hadoop 2 stack. This includes YARN, MapReduce 2 and support for enabling native NameNode High Availability (HA).You can read the whole report here in zdnet site.