Friday, November 8, 2013

Presto : Facebook Big Data Query Engine

Facebook has announced their latest open source project named Presto. You can read the whole introduction from the engineering team on Facebook  :-D  

Here's a quick look at how it works.


Presto is written in Java and here are some info from the introduction of Presto.

Presto is 10x better than Hive/MapReduce in terms of CPU efficiency and latency for most queries at Facebook. It currently supports a large subset of ANSI SQL, including joins, left/right outer joins, subqueries, and most of the common aggregate and scalar functions, including approximate distinct counts (using HyperLogLog) and approximate percentiles (based on quantile digest). The main restrictions at this stage are a size limitation on the join tables and cardinality of unique keys/groups. The system also lacks the ability to write output data back to tables (currently query results are streamed to the client).

They have open source the project and it can be found here. Interesting right? So go ahead and see if you can build something with it. 
  

No comments:

Post a Comment