Friday 17 August 2012

Google's Dremel

Dremel is a way of analyzing information. Running across thousands of servers, it lets you “query” large amounts of data, such as a collection of web documents or a library of digital books or even the data describing millions of spam messages

Dremel can handle web-sized amounts of data at blazing fast speed. According to Google’s paper, you can run queries on multiple petabytes — millions of gigabytes — in a matter of seconds.

According to Google’s paper, the platform has been used inside Google since 2006, with “thousands” of Googlers using it to analyze everything from the software crash reports for various Google services to the behavior of disks inside the company’s data centers. Sometimes, the tool is used with tens of servers, sometime with thousands.

You can use Dremel today — even if you’re not a Google engineer. Google now offers a Dremel web service it calls BigQuery (https://developers.google.com/bigquery/). You can use the platform via an online API, or application programming interface. Basically, you upload your data to Google, and it lets you run queries on its internal infrastructure.

source : http://www.wired.com/wiredenterprise/2012/08/google-dremel-versus-hadoop/