Sunday, July 20, 2014

Lambda Architecture Overview

Nathan Marz and team has designed a generic, scalable and fault tolerant data (#bigdata) processing architecture named as a Lambda Architecture (LA), based on his working experiences and distributed data processing challenges with Backtype and Twitter.

Lambda Architecture has design goals like robust system that is fault tolerant, includes human errors and hardware failures, able to serve a huge range of use cases and workload in minimum time nearly real time. Should be scalable enough.

Lambda Architecture has 3 layers.

1. Batch Layer: 
It has two function managing a master dataset and pre-compute the batch views. Batch layer includes hdfs to store the master and mapreduce to precompute the batch views.

2. Speed layer: 
This layer is responsible for real time(nearly) data processing, low latency systems like Apache Storm includes in this layer to compute the data views with very minimal latency.

3. Serving layer: 
This can be any NoSQL database or indexing engine that able to index the batch view and able to merge output of batch and speed layer and query on that data, ad-hoc way.

For more details about lambda architecture do visit here

Music Analytics Opportunities

Once upon a time the words music listening habits were private as their bedrooms, music lovers used to buy the CDs, recordings and other physical copies of music and never publicly shared, Record companies were aware which radio station played their songs and where their CDs were popular, but that information painted an incomplete picture at best. Who knew what music people were sharing on tapes and CDs burnt in the privacy of their own bedrooms?

A traditional business metrics like number of CDs were sold and nothing happened after that, who purchased what and whom to assist what to buy, all this was anonymous. Thats all changed after explosion of online music sources like torrenting, music streaming sites and social media platforms, are now playing a very key role for music industry to understand their fans, spot upcoming talents like never before and anyones personal music interest nowadays becoming a public. Music analytics is now worth around $24.35 billion per year.

At the same time that the internet is taking power away from record labels, it is also giving them the ability to predict future hits.