Wednesday, July 17, 2013

Hadoop Ecosystem on Windows Azure

As Microsoft becoming one of the popular vendor in Bigdata Hadoop market, Microsoft have developed a cloud based solution Bigdata, "Windows Azure HDInsight" which Process, analyze, and find out new business insights from Big Data using the power of Apache Hadoop Ecosystem. Windows Azure HDInsight is used to gain valuable business insights by processing and analyzing data including unstructured data, and helps business to made realtime decisions, a Big Data solution powered by Apache Hadoop. 

HDInsight Service makes Apache Hadoop available as a service in the cloud. It provides a provisions to build a Hadoop cluster in minutes, and scale it down once you run your MapReduce jobs. It gives a various ways for to gain performance and effective output like to choose the cluster size to optimize job and processing time to insight or cost,with very interactive way. HDInsight also supports many programming languages including JAVA, .NET technologies.

Reference : http://www.windowsazure.com/en-us/documentation/services/hdinsight/

You can find the core services, data processing frameworks, Microsoft integration points and value adds services, data movement servies, and packages exposed by Windows Azure HDInsights in above diagram. It makes the HDFS and MapReduce the componants of Hadoop framework available in a simpler, more scalable, and cost efficient Windows Azure environment. HDInsight simplifies the hadoop configuration, monitoring and post-processing of Hadoop analysed data by hadoop jobs by providing simple JS and Hive consoles. The JavaScript console is unique to HDInsight and handles Pig(ETL) Latin as well as JavaScript and HDFS commands. HDInsight also provides a cost efficient approach to the managing and storing of data, it uses Windows Azure Blob Storage as a native file system. ( Binary Large Object(Blob): a file of any type and size, that can be stored in Windows Azure) 

A very good appreciable thing about HDInsight is very user interactive console of JavaScript  and hive, for configuration, scheduling and monitoring the jobs. 

Followers