aroratimus: Hadoop Getting Started

Sunday, October 9, 2016

Hadoop Getting Started

Hadoop! Now a days very popular in handling Big data and its analytic implementation.
But before starting Hadoop we must know that why we need it.

Facebook generates peta byte of data per day.
An aircraft generates 10 tera bytes of data per second.
Electricity transmission also generates big data.
Trading data or stock exchange also produce large amount of data.

Now question arises…Previously too we had big data then How did we handle it ? Well it was handled by relational database.
But the drawback is, previously only particular set of data was analyzed which were available in the form of rows and columns.
We used Hydrogen Collider instrument which generated petabyte of data for every quantum of power.
As all data comes for example from facebook in log files which don’t have rows and columns that why we called unstructured data and here we need of some other tool like Hadoop.
If we talk data form then it would be divided in part:

Volume
Velocity
Variety
Value

That is- Value = Volume + Velocity + Variety
Now for example if we want to read 1 TB of data by 1 machine which has 4 I/o channels and each channel speed is 100 MB/Second then it will take 45 min to read all data.
While 10 machine takes 4.5 min to read data.
Definition of Hadoop:

Apache Hadoop is a framework that allows for the distributed processing of large data set of across cluster (group) of commodity (normal PC) computers using a simple programming model.

Currently Adhaar Card implementation is bruning example of Hadoop uses in India.

aroratimus

Sunday, October 9, 2016

Hadoop Getting Started

No comments:

Post a Comment

Blog Archive

Labels

Popular Posts