Apache Hadoop is a fault – tolerant open-source software framework used for the distributed processing of Big Data across clusters of computers.
There are two components of Hadoop.
- HDFS -Hadoop Distributed File System (NameNode, DataNode)
- MapReduce (Job Tracker, Task Tracker)
Installation steps: This tutorial describes the installation of Hadoop single node cluster on Ubuntu platform.