Course Duration in Hours
30
30
Hadoop Course Outline
What is Big Data & Why Hadoop?
Big Data Characteristics, Challenges with traditional system
Hadoop Overview & its Ecosystem
Installing and Configuring Hadoop
Setting up hadoop lab on VM Workstation
HDFS Architecture, Name Nodes, Data Nodes and Secondary Name Node
Understanding HDFS HA and Federation architecture
YARN Architecture, Resource Manager, Node Manager and Application Master
Demo Session
Map Reduce Anatomy
How Map Reduce Works?
Writing Mapper, Reducer and Driver using Java APIs
Understanding Hadoop Data Type, Input& Output Formats
Demo Session
Combiner, Partitioner, Counter, Setup and cleanup, Distributed Cache
Passing parameters, Multiple Inputs, Chaining multiple jobs
Handling small files and bad records
Demo Session
Sqoop
Importing and exporting data using sqoop (One DB to Hive)
Extracting and loading the data from social media network using Flume
Demo Session
Hive
Hive Architecture
Installing and configuration of Hive.
Loading data into hive tables from external source.
Demo Session
Pig
Pig Basics, Loading data files
Demo Session
Hadoop Best Practices, Advanced Tips & Techniques
Managing HDFS and YARN
Hadoop Cluster sizing, capacity planning and optimization
Hadoop Deployment options
The participants should have basic knowledge of Java, SQL and Linux. It is advised to refresh these skills to obtain maximum benefit from this training.
MM TECHNOWORLD, Palavakkam (Chennai),Chennai,IN