Course Duration in Hours
80
80
Syllabus
Module 1: BigData
Definition with Real Time Examples
How BigData is generated with Real Time Generation
Use of BigData-How Industry is utilizing BigData
Future of BigData!!!
Module 2: Hadoop
Why Hadoop?
What is Hadoop?
Hadoop vs RDBMS, Hadoop vs BigData
Brief history of Hadoop
Problems with traditional large-scale systems
Requirements for a new approach
Anatomy of a Hadoop cluster
Module 3: HDFS
Concepts & Architecture
Data Flow (File Read , File Write)
Fault Tolerance
Shell Commands
Java Base API
Data Flow Archives
Coherency
Data Integrity
Role of Secondary NameNode
Module 4: MapReduce
Theory
Data Flow (Map Shuffle - Reduce)
MapRed vs MapReduce APIs
Programming [ Mapper, Reducer, Combiner, Partitioner ]
Module 5: HIVE & PIG
Architecture
Installation
Configuration
Hive vs RDBMS
Tables
DDL & DML
Partitioning & Bucketing
Hive Web Interface
Why Pig
Use case of Pig
Pig Components
Data Model
Pig Latin
Module 6: HBase
RDBMS Vs NoSQL
HBase Introduction
HBase Components Scanner
Filter Hbase POC
Any one who would like to explore latest big data - Hadoop technology.
Purple Jay Technologies, Bellandur (Bangalore),Bangalore,IN