Select Page

BIG DATA ANALYTICS


(Elective – 1)
OBJECTIVES:
• Optimize business decisions and create competitive advantage with Big Data analytics
• Introducing Java concepts required for developing map reduce programs
• Derive business benefit from unstructured data
• Imparting the architectural concepts of Hadoop and introducing map reduce paradigm
• To introduce programming tools PIG & HIVE in Hadoop echo system.
UNIT-I
Data structures in Java: Linked List, Stacks, Queues, Sets, Maps; Generics: Generic classes and
Type parameters, Implementing Generic Types, Generic Methods, Wrapper Classes, Concept of
Serialization
UNIT-II
Working with Big Data: Google File System, Hadoop Distributed File System (HDFS) –
Building blocks of Hadoop (Namenode, Datanode, Secondary Namenode, JobTracker,
TaskTracker), Introducing and Configuring Hadoop cluster (Local, Pseudo-distributed mode,
Fully Distributed mode), Configuring XML files.
UNIT-III
Writing MapReduce Programs: A Weather Dataset, Understanding Hadoop API for
MapReduce Framework (Old and New), Basic programs of Hadoop MapReduce: Driver code,
Mapper code, Reducer code, RecordReader, Combiner, Partitioner
UNIT-IV
Hadoop I/O: The Writable Interface, WritableComparable and comparators, Writable Classes:
Writable wrappers for Java primitives, Text, BytesWritable, NullWritable, ObjectWritable and
GenericWritable, Writable collections, Implementing a Custom Writable: Implementing a
RawComparator for speed, Custom comparators
UNIT-V
Pig: Hadoop Programming Made Easier
Admiring the Pig Architecture, Going with the Pig Latin Application Flow, Working through the
ABCs of Pig Latin, Evaluating Local and Distributed Modes of Running Pig Scripts, Checking
out the Pig Script Interfaces, Scripting with Pig Latin
UNIT-VI
Applying Structure to Hadoop Data with Hive:
Saying Hello to Hive, Seeing How the Hive is Put Together, Getting Started with Apache Hive,
Examining the Hive Clients, Working with Hive Data Types, Creating and Managing Databases
and Tables, Seeing How the Hive Data Manipulation Language Works, Querying and Analyzing
Data

IV Year – I Semester
L T P C
4 0 0 3
OUTCOMES:
• Preparing for data summarization, query, and analysis.
• Applying data modeling techniques to large data sets
• Creating applications for Big Data analytics
• Building a complete business data analytic solution
TEXT BOOKS:

  1. Big Java 4th Edition, Cay Horstmann, Wiley John Wiley & Sons, INC
  2. Hadoop: The Definitive Guide by Tom White, 3rd Edition, O’reilly
  3. Hadoop in Action by Chuck Lam, MANNING Publ.
  4. Hadoop for Dummies by Dirk deRoos, Paul C.Zikopoulos, Roman B.Melnyk,Bruce
    Brown, Rafael Coss
    REFERENCE BOOKS:
  5. Hadoop in Practice by Alex Holmes, MANNING Publ.
  6. Hadoop MapReduce Cookbook, SrinathPerera, ThilinaGunarathne

SOFTWARE LINKS:

  1. Hadoop:http://hadoop.apache.org/
  2. Hive: https://cwiki.apache.org/confluence/display/Hive/Home
  3. Piglatin: http://pig.apache.org/docs/r0.7.0/tutorial.html

[content-egg module=Amazon template=list]

[content-egg module=Flipkart template=list]