Big Data Analytics 21CS71
Course Code: 21CS71
Credits: 03
CIE Marks: 50
SEE Marks: 50
Total Marks: 100
Exam Hours: 03
Total Hours of Pedagogy: 40H
Teaching Hours/Weeks: [L:T:P:S] 3:0:0:0
Introduction to Big Data Analytics: Big Data, Scalability and Parallel Processing, Designing Data Architecture, Data Sources, Quality, Pre-Processing and Storing, Data Storage and Analysis, Big Data Analytics Applications and Case Studies.
Introduction to Hadoop (T1): Introduction, Hadoop and its Ecosystem, Hadoop Distributed File
System, MapReduce Framework and Programming Model, Hadoop Yarn, Hadoop Ecosystem Tools.
Hadoop Distributed File System Basics (T2): HDFS Design Features, Components, HDFS User
Commands.
Essential Hadoop Tools (T2): Using Apache Pig, Hive, Sqoop, Flume, Oozie, HBase.
NoSQL Big Data Management, MongoDB and Cassandra: Introduction, NoSQL Data Store, NoSQL Data Architecture Patterns, NoSQL to Manage Big Data, Shared-Nothing Architecture for Big Data Tasks, MongoDB, Databases, Cassandra Databases.
MapReduce, Hive and Pig: Introduction, MapReduce Map Tasks, Reduce Tasks and MapReduce Execution, Composing MapReduce for Calculations and Algorithms, Hive, HiveQL, Pig.
Machine Learning Algorithms for Big Data Analytics: Introduction, Estimating the relationships,
Outliers, Variances, Probability Distributions, and Correlations, Regression analysis, Finding Similar
Items, Similarity of Sets and Collaborative Filtering, Frequent Itemsets and Association Rule Mining.
Text, Web Content, Link, and Social Network Analytics: Introduction, Text mining, Web Mining, Web
Content and Web Usage Analytics, Page Rank, Structure of Web and analyzing a Web Graph, Social
Network as Graphs and Social Network Analytics: