Hadoop

Big Data Lab Experiments

๐Ÿ“Œ Overview

BAD601 Big Data Analytics
This repository contains the implementation and documentation of Big Data Analytics Laboratory Experiments.
The main objective is to understand the fundamentals of Big Data, Hadoop ecosystem, and related tools through hands-on experiments.


๐Ÿงช List of Experiments

| Sl. No | Experiment | Link | |โ€”โ€”โ€“|โ€”โ€”โ€”โ€”|โ€”โ€”| | 1 | Install Hadoop and implement the following file management tasks in HDFS:
โ€ข Adding files and directories
โ€ข Retrieving files
โ€ข Deleting files and directories
Hint: A typical Hadoop workflow creates data files (such as log files) elsewhere and copies them into HDFS using one of the above command line utilities. |Install | | 2 | Develop a MapReduce program to implement Matrix Multiplication. | MM | | 3 | Develop a MapReduce program that mines weather data and displays appropriate messages indicating the weather conditions of the day. | WDA | | 4 | Develop a MapReduce program to find the tags associated with each movie by analyzing MovieLens dataset. | MTA | | 5 | Implement MongoDB functions: Count, Sort, Limit, Skip, Aggregate. | MDB | | 6 | Develop Pig Latin scripts to sort, group, join, project, and filter the data. | PIG | | 7 | Use Hive to create, alter, and drop databases, tables, views, functions, and indexes. | | 8 | Implement a Word Count program in Hadoop and Spark. | WCH WCS | | 9 | Use CDH (Cloudera Distribution for Hadoop) and HUE (Hadoop User Interface) to analyze data and generate reports for sample datasets. |


โš™๏ธ Technologies & Tools Used


๐Ÿš€ How to Run

  1. Install Hadoop and configure it in pseudo-distributed/cluster mode.
  2. Start HDFS and YARN daemons: ```bash start-dfs.sh start-yarn.sh

1. Hadoop Installation : Install

2. MatrixMultiplication : MM

3. WeatherDataAnalysis : WDA

4. MovieTagAnalysis : MTA

5. Mongo DB : MDB

6. PigLatin : PIG

7. Word Count Hadoop : WCH

8. Word Count Spark : WCS