Hadoop

Big Data Lab Experiments

📌 Overview

BAD601 Big Data Analytics
This repository contains the implementation and documentation of Big Data Analytics Laboratory Experiments.
The main objective is to understand the fundamentals of Big Data, Hadoop ecosystem, and related tools through hands-on experiments.

🧪 List of Experiments

| Sl. No | Experiment | Link | |——–|————|——| | 1 | Install Hadoop and implement the following file management tasks in HDFS:
• Adding files and directories
• Retrieving files
• Deleting files and directories
Hint: A typical Hadoop workflow creates data files (such as log files) elsewhere and copies them into HDFS using one of the above command line utilities. |Install | | 2 | Develop a MapReduce program to implement Matrix Multiplication. | MM | | 3 | Develop a MapReduce program that mines weather data and displays appropriate messages indicating the weather conditions of the day. | WDA | | 4 | Develop a MapReduce program to find the tags associated with each movie by analyzing MovieLens dataset. | MTA | | 5 | Implement MongoDB functions: Count, Sort, Limit, Skip, Aggregate. | MDB | | 6 | Develop Pig Latin scripts to sort, group, join, project, and filter the data. | PIG | | 7 | Use Hive to create, alter, and drop databases, tables, views, functions, and indexes. | | 8 | Implement a Word Count program in Hadoop and Spark. | WCH WCS | | 9 | Use CDH (Cloudera Distribution for Hadoop) and HUE (Hadoop User Interface) to analyze data and generate reports for sample datasets. |

⚙️ Technologies & Tools Used

Hadoop (HDFS, MapReduce, YARN)
MongoDB
Apache Pig
Apache Hive
Apache Spark
Apache Kafka
Python / Java (for coding MapReduce & Spark jobs)

🚀 How to Run

Install Hadoop and configure it in pseudo-distributed/cluster mode.
Start HDFS and YARN daemons: ```bash start-dfs.sh start-yarn.sh

Hadoop

Big Data Lab Experiments

📌 Overview

🧪 List of Experiments

⚙️ Technologies & Tools Used

🚀 How to Run

1. Hadoop Installation : Install

2. MatrixMultiplication : MM

3. WeatherDataAnalysis : WDA

4. MovieTagAnalysis : MTA

5. Mongo DB : MDB

6. PigLatin : PIG

7. Word Count Hadoop : WCH

8. Word Count Spark : WCS