Big-Data - Hadoop Multi-Node Cluster over AWS using Ansible

Apache Framework — To Do Big Data Computing and Storage through Distributed Approach. Big Size files are Stripped out in some block sizes and Stores at different Data Storage Nodes using HDFS protocol. Benefit of the tool is that I/O process become faster…

Created by Akanksha

🤔 What Exactly Big Data is?

Big data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.

🤔 How to solve this Big Data Management Challenge?

When we go with a simple concept to store array. And while traversing we need to load whole over our RAM similarly while working with Big Data we need to do so. But due to the above mentioned 5 key problems we are unable to so that. Hence, NFS concept replaces with HDFS concept. — Sounds Techie Right!!

Hadoop

Hadoop is an Apache Framework to Do Big Data Computing and Storage through Distributed Approach. Big Size files are Stripped out in some block sizes and Stores at different Data Storage Nodes using HDFS protocol. Benefit of the tool is that I/O process become faster. Data is reliable as internally it replicated in containers and also using Map Reduce Concept we can do Analysis and Processing of Data easily.

source : IBM

Let me Direct you to the practical part now,

We have three instances, one Master and two Slaves to create Hadoop cluster. One Client System for joining our master node and further store data over the cluster. Have done all these parts over AWS.

# mkdir hadoop-ws
# cd hadoop-ws
# mkdir roles
ansible.cfg
file format:
access_key: GUJGWDUYGUEWVVFEWGVFUYV
secret_key: huadub7635897^%&hdfqt57gvhg

Steps:

# mkdir role
# cd role
# ansible-galaxy init ec2
# ansible-galaxy init hadoop_master
# ansible-galaxy init hadoop_slave
# ansible-galaxy init hadoop_client
# cd role/hadoop_master/tasks
# vim main.yml
task.yml
# cd role/kube_master/vars
# vim main.yml
var.yml
task.yml
var.yml
core-site.xml
hdfs.xml
task.yml
var.yml
core-site.xml
hdfs-site.xml
task.yml
var.yml
core-site.xml
setup.yml
# ansible-playbook setup.yml -ask-vault-pass

Thanks for reading. Hope this blog have given you some valuable inputs!!

Technology enhancement take a journey of learning and exploring!! On a way to achieve and Follow my own star!!