Created by Akanksha

Hadoop MapReduce Multi-node Cluster over AWS using Ansible Automation

Documentation and analysis is done back and forth using Job Tracker and Task Tracker nodes that are part of MR Cluster and giving the benefit to run the analysis program giving Distributed Computing Resources….

Hadoop Distributed Computing Cluster :

Fig 1. Hadoop Distributed Computing Cluster

Working of Job Tracker Node:

Fig 2. Flow Diagram of Hadoop Data Flow

Working of Task Tracker Node

How Hadoop Provides internal Sorting Program?

How to setup Job Tracker?

How to setup Task Tracker?

Let me Direct you to the practical part now,

Fig 3. Practical Setup of our HDFS and MR Cluster
# mkdir hadoop-ws
# cd hadoop-ws
# mkdir roles
Fig 4. ansible.cfg File
file format:
access_key: GUJGWDUYGUEWVVFEWGVFUYV
secret_key: huadub7635897^%&hdfqt57gvhg

Steps:

# mkdir role
# cd role
# ansible-galaxy init ec2
# ansible-galaxy init hadoop_master
# ansible-galaxy init hadoop_slave
# ansible-galaxy init hadoop_client
# ansible-galaxy init hadoop_jobtracker
# ansible-galaxy init hadoop_tasktracker
# cd role/ec2/tasks
# vim main.yml
Fig 5. Task File for EC2 Role
# cd role/ec2/vars
# vim main.yml
Fig 6. Variable File for EC2 Role
# cd role/hadoop_master/tasks
# vim main.yml
Fig 7. Tasks for Master Ansible Role
# cd role/hadoop_master/templates
# vim hdfs-site.xml.j2
Fig 8. hdfs-site.xml.j2 file
# cd role/hadoop_master/vars
# vim main.yml
Fig 9. Variable file for Hadoop-master Role
# cd role/hadoop_master/templates
# vim main.yml
Fig 10. core-site.xml.j2 file
# cd role/hadoop_slave/tasks
# vim main.yml
Fig 11. Hadoop-Slave Task File
# cd role/hadoop_slave/vars
# vim main.yml
Fig 12. Hadoop-Slave Variable File
# cd role/hadoop_slave/templates
# vim core-site.xml.j2
Fig 13. core-site.xml.j2 File
# cd role/hadoop_slave/templates
# vim hdfs-site.xml.j2
Fig 14. hdfs-site.xml.j2 File
# cd role/hadoop_jobtracker/tasks
# vim main.yml
Fig 15. Task File for Job Tracker Role
# cd role/hadoop_jobtracker/vars
# vim main.yml
Fig 16. Variable File for Job Tracker Role
# cd role/hadoop_jobtracker/templates
# vim core-site.xml.j2
Fig 17. Core-Site.xml.j2 File
# cd role/hadoop_jobtracker/templates
# vim mapred-site.xml.j2
Fig 18. Mapred-Site.xml.j2 File
# cd role/hadoop_tasktracker/tasks
# vim main.yml
Fig 19. Task File for Task Tracker Role
# cd role/hadoop_tasktracker/vars
# vim main.yml
Fig 20. Variable File for Task Tracker Role
# cd role/hadoop_tasktracker/templates
# vim mapred-site.xml.j2
Fig 21. Mapred-Site.xml.j2 File
# cd role/hadoop_client/tasks
# vim main.yml
Fig 22.Tasks for Hadoop Client Role
# cd role/hadoop_client/vars
# vim main.yml
Fig 23. Variable File for Hadoop Client Role
# cd role/hadoop_client/templates
# vim core-site.xml.j2
Fig 24. Core-Site.xml.j2 File
# cd role/hadoop_client/templates
# vim mapred-site.xml.j2
Fig 25. Mapred-Site.xml.j2 File
Fig 26. setup.yml File
# ansible-playbook setup.yml -ask-vault-pass
(: — )

Thanks for reading. Hope this blog have given you some valuable inputs!!

Technology enhancement take a journey of learning and exploring!! On a way to achieve and Follow my own star!!