Lately, several of our security conscious clients have expressed a desire to install and/or upgrade their Hadoop distribution on cluster nodes that do not have access to the internet. In such cases the installation needs to be performed using local repositories. Since I could not find a step-by-step procedure to accomplish this I thought I would publish it myself. The following step-by-step procedure has been implemented using the following configuration and specifications: Cloudera Manager Node : m3.large EC2 Instance running Centos 6.5 (CentOS-6.5-GA-03.3-f4325b48-37b0-405a-9847-236c64622e3e-ami-6be4dc02.2 (ami-8997afe0)) Name Node: m3.large EC2 Instance running Centos 6.5 (CentOS-6.5-GA-03.3-f4325b48-37b0-405a-9847-236c64622e3e-ami-6be4dc02.2 (ami-8997afe0)) Data Nodes (3): m3.large EC2 Instance running Centos 6.5 (CentOS-6.5-GA-03.3-f4325b48-37b0-405a-9847-236c64622e3e-ami-6be4dc02.2 (ami-8997afe0)) Existing Version of Cloudera Manager: 5.4.3 Existing Version of CDH: 5.4.2 Upgrade to Version of Cloudera Manager: 5.5.0 Upgrade to Version of CDH: 5.5.0 # cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.6 (Santiago) # cat /proc/version Linux version 2.6.32-504.16.2.el6.x86_64 (mockbuild@x86-028.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-9) (GCC) ) #1 SMP Tue Mar 10 17:01:00 EDT 2015
Upgrade Steps
We will be completing the upgrade of the cluster in two steps. In the first step only Cloudera Manager will be upgraded to version 5.5. Once the cluster has been verified to be functional with Cloudera Manager 5.5 then we will upgrade CDH to version 5.5.1. Upgrade Cloudera Manager
1. Let's start by creating the local repository for Cloudera Manager. Download latest version of Cloudera Manager from link below on Local Repository Host:
# wget -r --no-parent --reject "index.html*" "https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.5/"
# wget "https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera"
2. Copy downloaded files to pub/repos/cloudera-manager directory on
Local Repository Host. After that start a local web server with pub/repos root directory. You may use any webserver including Python SimpleHTTPServer or Apache. Following are steps to use the SimpleHTTPServer:
# cd pub/repos
# nohup python -m SimpleHTTPServer 8000 &
Expected output for https://Local Repository Host:8000/pub/repos/cloudera-manager
Expected output for https://Local Repository Host:8000/pub/repos/cloudera-manager/RPMS/x86_64
3. Make sure the local repository for Cloudera Manager is set as:
$ cat /etc/yum.repos.d/cloudera-manager.repo [cloudera-manager] name=Cloudera Manager package mirror baseurl=https://Local Repository Host:8000/pub/repos/cloudera-manager gpgkey=https://Local Repository Host:8000/pub/repos/cloudera-manager/RPM-GPG-KEY-cloudera gpgcheck=1
4. Log on to Cloudera Manager. Stop Cloudera Management Service:
5. Make sure all services are stopped. Sample screens after stopping below:
6. Stop the Hadoop Cluster:
7. SSH to Cloudera Manager Server. Stop Cloudera Manager Service:
# sudo service cloudera-scm-server status cloudera-scm-server (pid 6963) is running... # sudo service cloudera-scm-server stop Stopping cloudera-scm-server: [ OK ] # sudo service cloudera-scm-server status cloudera-scm-server is stopped
8. Before proceeding with the upgrade make sure you backup the Cloudera Manager Databases used by CDH services like Hive Metastore, Oozie, Sentry etc.
9. When you are ready to upgrade issue command to upgrade Cloudera Manager:
# yum upgrade cloudera-manager-server cloudera-manager-daemons
Make sure the Upgrade Version for Cloudera Manager is as below:
10. To verify if the upgrade is successful issue the following command:
# rpm -qa 'cloudera-manager-*' cloudera-manager-daemons-5.5.0-1.cm550.p0.61.el6.x86_64 cloudera-manager-agent-5.5.0-1.cm550.p0.61.el6.x86_64 cloudera-manager-server-5.5.0-1.cm550.p0.61.el6.x86_64
11. Start the Cloudera Manager:
# service cloudera-scm-server start Starting cloudera-scm-server: [ OK ]
12. Monitor the Cloudera Manager Server Log for errors. The Cloudera Manager Server console is ready for use once you see the "Started Jetty Server" message in the log:
# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
13. Log on to Cloudera Manager. You should now see the following screen. Note the running version:
14. Choose Option as below to upgrade Cloudera Manager Agents. Press Continue:
15. Choose Custom Repository:
In first box add: https://Local Repository Host:8000/pub/repos/cloudera-manager In second box add: https://Local Repository Host:8000/pub/repos/cloudera-manager/RPM-GPG-KEY-cloudera
Press Continue.
16. Check JDK/Java options as below and press Continue:
17. Provide SSH credentials and Press Continue:
18. Cloudera Manager will now upgrade the Agents:
19. Verify Completion. Press Continue:
20. Inspect Hosts for Correctness. Press Continue:
21. You should now see a Confirmation Screen as below:
22. Upgrade Cloudera Management Service. Press Continue:
23. Confirm Restart of Cloudera Management Service:
24. Verify Cloudera Management Service restarted. Press Finish:
25. On the Cloudera Manager Home Screen. Choose Deploy Client Configuration:
26. Verify Client Configurations Deployed:
27. Start the Cluster:
28. Verify Services on the Cluster are Active:
29. Verify Cloudera Manager Version:
30. Verify Agents Upgraded. Issue the following commands on all nodes:
# rpm -qa 'cloudera-manager-*' cloudera-manager-daemons-5.5.0-1.cm550.p0.61.el6.x86_64 cloudera-manager-agent-5.5.0-1.cm550.p0.61.el6.x86_64
31. Congratulations. Upgrade of Cloudera Manager was successful:
2. Upgrade Cloudera Distribution
Now that the Cloudera Manager has been upgraded lets upgrade CDH to version 5.5.
1. Download latest version of CDH from link below on
Local Repository Host:
wget https://archive.cloudera.com/cdh5/parcels/5.5/CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel wget https://archive.cloudera.com/cdh5/parcels/5.5/CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel.sha1 wget https://archive.cloudera.com/cdh5/parcels/5.5//manifest.json
2. Create/Refresh the local repository for Cloudera Manager by copying the downloaded files to pub/repos/cloudera-cdh5/ directory on
Local Repository Host. Expected output for https://Local Repository Host:8000/pub/repos/cloudera-cdh5/
3. Back up HDFS metadata using the following command:
$ whoami hdfs $ hdfs dfsadmin -fetchImage ~ 15/11/27 19:23:58 INFO namenode.TransferFsImage: Opening connection to https://ip-10-169-250-118.ec2.internal:50070/imagetransfer?getimage=1&txid=latest 15/11/27 19:23:58 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds 15/11/27 19:23:58 INFO namenode.TransferFsImage: Transfer took 0.09s at 2715.91 KB/s $ ls -l total 244 -rw-rw-r--. 1 hdfs hdfs 244838 Nov 27 19:23 fsimage_0000000000000015418
4. Backup databases used for the various CDH services. The following screen shows the databases details used for various services like Oozie, HUE, Sentry etc:
5. Log on to Cloudera Manager.
6. Verify the parcel download setting is pointing to the local repository for CDH. Press the Parcels icon on the Cloudera Manager Home Page. Press Edit settings:
7. Choose the following Option to start upgrade of CDH:
8. Choose version 5.5:
9. Make sure you have backed up all databases:
10. The following screen indicates that we are all set to proceed. Press Continue:
11. CDH Version 5.5 parcels will now be downloaded, distributed to all nodes and unpacked. Press Continue:
12. Hosts will be inspected for correctness. Press Continue:
13. Verify that no party is using the HH-TEST Cluster. Choose Full Cluster Restart. Press Continue:
14. The HH-TEST cluster will now be stopped. Upgraded and restarted. Press Continue:
15. Confirmation screen show now show the upgraded version of CDH. Press Continue:
16. Review additional post-upgrade instructions. Press Finish.
17. Verify CDH version on Cloudera Manager Home Page:
18. Verify CDH version on back-end. SSH to any node in the cluster:
$ hadoop version Hadoop 2.6.0-cdh5.5.0 Subversion https://github.com/cloudera/hadoop -r fd21232cef7b8c1f536965897ce20f50b83ee7b2 Compiled by jenkins on 2015-11-09T20:37Z Compiled with protoc 2.5.0 From source with checksum 98e07176d1787150a6a9c087627562c This command was run using /opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/hadoop-common-2.6.0-cdh5.5.0.jar $ hadoop fs -ls / Found 3 items drwxrwxr-x - solr solr 0 2015-11-26 20:58 /solr drwxrwxrwt - hdfs supergroup 0 2015-11-27 02:29 /tmp drwxr-xr-x - hdfs supergroup 0 2015-11-27 02:29 /user
19. This completes the upgrade of CDH:
Discover more about our expertise in Hadoop.
Share this
You May Also Like
These Related Stories
Debugging high CPU usage using Perf Tool and Vmcore Analysis
Debugging high CPU usage using Perf Tool and Vmcore Analysis
Oct 17, 2014
8
min read
Issues and workarounds for Exadata Patching (JAN-2019)
Issues and workarounds for Exadata Patching (JAN-2019)
Oct 21, 2019
14
min read
Automating Tungsten upgrades using Ansible
Automating Tungsten upgrades using Ansible
Feb 15, 2019
3
min read
No Comments Yet
Let us know what you think