12/7/2017 Hadoop Installation on Monsoon | SAP BlogsProducts Products Industries Industries Support Support Training Training Community Community Developer Developer Partner Partner About About Home / Community / Blogs + Actions Hadoop Installation on Monsoon December 5, 2017 | 90 Views | Mike DeHart more by this author SAP Vora cluster | hadoop | hdp | hortonworks | monsoon share 0 share 1 tweet share 4 like 0 Follow Description: This tutorial is offered as a ‘quick and dirty’ guide to installing a Hadoop 3-node cluster on SAP Monsoon. Most https://blogs.sap.com/2017/12/05/hadoop-installation-on-monsoon/ 1/17 but more is recommended especially if the cluster will be used for testing. etc.com/2017/12/05/hadoop-installation-on-monsoon/ 2/17 . Generally 3 medium (4 CPU / 16 GB) nodes is recommended. HDP installation 1. Ambari Installation 4. If you do not have access to space on Monsoon clusters can be provisioned using another cloud platform (AWS. Environment: Nodes: 4 CPU / 16G RAM x 3 OS: SUSE 12 SP01 HortonWorks Ambari 2.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs configurations are kept close to default. Azure. but will not be covered in this tutorial. and as such this guide is ideal for development and testing environments. Ambari will run on nodes with as little as 8 GB RAM.6. I will note workarounds or additional steps related to Monsoon with the label ‘Monsoon only’.0 Contents: 1. Setting up the cluster I will be using SAP’s Monsoon in order to provision the servers where Hadoop will be installed. Type: medium_4_16 https://blogs.). Due to some proxy constraints on Monsoon.sap. Setting up the cluster 2. Prerequisites 3. From your local user folder.sap.2 can support SUSE (64-bit) 11. and 12. 11.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs In this instance I have one master node of type HANA: hana_4_32 (4 CPU / 32 GB) and two medium nodes for my workers: All nodes are running SuSE 12 SP01.6. We’ll be using PuTTY to connect to our nodes.com/2017/12/05/hadoop-installation-on-monsoon/ 3/17 . 12. create a folder named ‘. HDP 2.exe client and puttygen. be sure to save your private key to a local file as there is no way to get it back! Save your private key to your local machine.exe in order to convert our private key to a readable format.3. Make sure your SSH key is provisioned for these nodes.” If your are creating a new key. You will need only putty. On Windows this will likely have to be done through the https://blogs. on the Authentication tab.4. create an SSH key if one is not already created and click “Provision.1. Under your user profile.2.ssh’. Make sure all nodes are running the same operating system and patch level. Once saved.ppk file and is what we’ll use with PuTTY to connect to our servers. mkdir "C:\Users\<username>\. launch puttygen.ssh folder is where we’ll store our private key. make sure All Files is selected in the loading screen. On the main page specify the hostname for one of the nodes.sap.com/2017/12/05/hadoop-installation-on-monsoon/ 4/17 .exe. Under Connection > Data specify your username under Auto-login username.exe. Under Connection > SSH > Auth make sure: Allow agent forwarding is checked https://blogs. select Save Private Key select Yes when prompted to save the key without a passphrase. and save it to a file in your .) Once loaded.ssh directory.ssh" In the new . Click on the Load button and select your saved private key (if your key isn’t listed. This new file is saved as a .12/7/2017 Hadoop Installation on Monsoon | SAP Blogs command-line as it doesn’t like to create folder that begin with a period. Once saved. launch putty. Prerequisites Now that we can connect via SSH to all three nodes. Once connection is successful. click Open to connect to the node. browse to your puttygen-created private key file Add any other customization you want (appearance. etc. we will do a quick update and create our administration user.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs Under Private Key for Authentication.sap. https://blogs. give a name to your profile and click Save. This means issuing a sudo su command will allow you to run as root.com/2017/12/05/hadoop-installation-on-monsoon/ 5/17 . Once the profile is saved.) then navigate back to the Session page. select behavior. Monsoon instance users are pre-configured with passwordless sudo access as long as your user is part of the sysadmin group on the server. repeat the above process for the other two nodes. 2. we’ll create an RSA key for the new cadmin user to allow key-based SSH authentication. From your primary node. We’ll use this user to connect our servers when installing our Ambari cluster. As such.com/2017/12/05/hadoop-installation-on-monsoon/ 6/17 .monsoon cad Create this user on all three nodes in the cluster. https://blogs.sap. we need to only create a new user and make sure it is added to the sysadmin group (as well as any other groups you may need).12/7/2017 Hadoop Installation on Monsoon | SAP Blogs On all three nodes we’ll first do an update to make sure we’re running the latest version: > sudo su # zypper update -t patch Since we already have a sysadmin group with password- less sudo access. We’ll need this during Ambari installation so save this to notepad so we can access it quickly later.ssh/): id_rsa – this is the private key. Log in as the cadmin user and run ssh-keygen to create this RSA key: > sudo su cadmin > ssh-keygen -t rsa [ENTER] [ENTER] Two files are created in the user’s .ssh folder (/home/cadmin/. I’m naming my user cadmin (cluster admin): # /usr/sbin/useradd -m -g users -G sysadmin. this user needs password-less connectivity to all nodes as well as sudo access. ssh folder and is read whenever that user is trying to connect to the server.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs id_rsa. From your primary node (where you first generated the keypair) run: > ssh cadmin@<NODE-2> https://blogs.com/2017/12/05/hadoop-installation-on-monsoon/ 7/17 . You can run cat on the file to make sure it was written correctly: > cat ~/.pub > ~/.sap.ssh/authorized_keys Where “XXXXXX” is the cadmin public key saved from the above step. we’ll need to add this to an authorized_keys file on all nodes in order for cadmin to connect using the private key.ssh/authorized_keys Now we can test to make sure cadmin is able to connect.ssh/authorized_keys Save this public key to notepad as well since we will add it to the other two nodes under cadmin home directory.pub – this is the public key. Connect to your other nodes via PuTTY and run: > sudo su cadmin > echo "XXXXXX" > ~/. As such we’ll first copy this to the file on our main node: > cat ~/.ssh/id_rsa. The authorized_keys file is located in the user’s . so the below steps only need to be applied once: First.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs Where NODE-2 is the hostname of one of your worker nodes. my link is: http://public-repo- 1. If the key was not accepted or it prompts you for a password. Ambari manager will only be installed on our primary (master) node.1.hortonworks. answer ‘yes’ and you should be connected. and issue the following: > sudo su # cd /etc/zypp/repos.hortonworks. You may get a prompt regarding the authenticity of the host.com/ambari/sles12/2.sap.0/ ambari. Ambari Installation Assuming we now have a working cadmin user. see this HortonWorks Ambari Repositories page and copy the “Repo File” link for your flavor of OS. You should see a https://blogs.d # wget http://public-repo-1. if you aren’t already. double-check that the public key is listed in the authorized_keys file and try troubleshooting via this link. In my case.6.com/ambari/sles # zypper ref This will add the ambari repository to zypper package manager and refresh the repository list.repo Connect via SSH to your primary node. 3. for SLES 12.0.x/updates/2.com/2017/12/05/hadoop-installation-on-monsoon/ 8/17 . in this section we’ll add the Ambari repository and install the cluster manager. com/ARTIFACTS/jdk-8u112-linux- x64.hortonworks.gz In this case.0. after killing the process a simple wget command will use the correct OS proxy to obtain the file: # wget http://public-repo-1.tar. Now we’ll install ambari server: # zypper install ambari-server Once installed.6.com/2017/12/05/hadoop-installation-on-monsoon/ 9/17 .com/ARTIFACTS/jce_policy-8. run the below to setup Ambari (as root): # ambari-server setup Accept the defaults for all prompts.com/ARTIFACTS/j And again for the JCE Policy file: Downloading JCE Policy archive from http://public-repo- 1.hortonworks. in some Monsoon instances ambari-server setup will be unable to get JDK 1. The setup output will hang after a prompt similar to: Downloading JDK from http://public-repo- 1.gz to /var/lib/ambari-server/resources/jdk-8u112- linux-x64.tar.8 or the JCE policy files from the public internet.sap.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs line after the refresh pulling packages from ‘ambari Version – ambari-2. ****Monsoon Only****: Due to the built-in proxy. The easiest workaround for this is to kill the setup process (Ctrl-C) and manually use curl or wget to download and save the files to their respective directories.0’ repository.zip to https://blogs.hortonworks. zip # wget http://public-repo-1. restart ambari server and in the next section we will install Hadoop services: # /usr/sbin/ambari-server restart 4. you can log in with the default credentials: Username: admin Password: admin https://blogs. we can access the UI from the web via port 8080: http://<node1>.corp:8080 Once the page loads.hortonworks.sap.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs /var/lib/ambari-server/resources/jce_policy-8. ******** Once setup completes.com/2017/12/05/hadoop-installation-on-monsoon/ 10/17 .com/ARTIFACTS/j Finally re-run the setup command and both files should be picked up. HDP Installation Now that we have the Ambari manager running.mo.sap. click on Launch Install Wizard to begin creating your Hadoop cluster. For more advanced proxy configurations or proxies that require authentication. Enter a name for your cluster and click next. see the HortonWorks documentation. https://blogs.proxyPort=<your To confirm your OS-level proxy you can issue: > echo $http_proxy Which should provide the host and port to enter under AMBARI_JVM_ARGS.sap.sh Open the file and under AMBARI_JVM_ARGS we need to add the following: -Dhttp. you can access the Users link on the left to change the admin password if desired. Make sure Use Public Repository is selected.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs Once logged in.proxyHost=<yourProxyHost> -Dhttp. Close the UI and stop the Ambari server: > sudo ambari-server stop We must add the proxy to /var/lib/ambari-server/ambari- env. Ambari won’t be able to read the public repository until we update the proxy. Otherwise. ****Monsoon Only****: By default. If it is not this may be due to a proxy issue (especially on Monsoon) see below.com/2017/12/05/hadoop-installation-on-monsoon/ 11/17 . 12/7/2017 Hadoop Installation on Monsoon | SAP Blogs Once added. save the file and restart the ambari server: > sudo ambari-server start ******** Under Select Version select your HDP version. in my case HDP 2.sap. If any errors occur click the ‘Failed’ status to https://blogs. At this point. Ambari will connect to and provision the hosts in the cluster. Under Install Options we need to enter the domains of all three of our hosts as well as connectivity information (remember that cadmin private key I told you to save?) Add all three fully-qualified Monsoon domains to the Target Hosts text box and copy/paste the cadmin private key under Host Registration Information. Make sure to update the user from root to cadmin as well: Then click Register and Confirm to continue.com/2017/12/05/hadoop-installation-on-monsoon/ 12/17 . and click Next.6. Services chosen will differ depending on your needs. Here is where we choose the services for your Hadoop installation. Hive. click Next. Next. Pig. you can see the results of all health checks and address any other warnings that may have been raised. Any prerequisites that are needed for selected services will automatically be added. In most situations these can remain the defaults: https://blogs. YARN + MR2. in my case I am using this for Vora and Spark testing so I’ve selected: HDFS. Once all nodes succeed. HDFS and Yarn are required for the majority of Hadoop installations. and Ambari Metrics. Spark. registration failed with an error <host> failed due to EOF occurred in violation of protocol (_ssl.sap.ini on all nodes. Tez.com/2017/12/05/hadoop-installation-on-monsoon/ 13/17 . When finished. In my case.c:661) From a web search I was able to fix the issue by adding: force_https_protocol=PROTOCOL_TLSv1_2 Under the [security] section of /etc/ambari- agent/conf/ambari-agent.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs view the install log and troubleshoot further via the web. we can assign services to their respective nodes. ZooKeeper. it is a good idea to assign more rather than less.sap. Generally. indicated by red circles: Most of these are easily fixed by taking out the directories starting with /home. Next we have to configure all the services. Sparkservers. SSH to your master node and log in as root: https://blogs. and Datanodes to all nodes.com/2017/12/05/hadoop-installation-on-monsoon/ 14/17 . Hive requires we set up a database. NodeManager.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs On the next page we can assign slaves and clients to our nodes. with the exception of Hive. I assign Clients. For this we’ll use PostgreSQL. There will be errors that need addressing. 12/7/2017 Hadoop Installation on Monsoon | SAP Blogs > sudo su # zypper install postgresql-jdbc # ls /usr/share/java/postgresql-jdbc. postgres=# grant all privileges on database hive to hiv postgres=# \q > exit Now we just need to backup and update the pg_hba configuration file: As root: # cp /var/lib/pgsql/data/pg_hba.jar # ambari-server setup --jdbc-db=postgres --jdbc-driver= Now we need to log in to postgres.com/2017/12/05/hadoop-installation-on-monsoon/ 15/17 . In this case we’re using ‘hive’ for all three: # sudo su postgres > psql postgres=# create database hive.conf /var/lib/pgsql/dat # vi /var/lib/pgsql/data/pg_hba.ambari.mapred) Save and exit with :wq Then restart postgres: https://blogs. postgres=# create user hive with password 'hive'. create our database and user / password.sap.jar # chmod 644 /usr/share/java/postgresql-jdbc.conf Add hive to the list of users at the bottom of the file (so it reads hive. Similar to when we registered the hosts. Possible errors are too vast to cover here.com/2017/12/05/hadoop-installation-on-monsoon/ 16/17 .sap. username.12/7/2017 Hadoop Installation on Monsoon | SAP Blogs > sudo service postgresql restart Now. click Next and the deployment should begin. Once all deployments are successful. Make sure the Database URL also correctly reflects the node where we installed and configured postgresql and test the database connection. the logs for any failures can be viewed by clicking on the respective “Failed” status. click Next to access the Ambari dashboard and view your services. select “Existing PostgreSQL Database” and make sure hive is set for the DB name. Any alerts can also be addressed and service customization can be configured. Congratulations! You now have a deployed Hadoop cluster! https://blogs. Once successful. and password. but web searches or searches of the hortonworks forums will most likely provide answers. back to the cluster setup. 12/7/2017 Hadoop Installation on Monsoon | SAP Blogs Alert Moderator Be the first to leave a comment Add Comment Share & Follow Privacy Terms of Use Legal Disclosure Copyright Trademark Sitemap Newsletter https://blogs.sap.com/2017/12/05/hadoop-installation-on-monsoon/ 17/17 .