After you have successfully created your virtual machine and have installed Ambari, you can now create your cluster.
1. Launch Install Wizard
I picked the cluster name “sandbox”.
Pick HDP 2.4
Note: You’ll need to ssh to the virtual machine as root to get the private key.
Be sure to use the fully qualified name of hdb.localdomain and the private key from the root account.
Installation will take a while.
In this step, you can pick the services you want for your virtual machine. HAWQ only needs HDFS to run but you can add more services. You can also use YARN with HAWQ for resource management. PXF also supports Hive and Hbase so install those services if you want to test this integration.
There is only a single host so this part is easy!
Under the HDFS tab in the Advanced hdfs-site section, make the following change:
Under the HDFS tab in the Custom hdfs-site section, make the following additions:
dfs.block.local-path-access.user=gpadmin dfs.client.socket-timeout=300000000 dfs.client.use.legacy.blockreader.local=false dfs.datanode.handler.count=60
Under the NameNode Server Threads, change the value to 600.
Under the HDFS tab in the Advanced core-site section, make the following change:
Under the HDFS tab in the Custom core-site section, make the following additions:
Now click on the HAWQ tab and change “Segment Memory Usage Limit” to 4GB.
Click on the Advanced tab and enter the HAWQ Master Port as 5432.
Set the HAWQ System User Password to “changeme”.
Click Next and proceed. You’ll be greeted with a screen warning but proceed anyway.
Before clicking Deploy, make an addition to your /etc/hosts file in the VM. This step is important and required for HAWQ to install properly. Also, this is a “feature” of installing on a single node and isn’t a problem in a multi-node cluster. Use the IP address found with ifconfig.
echo "192.168.175.135 hdb.localdomain" >> /etc/hosts