Automate your Hadoop Cluster (CDH) – Part 2
Automate your Hadoop Cluster (CDH) – Part 2
Our Part -1 goal was to install and configure a Cloudera Manager and configure repository from command line.
Our Part -2 goal is to install and configure a CDH services using CM API.
- Part -1 Install Cloudera Manager (cluster) like a boss.
- ** Part -2 Add services and Configure CDH using API **
- Part -2 Secure your CDH Cloudera server with Kerberos
- Part -3 Configure TLS Encryption on CDH services
Create CDH Services Using API
- Install jq tools
jq is great tool to parse and json documents using unix shell. We will use jq tools to create json contents
For more information on jq can be read at
Use the following command to install jq
sudo yum install epel-release
sudo yum install jq
- Start Trial
basicauth=admin:admin
apiURL=http://$(hostname -f):7180/api/v19
curl -i -X POST -u "${basicauth}" ${apiURL}/cm/trial/begin
- Register all Data node hosts with Cloudera Manager
basicauth=admin:admin
apiURL=http://$(hostname -f):7180/api/v19
export DATA_NODE_HOSTS='"data-1.example.com", "data-2.example.com", "data-2.example.com"'
PRIVATE_KEY=$(sed 's/$/\\n/' ~cloudera/.ssh/id_rsa | paste -sd '' -)
echo '{ "hostNames": ['${DATA_NODE_HOSTS}'],
"userName" : "cloudera", "privateKey":"'$PRIVATE_KEY'",
"unlimitedJCE":"true", "javaInstallStrategy":"NONE" }' > /tmp/hostInstall.json
curl -i -X POST -u "${basicauth}" -H "content-type:application/json" -d @'/tmp/hostInstall.json' ${apiURL}/cm/commands/hostInstall
Response should look like
HTTP/1.1 200 OK
Expires: Thu, 01-Jan-1970 00:00:00 GMT
Set-Cookie: CLOUDERA_MANAGER_SESSIONID=7j26djm83ly1usy2rl2sbj89;Path=/;HttpOnly
Content-Type: application/json
Date: Sat, 22 Dec 2018 19:23:05 GMT
Transfer-Encoding: chunked
Server: Jetty(6.1.26.cloudera.4)
{
"id" : 12,
"name" : "GlobalHostInstall",
"startTime" : "2018-12-22T19:23:05.009Z",
"active" : true,
"children" : {
"items" : [ ]
}
}
- Check for Installed hosts
basicauth=admin:admin
apiURL=http://$(hostname -f):7180/api/v19
curl -i -X GET -u "${basicauth}" -H "content-type:application/json" ${apiURL}/hosts |grep "hostname"
Wait until all hosts are installed.
- Install Cloudera Management Service
export basicauth=admin:admin
export apiURL=http://$(hostname -f):7180/api/v19
export POSTGRES_SERVER=cm.example.com
curl -i -X PUT -u "${basicauth}" -i -H "content-type:application/json" -d '{ "name": "mgmt" ,"displayName":"Cloudere Management Service"}' ${apiURL}/cm/service
curl -i -X PUT -u "${basicauth}" ${apiURL}/cm/service/autoAssignRoles
curl -i -X PUT -u "${basicauth}" ${apiURL}/cm/service/autoConfigure
curl -u "${basicauth}" \
-H "Content-Type: application/json" \
-X POST \
--data '{}' \
${apiURL}/cm/service/commands/restart
- Configure reporting database
curl -i -X PUT -u "${basicauth}" \
-H "content-type:application/json" \
-d '{ "items": [{"name": "headlamp_database_host", "value": "'${POSTGRES_SERVER}'"},
{"name": "headlamp_database_name", "value": "reportman"},
{"name": "headlamp_database_password", "value": "averycomplexpassword"},
{"name": "headlamp_database_user", "value": "reportman"},
{"name": "headlamp_database_type", "value": "postgresql"}
]}' \
${apiURL}/cm/service/roleConfigGroups/mgmt-REPORTSMANAGER-BASE/config
- Delete Navigator Entry
curl -X DELETE -u "${basicauth}" \
${apiURL}/cm/service/roles/$(curl -sS -X GET -u "${basicauth}" ${apiURL}/cm/service/roles | grep -B1 '"type" : "NAVIGATORMETASERVER"' | grep name | cut -d'"' -f4)
curl -X DELETE -u "${basicauth}" \
${apiURL}/cm/service/roles/$(curl -sS -X GET -u "${basicauth}" ${apiURL}/cm/service/roles | grep -B1 '"type" : "NAVIGATOR"' | grep name | cut -d'"' -f4)
- Create Service cluster and add hosts
export clusterName=cdhcluster-01
curl -X POST -u "${basicauth}" \
-H "content-type:application/json" \
-d '{ "items": [
{
"name": "'${clusterName}'",
"version": "CDH5",
"fullVersion":"5.13.1"
}
] }' \
${apiURL}/clusters
# Add hosts
hostIds=$(curl -sS -X GET -u ${basicauth} ${apiURL}/hosts |grep "hostId" | cut -d'"' -f4)
for hostid in ${hostIds};do
curl -X POST -u "${basicauth}" \
-H 'content-type:application/json' \
-d '{ "items": [ {"hostId": "'${hostid}'"} ]}' \
${apiURL}/clusters/${clusterName}/hosts
done
- Distribute CDH Parcel
# define a function
wait_for_parcel () {
wait_for=$1
service=$2
version=$3
while [ 1 ]
do
curl -sS -X GET -u "${basicauth}" ${apiURL}/clusters/${clusterName}/parcels/products/$service/versions/$version | grep '"stage" : "'$wait_for'"' && break
sleep 5
done
}
# Distribute parcel
service=CDH
PARCEL_VERSION=5.13.1-1.cdh5.13.1.p0.2
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/parcels/products/${service}/versions/${PARCEL_VERSION}/commands/startDistribution
# Check Status until DISTRIBUTED
wait_for_parcel DISTRIBUTED ${service} ${PARCEL_VERSION}
# Activate
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/parcels/products/${service}/versions/${PARCEL_VERSION}/commands/activate
wait_for_parcel ACTIVATED $service $PARCEL_VERSION
- Distribute KAFKA Parcel
# define a function
wait_for_parcel () {
wait_for=$1
service=$2
version=$3
while [ 1 ]
do
curl -sS -X GET -u "${basicauth}" ${apiURL}/clusters/${clusterName}/parcels/products/$service/versions/$version | grep '"stage" : "'$wait_for'"' && break
sleep 5
done
}
# Distribute parcel
service=KAFKA
PARCEL_VERSION=3.1.0-1.3.1.0.p0.35
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/parcels/products/${service}/versions/${PARCEL_VERSION}/commands/startDistribution
# Check Status until DISTRIBUTED
wait_for_parcel DISTRIBUTED ${service} ${PARCEL_VERSION}
# Activate
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/parcels/products/${service}/versions/${PARCEL_VERSION}/commands/activate
wait_for_parcel ACTIVATED $service $PARCEL_VERSION
- Create CDH Services
export basicauth=admin:admin
export apiURL=http://$(hostname -f):7180/api/v19
export clusterName=cdhcluster-01
curl -X POST -u "${basicauth}" \
-H "content-type:application/json" \
-d '{ "items": [ {"name": "zookeeper", "type": "ZOOKEEPER","displayName":"Zookeeper"}, {"name": "yarn", "type": "YARN","displayName":"YARN (MR2 Included)"}, {"name": "hdfs", "type": "HDFS","displayName":"HDFS"}, {"name": "kafka", "type": "KAFKA","displayName":"Kafka"}, {"name": "hbase", "type": "HBASE","displayName":"HBase"} ] }' \
${apiURL}/clusters/${clusterName}/services
Configure CDH Services Using API
- HDFS
curl -i -u "${basicauth}" \
-H "Content-Type: application/json" -i \
-X PUT \
--data '{
"items" : [ {
"name" : "zookeeper_service",
"value" : "zookeeper",
"sensitive" : false
} ]
}' \
${apiURL}/clusters/${clusterName}/services/hdfs/config
- YARN
curl -i -u "${basicauth}" \
-H "Content-Type: application/json" -i \
-X PUT \
--data '{
"items" : [ {
"name" : "hdfs_service",
"value" : "hdfs",
"sensitive" : false
}, {
"name" : "zookeeper_service",
"value" : "zookeeper",
"sensitive" : false
} ]
}' \
${apiURL}/clusters/${clusterName}/services/yarn/config
- KAFKA
curl -i -u "${basicauth}" \
-H "Content-Type: application/json" -i \
-X PUT \
--data '{
"items" : [ {
"name" : "zookeeper_service",
"value" : "zookeeper",
"sensitive" : false
} ]
}' \
${apiURL}/clusters/${clusterName}/services/kafka/config
- HBASE
curl -i -u "${basicauth}" \
-H "Content-Type: application/json" -i \
-X PUT \
--data '{
"items" : [ {
"name" : "hdfs_service",
"value" : "hdfs",
"sensitive" : false
}, {
"name" : "zookeeper_service",
"value" : "zookeeper",
"sensitive" : false
} ]
}' \
${apiURL}/clusters/${clusterName}/services/hbase/config
Create CDH services Roles
- Generate
hostIds
export basicauth=admin:admin
export apiURL=http://$(hostname -f):7180/api/v19
export clusterName=cdhcluster-01
# save Cloudera host id's for all data node
curl -X GET -u ${basicauth} ${apiURL}/hosts |jq -r '.items[].hostId' > /tmp/all-hosts-id.txt
# save Cloudera host id's for first node
# this will be used to configure NODEMAnaGER
curl -X GET -u admin:admin http://$(hostname -f):7180/api/v19/hosts |jq -r '.items[0].hostId' > /tmp/hosts-id-1.txt
- configure zookeeper roles
export serviceName="zookeeper"
export roleType="SERVER"
jq -R '.' /tmp/all-hosts-id.txt | jq -s '{items:map({type:"'$roleType'",hostRef:{hostId:.}})}' > /tmp/items-json-${serviceName}-${roleType}.txt
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/services/${serviceName}/roles -H "content-type:application/json" -d @/tmp/items-json-${serviceName}-${roleType}.txt
- configure hdfs roles
export serviceName="hdfs"
export roleTypes="NAMENODE SECONDARYNAMENODE BALANCER"
for roleType in ${roleTypes};do
jq -R '.' /tmp/hosts-id-1.txt | jq -s '{items:map({type:"'$roleType'",hostRef:{hostId:.}})}' > /tmp/items-json-${serviceName}-${roleType}.txt
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/services/${serviceName}/roles -H "content-type:application/json" -d @/tmp/items-json-${serviceName}-${roleType}.txt
done
export roleTypes="DATANODE"
for roleType in ${roleTypes};do
jq -R '.' /tmp/all-hosts-id.txt | jq -s '{items:map({type:"'$roleType'",hostRef:{hostId:.}})}' > /tmp/items-json-${serviceName}-${roleType}.txt
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/services/${serviceName}/roles -H "content-type:application/json" -d @/tmp/items-json-${serviceName}-${roleType}.txt
done
- configure hdfs directories
curl -i -u "${basicauth}" \
-H "Content-Type: application/json" -i \
-X PUT \
--data '{
"items" : [ {
"name" : "dfs_name_dir_list",
"value" : "/dfs/nn",
"sensitive" : false
} ]
}' \
${apiURL}/clusters/${clusterName}/services/hdfs/roleConfigGroups/hdfs-NAMENODE-BASE/config
curl -i -u "${basicauth}" \
-H "Content-Type: application/json" -i \
-X PUT \
--data '{
"items" : [ {
"name" : "fs_checkpoint_dir_list",
"value" : "/dfs/snn",
"sensitive" : false
}
]
}
' \
${apiURL}/clusters/${clusterName}/services/hdfs/roleConfigGroups/hdfs-SECONDARYNAMENODE-BASE/config
curl -i -u "${basicauth}" \
-H "Content-Type: application/json" -i \
-X PUT \
--data '{
"items" : [ {
"name" : "dfs_data_dir_list",
"value" : "/dfs/dn",
"sensitive" : false
}
]
}' \
${apiURL}/clusters/${clusterName}/services/hdfs/roleConfigGroups/hdfs-DATANODE-BASE/config
- configure hbase roles
export serviceName="hbase"
export roleTypes="MASTER"
for roleType in ${roleTypes};do
jq -R '.' /tmp/hosts-id-1.txt | jq -s '{items:map({type:"'$roleType'",hostRef:{hostId:.}})}' > /tmp/items-json-${serviceName}-${roleType}.txt
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/services/${serviceName}/roles -H "content-type:application/json" -d @/tmp/items-json-${serviceName}-${roleType}.txt
done
export roleTypes="REGIONSERVER"
for roleType in ${roleTypes};do
jq -R '.' /tmp/all-hosts-id.txt | jq -s '{items:map({type:"'$roleType'",hostRef:{hostId:.}})}' > /tmp/items-json-${serviceName}-${roleType}.txt
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/services/${serviceName}/roles -H "content-type:application/json" -d @/tmp/items-json-${serviceName}-${roleType}.txt
done
- configure yarn roles
export serviceName="yarn"
export roleTypes="NODEMANAGER RESOURCEMANAGER JOBHISTORY"
for roleType in ${roleTypes};do
jq -R '.' /tmp/hosts-id-1.txt | jq -s '{items:map({type:"'$roleType'",hostRef:{hostId:.}})}' > /tmp/items-json-${serviceName}-${roleType}.txt
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/services/${serviceName}/roles -H "content-type:application/json" -d @/tmp/items-json-${serviceName}-${roleType}.txt
done```
- [ ] configure yarn directories
```curl -i -u "${basicauth}" \
-H "Content-Type: application/json" -i \
-X PUT \
--data '{
"items" : [ {
"name" : "yarn_nodemanager_local_dirs",
"value" : "/yarn/nm",
"sensitive" : false
}, {
"name" : "yarn_nodemanager_log_dirs",
"value" : "/yarn/container-logs",
"sensitive" : false
}]
}' \
${apiURL}/clusters/${clusterName}/services/yarn/roleConfigGroups/yarn-NODEMANAGER-BASE/config
- configure kafka roles
export serviceName="kafka"
export roleTypes="KAFKA_BROKER"
for roleType in ${roleTypes};do
jq -R '.' /tmp/all-hosts-id.txt | jq -s '{items:map({type:"'$roleType'",hostRef:{hostId:.}})}' > /tmp/items-json-${serviceName}-${roleType}.txt
curl -i -X POST -u "${basicauth}" ${apiURL}/clusters/${clusterName}/services/${serviceName}/roles -H "content-type:application/json" -d @/tmp/items-json-${serviceName}-${roleType}.txt
done
Execute firstRun
- Deploy Client Config
curl -u ${basicauth} \
-H "Content-Type:application/json" \
-X POST \
-i \ ${apiURL}/clusters/${clusterName}/commands/deployClientConfig
The output will generate a JSON with command id. In this case it is 51
, but it will vary.
{
"id" : 51,
"name" : "DeployClusterClientConfig",
"startTime" : "2018-12-25T06:35:00.200Z",
"active" : true,
"clusterRef" : {
"clusterName" : "cdhcluster-01"
}
}
Wait until the command is finished with success .
"success" : true
Following command can be used to check the status
export commandId=51
curl -i -X GET -u ${basicauth} ${apiURL}/commands/${commandId}
First Run
- zookeeper
export serviceName=zookeeper
curl -X POST -u ${basicauth} ${apiURL}/clusters/${clusterName}/services/${serviceName}/commands/firstRun
Wait until the command is finished with success .
"success" : true
Following command can be used to check the status
export commandId=<commandId>
curl -i -X GET -u ${basicauth} ${apiURL}/commands/${commandId}
- hdfs
export serviceName=hdfs
curl -X POST -u ${basicauth} ${apiURL}/clusters/${clusterName}/services/${serviceName}/commands/firstRun
Wait until the command is finished with success .
"success" : true
Following command can be used to check the status
export commandId=<commandId>
curl -i -X GET -u ${basicauth} ${apiURL}/commands/${commandId}
- hbase
export serviceName=hbase
curl -X POST -u ${basicauth} ${apiURL}/clusters/${clusterName}/services/${serviceName}/commands/firstRun
Wait until the command is finished with success .
"success" : true
Following command can be used to check the status
export commandId=<commandId>
curl -i -X GET -u ${basicauth} ${apiURL}/commands/${commandId}
- yarn
export serviceName=yarn
curl -X POST -u ${basicauth} ${apiURL}/clusters/${clusterName}/services/${serviceName}/commands/firstRun
Wait until the command is finished with success .
"success" : true
Following command can be used to check the status
export commandId=<commandId>
curl -i -X GET -u ${basicauth} ${apiURL}/commands/${commandId}
- kafka
export serviceName=kafka
curl -X POST -u ${basicauth} ${apiURL}/clusters/${clusterName}/services/${serviceName}/commands/firstRun
Wait until the command is finished with success .
"success" : true
Following command can be used to check the status
export commandId=<commandId>
curl -i -X GET -u ${basicauth} ${apiURL}/commands/${commandId}
At this point we should have a working Cloudera cluster. Next part 3 will cover the security configurations using Cloudera CDH API