REST API
Function Description
Use the HTTP REST API to view more information about MapReduce tasks. Currently, the REST API of MapResuce can be used to query the status of completed tasks. For details about the API, see the official website:
Preparing the Running Environment
- Install the client, for example, to the /opt/client directory on the node. For details, see section "Installing a Client."
- Go the client installation directory and run the following commands to configure the environment variables:
kinit service user
The validity duration of kinit authentication is 24 hours. If you run the sample again 24 hours later, you need to run the kinit command again.
- HTTPS-based access is different from HTTP-based access. When you access MapReduce using HTTPS, you must ensure that the SSL protocol supported by the curl command is supported by the cluster because SSL security encryption is used. If the cluster does not support the SSL protocol, change the SSL protocol in the cluster. For example, if the cURL supports only the TLSv1 protocol, perform the following steps:
Log in to FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > Yarn > Configurations > All Configurations, search for hadoop.ssl.enabled.protocols in the search box, and check whether the parameter value contains TLSv1. If the parameter value does not contain TLSv1, add TLSv1 in the hadoop.ssl.enabled.protocols configuration item. Clear the value of ssl.server.exclude.cipher.list. Otherwise, you cannot access Yarn using HTTPS. Click Save, and click More > Restart Service to restart the service.
- The values of MapReduce configuration items hadoop.ssl.enabled.protocols and ssl.server.exclude.cipher.list directly reference the values of the corresponding configuration items in Yarn. Therefore, you need to change the values of the corresponding configuration items in Yarn and restart the Yarn and MapReduce services.
- TLSv1 has security vulnerabilities. Exercise caution when using it.
Procedure
Obtain detailed information about tasks that have been completed on MapReduce.
- Commands for the operation:
curl -k -i --negotiate -u : "https://10.120.85.2:26014/ws/v1/history/mapreduce/jobs"
In the preceding command, 10.120.85.2 indicates the value of JHS_FLOAT_IP for MapReduce, and 26014 indicates the port ID of the JobHistoryServer node.
In RedHat 6.x and CentOS 6.x, a compatibility problem occurs when the curl command is used to access the JobHistoryServer. As a result, the correct result cannot be returned.
- You can view the status information about historical tasks, such as the task IDs, start time, end time, and task execution status.
- Execution result
{ "jobs":{ "job":[ { "submitTime":1525693184360, "startTime":1525693194840, "finishTime":1525693215540, "id":"job_1525686535456_0001", "name":"QuasiMonteCarlo", "queue":"default", "user":"mapred", "state":"SUCCEEDED", "mapsTotal":1, "mapsCompleted":1, "reducesTotal":1, "reducesCompleted":1 } ] } }
- Result analysis:
Using this API, you can query the completed MapReduce tasks in the current cluster and obtain information listed in Table 1.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.