Updated on 2022-07-11 GMT+08:00

REST API

Function Description

Use the HTTP REST API to view more information about MapReduce tasks. Currently, the REST API of MapResuce can be used to query the status of completed tasks. For details about the API, see the official website:

http://hadoop.apache.org/docs/r3.1.1/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html

Preparing the Running Environment

  1. Install the client, for example, to the /opt/client directory on the node. For details, see section "Installing a Client."
  2. Go the client installation directory and run the following commands to configure the environment variables:

    source bigdata_env

Procedure

Obtain detailed information about tasks that have been completed on MapReduce.

  • Commands for the operation:
    curl -k -i --negotiate -u : "http://10.120.85.2:19888/ws/v1/history/mapreduce/jobs"

    In the preceding command, 10.120.85.2 indicates the value of JHS_FLOAT_IP for MapReduce, and 19888 indicates the port ID of the JobHistoryServer node.

    In RedHat 6.x and CentOS 6.x, a compatibility problem occurs when the curl command is used to access the JobHistoryServer. As a result, the correct result cannot be returned.

  • You can view the status information about historical tasks, such as the task IDs, start time, end time, and task execution status.
  • Execution result
    {
        "jobs":{
            "job":[
                {
                    "submitTime":1525693184360,
                    "startTime":1525693194840,
                    "finishTime":1525693215540,
                    "id":"job_1525686535456_0001",
                    "name":"QuasiMonteCarlo",
                    "queue":"default",
                    "user":"mapred",
                    "state":"SUCCEEDED",
                    "mapsTotal":1,
                    "mapsCompleted":1,
                    "reducesTotal":1,
                    "reducesCompleted":1
                }
            ]
        }
    }
  • Result analysis:

    Using this API, you can query the completed MapReduce tasks in the current cluster and obtain information listed in Table 1.

    Table 1 Common information

    Parameter

    Description

    submitTime

    Time when a task is submitted

    startTime

    Start time

    finishTime

    End time

    queue

    Task queue

    user

    User who submits the task

    state

    Task state, success or failure