Help Center> Cloud Search Service> User Guide> Elasticsearch> Importing Data> Using DIS to Import Local Data to Elasticsearch
Updated on 2024-07-02 GMT+08:00

Using DIS to Import Local Data to Elasticsearch

You can use DIS to upload log data stored on the local Windows PC to the DIS queue and use CDM to migrate the data to Elasticsearch in CSS. In this way, you can efficiently manage and obtain logs through Elasticsearch. Data files can be in the JSON or CSV format.

Figure 1 shows the data transmission process.

Figure 1 Process of using DIS to import local data to Elasticsearch

Procedure

  1. Log in to the DIS management console.
  2. Purchase a DIS stream.

    For details, see Creating a DIS Stream in Data Ingestion Service User Guide.

  3. Install and configure DIS Agent.

    For details, see Installing DIS Agent and Configuring DIS Agent in Data Ingestion Service User Guide.

  4. Start DIS Agent and upload the collected local data to the DIS queue.

    For details, see Starting DIS Agent in the Data Ingestion Service User Guide.

    For example, upload the following data to a DIS queue using the DIS Agent:

    {"logName":"aaa","date":"bbb"}
    {"logName":"ccc","date":"ddd"}
    {"logName":"eee","date":"fff"}
    {"logName":"ggg","date":"hhh"}
    {"logName":"mmm","date":"nnn"}
  5. Log in to the CSS management console.
  6. In the navigation pane on the left, choose Clusters > Elasticsearch to switch to the Clusters page.
  7. From the cluster list, locate the row that contains the cluster to which you want to import data, and click Access Kibana in the Operation column.
  8. In the Kibana navigation pane on the left, choose Dev Tools.
  9. Run the following command on the console to check whether the cluster has indexes:
    GET _cat/indices?v

    If there are indexes available in the cluster that you want to import data, you do not need to create an index. Go to step 11.

    If there are no indexes available in the cluster that you want to import data, go to the next step and create an index.

  10. On the Console page, run the related command to create an index for the data to be stored and specify a custom mapping to define the data type:

    For example, on the Console page, run the following command to create index apache and specify a custom mapping to define the data type:

    Versions earlier than 7.x

    PUT /apache
    {
        "settings": {
            "number_of_shards": 1
        },
        "mappings": {
            "logs": {
                "properties": {
                    "logName": {
                        "type": "text",
                        "analyzer": "ik_smart"
                    },
                    "date": {
                        "type": "keyword"
                    }
                }
            }
        }
    }

    Versions 7.x and later

    PUT /apache
    {
        "settings": {
            "number_of_shards": 1
        },
        "mappings": {
                       "properties": {
                    "logName": {
                        "type": "text",
                        "analyzer": "ik_smart"
                    },
                    "date": {
                        "type": "keyword"
                    }
                }
            }
    }

    The command is successfully executed if the following information is displayed.

    {
      "acknowledged" : true,
      "shards_acknowledged" : true,
      "index" : "apache"
    }
  11. Log in to the CDM management console.
  12. Purchase a CDM cluster.

    For details, see Creating a CDM Cluster in the Cloud Data Migration User Guide.

  13. Create a link between CDM and CSS.

    For details, see Creating Links in Cloud Data Migration User Guide.

  14. Create a link between CDM and DIS.

    For details, see Creating Links in Cloud Data Migration User Guide.

  15. Create a job on the purchased CDM cluster and migrate the data in the DIS queue to the target cluster in CSS.

    For details, see Table/File Migration Jobs in Cloud Data Migration User Guide.

  16. On the Console page of Kibana, search for the imported data.

    On the Console page of Kibana, run the following command to search for data. View the search results. If the searched data is consistent with the imported data, the data has been imported successfully.

    GET apache/_search

    The command is successfully executed if the following information is displayed.

    {
      "took": 81,
      "timed_out": false,
      "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 5,
        "max_score": 1,
        "hits": [
          {
            "_index": "apache",
            "_type": "logs",
            "_id": "txfbqnEBPuwwWJWL-qvP",
            "_score": 1,
            "_source": {
              "date": """{"logName":"aaa"""",
              "logName": """"date":"bbb"}"""
            }
          },
          {
            "_index": "apache",
            "_type": "logs",
            "_id": "uBfbqnEBPuwwWJWL-qvP",
            "_score": 1,
            "_source": {
              "date": """{"logName":"ccc"""",
              "logName": """"date":"ddd"}"""
            }
          },
          {
            "_index": "apache",
            "_type": "logs",
            "_id": "uRfbqnEBPuwwWJWL-qvP",
            "_score": 1,
            "_source": {
              "date": """{"logName":"eee"""",
              "logName": """"date":"fff"}"""
            }
          },
          {
            "_index": "apache",
            "_type": "logs",
            "_id": "uhfbqnEBPuwwWJWL-qvP",
            "_score": 1,
            "_source": {
              "date": """{"logName":"ggg"""",
              "logName": """"date":"hhh"}"""
            }
          },
          {
            "_index": "apache",
            "_type": "logs",
            "_id": "uxfbqnEBPuwwWJWL-qvP",
            "_score": 1,
            "_source": {
              "date": """{"logName":"mmm"""",
              "logName": """"date":"nnn"}"""
            }
          }
        ]
      }
    }

    apache specifies the created index name. Set this parameter based on site requirements.