Updated on 2024-05-29 GMT+08:00

Importing IoTDB Data

Scenario

This section describes how to use import-csv.sh to import data in CSV format to IoTDB.

Prerequisites

  • The client has been installed. For details, see . For example, the installation directory is /opt/client. The client directory in the following operations is only an example. Change it based on the actual installation directory onsite.
  • Service component users have been created by the MRS cluster administrator by referring to . In security mode, machine-machine users need to download the keytab file. For details, see . A human-machine user must change the password upon the first login.
  • By default, SSL is enabled on the server. You have generated the truststore.jks certificate by following the instructions provided in Using the IoTDB Client and copied it to the Client installation directory/IoTDB/iotdb/conf directory.

Procedure

  1. Prepare a CSV file named example-filename.csv on the local PC with the following content:
    Time,root.fit.d1.s1,root.fit.d1.s2,root.fit.d2.s1,root.fit.d2.s3,root.fit.p.s1
    1,100,hello,200,300,400
    2,500,world,600,700,800
    3,900,"hello, \"world\"",1000,1100,1200

    Before importing data, pay attention to the following:

    • The data to be imported cannot contain spaces. Otherwise, importing that line of data fails and is skipped, but subsequent import operations are not affected.
    • Data that contains commas (,) must be enclosed in backquote. For example, hello,world is changed to `hello,world`.
    • Quotation marks ("") in the data must be replaced with the escape character \". For example, "world" is changed to \"world\".
    • Single quotation marks (') in the data must be replaced with the escape character \'. For example, 'world' will be changed to \'world\'.
    • If the data to be imported is time, the format is yyyy-MM-dd'T'HH:mm:ss, yyy-MM-dd HH:mm:ss or yyyy-MM-dd'T'HH:mm:ss.SSSZ, for example, 2022-02-28T11:07:00, 2022-02-28T11:07:00, or 2022-02-28T11:07:00.000Z.
  2. Use WinSCP to import the CSV file to the directory of the node where the client is installed, for example, /opt/client/IoTDB/iotdb/tools.
  3. Log in to the node where the client is installed as the client installation user.
  4. Run the following command to switch to the client installation directory:

    cd /opt/client

  5. Run the following command to configure environment variables:

    source bigdata_env

  6. Before logging in to the IoTDB client for the first time, perform the following steps to generate a client SSL certificate:
    1. Run the following command to generate a client SSL certificate:

      keytool -noprompt -import -alias myservercert -file ca.crt -keystore truststore.jks

      After running this command, you are required to set a password.

    2. Copy the generated truststore.jks file to the Client installation directory/IoTDB/iotdb/conf directory.

      cp truststore.jks Client installation directory/IoTDB/iotdb/conf

  7. (Optional) Perform this step to authenticate the current user if Kerberos authentication is enabled for the cluster. If Kerberos authentication is not enabled, skip this step.

    kinit Component service user

  8. Run the following command to switch to the directory where the IoTDB client running script is stored:

    cd /opt/client/IoTDB/iotdb/sbin

  9. If Kerberos authentication is disabled for the cluster (the cluster is in normal mode), invoke the alter-cli-password.sh script to change the default password of the default user root.

    sh alter-cli-password.sh IP address of the IoTDBServer instance RPC port number

    • To view the IP address of the IoTDBServer instance node, log in to FusionInsight Manager and choose Cluster > Services > IoTDB > Instances.
    • The IoTDBServer RPC port can be configured in the IoTDB_SERVER_RPC_PORT parameter. The default ports are as follows:
      • The default open-source port number is 6667.
      • The default customized port number is 22260.

      Port customization/open source: When creating an LTS version cluster, you can set Component Port to Open source or Custom. If Open source is selected, the open source port is used. If Custom is selected, the customized port is used.

    • The initial password of user root is root in versions earlier than MRS 3.3.0 and Iotdb@123 in MRS 3.3.3.0 and later versions.

      The password must contain at least four characters in versions earlier than MRS 3.3.0 and at least eight characters in MRS 3.3.0 and later versions, and cannot contain spaces.

  10. Run the following command to log in to the client:

    ./start-cli.sh -h Service IP address of the IoTDBServer instance node -p IoTDBServer RPC port

    • You can log in to FusionInsight Manager and choose Cluster > Services > IoTDB > Instance to view the service IP address of the IoTDBServer instance node.
    • To obtainthe default RPC port number, choose Cluster > Services > IoTDB, choose Configurations > All Configurations, and search for IOTDB_SERVER_RPC_PORT.
    • If Kerberos authentication is disabled for the cluster (the cluster is in normal mode), use the default user root to log in to the IoTDB client.

    After you run this command, specify the service username as required.

    • To specify the service username, enter yes and enter the service username and password as prompted.

    • If you will not specify the service username, enter no. In this case, you will perform subsequent operations as the user in 7.

    • If you enter other information, you will log out.

  11. (Optional) Create metadata.
    IoTDB has the capability of type inference, so it is not necessary to create metadata before data import. However, it is recommended that you create metadata before using the CSV tool to import data, because this avoids unnecessary type conversion errors. The commands are as follows:
    SET STORAGE GROUP TO root.fit.d1;
    SET STORAGE GROUP TO root.fit.d2;
    SET STORAGE GROUP TO root.fit.p;
    CREATE TIMESERIES root.fit.d1.s1 WITH DATATYPE=INT32,ENCODING=RLE;
    CREATE TIMESERIES root.fit.d1.s2 WITH DATATYPE=TEXT,ENCODING=PLAIN;
    CREATE TIMESERIES root.fit.d2.s1 WITH DATATYPE=INT32,ENCODING=RLE;
    CREATE TIMESERIES root.fit.d2.s3 WITH DATATYPE=INT32,ENCODING=RLE;
    CREATE TIMESERIES root.fit.p.s1 WITH DATATYPE=INT32,ENCODING=RLE;
  12. Run the following command to exit the client:

    quit;

  13. Run the following command to switch to the directory where the import-csv.sh script is stored:

    cd /opt/client/IoTDB/iotdb/tools

  14. Run the following command to run import-csv.sh and import the example-filename.csv file:

    ./import-csv.sh -h Service IP address of the IoTDBServer instance -pIoTDBServer RPC port -f example-filename.csv

    Enter the service username and password in interactive mode as prompted. If information in the following figure is displayed, the CSV file is imported:

  15. Verify data consistency.
    1. Run the following command to switch to the directory where the IoTDB client running script is stored:

      cd /opt/client/IoTDB/iotdb/sbin

    2. Log in to the IoTDB client by referring to 10. Run SQL statements to query data and compare the data with that in the 1 file.
    3. Check whether the imported data is consistent with the data in the 1. If they are, the import is successful.

      Run the following command to check the imported data:

      SELECT * FROM root.fit.**;

      • To prevent security risks, you are advised to import CSV files in interactive mode.
      • You can also import CSV files by running the ./import-csv.sh -h Service IP address of the IoTDBServer instance -p IoTDBServer RPC port -u Service username -pw Service user password-f example-filename.csv command.

        If information in the following figure is displayed, the CSV file is imported.

      • If nanosecond (ns) time precision is enabled for the IoTDB on the server, the -tp ns parameter needs to be added when the client imports data with the nanosecond timestamp. To check whether nanosecond time precision is enabled for a cluster, log in to FusionInsight Manager, choose Cluster > Configurations > All Non-default Values, and search for timestamp_precision.