Help Center/ Graph Engine Service/ API Reference/ Management Plane APIs (V2)/ Graph Management/ Incrementally Importing Data to a Graph (2.1.14)
Updated on 2024-12-03 GMT+08:00

Incrementally Importing Data to a Graph (2.1.14)

Function

This API is used to import data to graphs incrementally.

  1. To ensure successful data recovery during system restarts, do not delete any graph data stored in OBS while using the graph.
  2. The size of a single file in the import directory or the size of a single file to be imported cannot exceed 5 GB. Or the import will fail. You are advised to split the file into multiple files smaller than 5 GB before importing.
  3. The total size of files imported at once (including vertex and edge datasets) cannot exceed 1/5 of the available memory. For details about the available memory, check the Node Monitoring area on the O&M monitoring dashboard for the minimum value of available memory for nodes with the suffix ges-dn-1-1 and ges-dn-2-1 (hover over the memory usage rate).

URI

POST /v2/{project_id}/graphs/{graph_id}/import-graph

Table 1 URI parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For details about how to obtain the project ID, see Obtaining a Project ID.

graph_id

Yes

String

Graph ID

Request Parameters

Table 2 Request header parameter

Parameter

Mandatory

Type

Description

X-Auth-Token

Yes

String

User token.

It is used to obtain the permission to call APIs. For details about how to obtain the token, see Authentication. The value of X-Subject-Token in the response header is the token.

Table 3 Request body parameters

Parameter

Mandatory

Type

Description

edgeset_path

No

String

Edge file directory or name

edgeset_format

No

String

Edge dataset format, which can currently be set to csv or txt, with csv as default.

vertexset_path

No

String

Vertex file directory or name

vertexset_format

No

String

Vertex dataset format, which can currently be set to csv or txt, with csv as default.

schema_path

No

String

Path for storing the metadata file of the new data.

log_dir

No

String

Directory for storing logs of imported graphs. This directory stores the data that fails to be imported during graph creation and detailed error causes.

parallel_edge

No

parallel_edge object

How to process repetitive edges.

delimiter

No

String

Field separator in a CSV file. The default value is comma (,). The default element separator in a field of the list/set type is semicolon (;).

trim_quote

No

String

Field quote character in a CSV file. The default value is double quotation marks ("). They are used to enclose a field if the field contains separators or line breaks.

offline

No

Boolean

Whether offline import is selected. The value can be true or false. The default value is false.

  • true: Offline import is selected. The import speed is high, but the graph is locked and cannot be read or written during the import.
  • false: Online import is selected. Compared with offline import, online import is slower. However, the graph can be read (cannot be written) during the import.
Table 4 parallel_edge

Parameter

Mandatory

Type

Description

action

No

String

Processing mode of repetitive edges. The value can be allow, ignore, or override. The default value is allow.

  • allow indicates that repetitive edges are allowed.
  • ignore indicates that subsequent repetitive edges are ignored.
  • override indicates that the previous repetitive edges are overwritten.

ignore_label

No

Boolean

Whether to ignore labels on repetitive edges. The value can be true or false. The default value is true.

  • true: Indicates that the repetitive edge definition does not contain the label. That is, the <source vertex, target vertex> indicates an edge, excluding the label information.
  • false: Indicates that the repetitive edge definition contains the label. That is, the <source vertex, target vertex, label> indicates an edge.

sort_key_column

No

String

Position of the sort key in the edge file, which can only be set to lastColumn. If the edge file does not contain a sort key, this parameter is not required.

Function of the sort key: Different sort key values are configured to distinguish duplicate edges (edges with the same source vertex, end vertex, and label). This parameter is required only for database edition graphs.

Response Parameters

Status code: 200

Table 5 Response body parameters

Parameter

Type

Description

job_id

String

ID of an asynchronous job

Status code: 400

Table 6 Response body parameters

Parameter

Type

Description

error_code

String

System prompt.

  • If the execution succeeds, this parameter may be left blank.
  • If the execution fails, this parameter is used to display the error code.

error_msg

String

System prompt.

  • If the execution succeeds, this parameter may be left blank.
  • If the execution fails, this parameter is used to display the error message.

Example Request

Incrementally import graph data. The edge file directory is testbucket/demo_movie/edges/ and the edge data set is in CSV format. The vertex file directory is testbucket/demo_movie/vertices/ and the vertex data set is in CSV format.

POST http://Endpoint/v2/{project_id}/graphs/{graph_id}/import-graph

{
  "edgeset_path" : "testbucket/demo_movie/edges/",
  "edgeset_format" : "csv",
  "vertexset_path" : "testbucket/demo_movie/vertices/",
  "vertexset_format" : "csv",
  "schema_path" : "testbucket/demo_movie/incremental_data_schema.xml",
  "log_dir" : "testbucket/importlogdir",
  "parallel_edge" : {
    "action" : "override",
    "ignore_label" : true
  },
  "delimiter" : ",",
  "trim_quote" : "\"",
  "offline" : false
}

Example Response

Status code: 200

Example response for a successful request

{
  "job_id" : "b4f2e9a0-0439-4edd-a3ad-199bb523b613"
}

Status code: 400

Example response for a failed request

{
  "error_msg" : "parameter format error",
  "error_code" : "GES.8013"
}

Status Codes

Return Value

Description

400 Bad Request

Request error.

401 Unauthorized

Authorization failed.

403 Forbidden

No operation permissions.

404 Not Found

No resources found.

500 Internal Server Error

Internal server error.

503 Service Unavailable

Service unavailable.

Error Codes

See Error Codes.