Importing Data
DLI provides an API for importing data. You can use it to import data stored in OBS to a created DLI or OBS table. The example code is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
//Instantiate the importJob object. The input parameters of the constructor include the queue, database name, table name (obtained by instantiating the Table object), and data path.
private static void importData(Queue queue, Table DLITable) throws DLIException {
String dataPath = "OBS Path";
queue = client.getQueue("queueName");
CsvFormatInfo formatInfo = new CsvFormatInfo();
formatInfo.setWithColumnHeader(true);
formatInfo.setDelimiter(",");
formatInfo.setQuoteChar("\"");
formatInfo.setEscapeChar("\\");
formatInfo.setDateFormat("yyyy/MM/dd");
formatInfo.setTimestampFormat("yyyy-MM-dd HH:mm:ss");
String dbName = DLITable.getDb().getDatabaseName();
String tableName = DLITable.getTableName();
ImportJob importJob = new ImportJob(queue, dbName, tableName, dataPath);
importJob.setStorageType(StorageType.CSV);
importJob.setCsvFormatInfo(formatInfo);
System.out.println("start submit import table: " + DLITable.getTableName());
//Call the submit interface of the ImportJob object to submit the data importing job.
importJob.submit(); //Call the getStatus interface of the ImportJob object to query the status of the data importing job.
JobStatus status = importJob.getStatus();
System.out.println("Job id: " + importJob.getJobId() + ", Status : " + status.getName());
}
|
- Before submitting the data importing job, you can set the format of the data to be imported. In the sample code, the setStorageType interface of the ImportJob object is called to set the data storage type to csv. The data format is set by calling the setCsvFormatInfo interface of the ImportJob object.
- Before submitting the data import job, you can set the partition of the data to be imported and whether to overwrite the data. You can call the setPartitionSpec API of the ImportJob object to set the partition information, for example, importJob.setPartitionSpec(new PartitionSpec("part1=value1,part2=value2")). You can also create the partition using parameters when creating the ImportJob object. By default, data is appended to an import job. To overwrite the existing data, call the setOverWrite API of the ImportJob object, for example, importJob.setOverWrite(Boolean.TRUE).
- If a folder and a file under an OBS bucket directory have the same name, data is preferentially loaded to the file, instead of the folder. It is recommended that the files and folders of the same level have different names when you create an OBS object.
Importing the Partition Data
DLI provides an API for importing data. You can use it to import data stored in OBS to a specified partition of the created DLI or OBS table. The example code is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
//Instantiate the importJob object. The input parameters of the constructor include the queue, database name, table name (obtained by instantiating the Table object), and data path.
private static void importData(Queue queue, Table DLITable) throws DLIException {
String dataPath = "OBS Path";
queue = client.getQueue("queueName");
CsvFormatInfo formatInfo = new CsvFormatInfo();
formatInfo.setWithColumnHeader(true);
formatInfo.setDelimiter(",");
formatInfo.setQuoteChar("\"");
formatInfo.setEscapeChar("\\");
formatInfo.setDateFormat("yyyy/MM/dd");
formatInfo.setTimestampFormat("yyyy-MM-dd HH:mm:ss");
String dbName = DLITable.getDb().getDatabaseName();
String tableName = DLITable.getTableName();
PartitionSpec partitionSpec = new PartitionSpec("part1=value1,part2=value2");
Boolean isOverWrite = true;
ImportJob importJob = new ImportJob(queue, dbName, tableName, dataPath, partitionSpec, isOverWrite);
importJob.setStorageType(StorageType.CSV);
importJob.setCsvFormatInfo(formatInfo);
System.out.println("start submit import table: " + DLITable.getTableName());
//Call the submit interface of the ImportJob object to submit the data importing job.
importJob.submit(); //Call the getStatus interface of the ImportJob object to query the status of the data importing job.
JobStatus status = importJob.getStatus();
System.out.println("Job id: " + importJob.getJobId() + ", Status : " + status.getName());
}
|
- When the ImportJob object is created, the partition information PartitionSpec can also be directly transferred as the partition character string.
- If some columns are specified as partition columns during partitionSpec import but the imported data contains only the specified partition information, the unspecified partition columns after data import contain abnormal values such as null.
- In the example, isOverWrite indicates whether to overwrite data. The value true indicates that data is overwritten, and the value false indicates that data is appended. Currently, overwrite is not supported to overwrite the entire table. Only the specified partition can be overwritten. To append data to a specified partition, set isOverWrite to false when creating the import job.
Exporting Data
DLI provides an API for exporting data. You can use it to export data from a DLI table to OBS. The example code is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
//Instantiate the ExportJob object and transfer the queue, database name, table name (obtained by instantiating the Table object), and storage path of the exported data. The table type must be MANAGED.
private static void exportData(Queue queue, Table DLITable) throws DLIException {
String dataPath = "OBS Path";
queue = client.getQueue("queueName");
String dbName = DLITable.getDb().getDatabaseName();
String tableName = DLITable.getTableName();
ExportJob exportJob = new ExportJob(queue, dbName, tableName, dataPath);
exportJob.setStorageType(StorageType.CSV);
exportJob.setCompressType(CompressType.GZIP);
exportJob.setExportMode(ExportMode.ERRORIFEXISTS);
System.out.println("start export DLI Table data...");
// Call the submit interface of the ExportJob object to submit the data exporting job.
exportJob.submit();
// Call the getStatus interface of the ExportJob object to query the status of the data exporting job.
JobStatus status = exportJob.getStatus();
System.out.println("Job id: " + exportJob.getJobId() + ", Status : " + status.getName());
}
|
- Before submitting the data exporting job, you can optionally set the data format, compression type, and export mode. In the preceding sample code, the setStorageType, setCompressType, and setExportMode interfaces of the ExportJob object are called to set the data format, compression type, and export mode, respectively. The setStorageType interface supports only the CSV format.
- If a folder and a file under an OBS bucket directory have the same name, data is preferentially loaded to the file, instead of the folder. It is recommended that the files and folders of the same level have different names when you create an OBS object.
Submitting a Job
DLI provides APIs for submitting and querying jobs. You can submit a job by calling the API. You can also call the API to query the job result. The example code is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
//Instantiate the SQLJob object and construct input parameters for executing SQL, including the queue, database name, and SQL statements.
private static void runSqlJob(Queue queue, Table obsTable) throws DLIException {
String sql = "select * from " + obsTable.getTableName();
String queryResultPath = "OBS Path";
SQLJob sqlJob = new SQLJob(queue, obsTable.getDb().getDatabaseName(), sql);
System.out.println("start submit SQL job...");
// Call the submit interface of the SQLJob object to submit the querying job.
sqlJob.submit();
// Call the getStatus interface of the SQLJob object to query the status of the querying job.
JobStatus status = sqlJob.getStatus();
System.out.println(status);
System.out.println("start export Result...");
//Call the exportResult interface of the SQLJob object to export the query result. queryResultPath refers to the path of the data to be exported.
sqlJob.exportResult(queryResultPath, StorageType.CSV,
CompressType.GZIP, ExportMode.ERRORIFEXISTS, null);
System.out.println("Job id: " + sqlJob.getJobId() + ", Status : " + status.getName());
}
|
Canceling a Job
DLI provides an API for canceling jobs. You can use it to cancel all jobs in the
Launching or
Running state. The following sample code is used for canceling jobs in the
Launching state:
|
private static void cancelSqlJob(DLIClient client) throws DLIException {
List<JobResultInfo> jobResultInfos = client.listAllJobs(JobType.QUERY);
for (JobResultInfo jobResultInfo : jobResultInfos) {
//Cancel jobs in the LAUNCHING state.
if (JobStatus.LAUNCHING.equals(jobResultInfo.getJobStatus())) {
//Cancel the job of a specific job ID.
client.cancelJob(jobResultInfo.getJobId());
}
}
}
|
Querying All Jobs
DLI provides an API for querying jobs. You can use it to query all jobs of the current project. The example code is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
private static void listAllSqlJobs(DLIClient client) throws DLIException {
//Return the collection of JobResultInfo lists.
List < JobResultInfo > jobResultInfos = client.listAllJobs();
//Traverse the JobResultInfo lists to view job information.
for (JobResultInfo jobResultInfo: jobResultInfos) {
//job id
System.out.println(jobResultInfo.getJobId());
//Job description
System.out.println(jobResultInfo.getDetail());
//job status
System.out.println(jobResultInfo.getJobStatus());
//job type
System.out.println(jobResultInfo.getJobType());
}
//Filter the query result by job type.
List < JobResultInfo > jobResultInfos1 = client.listAllJobs(JobType.DDL);
//Filter the query result by job type and start time that is in the Unix timestamp format.
List < JobResultInfo > jobResultInfos2 = client.listAllJobs(1502349803729L, 1502349821460L, JobType.DDL);
//Filter the query result by page.
List < JobResultInfo > jobResultInfos3 = client.listAllJobs(100, 1, JobType.DDL);
//Filter the query result by page, start time, and job type.
List < JobResultInfo > jobResultInfos4 = client.listAllJobs(100, 1, 1502349803729L, 1502349821460L, JobType.DDL);
// Use Tags to query jobs that meet the conditions.
JobFilter jobFilter = new JobFilter();
jobFilter.setTags("workspace=space002,jobName=name002");
List < JobResultInfo > jobResultInfos1 = client.listAllJobs(jobFilter);
// Use Tags to query target jobs of a specified page.
JobFilter jobFilter = new JobFilter();
jobFilter.setTags("workspace=space002,jobName=name002");
jobFilter.setPageSize(100);
jobFilter.setCurrentPage(0);
List < JobResultInfo > jobResultInfos1 = client.listJobsByPage(jobFilter);
}
|
Querying Job Results
DLI provides an API for querying job results. You can use it to query information about a job of the specific job ID. The example code is as follows:
|
private static void getJobResultInfo(DLIClient client) throws DLIException {
String jobId = "4c4f7168-5bc4-45bd-8c8a-43dfc85055d0";
JobResultInfo jobResultInfo = client.queryJobResultInfo(jobId);
//View information about a job.
System.out.println(jobResultInfo.getJobId());
System.out.println(jobResultInfo.getDetail());
System.out.println(jobResultInfo.getJobStatus());
System.out.println(jobResultInfo.getJobType());
}
|
Querying Jobs of the SQL Type
DLI provides an API for querying SQL jobs. You can use it to query information about recently executed jobs submitted using SQL statements in the current project. The example code is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
private static void getJobResultInfos(DLIClient client) throws DLIException {
//Return the collection of JobResultInfo lists.
List<JobResultInfo> jobResultInfos = client.listSQLJobs();
//Traverse the list to view job information.
for (JobResultInfo jobResultInfo : jobResultInfos) {
//job id
System.out.println(jobResultInfo.getJobId());
//Job description
System.out.println(jobResultInfo.getDetail());
//job status
System.out.println(jobResultInfo.getJobStatus());
//job type
System.out.println(jobResultInfo.getJobType());
}
// Use Tags to query SQL jobs that meet the conditions.
JobFilter jobFilter = new JobFilter();
jobFilter.setTags("workspace=space002,jobName=name002");
List < JobResultInfo > jobResultInfos1 = client.listAllSQLJobs(jobFilter);
// Use Tags to query target SQL jobs of a specified page.
JobFilter jobFilter = new JobFilter();
jobFilter.setTags("workspace=space002,jobName=name002");
jobFilter.setPageSize(100);
jobFilter.setCurrentPage(0);
List < JobResultInfo > jobResultInfos1 = client.listSQLJobsByPage(jobFilter);
}
|
Exporting Query Results
DLI provides an API for exporting query results. You can use the API to export the query job result submitted in the editing box of the current project. The example code is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
//Instantiate the SQLJob object and construct input parameters for executing SQL, including the queue, database name, and SQL statements.
private static void exportSqlResult(Queue queue, Table obsTable) throws DLIException {
String sql = "select * from " + obsTable.getTableName();
String queryResultPath = "OBS Path";
SQLJob sqlJob = new SQLJob(queue, obsTable.getDb().getDatabaseName(), sql);
System.out.println("start submit SQL job...");
//Call the submit interface of the SQLJob object to submit the querying job.
sqlJob.submit();
//Call the getStatus interface of the SQLJob object to query the status of the querying job.
JobStatus status = sqlJob.getStatus();
System.out.println(status);
System.out.println("start export Result...");
//Call the exportResult interface of the SQLJob object to export the query result. exportPath indicates the path for exporting data. JSON indicates the export format. queueName indicates the queue for executing the export job. limitNum indicates the number of results of the export job. 0 indicates that all data is exported.
sqlJob.exportResult(exportPath + "result", StorageType.JSON, CompressType.NONE,
ExportMode.ERRORIFEXISTS, queueName, true, 5);
}
|
Previewing Job Results
DLI provides an API for previewing job results. You can call this API to obtain the first 1000 records in the result set.
// Initialize a SQLJob object and pass the queue, database name, and SQL statement to execute the SQL.
private static void getPreviewJobResult(Queue queue, Table obsTable) throws DLIException {
String sql = "select * from " + obsTable.getTableName();
SQLJob sqlJob = new SQLJob(queue, obsTable.getDb().getDatabaseName(), sql);
System.out.println("start submit SQL job...");
// Call the submit method on the SQLJob object.
sqlJob.submit();
// Call the previewJobResult method on the SQLJob object to query the first 1000 records in the result set.
List<Row> rows = sqlJob.previewJobResult();
if (rows.size() > 0) {
Integer value = rows.get(0).getInt(0);
System.out.println("Obtain the data value in the first column at the first row." + value);
}
System.out.println("Job id: " + sqlJob.getJobId() + ", previewJobResultSize : " + rows.size());
}
Deprecated API
The getJobResult method has been discarded. You can call DownloadJob instead to obtain the job result.
For details about the DownloadJob method, obtain the dli-sdk-java-x.x.x.zip package by referring to section Downloading SDK and decompress the package.