Java Example Code
Scenario
This section provides Java example code that demonstrates how to use a Spark job to access data from the GaussDB(DWS) data source.
A datasource connection has been created and bound to a queue on the DLI management console. For details, see Enhanced Datasource Connections.
Hard-coded or plaintext passwords pose significant security risks. To ensure security, encrypt your passwords, store them in configuration files or environment variables, and decrypt them when needed.
Preparations
- Import dependencies.
- Maven dependency involved
1 2 3 4 5
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.3.2</version> </dependency>
- Import dependency packages.
1
import org.apache.spark.sql.SparkSession;
- Maven dependency involved
- Create a session.
1
SparkSession sparkSession = SparkSession.builder().appName("datasource-dws").getOrCreate();
Accessing a Data Source Through a SQL API
- Create a table to connect to a GaussDB(DWS) data source and set connection parameters.
1
sparkSession.sql("CREATE TABLE IF NOT EXISTS dli_to_dws USING JDBC OPTIONS ('url'='jdbc:postgresql://10.0.0.233:8000/postgres','dbtable'='test','user'='dbadmin','password'='**')");
- Insert data.
1
sparkSession.sql("insert into dli_to_dws values(3,'Liu'),(4,'Xie')");
- Query data.
1
sparkSession.sql("select * from dli_to_dws").show();
Response:
Submitting a Spark Job
- Generate a JAR package based on the code file and upload the package to DLI.
For details about console operations, see Creating a Package. For details about API operations, see Uploading a Package Group.
- In the Spark job editor, select the corresponding dependency module and execute the Spark job.
For details about console operations, see Creating a Spark Job. For details about API operations, see Creating a Batch Processing Job.
- If the Spark version is 2.3.2 (will be offline soon) or 2.4.5, specify the Module to sys.datasource.dws when you submit a job.
- If the Spark version is 3.1.1, you do not need to select a module. Configure Spark parameters (--conf).
spark.driver.extraClassPath=/usr/share/extension/dli/spark-jar/datasource/dws/*
spark.executor.extraClassPath=/usr/share/extension/dli/spark-jar/datasource/dws/*
- For details about how to submit a job on the console, see the description of the table "Parameters for selecting dependency resources" in Creating a Spark Job.
- For details about how to submit a job through an API, see the description of the modules parameter in Table 2 "Request parameters" in Creating a Batch Processing Job.
Complete Example Code
Accessing GaussDB(DWS) tables through SQL APIs
import org.apache.spark.sql.SparkSession; public class java_dws { public static void main(String[] args) { SparkSession sparkSession = SparkSession.builder().appName("datasource-dws").getOrCreate(); sparkSession.sql("CREATE TABLE IF NOT EXISTS dli_to_dws USING JDBC OPTIONS ('url'='jdbc:postgresql://10.0.0.233:8000/postgres','dbtable'='test','user'='dbadmin','password'='**')"); //*****************************SQL model*********************************** //Insert data into the DLI data table sparkSession.sql("insert into dli_to_dws values(3,'Liu'),(4,'Xie')"); //Read data from DLI data table sparkSession.sql("select * from dli_to_dws").show(); //drop table sparkSession.sql("drop table dli_to_dws"); sparkSession.close(); } }
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot