Creating and Submitting a Spark SQL Job
You can use DLI to submit a Spark SQL job to query data. The general procedure is as follows:
Step 1: Logging in to the Cloud Platform
Step 3: Logging In to the DLI Management Console
The following illustrates how to query OBS data using DLI. Operations to query DLI data are similar.
Step 1: Logging in to the Cloud Platform
- Open the DLI homepage.
- On the login page, enter the username and password, and click Log In.
Step 2: Uploading Data to OBS
DLI allows you to query data stored on OBS. Before querying the data, you need to upload data to OBS.
- In the services displayed, click Object Storage Service (OBS) in Storage.
- The OBS console page is displayed.
- Create a bucket. The bucket name must be globally unique. In this example, assume that the bucket name is obs1.
- Click Create Bucket in the upper right corner.
- On the displayed Create Bucket page, enter the Bucket Name. Retain the default values for other parameters or set them as required.
- Click Create Now.
- Click obs1 to switch to the Overview page.
- In the left navigation pane, click Objects. Click Upload Object. In the displayed dialog box, drag files or folders to upload or add file, for example, sampledata.csv to the file upload box. Then, click Upload.
You can create a sampledata.txt file, copy the following content separated by commas (,), and save the file as sampledata.csv.
12,test
After the file is uploaded successfully, the file path is obs://obs1/sampledata.csv.
For more information about OBS operations, see the Object Storage Service Console Operation Guide.
For more information about the tool, see the OBS Tool Guide.
You are advised to use an OBS tool, such as OBS Browser+, to upload large files because OBS Console has restrictions on the file size and quantity.
- OBS Browser+ is a graphical tool that provides complete functions for managing your buckets and objects in OBS.
Step 4: Creating a Queue
A queue is the basis for using DLI. Before executing an SQL job, you need to create a queue.
- An available queue default is preset in DLI.
- You can also create queues as needed.
- On the DLI management console, click SQL Editor in the navigation pane on the left. The SQL Editor page is displayed.
- On the left pane, select the Queues tab, and click next to Queues.
For details, see Creating a Queue.
Step 5: Creating a Database
Before querying data, create a database, for example, db1.
The default database is a built-in database. You cannot create the database named default.
- On the DLI management console, click SQL Editor in the navigation pane on the left. The SQL Editor page is displayed.
- In the editing window on the right of the SQL Editor page, enter the following SQL statement and click Execute. Read and agree to the privacy agreement, and click OK.
create database db1;
After database db1 is successfully created, db1 will be displayed in the Database list.
When you execute a query on the DLI management console for the first time, you need to read the privacy agreement. You can perform operations only after you agree to the agreement. For later queries, you will not need to read the privacy agreement again.
Step 6: Creating a Table
After database db1 is created, create a table (for example, table1) containing data in the sample file obs://obs1/sampledata.csv stored on OBS in db1.
- In the SQL editing window of the SQL Editor page, select the default queue and database db1.
- Enter the following SQL statement in the job editor window and click Execute:
create table table1 (id int, name string) using csv options (path 'obs://obs1/sampledata.csv');
After the table is created, click the Databases tab and then select db1. The created table table1 is displayed in the Table area.
Step 7: Querying Data
After performing the preceding steps, you can start querying data.
- In the Table tab on the SQL Editor page, double-click the created table table1. The SQL statement is automatically displayed in the SQL job editing window in the right pane. Run following statement to query 1,000 records in the table1 table:
select * from db1.table1 limit 1000;
- Click Execute. The system starts the query.
After the SQL statement is executed successfully, you can view the query result in View Result under the SQL job editing window.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot