Help Center> Data Lake Insight> FAQs> Problems Related to Spark Jobs> Job Development> How Do I Use Python Scripts to Access the MySQL Database If the pymysql Module Is Missing from the Spark Job Results Stored in MySQL?
Updated on 2023-05-19 GMT+08:00

How Do I Use Python Scripts to Access the MySQL Database If the pymysql Module Is Missing from the Spark Job Results Stored in MySQL?

  1. If the pymysql module is missing, check whether the corresponding EGG package exists. If the package does not exist, upload the pyFile package on the Package Management page. The procedure is as follows:
    1. Upload the egg package to the specified OBS path.
    2. Log in to the DLI management console and choose Data Management > Package Management.
    3. On the Package Management page, click Create Package in the upper right corner to create a package.
    4. In the Create Package dialog, set the following parameters:
      • Type: Select PyFile.
      • OBS Path: Select the OBS path where the egg package is stored.
      • Set Group and Group Name as you need.
    5. Click OK.
    6. On the Spark job editing page where the error is reported, choose the uploaded egg package from the Python File Dependencies drop-down list and run the Spark job again.
  2. To interconnect PySpark jobs with MySQL, you need to create a datasource connection to enable the network between DLI and RDS.

    For details about how to create a datasource connection on the management console, see "Enhanced Datasource Connections" in Data Lake Insight User Guide.

    For details about how to call an API to create a datasource connection, see "Creating an Enhanced Datasource Connection" in Data Lake Insight API Reference.

Job Development FAQs

more