Updated on 2024-05-07 GMT+08:00

Verifying that Jupyter Notebook Can Access MRS

  1. Run the following command on the client node to start Jupyter Notebook:

    PYSPARK_PYTHON=./Python/bin/python3 PYSPARK_DRIVER_PYTHON=jupyter-notebook PYSPARK_DRIVER_PYTHON_OPTS="--allow-root" pyspark --master yarn --executor-memory 2G --driver-memory 1G

  1. Use EIP:9999 to log in to the Jupyter web UI (ensure that the ECS security group allows the local public IP address and port 9999). The login password is the password configured in 2.

  2. Create code.

    Create a Python 3 task and use Spark to read files.

    The result is as follows:

    Log in to FusionInsight Manager and view the submitted PySpark application on the YARN web UI.

  3. Verify that the pandas library can be called.