Instance

Use Spark to perform operations such as data insertion, query, update, incremental query, query at a specific time point, and data deletion on Hudi.

Data Preparation

  • Create the test user on FusionInsight Manager, assign permissions to the user, and download the krb5.conf and user.keytab files to the /opt/huditest/example/ directory.
  • Run the following spark-submit command:

    Scala:

    spark-submit --keytab <user_keytab_path> --principal=<principal_name> --jars /opt/huditest/example/hudi-security-examples-0.8.0.jar --class com.huawei.bigdata.hudi.examples.HoodieDataSourceExample /opt/huditest/example/* hdfs://hacluster/tmp/huditest/example/scala hoodie_rt_scala

    Python:

    spark-submit /opt/huditest/example/HudiPythonExample.py hdfs://hacluster/tmp/huditest/example/python hudi_trips_cow

    Java:

    spark-submit --keytab <user_keytab_path> --principal=<principal_name> --jars /opt/huditest/example/hudi-security-examples-0.8.0.jar --class com.huawei.bigdata.hudi.examples.HoodieWriteClientExample /opt/huditest/example/* hdfs://hacluster/tmp/huditest/example/java hoodie_java