Updated on 2022-11-18 GMT+08:00

Instance

Use Spark to perform operations such as data insertion, query, update, incremental query, query at a specific time point, and data deletion on Hudi.

Data Preparation

  • Create the test user on FusionInsight Manager, assign permissions to the user, and download the krb5.conf and user.keytab files to the /opt/example/ directory.
  • Run the following spark-submit command:

    Java:

    spark-submit --keytab <user_keytab_path> --principal=<principal_name> --class com.huawei.bigdata.hudi.examples.HoodieWriteClientExample /opt/example/hudi-java-security-examples-1.0.jar hdfs://hacluster/tmp/example/hoodie_java hoodie_java

    Scala:

    spark-submit --keytab <user_keytab_path> --principal=<principal_name> --class com.huawei.bigdata.hudi.examples.HoodieDataSourceExample /opt/example/hudi-scala-security-examples-1.0.jar hdfs://hacluster/tmp/example/hoodie_scala hoodie_scala

    Python:

    spark-submit /opt/example/HudiPythonExample.py hdfs://hacluster/tmp/huditest/example/python hudi_trips_cow