Help Center/ MapReduce Service/ Troubleshooting/ Using Kafka/ Using Python3.x to Connect to Kafka in a Security Cluster
Updated on 2023-11-30 GMT+08:00

Using Python3.x to Connect to Kafka in a Security Cluster

Issue

The user does not know how to connect to a Kafka cluster with Kerberos authentication enabled in the Python3.x environment.

Symptom

The user needs an operation guide to connect to the Kafka cluster with Kerberos authentication enabled in the Python3.x environment.

Procedure

  1. Log in to the master node and run the following command to configure the Huawei Cloud EulerOS image source:

    wget http://mirrors.myhuaweicloud.com/repo/mirrors_source.sh && sh mirrors_source.sh

  2. Run the following commands to compile Python3.x:

    yum groupinstall "Development tools" -y

    yum -y install zlib zlib-devel

    yum -y install bzip2 bzip2-devel

    yum -y install ncurses ncurses-devel

    yum -y install readline readline-devel

    yum -y install openssl openssl-devel

    yum -y install openssl-static

    yum -y install xz lzma xz-devel

    yum -y install sqlite sqlite-devel

    yum -y install gdbm gdbm-devel

    yum -y install tk tk-devel

    yum -y install libffi libffi-devel

  3. After the compilation is successful, run the following commands to download and decompress the .tgz package of Python3.x:

    wget https://www.python.org/ftp/python/3.6.7/Python-3.6.7.tgz

    tar -zxvf Python-3.6.7.tgz

    cd Python-3.6.7

    You can also download the .tgz package of Python3.x from the Python official website. Python-3.6.X is recommended. In version 3.7, the take function of RDD cannot be used.

  4. Run the following commands to configure Python3.x, and compile and install it in the /opt/Bigdata/python3 directory:

    ./configure --prefix=/opt/Bigdata/python3 --enable-shared CFLAGS=-fPIC

    make && make install

    The installation directory can be customized.

  5. Run the following commands to configure Python3.x variables:

    echo "/opt/Bigdata/python3/lib" >> /etc/ld.so.conf

    ldconfig

    ln -s /opt/Bigdata/python3/bin/python3 /usr/bin/python3

    ln -s /opt/Bigdata/python3/bin/pip3 /usr/bin/pip3

    The variable directory must be the same as the installation directory specified in 4.

  6. After the configuration is successful, run the following commands to install Kafka in the Python3.x environment:

    cp /usr/include/gssapi/* /home/omm/kerberos/include/gssapi/

    pip3 install kafka-python

    pip3 install gssapi

  7. After the installation is successful, run the following command to configure environment variables:

    source Client installation directory/bigdata_env

  8. Run the following command to authenticate the current user:

    kinit Kafka user

    The Kafka user is the one who logs in to Manager. This user must have the permissions of the Kafka user group.

  9. Run the Python3.x script.

    Sample script:

    producer:
    from kafka import KafkaProducer
    producer = KafkaProducer(bootstrap_servers=["broker_ip:21007"],
    security_protocol="SASL_PLAINTEXT",
    sasl_mechanism="GSSAPI",
    sasl_kerberos_service_name="kafka",
    sasl_kerberos_domain_name="hadoop.hadoop.com")
    for _ in range(100):
    response = producer.send("test-topic", b"testmessage")
    result = response.get(timeout=50)
    print(result)
    
    consumer:
    from kafka import KafkaConsumer
    consumer = KafkaConsumer("test-topic",
    bootstrap_servers=["broker_ip:21007"],
    group_id="test-group",
    enable_auto_commit="true",
    security_protocol="SASL_PLAINTEXT",
    sasl_mechanism="GSSAPI",
    sasl_kerberos_service_name="kafka",
    sasl_kerberos_domain_name="hadoop.hadoop.com")
    for message in consumer:
    print(message)