El contenido no se encuentra disponible en el idioma seleccionado. Estamos trabajando continuamente para agregar más idiomas. Gracias por su apoyo.

Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
Help Center/ Data Lake Insight/ Developer Guide/ Flink Jobs/ Flink Job Agencies/ Flink Jar Jobs Using DEW to Acquire Access Credentials for Reading and Writing Data from and to OBS

Flink Jar Jobs Using DEW to Acquire Access Credentials for Reading and Writing Data from and to OBS

Updated on 2024-09-20 GMT+08:00

Scenario

To write the output data of a Flink Jar job to OBS, AK/SK is required for accessing OBS. To ensure the security of AK/SK data, you can use Data Encryption Workshop (DEW) and Cloud Secret Management Service (CSMS) for unified management of AK/SK, effectively avoiding sensitive information leakage and business risks caused by hard-coded or plaintext configuration of programs.

This section walks you through on how a Flink Jar job acquires an AK/SK to read and write data from and to OBS.

Prerequisites

  • A shared secret has been created on the DEW console and the secret value has been stored. For details, see Creating a Shared Secret.
  • An agency has been created and authorized for DLI to access DEW. The agency must have been granted the following permissions:
    • Permission of the ShowSecretVersion interface for querying secret versions and secret values in DEW: csms:secretVersion:get.
    • Permission of the ListSecretVersions interface for listing secret versions in DEW: csms:secretVersion:list.
    • Permission to decrypt DEW secrets: kms:dek:decrypt

    For details about agency permission examples, see Customizing DLI Agency Permissions and Agency Permission Policies in Common Scenarios.

  • DEW can be used to manage access credentials only in Flink 1.15. When creating a Flink job, select version 1.15 and configure the information of the agency that allows DLI to access DEW for the job. For details about how to create a custom agency and configure it, see Customizing DLI Agency Permissions.
  • To use this function, you need to configure AK/SK for all OBS buckets.

Syntax

On the Flink Jar job editing page, set Runtime Configuration as needed. The configuration information is as follows:

Different OBS buckets use different AK/SK authentication information. You can use the following configuration method to specify the AK/SK information based on the bucket. For details about the parameters, see Table 1.

flink.hadoop.fs.obs.bucket.USER_BUCKET_NAME.dew.access.key=USER_AK_CSMS_KEY
flink.hadoop.fs.obs.bucket.USER_BUCKET_NAME.dew.secret.key=USER_SK_CSMS_KEY
flink.hadoop.fs.obs.security.provider=com.dli.provider.UserObsBasicCredentialProvider
flink.hadoop.fs.dew.csms.secretName=CredentialName
flink.hadoop.fs.dew.endpoint=ENDPOINT
flink.hadoop.fs.dew.csms.version=VERSION_ID
flink.hadoop.fs.dew.csms.cache.time.second=CACHE_TIME
flink.dli.job.agency.name=USER_AGENCY_NAME

Parameter Description

Table 1 Parameters

Parameter

Mandatory

Default Value

Data Type

Description

flink.hadoop.fs.obs.bucket.USER_BUCKET_NAME.dew.access.key

Yes

None

String

USER_BUCKET_NAME needs to be replaced with the user's OBS bucket name.

The value of this parameter is the key defined by the user in the CSMS shared secret. The value corresponding to the key is the user's access key ID (AK). The user must have the permission to access the bucket on OBS.

flink.hadoop.fs.obs.bucket.USER_BUCKET_NAME.dew.secret.key

Yes

None

String

USER_BUCKET_NAME needs to be replaced with the user's OBS bucket name.

The value of this parameter is the key defined by the user in the CSMS shared secret. The value corresponding to the key is the user's secret access key (SK). The user must have the permission to access the bucket on OBS.

flink.hadoop.fs.obs.security.provider

Yes

None

String

OBS AK/SK authentication mechanism, which uses DEW-CSMS' secret management to obtain the AK and SK for accessing OBS.

The default value is com.dli.provider.UserObsBasicCredentialProvider.

flink.hadoop.fs.dew.endpoint

Yes

None

String

Endpoint of the DEW service to be used.

See Regions and Endpoints.

Configuration example: flink.hadoop.fs.dew.endpoint=kms.cn-xxxx.myhuaweicloud.com

flink.hadoop.fs.dew.projectId

No

Yes

String

ID of the project DEW belongs to. The default value is the ID of the project where the Flink job is.

See Obtaining a Project ID.

flink.hadoop.fs.dew.csms.secretName

Yes

None

String

Name of the shared secret in DEW's secret management.

Configuration example: flink.hadoop.fs.dew.csms.secretName=secretInfo

flink.hadoop.fs.dew.csms.version

No

Latest version

String

Version number (certificate version identifier) of the shared secret in DEW's secret management.

If not specified, the latest version of the shared secret is obtained by default.

Configuration example: flink.hadoop.fs.dew.csms.version=v1

flink.hadoop.fs.dew.csms.cache.time.second

No

3600

Long

Cache duration after the CSMS shared secret is obtained during Flink job access.

The unit is second. The default value is 3600 seconds.

flink.dli.job.agency.name

Yes

-

String

Custom agency name.

Sample Code

This section describes how to write processed DataGen data to OBS. You need to modify the parameters in the sample Java code based on site requirements.

  1. Create an agency for DLI to access DEW and complete authorization. For details, see Customizing DLI Agency Permissions.
  2. Create a shared secret in DEW. For details, see Creating a Shared Secret.
    1. Log in to the DEW management console.
    2. In the navigation pane on the left, choose Cloud Secret Management Service > Secrets.
    3. On the displayed page, click Create Secret. Set basic secret information.
  3. Set job parameters on the DLI Flink Jar job editing page.
    • Class Name
      com.dli.demo.dew.DataGen2FileSystemSink
    • Class Arguments
      --checkpoint.path obs://test/flink/jobs/checkpoint/120891/ 
      --output.path obs://dli/flink.db/79914/DataGen2FileSystemSink
    • Runtime Configuration
      flink.hadoop.fs.obs.bucket.USER_BUCKET_NAME.dew.access.key=USER_AK_CSMS_KEY
      flink.hadoop.fs.obs.bucket.USER_BUCKET_NAME.dew.secret.key=USER_SK_CSMS_KEY
      flink.hadoop.fs.obs.security.provider=com.dli.provider.UserObsBasicCredentialProvider
      flink.hadoop.fs.dew.csms.secretName=obsAksK
      flink.hadoop.fs.dew.endpoint=kmsendpoint
      flink.hadoop.fs.dew.csms.version=v6
      flink.hadoop.fs.dew.csms.cache.time.second=3600
      flink.dli.job.agency.name=***
  4. Flink Jar job example.
    • Environment preparation

      Development tools such as IntelliJ IDEA and other development tools, JDK, and Maven have been installed and configured.

      Dependency package in POM file configurations

       <properties>
              <flink.version>1.15.0</flink.version>
          </properties>
      
          <dependencies>
              <dependency>
                  <groupId>org.apache.flink</groupId>
                  <artifactId>flink-statebackend-rocksdb</artifactId>
                  <version>${flink.version}</version>
                  <scope>provided</scope>
              </dependency>
      
              <dependency>
                  <groupId>org.apache.flink</groupId>
                  <artifactId>flink-streaming-java</artifactId>
                  <version>${flink.version}</version>
                  <scope>provided</scope>
              </dependency>
      
              <!-- fastjson -->
              <dependency>
                  <artifactId>fastjson</artifactId>
                  <version>2.0.15</version>
              </dependency>
          </dependencies>
    • Sample code
      package com.huawei.dli.demo.dew;
      
      import org.apache.flink.api.common.serialization.SimpleStringEncoder;
      import org.apache.flink.api.java.utils.ParameterTool;
      import org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend;
      import org.apache.flink.core.fs.Path;
      import org.apache.flink.streaming.api.datastream.DataStream;
      import org.apache.flink.streaming.api.environment.CheckpointConfig;
      import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
      import org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink;
      import org.apache.flink.streaming.api.functions.sink.filesystem.rollingpolicies.OnCheckpointRollingPolicy;
      import org.apache.flink.streaming.api.functions.source.ParallelSourceFunction;
      import org.slf4j.Logger;
      import org.slf4j.LoggerFactory;
      
      import java.time.LocalDateTime;
      import java.time.ZoneOffset;
      import java.time.format.DateTimeFormatter;
      import java.util.Random;
      
      public class DataGen2FileSystemSink {
          private static final Logger LOG = LoggerFactory.getLogger(DataGen2FileSystemSink.class);
      
          public static void main(String[] args) {
              ParameterTool params = ParameterTool.fromArgs(args);
              LOG.info("Params: " + params.toString());
              try {
                  StreamExecutionEnvironment streamEnv = StreamExecutionEnvironment.getExecutionEnvironment();
      
                  // set checkpoint
                  String checkpointPath = params.get("checkpoint.path", "obs://bucket/checkpoint/jobId_jobName/");
                  LocalDateTime localDateTime = LocalDateTime.ofEpochSecond(System.currentTimeMillis() / 1000,
                      0, ZoneOffset.ofHours(8));
                  String dt = localDateTime.format(DateTimeFormatter.ofPattern("yyyyMMdd_HH:mm:ss"));
                  checkpointPath = checkpointPath + dt;
      
                  streamEnv.setStateBackend(new EmbeddedRocksDBStateBackend());
                  streamEnv.getCheckpointConfig().setCheckpointStorage(checkpointPath);
                  streamEnv.getCheckpointConfig().setExternalizedCheckpointCleanup(
                      CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
                  streamEnv.enableCheckpointing(30 * 1000);
      
                  DataStream<String> stream = streamEnv.addSource(new DataGen())
                      .setParallelism(1)
                      .disableChaining();
      
                  String outputPath = params.get("output.path", "obs://bucket/outputPath/jobId_jobName");
      
                  // Sink OBS
                  final StreamingFileSink<String> sinkForRow = StreamingFileSink
                      .forRowFormat(new Path(outputPath), new SimpleStringEncoder<String>("UTF-8"))
                      .withRollingPolicy(OnCheckpointRollingPolicy.build())
                      .build();
      
                  stream.addSink(sinkForRow);
      
                  streamEnv.execute("sinkForRow");
              } catch (Exception e) {
                  LOG.error(e.getMessage(), e);
              }
          }
      }
      
      class DataGen implements ParallelSourceFunction<String> {
      
          private boolean isRunning = true;
      
          private Random random = new Random();
      
          @Override
          public void run(SourceContext<String> ctx) throws Exception {
              while (isRunning) {
                  JSONObject jsonObject = new JSONObject();
                  jsonObject.put("id", random.nextLong());
                  jsonObject.put("name", "Molly" + random.nextInt());
                  jsonObject.put("address", "hangzhou" + random.nextInt());
                  jsonObject.put("birthday", System.currentTimeMillis());
                  jsonObject.put("city", "hangzhou" + random.nextInt());
                  jsonObject.put("number", random.nextInt());
                  ctx.collect(jsonObject.toJSONString());
                  Thread.sleep(1000);
              }
          }
      
          @Override
          public void cancel() {
              isRunning = false;
          }
      }

Utilizamos cookies para mejorar nuestro sitio y tu experiencia. Al continuar navegando en nuestro sitio, tú aceptas nuestra política de cookies. Descubre más

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback