El contenido no se encuentra disponible en el idioma seleccionado. Estamos trabajando continuamente para agregar más idiomas. Gracias por su apoyo.

Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

gs_loader

Updated on 2024-06-07 GMT+08:00

Overview

  • gs_loader is used to import data. gs_loader converts the syntax supported by the control file to the \COPY syntax, uses the existing \COPY function to import data, and records the \COPY result in logs.
  • Before using gs_loader, ensure that the gs_loader version is consistent with the gsql version and database version.

Log Level Configuration

Set the log level for developers to view. After the setting, the tool running information is printed on the console.

export gs_loader_log_level=debug
export gs_loader_log_level=info
export gs_loader_log_level=warning
export gs_loader_log_level=error

Permission

The application scenarios are classified into separation-of-duties and non-separation-of-duties scenarios. You can set enableSeparationOfDuty to on or off to enable or disable the separation of duties function.

  • If enableSeparationOfDuty is set to off:

    The user can be a common user or the database administrator. If the user is a common user, the administrator needs to grant permissions to the common user. The administrator account can be used directly.

    Create a user.

    CREATE USER load_user WITH PASSWORD '************';

    Create related tables and grant permissions.

    GRANT ALL ON FUNCTION copy_error_log_create() TO load_user;
    GRANT ALL ON SCHEMA public TO load_user;
    SELECT copy_error_log_create();
    SELECT copy_summary_create();
    GRANT ALL PRIVILEGES ON  public.pgxc_copy_error_log To load_user;
    GRANT ALL PRIVILEGES ON  public.gs_copy_summary To load_user;
  • If enableSeparationOfDuty is set to on:

    The user can be a common user or the database administrator. Create the pgxc_copy_error_log and gs_copy_summary tables in their respective schemas and add indexes. No permission granting is required.

    Create a user.

    CREATE USER load_user WITH PASSWORD '********';

    Create related tables and add indexes.

    CREATE TABLE load_user.pgxc_copy_error_log (relname varchar, begintime timestamptz, filename varchar, lineno int8, rawrecord text, detail text);
    CREATE INDEX copy_error_log_relname_idx ON load_user.pgxc_copy_error_log(relname);
    CREATE TABLE load_user.gs_copy_summary(relname varchar, begintime timestamptz, endtime timestamptz, id bigint, pid bigint, readrows bigint, skiprows bigint, loadrows bigint, errorrows bigint, whenrows bigint, allnullrows bigint, detail text);
    CREATE INDEX gs_copy_summary_idx ON load_user.gs_copy_summary(id);

Usage Environment

You need to add the tool path to PATH. gs_loader supports SSL encrypted communication. The method of using gs_loader is the same as that of using gsql. For details, refer to the Administrator Guide.

Adding System Catalogs

The gs_copy_summary table is added to record the COPY execution result summary, including the number of successful rows, number of error rows, number of ignored rows, and number of empty rows.

The copy_summary_create function is added to create the gs_copy_summary table.

The format of the gs_copy_summary table is as follows:

relname     | public.sqlldr_tbl
begintime   | 2021-09-03 16:00:11.7129-04
endtime     | 2021-09-03 16:00:15.259908-04
id          | 21870
pid         | 47582725060352
readrows    | 100000
skiprows    | 0
loadrows    | 111
errorrows   | 0
whenrows    | 99889
allnullrows | 0
detail      | 111 Rows successfully loaded.
            | 0 Rows not loaded due to data errors.
            | 99889 Rows not loaded because all WHEN clauses were failed.
            | 0 Rows not loaded because all fields were null.
            | 

Usage Guidelines

  1. Create a user and related tables, and add indexes.

    • If the separation of duties function is disabled for common users only:

      Create a user.

      CREATE USER load_user WITH PASSWORD '************';

      Create related tables and grant permissions.

      GRANT ALL ON FUNCTION copy_error_log_create() TO load_user;
      GRANT ALL ON SCHEMA public TO load_user;
      SELECT copy_error_log_create();
      SELECT copy_summary_create();
      GRANT ALL PRIVILEGES ON  public.pgxc_copy_error_log To load_user;
      GRANT ALL PRIVILEGES ON  public.gs_copy_summary To load_user;
    • If the separation of duties function is enabled for common users and administrators:

      Create a user.

      CREATE USER load_user WITH PASSWORD '********';

      Create related tables and add indexes.

      CREATE TABLE load_user.pgxc_copy_error_log (relname varchar, begintime timestamptz, filename varchar, lineno int8, rawrecord text, detail text);
      CREATE INDEX copy_error_log_relname_idx ON load_user.pgxc_copy_error_log(relname);
      CREATE TABLE load_user.gs_copy_summary(relname varchar, begintime timestamptz, endtime timestamptz, id bigint, pid bigint, readrows bigint, skiprows bigint, loadrows bigint, errorrows bigint, whenrows bigint, allnullrows bigint, detail text);
      CREATE INDEX gs_copy_summary_idx ON load_user.gs_copy_summary(id);

  2. Create a table and a control file, and prepare a data file.

    Create the loader_tbl table.

    1
    2
    3
    4
    5
    6
    7
    CREATE TABLE  loader_tbl
    (
        ID   NUMBER,
        NAME VARCHAR2(20),
        CON  VARCHAR2(20),
        DT   DATE
    );
    

    Create the loader.ctl control file.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    LOAD DATA
    truncate into table loader_tbl
    WHEN (2:2) = ',' 
    fields terminated by ','
    trailing nullcols
    (
        id integer external,
        name char(32),
        con ":id || '-' || :name",
        dt date
    )
    

    Create the data.csv data file.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    1,OK,,2007-07-8
    2,OK,,2008-07-8
    3,OK,,2009-07-8
    4,OK,,2007-07-8
    43,DISCARD,,2007-07-8
    ,,,
    32,DISCARD,,2007-07-8
    a,ERROR int,,2007-07-8
    8,ERROR date,,2007-37-8
    ,,,,
     ,
    8,ERROR fields,,2007-37-8
    ,,,
    5,OK,,2021-07-30
    

  3. Import the data.

    Before importing data, ensure that the gs_loader tool has the execute permission. Ensure that the current path has the write permission on files. (The gs_loader generates some temporary files during the processing and automatically deletes them after the import is complete.)

    1
    gs_loader control=loader.ctl data=data.csv db=testdb bad=loader.bad errors=5 port=8000 passwd=************ user=load_user
    

    Execution result:

    1
    2
    3
    4
    5
    6
    gs_loader: version 0.1
     
     10 Rows successfully loaded. 
     
    log file is: 
     loader.log
    

Parameter Description

Table 1 gs_loader parameter description

Parameter

Parameter Description

Parameter Type: Value Range

help

Displays help information.

-

user

Database connection user (equivalent to -U).

String

-U

Database connection user (equivalent to user).

String

passwd

User password (equivalent to -W).

String

-W

User password (equivalent to passwd).

String

db

Database name. This parameter is mandatory and is equivalent to -d.

String

-d

Database name. This parameter is mandatory and is equivalent to db.

String

host

Specifies the host name of the machine on which the server is running or the directory for the Unix-domain socket (equivalent to -h).

See the gsql --host parameter.

-h

Specifies the host name of the machine on which the server is running or the directory for the Unix-domain socket (equivalent to host).

See the gsql --host parameter.

port

Port number of the database server (equivalent to -p).

See the gsql --port parameter.

-p

Port number of the database server (equivalent to port).

See the gsql --port reference.

create

Determines whether to create the pgxc_copy_error_log and gs_copy_summary tables.

The value can be true or false. The default value is true.

clean

Indicates whether to clear the error record.

The value can be true or false. The default value is false.

data

(Mandatory) Data file. You can specify multiple data files or use wildcards (*) and question marks (?) to represent multiple data files.

String

control

(Mandatory) Name of a control file.

String

log

Name of a log file.

String

bad

Name of the file recording the error lines. You can also specify a directory to generate the file based on the data file name.

String

discard

Name of the file recording the lines that fail to be matched by WHEN. You can also specify a directory to generate the file name based on the data file name.

String

errors

Maximum number of error lines in a data file.

Integer Default value: 0

skip

Number of first lines that can be skipped in a data file.

Integer Default value: 0

bindsize

Only syntax compatibility is implemented, but functions are not implemented.

-

rows

Only syntax compatibility is implemented, but functions are not implemented.

-

CAUTION:
  • All parameters are in lowercase and are compatible with the gsql login mode, including -p port number, -h host, -d database, -U username, and -W password.
  • gs_loader uses a .bad file to record errors from the rawrecord column in an error table. The error table does not record rawrecord if an error cannot be read by certain code. In this case, a blank line is recorded in the .bad file.

Control Files

  • Syntax
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    LOAD [ DATA ]
    [CHARACTERSET char_set_name]
    [INFILE [directory_path] [filename ] ]
    [BADFILE [directory_path] [filename ] ]
    [{ INSERT | APPEND | REPLACE | TRUNCATE }] 
    INTO TABLE table_name
    [{ INSERT | APPEND | REPLACE | TRUNCATE }] 
    [FIELDS CSV]
    [TERMINATED [BY] { 'string' }]
    [OPTIONALLY ENCLOSED BY { 'string' }]
    [TRAILING NULLCOLS]
    [ WHEN { (start:end) | column_name } {= | !=} 'string' ]
    [(
    col_name [ [ POSITION ({ start:end }) ]  ["sql_string"] ] | [ FILLER [column_type [external] ] ] | [ CONSTANT "string" ] | [ SEQUENCE ( { COUNT | MAX | integer } [, incr] ) ]|[NULLIF (COL=BLANKS)]
    [, ...]
    )]
    
  • Parameter description:
    • CHARACTERSET

      Character set.

      Value range: a string.

    • INFILE

      The current keyword is invalid and needs to occupy a separate line in the control file. The keyword is ignored during running. You need to specify the corresponding data file in the gs_loader command line parameters.

    • BADFILE

      The current keyword is invalid and will be ignored during running. If badfile is not specified in the gs_loader command, a badfile will be generated based on the name of the corresponding control file.

    • INSERT | APPEND | REPLACE | TRUNCATE

      Import mode.

      INSERT: If the table contains data, an error is reported.

      APPEND: Data is inserted directly.

      REPLACE: If the table contains data, all data is deleted and then inserted.

      TRUNCATE: If the table contains data, all data is deleted and then inserted.

    • table_name

      Specifies the name (possibly schema-qualified) of an existing table.

      Value range: an existing table name

    • FIELDS csv

      Specifies that the CSV mode of COPY is used. In CSV mode, the default separator is a comma (,), and the default quotation mark is a double quotation mark (").

    • TERMINATED [BY] { 'string' }

      The string that separates columns within each row (line) of the file, and it cannot be larger than 10 bytes.

      Value range: The delimiter cannot include any of the following characters: \.abcdefghijklmnopqrstuvwxyz0123456789

      Value range: The default value is a tab character in text format and a comma in CSV format.

    • OPTIONALLY ENCLOSED BY { 'string' }

      Specifies a quoted character string for a CSV file.

      The default value is double quotation marks (") only in CSV mode that is explicitly specified by the FIELDS CSV parameter.

      In other modes, there is no default value.

    • TRAILING NULLCOLS

      Specifies how to handle the problem that multiple columns of a row in a source data file are lost during data import.

    • WHEN { (start:end) | column_name } {= | !=}

      Filters rows by character string between start and end or by column name.

      Value range: a string.

    • POSITION ({ start:end })

      Processes columns and obtain the corresponding character strings between start and end.

    • "sql_string"

      Processes columns and calculates column values based on column expressions.

      Value range: a string.

    • FILLER

      Processes columns. If FILLER occurs, this column is skipped.

    • CONSTANT

      Processes columns and sets the inserted columns to constants.

      Value range: a string.

    • SEQUENCE ( { COUNT | MAX | integer } [, incr] )

      Processes columns to generate the corresponding sequence values.

      • COUNT: The count starts based on the number of rows in the table.
      • MAX: The count starts from the maximum value of this column in the table.
      • integer: The count starts from the specified value.
      • incr: indicates the increment each time.
    • NULLIF

      Leave the field empty. Currently, only the COL POSITION() CHAR NULLIF (COL=BLANKS) syntax is supported.

    CAUTION:
    • OPTIONS, INFILE, and BADFILE are not supported. Syntax errors are not reported only in specific scenarios.
    • gs_loader uses a .bad file to record errors from the rawrecord column in an error table. The error table does not record rawrecord if an error cannot be read by certain code. In this case, a blank line is recorded in the .bad file.

Utilizamos cookies para mejorar nuestro sitio y tu experiencia. Al continuar navegando en nuestro sitio, tú aceptas nuestra política de cookies. Descubre más

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback