Updated on 2023-06-13 GMT+08:00

Step 4: Metadata Collection

To manage and monitor the raw data migrated to the cloud on DataArts Studio, you can use the DataArts Catalog module to collect and monitor the metadata at the Source Data Integration (SDI) layer.

Collecting and Monitoring Metadata

  1. On the DataArts Studio console, locate an instance and click Access. On the displayed page, locate a workspace and click DataArts Catalog.

    Figure 1 DataArts Catalog

  2. Choose Metadata Collection > Collection Tasks in the left navigation pane, right-click a directory in the directory tree, and choose Create Directory from the shortcut menu. In the dialog box displayed, enter the directory name, for example, transport, select a parent directory, and click OK.

    Figure 2 Collection Tasks

  3. Select the transport directory in the directory tree and click Add Task.
  4. Create a collection task named transport_all, configure parameters shown in the following figure, and click Next.

    Figure 3 Creating a collection task (basic settings)
    Figure 4 Creating a metadata collection task

  5. Configure the scheduling mode and click Submit.

    Figure 5 Configuring the scheduling mode

  6. In the collection task list, locate the new collection task and click Start Scheduling in the row that contains the task.

    Figure 6 Starting scheduling

  7. Choose Metadata Collection > Task Monitoring in the left navigation pane, and check whether the collection task is successful.

    Figure 7 Viewing a monitoring task

  8. After the collection task is successful, choose Data Map > Data Catalog in the left navigation pane, click the Technical Assets tab, and set filter criteria. For example, select mrs_hive_link for Data Connections and Table for Types. All tables that meet the filter criteria are displayed.

    Figure 8 Technical assets

  9. Click the target metadata name to view its details.

    Figure 9 Metadata details