Video Dataset Processing Operators
The data processing operators provide multiple data operation capabilities, including data extraction, filtering, conversion, and labeling. These operators help you extract useful information from massive data and perform deep processing to generate high-quality training data.
The platform supports processing of video datasets, including data extraction, data filtering, and data labeling. Table 1 lists the capabilities of video dataset processing operators.
Category |
Operator Name |
Operator Description |
---|---|---|
Data extraction |
Splits a long video into short video clips based on the scene change. If the length of a clip exceeds the specified time threshold, the clip is further split by duration. |
|
Data conversion |
Adds a full-screen text watermark to a video. |
|
Video cropping is to crop unnecessary elements in a video, such as subtitles, logos, watermarks, borders, and dense text, and filter out video files whose area ratio after cropping exceeds the preset threshold. Before using this function, you need to execute the subtitle, logo, watermark, border, and dense text recognition operators. |
||
Data filtering |
Filters videos based on the video metadata (frame rate, resolution, and video duration) and retains only the videos that meet the specified conditions. Note: The standard frame rate of a movie is 24 FPS or 30 FPS. |
|
Filters videos based on the aspect ratio. The aspect ratio is a ratio of a width to a height of a video image. |
||
Data labeling |
Labels pornographic content. |
|
Labels violent and terrorism content. |
||
Labels political content. |
||
Calculates and scores the motion range of each pixel in each frame, and identifies videos with too fast motion (for example, > 100 optical flows) or too slow motion (for example, ≤ 2 optical flows). A larger value indicates faster motion. |
||
Scores the basic video quality (definition, brightness, blur, image shaking and ghosting, overexposure in low light, and glitch). The value range is (0, 1). A higher value indicates better quality. A video whose score is greater than 0.05 is considered as a video with high basic quality. |
||
Scores the aesthetics of a video from the following dimensions: content (attractive and clear), composition (good object position), color (vital and pleasant), light (obvious contrast), and track (continuous and stable). The value range is (0, 1). A higher value indicates better aesthetics. A video whose score is greater than 0.95 is considered as a video with high aesthetics. |
||
Identifies whether a video contains watermarks. |
||
Identifies whether a video contains subtitles. |
||
Identifies whether a video contains a logo. |
||
Checks whether a video contains black bars. |
||
Identifies whether a video contains dense text. A video in which the proportion of dense text area exceeds the specified proportion is a video with dense text. By default, a video with a cropping area proportion greater than or equal to 7% is a video with dense text. |
||
Returns video label categories through the operator. There are seven categories at L1 and 700 categories at L4. |
||
Extracts frames from a video and generates a simplified video synopsis through model inference. |
||
Extracts frames from a video and generates a detailed video synopsis through model inference. |
Scene Splitting
- Applicable file format: video > mp4/avi.
- Parameter description:
Video to be split: Videos that meet the resolution, duration, and frame rate criteria are split.
Specifications after video splitting: The maximum duration of a single video slice can be customized. If the duration of the first split slice exceeds the specified value, the video will be further split. The final split result is less than or equal to the specified threshold.
Video Cropping
- Applicable file format: video > mp4/avi.
- Parameter description:
Items to be cropped: Remove useless information such as subtitles, logos, watermarks, borders, and dense text from videos.
Cropping area ratio: The ratio of the cropped video area to the original video area is the cropping area ratio. Videos whose cropping area ratio exceeds the preset threshold will be filtered out. The default value is 30%.
Video Aspect Ratio Filtering
- Applicable file format: video > mp4/avi.
- Parameter description:
Aspect ratio threshold: Videos whose aspect ratio exceeds the threshold will be filtered out. The threshold range is (1, 10). You can enter one decimal place.
- Example:
There are two videos, and their respective aspect ratios are 1.77 and 1.79.
Set the aspect ratio threshold to 1.78. After operator processing, the result is as follows.
Only the video whose aspect ratio is 1.79 is retained.
Pornographic Video Detection
- Applicable file format: video > mp4/avi.
- Operator description: Labels pornographic content.
- Parameter configuration example:
- Detection example:
The results are stored in the annotation file as the video_anti_porn object.
suggestion: indicates whether the file passes the check. pass indicates that the file passes the check and no problem occurs. review indicates that manual review is required. You can choose to bypass or block the file based on your review policy. block indicates that the file to be reviewed is problematic.
confidence: detection confidence of the model. (Note that the confidence indicates the confidence of the model-provided suggestions.) If suggestion is pass, the value is 0. If suggestion is review or block, the value ranges from 0 to 1.
label: label of the pornographic content detected by the model. If no pornographic content is detected, the value is empty.
Violent and Terrorism Video Detection
- Applicable file format: video > mp4/avi.
- Operator description: Labels violent and terrorism content.
- Parameter configuration example:
- Detection example: The results are stored in the annotation file as the video_anti_terrorism object.
suggestion: indicates whether the file passes the check. pass indicates that the file passes the check and no problem occurs. review indicates that manual review is required. You can choose to bypass or block the file based on your review policy. block indicates that the file to be reviewed is problematic.
confidence: detection confidence of the model. (Note that the confidence indicates the confidence of the model-provided suggestions.) If suggestion is pass, the value is 0. If suggestion is review or block, the value ranges from 0 to 1.
label: label of the violent and terrorism content detected by the model. If no violent or terrorism content is detected, the value is empty.
Political Video Content Detection and Scoring
- Applicable file format: video > mp4/avi.
- Operator description:
- Parameter configuration example:
- Detection example:
The results are stored in the annotation file as the video_anti_politics object.
suggestion: indicates whether the file passes the check. pass indicates that the file passes the check and no problem occurs. review indicates that manual review is required. You can choose to bypass or block the file based on your review policy. block indicates that the file to be reviewed is problematic.
result: result returned by the model after file detection, including the suggestion, confidence, and label. One or more records can be returned.
confidence: detection confidence of the model. (Note that the confidence indicates the confidence of the model-provided suggestions.) If suggestion is pass, the value is 0. If suggestion is review or block, the value ranges from 0 to 1.
label: label of the political content detected by the model. If no political content is detected, the value is empty.
Motion Range Scoring
- Applicable file format: video > mp4/avi.
- Scoring description:
Identifies videos with too fast or too slow motion. A larger value indicates faster motion. If the motion range is greater than 100 optical flows, the motion is too fast. If the motion range is less than or equal to 2 optical flows, the motion is too slow.
Basic Quality Scoring
- Applicable file format: video > mp4/avi.
- Scoring description:
Scores the basic video quality (definition, brightness, blur, image shaking and ghosting, overexposure in low light, and glitch). The value range is (0, 1). A higher value indicates better quality. A video whose score is greater than 0.05 is considered as a video with high basic quality.
Aesthetics Scoring
- Applicable file format: video > mp4/avi.
- Scoring description:
Scores the aesthetics of a video from the following dimensions: content (attractive and clear), composition (good object position), color (vital and pleasant), light (obvious contrast), and track (continuous and stable). The value range is (0, 1). A higher value indicates better aesthetics. A video whose score is greater than 0.95 is considered as a video with high aesthetics.
Watermark Identification
- Applicable file format: video > mp4/avi.
- Operator description:
- Example: The JSONL file shows whether the watermark is identified. If the value of consist_watermark is 1, the watermark is identified. If the value is 0, no watermark is identified.
Subtitle Identification
- Applicable file format: video > mp4/avi.
- Operator description:
- Example: The JSONL file shows whether the subtitle is identified. If the value of consist_subtitle is 1, the subtitle is identified. If the value is 0, no subtitle is identified.
Logo Identification
- Applicable file format: video > mp4/avi.
- Operator description:
- Example: The JSONL file shows whether the logo is identified. If the value of consist_logo is 1, the logo is identified. If the value is 0, no logo is identified.
Dense Text Identification
- Applicable file format: video > mp4/avi.
- Parameter description:
Proportion of dense text area: A video in which the proportion of dense text area exceeds the specified proportion is a video with dense text. By default, a video with a cropping area proportion greater than or equal to 7% is a video with dense text.
- Example: In the JSONL file, if the value of consist_densetext is 1, dense text is identified. If the value is 0, dense text is not identified.
Video Classification (InterVideo2)
Video Synopsis Generation (Detailed)
- Applicable file format: video > mp4/avi.
- Operator description:
Extracts frames from a video and generates a detailed video synopsis through model inference.
- Parameter configuration example:
- Example: The long_prompt field in the description indicates the detailed video synopsis.
Video Synopsis Generation (Simplified)
- Applicable file format: video > mp4/avi.
- Operator description:
Extracts frames from a video and generates a simplified video synopsis through model inference.
- Parameter configuration example:
- Example: The prompt field in the description indicates the simplified video synopsis.
Figure 1 Example
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot