Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
On this page

Show all

Help Center/ Huawei HiLens/ User Guide/ Caffe Operator Boundaries

Caffe Operator Boundaries

Updated on 2024-09-30 GMT+08:00

For the Caffe framework, if the input dimension of each operator is not 4 and the axis parameter exists, negative numbers cannot be used.

Table 1 shows the boundaries of Caffe operators supported by .om models.

Table 1 Caffe operator boundaries

No.

Operator

Definition

Boundary

1

Absval

Computes the absolute value of the input.

[Input]

One input

[Parameter]

engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2

[Constraint]

None

[Quantitative tool support]

Yes

2

Argmax

Returns the index number corresponding to the maximum input value.

[Input]

One input

[Parameter]

  • out_max_val: (optional) bool, default to false
  • top_k: (optional) unit32, default to 1
  • axis: (optional) int32

[Constraint]

None

[Quantitative tool support]

No

3

BatchNorm

Normalizes the input:

variance of [(x – avg(x))/x]

[Input]

One input

[Parameter]

  • use_global_stats: bool, must be true
  • moving_average_fraction: (optional) float, default to 0.999
  • eps: (optional) float, default to 1e-5

[Constraint]

Only the C dimension can be normalized.

[Quantitative tool support]

Yes

4

Concat

Concatenates the input along the given dimension.

[Input]

Multiple inputs

[Parameter]

  • concat_dim: (optional) uint32, default to 1, greater than 0
  • axis: (optional) int32, default to 1, exclusive with concat_dim. When axis is –1, four input dimensions are required. Otherwise, the result may be incorrect.

[Constraint]

  • For the input tensor, the sizes of its dimensions must be the same except the dimension for concatenation.
  • The range of the input tensor count is [1, 1,000].

[Quantitative tool support]

Yes

5

DepthwiseConvolution

Depthwise convolution

[Input]

One 4D input, with a constant filter

[Parameter]

  • num_output: (optional) uint32
  • bias_term: (optional) bool, default to true
  • pad: uint32, default to 0, array
  • kernel_size: uint32, array
  • stride: uint32, default to 1, array
  • dilation: uint32, default to 1, array
  • pad_h: (optional) uint32, default to 0 (2D only)
  • pad_w: (optional) uint32, default to 0 (2D only)
  • kernel_h: (optional) uint32 (2D only)
  • kernel_w: (optional) uint32 (2D only)
  • stride_h: (optional) uint32 (2D only)
  • stride_w: (optional) uint32 (2D only)
  • group: (optional) uint32, default to 1
  • weight_filler: (optional) FillerParameter
  • bias_filler: (optional) FillerParameter
  • engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2
  • force_nd_im2col: (optional) bool, default to false
  • axis: (optional) int32, default to 1

[Constraint]

filterN=inputC=group

[Quantitative tool support]

Yes

6

Convolution

Convolution

[Input]

One 4D input, with a constant filter

[Parameter]

  • num_output: (optional) uint32
  • bias_term: (optional) bool, default to true
  • pad: uint32, default to 0, array
  • kernel_size: uint32, array
  • stride: uint32, default to 1, array
  • dilation: uint32, default to 1, array
  • pad_h: (optional) uint32, default to 0 (2D only)
  • pad_w: (optional) uint32, default to 0 (2D only)
  • kernel_h: (optional) uint32 (2D only)
  • kernel_w: (optional) uint32 (2D only)
  • stride_h: (optional) uint32 (2D only)
  • stride_w: (optional) uint32 (2D only)
  • group: (optional) uint32, default to 1
  • weight_filler: (optional) FillerParameter
  • bias_filler: (optional) FillerParameter
  • engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2
  • force_nd_im2col: (optional) bool, default to false
  • axis: (optional) int32, default to 1

[Constraint]

  • (inputW + padWHead + padWTail) ≥ (((FilterW-1) x dilationW) + 1)
  • (inputW + padWHead + padWTail)/StrideW + 1 ≤ 2147483647
  • (inputH + padHHead + padHTail) ≥ (((FilterH-1) x dilationH) + 1)
  • (inputH + padHHead + padHTail)/StrideH + 1 ≤ 2147483647
  • 0 ≤ Pad < 256, 0 < FilterSize < 256, 0 < Stride < 64, 1 ≤ dilationsize < 256

[Quantitative tool support]

Yes

7

Crop

Crops the input.

[Input]

Two inputs

[Parameter]

  • axis: (optional) int32, default to 2. When axis is –1, four input dimensions are required.
  • offset: uint32, array

[Constraint]

None

[Quantitative tool support]

No

8

Deconvolution

Deconvolution

[Input]

One 4D input, with a constant filter

[Parameter]

  • num_output: (optional) uint32
  • bias_term: (optional) bool, default to true
  • pad: uint32, default to 0, array
  • kernel_size: uint32, array
  • stride: uint32, default to 1, array
  • dilation: uint32, default to 1, array
  • pad_h: (optional) uint32, default to 0 (2D only)
  • pad_w: (optional) uint32, default to 0 (2D only)
  • kernel_h: (optional) uint32 (2D only)
  • kernel_w: (optional) uint32 (2D only)
  • stride_h: (optional) uint32 (2D only)
  • stride_w: (optional) uint32 (2D only)
  • group: (optional) uint32, default to 1
  • weight_filler: (optional) FillerParameter
  • bias_filler: (optional) FillerParameter
  • engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2
  • force_nd_im2col: (optional) bool, default to false
  • axis: (optional) int32, default to 1

[Constraint]

  • group = 1
  • dilation = 1
  • filterH - padHHead - 1 ≥ 0
  • filterW - padWHead - 1 ≥ 0

Restrictions involving intermediate variables:

  1. a = ALIGN(filter_num, 16) x ALIGN(filter_c, 16) x filter_h x filter_w x 2
  2. If ALIGN(filter_c, 16)%32 = 0, a = a/2
  3. conv_input_width = (deconvolution input W – 1) x strideW + 1
  4. b = (conv_input_width) x filter_h x ALIGN(filter_num, 16) x 2 x 2
  5. a + b ≤ 1024 x 1024

[Quantitative tool support]

Yes

9

DetectionOutput

Generates detection results and outputs FSR.

[Input]

Three inputs

[Parameter]

  • num_classes: (mandatory) int32, indicating the number of classes to be predicted
  • share_location: (optional) bool, default to true, indicating that classes share one bounding box
  • background_label_id: (optional) int32, default to 0
  • nms_param: (optional) indicating non-maximum suppression (NMS)
  • save_output_param: (optional) indicating whether to save the detection result
  • code_type: (optional) default to CENTER_SIZE
  • variance_encoded_in_target: (optional) bool, default to true. The value true indicates that the variance is encoded in the target, otherwise the prediction offset needs to be adjusted accordingly.
  • keep_top_k: (optional) int32, indicating the total number of BBoxes to be reserved for each image after NMS
  • confidence_threshold: (optional) float, indicating that only the detection whose confidence is above the threshold is considered. If this parameter is not set, all boxes are considered.
  • nms_threshold: (optional) float
  • top_k: (optional) int32
  • boxes: (optional) int32, default to 1
  • relative: (optional) bool, default to true
  • objectness_threshold: (optional) float, default to 0.5
  • class_threshold: (optional) float, default to 0.5
  • biases: array
  • general_nms_param: optional

[Constraint]

  • Used for Faster R-CNN
  • Non-maximum suppression (NMS) ratio nmsThreshold is within (0, 1)
  • Probability threshold postConfThreshold is within (0, 1)
  • Classes ≥ 2
  • Input box count ≤ 1,024
  • Output W dimension = 16

[Quantitative tool support]

Yes

10

Eltwise

Computes element-wise operations (PROD, MAX, and SUM).

[Input]

At least two inputs

[Parameter]

  • operation: (optional) enum, PROD = 0, SUM = 1, MAX = 2; default to SUM
  • coeff: float array
  • stable_prod_grad: (optional) bool, default to true

[Constraint]

  • Up to four inputs
  • Compared with the native operator, this operator does not support the stable_prod_grad parameter.
  • PROD, MAX, and SUM operations are supported.

[Quantitative tool support]

Yes

11

Elu

Activation function

[Input]

One input

[Parameter]

alpha: (optional) float, default to 1

[Constraint]

None

[Quantitative tool support]

No

12

Exp

Applies e as the base and x as the exponent.

[Input]

One input

[Parameter]

  • base: (optional) float, default to –1.0
  • scale: (optional) float, default to 1.0
  • shift: (optional) float, default to 0.0

[Constraint]

None

[Quantitative tool support]

No

13

Flatten

Converts an input of shape N * C * H * W to a vector output of shape N * (C * H * W).

[Input]

One input

(top_size ≠ bottom_size ≠ 1. When axis is –1, four input dimensions are required.)

[Parameter]

  • axis: (optional) int32, default to 1
  • end_axis: (optional) int32, default to -1

[Constraint]

axis < end axis

[Quantitative tool support]

Yes

14

FullConnection

Computes an inner product.

[Input]

One input

[Parameter]

  • num_output: (optional) uint32
  • bias_term: (optional) bool, default to true
  • weight_filler: (optional) FillerParameter, 2D
  • bias_filler: (optional) FillerParameter, 1D
  • axis: (optional) int32, default to 1
  • transpose: (optional) bool, default to false

[Constraint]

  • transpose = false, axis = 1
  • Bais_C ≤ 56832
  • To quantify the model, the following dimension restrictions must be satisfied:

    − When N = 1, then 2 x CEIL(C, 16) x 16 x xH x xW ≤ 1024 x 1024

    − When N > 1, then 2 x 16 x CEIL(C, 16) x 16 x xH x xW ≤ 1024 x 1024

[Quantitative tool support]

Yes

15

Interp

Interpolation layer

[Input]

One input

[Parameter]

  • height: (optional) int32, default to 0
  • width: (optional) int32, default to 0
  • zoom_factor: (optional) int32, default to 1
  • shrink_factor: (optional) int32, default to 1
  • pad_beg: (optional) int32, default to 0
  • pad_end: (optional) int32, default to 0
NOTE:

l zoom_factor and shrink_factor are exclusive.

l height and zoom_factor are exclusive.

l height and shrink_factor are exclusive.

[Constraint]

(outputH x outputW)/(inputH x inputW) > 1/30

[Quantitative tool support]

No

16

Log

Performs logarithmic operation on the input.

[Input]

One input

[Parameter]

  • base: (optional) float, default to –1.0
  • scale: (optional) float, default to 1.0
  • shift: (optional) float, default to 0.0

[Constraint]

None

[Quantitative tool support]

No

17

LRN

Normalizes the input in a local region.

[Input]

One non-constant input

[Parameter]

  • local_size: (optional) uint32, default to 5
  • alpha: (optional) float, default to 1
  • beta: (optional) float, default to 0.75
  • norm_region: (optional) enum, default to ACROSS_CHANNELS (ACROSS_CHANNELS = 0, WITHIN_CHANNEL = 1)
  • lrnk: (optional) float, default to 1
  • engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2

[Constraint]

  • local_size is an odd number greater than 0.
  • Inter-channel: If local_size is within [1, 15]: lrnK > 0.00001 and beta > 0.01; Otherwise, lrnK and beta are any values. lrnK and alpha are not 0 at the same time. When the C dimension is greater than 1,776, local_size < 1728.
  • Intra-channel: lrnK = 1, local_size is within [1, 15], beta > 0.01.

[Quantitative tool support]

Yes

18

LSTM

Long and short term memory network (LSTM)

[Input]

Two or three inputs

  • X: time sequence data (T x B x Xt), which is in the NCHW 4D format,
  • where, N corresponds to the time sequence length T, C corresponds to the batch size B, H corresponds to the input data Xt at time point t, and W is fixed at 1.
  • Cont: sequence continuity flag (T x B)
  • Xs: (optional) static data (B x Xt)

[Parameter]

  • num_output: (optional) uint32, default to 0
  • weight_filler: (optional) FillerParameter
  • bias_filler: (optional) FillerParameter
  • debug_info: (optional) bool, default to false
  • expose_hidden: (optional) bool, default to false

[Constraint]

Restrictions involving intermediate variables:

a = (ALIGN(xt, 16) + ALIGN(output, 16)) x 16 x 2 x 2

b = (ALIGN(xt, 16) + ALIGN(output, 16)) x 16 x 4 x 2 x 2

c = use_projection ? ALIGN(ht, 16) x ALIGN(output, 16) x 2):0

d = 16 x ALIGN(ht, 16) x 2

e = batchNum x 4

The constraints are as follows:

a + b + c ≤ 1024 x 1024

d ≤ 256 x 1024/8

e ≤ 256 x 1024/32

[Quantitative tool support]

No

19

Normalize

Normalization layer

[Input]

One input

[Parameter]

  • across_spatial: (optional) bool, default to true
  • scale_filler: (optional) default to 1.0
  • channel_shared: (optional) bool, default to true
  • eps: (optional) float, default to 1e-10

[Constraint]

  • 1e – 7 < eps ≤ 0.1 + (1e – 6)
  • across_spatial can only be true for Caffe, indicating normalization by channel.

[Quantitative tool support]

Yes

20

Permute

Permutes the input dimensions according to a given mode.

[Input]

One input

[Parameter]

order: uint32, array

[Constraint]

None

[Quantitative tool support]

Yes

21

Pooling

Pools the input.

[Input]

One input

[Parameter]

  • pool: (optional) enum, indicating the pooling method, MAX = 0, AVE = 1, and STOCHASTIC = 2, default to MAX
  • pad: (optional) uint32, default to 0
  • pad_h: (optional) uint32, default to 0
  • pad_w: (optional) uint32, default to 0
  • kernel_size: (optional) uint32, exclusive with kernel_h/kernel_w
  • kernel_h: (optional) uint32
  • kernel_w: (optional) uint32, used in pair with kernel_h
  • stride: (optional) uint32, default to 1
  • stride_h: (optional) uint32
  • stride_w: (optional) uint32
  • engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2
  • global_pooling: (optional) bool, default to false
  • ceil_mode: (optional) bool, default to true
  • round_mode: (optional) enum, CEIL = 0, FLOOR = 1, default to CEIL

[Constraint]

  • kernelH ≤ inputH + padTop + padBottom
  • kernelW ≤ inputW + padLeft + padRight
  • padTop < windowH
  • padBottom < windowH
  • padLeft < windowW
  • padRight < windowW

In addition to common restrictions, the following restrictions must be satisfied.

The global pool mode supports only the following ranges:

  1. outputH==1 && outputW==1 && kernelH>=inputH && kernelW>=inputW
  2. inputH*inputW ≤ 10,000

[Quantitative tool support]

Yes

22

Power

Computes the output y as (scale * x + shift)^power.

[Input]

One input

[Parameter]

  • power: (optional) float, default to 1.0
  • scale: (optional) float, default to 1.0
  • shift: (optional) float, default to 0.0

[Constraint]

  • power! = 1
  • scale * x + shift > 0

[Quantitative tool support]

Yes

23

Prelu

Activation function

[Input]

One input

[Parameter]

  • filler: optional
  • channel_shared: (optional) bool, indicating whether to share slope parameters across channels, default to false

[Constraint]

None

[Quantitative tool support]

Yes

24

PriorBox

Obtains the real location of the target from the box proposals.

[Input]

One input

[Parameter]

  • min_size: (mandatory) indicating the minimum frame size (in pixels)
  • max_size: (mandatory) indicating the maximum frame size (in pixels)
  • aspect_ratio: array, float. A repeated ratio is ignored. If no aspect ratio is provided, the default ratio 1 is used.
  • flip: (optional) bool, default to true. The value true indicates that each aspect ratio is reversed. For example, for aspect ratio r, the aspect ratio 1.0/r is generated.
  • clip: (optional) bool, default to false. The value true indicates that the previous value is clipped to the range [0, 1].
  • variance: array, used to adjust the variance of the BBoxes
  • img_size: (optional) uint32, exclusive with img_h or img_w
  • img_h: (optional) uint32
  • img_w: (optional) uint32
  • step: (optional) float, exclusive with step_h or step_w
  • step_h: (optional) float
  • step_w: (optional) float
  • offset: float, default to 0.5

[Constraint]

Used for the SSD network only

Output dimensions: [n, 2, detected boxes x 4, 1]

[Quantitative tool support]

Yes

25

Proposal

Sorts the box proposals by (proposal, score) and obtains the top N proposals by using the NMS.

[Input]

Three inputs (scores, bbox_pred, im_info)

[Parameter]

  • feat_stride: (optional) float
  • base_size: (optional) float
  • min_size: (optional) float
  • ratio: float array
  • scale: float array
  • pre_nms_topn: (optional) int32
  • post_nms_topn: (optional) int32
  • nms_thresh: (optional) float

[Constraint]

  • Used only for Faster R-CNN
  • ProposalParameter and PythonParameter are exclusive.
  • Value range of preTopK: 1-6,144
  • Value range of postTopK: 1-1,024
  • scaleCnt x ratioCnt ≤ 64
  • nmsTresh: threshold for Intersection-over-Union (IoU) box filtering, 0 < nmsTresh ≤ 1
  • minSize: minimum edge length of a box. A value less than this parameter is filtered out.
  • featStride: H/W stride between the two adjacent boxes used in default box generation
  • baseSize: default box size used in default box generation
  • ratio and scale: used in default box generation
  • imgH and imgW: height and width of the image input to the network. The values must be greater than 0.
  • Restrictions on the input dimensions:

    clsProb: C = 2 x scaleCnt x ratioCnt

    bboxPred: C = 4 x scaleCnt x ratioCnt

    bboxPrior: N = clsProb.N, C = 4 x scaleCnt x ratioCnt

    imInfo: N = clsProb.N, C = 3

[Quantitative tool support]

Yes

26

PSROIPooling

Position-sensitive region-of-interest pooling (PSROIPooling)

[Input]

Two inputs

[Parameter]

  • spatial_scale: (mandatory) float
  • output_dim: (mandatory) int32, indicating the number of output channels
  • group_size: (mandatory) int32, indicating the number of groups to encode position-sensitive score maps

[Constraint]

Used for the Region-based Fully Convolutional Network (R-FCN)

  • ROI coordinates [roiN, roiC, roiH, roiW]: 1 ≤ roiN ≤ 65535, roiC == 5, roiH == 1, roiW == 1
  • Dimensions of the input feature map: [xN, xC, xH, xW]

    pooledH == pooledW == groupSize ≤ 128

    pooledH and pooledW indicate the length and width of the pooled ROI.

  • Output format: y [yN, yC, yH, yW]
  • poolingMode == avg pooling, pooledH == pooledW == groupSize, pooledH ≤ 128, spatialScale > 0, groupSize > 0, outputDim > 0
  • 1 ≤ xN ≤ 65535, roisN % xN == 0
  • HW_LIMIT defines the limits of xH and xW.

    xHW = xH * xW

    pooledHW = pooledH * pooledW

    HW_LIMIT = (64 x 1024 – 8 x 1024)/32

    xH ≥ pooledH, xW ≥ pooledW

    xHW ≥ pooledHW

    xHW/pooledHW ≤ HW_LIMIT

  • In multi-batch scenarios, the ROIs are allocated equally to the batches. In addition, the batch sequence of the ROIs is the same as the feature.

[Quantitative tool support]

Yes

27

Relu

Activation function, including common ReLU and Leaky ReLU, which can be specified by parameters

[Input]

One input

[Parameter]

  • negative_slope: (optional) float, default to 0
  • engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2

[Constraint]

None

[Quantitative tool support]

Yes

28

Reshape

Reshapes the input.

[Input]

One input

[Parameter]

  • shape: constant, int64 or int
  • axis: (optional) int32, default to 0
  • num_axes: (optional) int32, default to -1

[Constraint]

None

[Quantitative tool support]

Yes

29

ROIAlign

Aggregates features using ROIs.

[Input]

At least two inputs

[Parameter]

  • pooled_h: (optional) uint32, default to 0
  • pooled_w: (optional) uint32, default to 0
  • spatial_scale: (optional) float, default to 1
  • sampling_ratio: (optional) int32, default to -1

[Constraint]

Mainly used for Mask R-CNN

  • Restrictions on the feature map:

1) H x W ≤ 5,248 (N > 1) or W x C < 40,960 (N = 1)

2) C ≤ 1280

3) ((C - 1)/128 + 1) x pooledW ≤ 216

  • Restrictions on the ROI:

1) C = 5 (caffe), H = 1, W = 1

2) samplingRatio * pooledW ≤ 128, samplingRatio * pooledH ≤ 128

3) H ≥ pooledH, W ≥ pooledW

[Quantitative tool support]

Yes

30

ROIPooling

Maps ROI proposals to a feature map.

[Input]

At least two inputs

[Parameter]

  • pooled_h: (mandatory) uint32, default to 0
  • pooled_w: (mandatory) uint32, default to 0
  • spatial_scale: (mandatory) float, default to 1. The multiplication spatial scale factor is used to convert ROI coordinates from the input scale to the pool scale.

[Constraint]

Mainly used for Faster R-CNN

  • Input dimensions: H x W ≤ 8,160, H ≤ 120, W ≤ 120
  • Output dimensions: pooledH ≤ 20, pooledW ≤ 20

[Quantitative tool support]

Yes

31

Scale

out = alpha x Input + beta

[Input]

Two inputs, each with four dimensions

[Parameter]

  • axis: (optional) int32, default to 1. Only 1 or –3 is supported.
  • num_axes: (optional) int32, default to 1
  • filler: (optional) ignored unless only one bottom is given and scale is a learned parameter
  • bias_term: (optional) bool, default to false, indicating whether to learn a bias (equivalent to ScaleLayer + BiasLayer, but may be more efficient). Initialized with bias_filler.
  • bias_filler: (optional) default to 0

[Constraint]

shape of scale and bias: (n, c, 1, 1), with the C dimension equal to that of the input

[Quantitative tool support]

Yes

32

ShuffleChannel

Shuffles information across the feature channels.

[Input]

One input

[Parameter]

group: (optional) uint32, default to 1

[Constraint]

None

[Quantitative tool support]

Yes

33

Sigmoid

Activation function

[Input]

One input

[Parameter]

engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2

[Constraint]

None

[Quantitative tool support]

Yes

34

Slice

Slices an input into multiple outputs.

[Input]

One input

[Parameter]

  • slice_dim: (optional) uint32, default to 1, exclusive with axis
  • slice_point: array, uint32
  • axis: (optional) int32, default to 1, indicating concatenation along the channel dimension

[Constraint]

None

[Output]

None

[Quantitative tool support]

Yes

35

Softmax

Normalized logic function

[Input]

One input

[Parameter]

  • engine: (optional) default to 0, CAFFE = 1, CUDNN = 2
  • axis: (optional) int32, default to 1, indicating the axis along which softmax is performed

[Constraint]

Softmax can be performed on each of the four input dimensions.

According to axis:

  • When axis = 1: C ≤ ((256 x 1024/4) – 8 x 1024 – 256)/2
  • When axis = 0: n ≤ (56 x 1024 – 256)/2
  • When axis = 2: W = 1, 0 < h < (1024 x 1024/32)
  • When axis = 3: 0 < W < (1024 x 1024/32)

If the input contains fewer than four dimensions, softmax is performed only on the last dimension, with the last dimension ≤ 46,080.

[Quantitative tool support]

Yes

36

Tanh

Activation function

[Input]

One input

[Parameter]

engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2

[Constraint]

The number of tensor elements cannot exceed INT32_MAX.

[Quantitative tool support]

Yes

37

Upsample

Backward propagation of max pooling

[Input]

Two inputs

[Parameter]

scale: (optional) int32, default to 1

[Constraint]

None

[Quantitative tool support]

Yes

38

SSDDetectionOutput

SSD network detection output

[Input]

Three inputs

[Parameter]

  • num_classes: (mandatory) int32, indicating the number of classes to be predicted
  • share_location: (optional) bool, default to true, indicating that classes share one bounding box
  • background_label_id: (optional) int32, default to 0
  • nms_param: (optional) indicating non-maximum suppression (NMS)
  • save_output_param: (optional) indicating whether to save the detection result
  • code_type: (optional) default to CENTER_SIZE
  • variance_encoded_in_target: (optional) bool, default to true. The value true indicates that the variance is encoded in the target, otherwise the prediction offset needs to be adjusted accordingly.
  • keep_top_k: (optional) int32, indicating the total number of BBoxes to be reserved for each image after NMS
  • confidence_threshold: (optional) float, indicating that only the detection whose confidence is above the threshold is considered. If this parameter is not set, all boxes are considered.
  • nms_threshold: (optional) float
  • top_k: (optional) int32
  • boxes: (optional) int32, default to 1
  • relative: (optional) bool, default to true
  • objectness_threshold: (optional) float, default to 0.5
  • class_threshold: (optional) float, default to 0.5
  • biases: array
  • general_nms_param: optional

[Constraint]

  • Used for the SSD network only
  • Value range of preTopK and postTopK: 1–1024
  • shareLocation = true
  • nmsEta = 1
  • Value range of numClasses: 1–2048
  • code_type = CENTER_SIZE
  • Value range of nms_threshold and confidence_threshold: 0.0–1.0

[Quantitative tool support]

Yes

39

Reorg

Real-time object detection

[Input]

One input

[Parameter]

  • stride: (optional) uint32, default to 2
  • reverse: (optional) bool, default to false

[Constraint]

sed only for YOLOv2

[Quantitative tool support]

No

40

Reverse

Reversion

[Input]

One input

[Parameter]

axis: (optional) int32, default to 1. Controls the axis to be reversed. The content layout will not be reversed.

[Constraint]

None

[Quantitative tool support]

No

41

LeakyRelu

LeakyRelu activation function

[Input]

One input

[Parameter]

Same as ReLU

[Constraint]

None

[Quantitative tool support]

Yes

42

YOLODetectionOutput

YOLO network detection output

[Input]

Four inputs

[Parameter]

  • num_classes: (mandatory) int32, indicating the number of classes to be predicted
  • share_location: (optional) bool, default to true, indicating that classes share one bounding box
  • background_label_id: (optional) int32, default to 0
  • nms_param: (optional) indicating non-maximum suppression (NMS)
  • save_output_param: (optional) indicating whether to save the detection result
  • code_type: (optional) default to CENTER_SIZE
  • variance_encoded_in_target: (optional) bool, default to true. The value true indicates that the variance is encoded in the target, otherwise the prediction offset needs to be adjusted accordingly.
  • keep_top_k: (optional) int32, indicating the total number of BBoxes to be reserved for each image after NMS
  • confidence_threshold: (optional) float, indicating that only the detection whose confidence is above the threshold is considered. If this parameter is not set, all boxes are considered.
  • nms_threshold: (optional) float
  • top_k: (optional) int32
  • boxes: (optional) int32, default to 1
  • relative: (optional) bool, default to true
  • objectness_threshold: (optional) float, default to 0.5
  • class_threshold: (optional) float, default to 0.5
  • biases: array
  • general_nms_param: optional

[Constraint]

  • sed only for YOLOv2
  • classNUm < 10,240; anchorBox ≤ 8
  • W ≤ 1,536
  • The upper layer of yolodetectionoutput must be the yoloregion operator.

[Quantitative tool support]

No

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback