Caffe Operator Boundaries

Updated on 2024-09-30 GMT+08:00

View PDF

For the Caffe framework, if the input dimension of each operator is not 4 and the axis parameter exists, negative numbers cannot be used.

Table 1 shows the boundaries of Caffe operators supported by .om models.

**Table 1** Caffe operator boundaries
No.	Operator	Definition	Boundary
1	Absval	Computes the absolute value of the input.	[Input] One input [Parameter] engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2 [Constraint] None [Quantitative tool support] Yes
2	Argmax	Returns the index number corresponding to the maximum input value.	[Input] One input [Parameter] out_max_val: (optional) bool, default to false top_k: (optional) unit32, default to 1 axis: (optional) int32 [Constraint] None [Quantitative tool support] No
3	BatchNorm	Normalizes the input: variance of [(x – avg(x))/x]	[Input] One input [Parameter] use_global_stats: bool, must be true moving_average_fraction: (optional) float, default to 0.999 eps: (optional) float, default to 1e-5 [Constraint] Only the C dimension can be normalized. [Quantitative tool support] Yes
4	Concat	Concatenates the input along the given dimension.	[Input] Multiple inputs [Parameter] concat_dim: (optional) uint32, default to 1, greater than 0 axis: (optional) int32, default to 1, exclusive with concat_dim. When axis is –1, four input dimensions are required. Otherwise, the result may be incorrect. [Constraint] For the input tensor, the sizes of its dimensions must be the same except the dimension for concatenation. The range of the input tensor count is [1, 1,000]. [Quantitative tool support] Yes
5	DepthwiseConvolution	Depthwise convolution	[Input] One 4D input, with a constant filter [Parameter] num_output: (optional) uint32 bias_term: (optional) bool, default to true pad: uint32, default to 0, array kernel_size: uint32, array stride: uint32, default to 1, array dilation: uint32, default to 1, array pad_h: (optional) uint32, default to 0 (2D only) pad_w: (optional) uint32, default to 0 (2D only) kernel_h: (optional) uint32 (2D only) kernel_w: (optional) uint32 (2D only) stride_h: (optional) uint32 (2D only) stride_w: (optional) uint32 (2D only) group: (optional) uint32, default to 1 weight_filler: (optional) FillerParameter bias_filler: (optional) FillerParameter engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2 force_nd_im2col: (optional) bool, default to false axis: (optional) int32, default to 1 [Constraint] filterN=inputC=group [Quantitative tool support] Yes
6	Convolution	Convolution	[Input] One 4D input, with a constant filter [Parameter] num_output: (optional) uint32 bias_term: (optional) bool, default to true pad: uint32, default to 0, array kernel_size: uint32, array stride: uint32, default to 1, array dilation: uint32, default to 1, array pad_h: (optional) uint32, default to 0 (2D only) pad_w: (optional) uint32, default to 0 (2D only) kernel_h: (optional) uint32 (2D only) kernel_w: (optional) uint32 (2D only) stride_h: (optional) uint32 (2D only) stride_w: (optional) uint32 (2D only) group: (optional) uint32, default to 1 weight_filler: (optional) FillerParameter bias_filler: (optional) FillerParameter engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2 force_nd_im2col: (optional) bool, default to false axis: (optional) int32, default to 1 [Constraint] (inputW + padWHead + padWTail) ≥ (((FilterW-1) x dilationW) + 1) (inputW + padWHead + padWTail)/StrideW + 1 ≤ 2147483647 (inputH + padHHead + padHTail) ≥ (((FilterH-1) x dilationH) + 1) (inputH + padHHead + padHTail)/StrideH + 1 ≤ 2147483647 0 ≤ Pad < 256, 0 < FilterSize < 256, 0 < Stride < 64, 1 ≤ dilationsize < 256 [Quantitative tool support] Yes
7	Crop	Crops the input.	[Input] Two inputs [Parameter] axis: (optional) int32, default to 2. When axis is –1, four input dimensions are required. offset: uint32, array [Constraint] None [Quantitative tool support] No
8	Deconvolution	Deconvolution	[Input] One 4D input, with a constant filter [Parameter] num_output: (optional) uint32 bias_term: (optional) bool, default to true pad: uint32, default to 0, array kernel_size: uint32, array stride: uint32, default to 1, array dilation: uint32, default to 1, array pad_h: (optional) uint32, default to 0 (2D only) pad_w: (optional) uint32, default to 0 (2D only) kernel_h: (optional) uint32 (2D only) kernel_w: (optional) uint32 (2D only) stride_h: (optional) uint32 (2D only) stride_w: (optional) uint32 (2D only) group: (optional) uint32, default to 1 weight_filler: (optional) FillerParameter bias_filler: (optional) FillerParameter engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2 force_nd_im2col: (optional) bool, default to false axis: (optional) int32, default to 1 [Constraint] group = 1 dilation = 1 filterH - padHHead - 1 ≥ 0 filterW - padWHead - 1 ≥ 0 Restrictions involving intermediate variables: a = ALIGN(filter_num, 16) x ALIGN(filter_c, 16) x filter_h x filter_w x 2 If ALIGN(filter_c, 16)%32 = 0, a = a/2 conv_input_width = (deconvolution input W – 1) x strideW + 1 b = (conv_input_width) x filter_h x ALIGN(filter_num, 16) x 2 x 2 a + b ≤ 1024 x 1024 [Quantitative tool support] Yes
9	DetectionOutput	Generates detection results and outputs FSR.	[Input] Three inputs [Parameter] num_classes: (mandatory) int32, indicating the number of classes to be predicted share_location: (optional) bool, default to true, indicating that classes share one bounding box background_label_id: (optional) int32, default to 0 nms_param: (optional) indicating non-maximum suppression (NMS) save_output_param: (optional) indicating whether to save the detection result code_type: (optional) default to CENTER_SIZE variance_encoded_in_target: (optional) bool, default to true. The value true indicates that the variance is encoded in the target, otherwise the prediction offset needs to be adjusted accordingly. keep_top_k: (optional) int32, indicating the total number of BBoxes to be reserved for each image after NMS confidence_threshold: (optional) float, indicating that only the detection whose confidence is above the threshold is considered. If this parameter is not set, all boxes are considered. nms_threshold: (optional) float top_k: (optional) int32 boxes: (optional) int32, default to 1 relative: (optional) bool, default to true objectness_threshold: (optional) float, default to 0.5 class_threshold: (optional) float, default to 0.5 biases: array general_nms_param: optional [Constraint] Used for Faster R-CNN Non-maximum suppression (NMS) ratio nmsThreshold is within (0, 1) Probability threshold postConfThreshold is within (0, 1) Classes ≥ 2 Input box count ≤ 1,024 Output W dimension = 16 [Quantitative tool support] Yes
10	Eltwise	Computes element-wise operations (PROD, MAX, and SUM).	[Input] At least two inputs [Parameter] operation: (optional) enum, PROD = 0, SUM = 1, MAX = 2; default to SUM coeff: float array stable_prod_grad: (optional) bool, default to true [Constraint] Up to four inputs Compared with the native operator, this operator does not support the stable_prod_grad parameter. PROD, MAX, and SUM operations are supported. [Quantitative tool support] Yes
11	Elu	Activation function	[Input] One input [Parameter] alpha: (optional) float, default to 1 [Constraint] None [Quantitative tool support] No
12	Exp	Applies e as the base and x as the exponent.	[Input] One input [Parameter] base: (optional) float, default to –1.0 scale: (optional) float, default to 1.0 shift: (optional) float, default to 0.0 [Constraint] None [Quantitative tool support] No
13	Flatten	Converts an input of shape N * C * H * W to a vector output of shape N * (C * H * W).	[Input] One input (top_size ≠ bottom_size ≠ 1. When axis is –1, four input dimensions are required.) [Parameter] axis: (optional) int32, default to 1 end_axis: (optional) int32, default to -1 [Constraint] axis < end axis [Quantitative tool support] Yes
14	FullConnection	Computes an inner product.	[Input] One input [Parameter] num_output: (optional) uint32 bias_term: (optional) bool, default to true weight_filler: (optional) FillerParameter, 2D bias_filler: (optional) FillerParameter, 1D axis: (optional) int32, default to 1 transpose: (optional) bool, default to false [Constraint] transpose = false, axis = 1 Bais_C ≤ 56832 To quantify the model, the following dimension restrictions must be satisfied: − When N = 1, then 2 x CEIL(C, 16) x 16 x xH x xW ≤ 1024 x 1024 − When N > 1, then 2 x 16 x CEIL(C, 16) x 16 x xH x xW ≤ 1024 x 1024 [Quantitative tool support] Yes
15	Interp	Interpolation layer	[Input] One input [Parameter] height: (optional) int32, default to 0 width: (optional) int32, default to 0 zoom_factor: (optional) int32, default to 1 shrink_factor: (optional) int32, default to 1 pad_beg: (optional) int32, default to 0 pad_end: (optional) int32, default to 0 NOTE: l zoom_factor and shrink_factor are exclusive. l height and zoom_factor are exclusive. l height and shrink_factor are exclusive. [Constraint] (outputH x outputW)/(inputH x inputW) > 1/30 [Quantitative tool support] No
16	Log	Performs logarithmic operation on the input.	[Input] One input [Parameter] base: (optional) float, default to –1.0 scale: (optional) float, default to 1.0 shift: (optional) float, default to 0.0 [Constraint] None [Quantitative tool support] No
17	LRN	Normalizes the input in a local region.	[Input] One non-constant input [Parameter] local_size: (optional) uint32, default to 5 alpha: (optional) float, default to 1 beta: (optional) float, default to 0.75 norm_region: (optional) enum, default to ACROSS_CHANNELS (ACROSS_CHANNELS = 0, WITHIN_CHANNEL = 1) lrnk: (optional) float, default to 1 engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2 [Constraint] local_size is an odd number greater than 0. Inter-channel: If local_size is within [1, 15]: lrnK > 0.00001 and beta > 0.01; Otherwise, lrnK and beta are any values. lrnK and alpha are not 0 at the same time. When the C dimension is greater than 1,776, local_size < 1728. Intra-channel: lrnK = 1, local_size is within [1, 15], beta > 0.01. [Quantitative tool support] Yes
18	LSTM	Long and short term memory network (LSTM)	[Input] Two or three inputs X: time sequence data (T x B x Xt), which is in the NCHW 4D format, where, N corresponds to the time sequence length T, C corresponds to the batch size B, H corresponds to the input data Xt at time point t, and W is fixed at 1. Cont: sequence continuity flag (T x B) Xs: (optional) static data (B x Xt) [Parameter] num_output: (optional) uint32, default to 0 weight_filler: (optional) FillerParameter bias_filler: (optional) FillerParameter debug_info: (optional) bool, default to false expose_hidden: (optional) bool, default to false [Constraint] Restrictions involving intermediate variables: a = (ALIGN(xt, 16) + ALIGN(output, 16)) x 16 x 2 x 2 b = (ALIGN(xt, 16) + ALIGN(output, 16)) x 16 x 4 x 2 x 2 c = use_projection ? ALIGN(ht, 16) x ALIGN(output, 16) x 2):0 d = 16 x ALIGN(ht, 16) x 2 e = batchNum x 4 The constraints are as follows: a + b + c ≤ 1024 x 1024 d ≤ 256 x 1024/8 e ≤ 256 x 1024/32 [Quantitative tool support] No
19	Normalize	Normalization layer	[Input] One input [Parameter] across_spatial: (optional) bool, default to true scale_filler: (optional) default to 1.0 channel_shared: (optional) bool, default to true eps: (optional) float, default to 1e-10 [Constraint] 1e – 7 < eps ≤ 0.1 + (1e – 6) across_spatial can only be true for Caffe, indicating normalization by channel. [Quantitative tool support] Yes
20	Permute	Permutes the input dimensions according to a given mode.	[Input] One input [Parameter] order: uint32, array [Constraint] None [Quantitative tool support] Yes
21	Pooling	Pools the input.	[Input] One input [Parameter] pool: (optional) enum, indicating the pooling method, MAX = 0, AVE = 1, and STOCHASTIC = 2, default to MAX pad: (optional) uint32, default to 0 pad_h: (optional) uint32, default to 0 pad_w: (optional) uint32, default to 0 kernel_size: (optional) uint32, exclusive with kernel_h/kernel_w kernel_h: (optional) uint32 kernel_w: (optional) uint32, used in pair with kernel_h stride: (optional) uint32, default to 1 stride_h: (optional) uint32 stride_w: (optional) uint32 engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2 global_pooling: (optional) bool, default to false ceil_mode: (optional) bool, default to true round_mode: (optional) enum, CEIL = 0, FLOOR = 1, default to CEIL [Constraint] kernelH ≤ inputH + padTop + padBottom kernelW ≤ inputW + padLeft + padRight padTop < windowH padBottom < windowH padLeft < windowW padRight < windowW In addition to common restrictions, the following restrictions must be satisfied. The global pool mode supports only the following ranges: outputH==1 && outputW==1 && kernelH>=inputH && kernelW>=inputW inputH*inputW ≤ 10,000 [Quantitative tool support] Yes
22	Power	Computes the output y as (scale * x + shift)^power.	[Input] One input [Parameter] power: (optional) float, default to 1.0 scale: (optional) float, default to 1.0 shift: (optional) float, default to 0.0 [Constraint] power! = 1 scale * x + shift > 0 [Quantitative tool support] Yes
23	Prelu	Activation function	[Input] One input [Parameter] filler: optional channel_shared: (optional) bool, indicating whether to share slope parameters across channels, default to false [Constraint] None [Quantitative tool support] Yes
24	PriorBox	Obtains the real location of the target from the box proposals.	[Input] One input [Parameter] min_size: (mandatory) indicating the minimum frame size (in pixels) max_size: (mandatory) indicating the maximum frame size (in pixels) aspect_ratio: array, float. A repeated ratio is ignored. If no aspect ratio is provided, the default ratio 1 is used. flip: (optional) bool, default to true. The value true indicates that each aspect ratio is reversed. For example, for aspect ratio r, the aspect ratio 1.0/r is generated. clip: (optional) bool, default to false. The value true indicates that the previous value is clipped to the range [0, 1]. variance: array, used to adjust the variance of the BBoxes img_size: (optional) uint32, exclusive with img_h or img_w img_h: (optional) uint32 img_w: (optional) uint32 step: (optional) float, exclusive with step_h or step_w step_h: (optional) float step_w: (optional) float offset: float, default to 0.5 [Constraint] Used for the SSD network only Output dimensions: [n, 2, detected boxes x 4, 1] [Quantitative tool support] Yes
25	Proposal	Sorts the box proposals by (proposal, score) and obtains the top N proposals by using the NMS.	[Input] Three inputs (scores, bbox_pred, im_info) [Parameter] feat_stride: (optional) float base_size: (optional) float min_size: (optional) float ratio: float array scale: float array pre_nms_topn: (optional) int32 post_nms_topn: (optional) int32 nms_thresh: (optional) float [Constraint] Used only for Faster R-CNN ProposalParameter and PythonParameter are exclusive. Value range of preTopK: 1-6,144 Value range of postTopK: 1-1,024 scaleCnt x ratioCnt ≤ 64 nmsTresh: threshold for Intersection-over-Union (IoU) box filtering, 0 < nmsTresh ≤ 1 minSize: minimum edge length of a box. A value less than this parameter is filtered out. featStride: H/W stride between the two adjacent boxes used in default box generation baseSize: default box size used in default box generation ratio and scale: used in default box generation imgH and imgW: height and width of the image input to the network. The values must be greater than 0. Restrictions on the input dimensions: clsProb: C = 2 x scaleCnt x ratioCnt bboxPred: C = 4 x scaleCnt x ratioCnt bboxPrior: N = clsProb.N, C = 4 x scaleCnt x ratioCnt imInfo: N = clsProb.N, C = 3 [Quantitative tool support] Yes
26	PSROIPooling	Position-sensitive region-of-interest pooling (PSROIPooling)	[Input] Two inputs [Parameter] spatial_scale: (mandatory) float output_dim: (mandatory) int32, indicating the number of output channels group_size: (mandatory) int32, indicating the number of groups to encode position-sensitive score maps [Constraint] Used for the Region-based Fully Convolutional Network (R-FCN) ROI coordinates [roiN, roiC, roiH, roiW]: 1 ≤ roiN ≤ 65535, roiC == 5, roiH == 1, roiW == 1 Dimensions of the input feature map: [xN, xC, xH, xW] pooledH == pooledW == groupSize ≤ 128 pooledH and pooledW indicate the length and width of the pooled ROI. Output format: y [yN, yC, yH, yW] poolingMode == avg pooling, pooledH == pooledW == groupSize, pooledH ≤ 128, spatialScale > 0, groupSize > 0, outputDim > 0 1 ≤ xN ≤ 65535, roisN % xN == 0 HW_LIMIT defines the limits of xH and xW. xHW = xH * xW pooledHW = pooledH * pooledW HW_LIMIT = (64 x 1024 – 8 x 1024)/32 xH ≥ pooledH, xW ≥ pooledW xHW ≥ pooledHW xHW/pooledHW ≤ HW_LIMIT In multi-batch scenarios, the ROIs are allocated equally to the batches. In addition, the batch sequence of the ROIs is the same as the feature. [Quantitative tool support] Yes
27	Relu	Activation function, including common ReLU and Leaky ReLU, which can be specified by parameters	[Input] One input [Parameter] negative_slope: (optional) float, default to 0 engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2 [Constraint] None [Quantitative tool support] Yes
28	Reshape	Reshapes the input.	[Input] One input [Parameter] shape: constant, int64 or int axis: (optional) int32, default to 0 num_axes: (optional) int32, default to -1 [Constraint] None [Quantitative tool support] Yes
29	ROIAlign	Aggregates features using ROIs.	[Input] At least two inputs [Parameter] pooled_h: (optional) uint32, default to 0 pooled_w: (optional) uint32, default to 0 spatial_scale: (optional) float, default to 1 sampling_ratio: (optional) int32, default to -1 [Constraint] Mainly used for Mask R-CNN Restrictions on the feature map: 1) H x W ≤ 5,248 (N > 1) or W x C < 40,960 (N = 1) 2) C ≤ 1280 3) ((C - 1)/128 + 1) x pooledW ≤ 216 Restrictions on the ROI: 1) C = 5 (caffe), H = 1, W = 1 2) samplingRatio * pooledW ≤ 128, samplingRatio * pooledH ≤ 128 3) H ≥ pooledH, W ≥ pooledW [Quantitative tool support] Yes
30	ROIPooling	Maps ROI proposals to a feature map.	[Input] At least two inputs [Parameter] pooled_h: (mandatory) uint32, default to 0 pooled_w: (mandatory) uint32, default to 0 spatial_scale: (mandatory) float, default to 1. The multiplication spatial scale factor is used to convert ROI coordinates from the input scale to the pool scale. [Constraint] Mainly used for Faster R-CNN Input dimensions: H x W ≤ 8,160, H ≤ 120, W ≤ 120 Output dimensions: pooledH ≤ 20, pooledW ≤ 20 [Quantitative tool support] Yes
31	Scale	out = alpha x Input + beta	[Input] Two inputs, each with four dimensions [Parameter] axis: (optional) int32, default to 1. Only 1 or –3 is supported. num_axes: (optional) int32, default to 1 filler: (optional) ignored unless only one bottom is given and scale is a learned parameter bias_term: (optional) bool, default to false, indicating whether to learn a bias (equivalent to ScaleLayer + BiasLayer, but may be more efficient). Initialized with bias_filler. bias_filler: (optional) default to 0 [Constraint] shape of scale and bias: (n, c, 1, 1), with the C dimension equal to that of the input [Quantitative tool support] Yes
32	ShuffleChannel	Shuffles information across the feature channels.	[Input] One input [Parameter] group: (optional) uint32, default to 1 [Constraint] None [Quantitative tool support] Yes
33	Sigmoid	Activation function	[Input] One input [Parameter] engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2 [Constraint] None [Quantitative tool support] Yes
34	Slice	Slices an input into multiple outputs.	[Input] One input [Parameter] slice_dim: (optional) uint32, default to 1, exclusive with axis slice_point: array, uint32 axis: (optional) int32, default to 1, indicating concatenation along the channel dimension [Constraint] None [Output] None [Quantitative tool support] Yes
35	Softmax	Normalized logic function	[Input] One input [Parameter] engine: (optional) default to 0, CAFFE = 1, CUDNN = 2 axis: (optional) int32, default to 1, indicating the axis along which softmax is performed [Constraint] Softmax can be performed on each of the four input dimensions. According to axis: When axis = 1: C ≤ ((256 x 1024/4) – 8 x 1024 – 256)/2 When axis = 0: n ≤ (56 x 1024 – 256)/2 When axis = 2: W = 1, 0 < h < (1024 x 1024/32) When axis = 3: 0 < W < (1024 x 1024/32) If the input contains fewer than four dimensions, softmax is performed only on the last dimension, with the last dimension ≤ 46,080. [Quantitative tool support] Yes
36	Tanh	Activation function	[Input] One input [Parameter] engine: (optional) enum, default to 0, CAFFE = 1, CUDNN = 2 [Constraint] The number of tensor elements cannot exceed INT32_MAX. [Quantitative tool support] Yes
37	Upsample	Backward propagation of max pooling	[Input] Two inputs [Parameter] scale: (optional) int32, default to 1 [Constraint] None [Quantitative tool support] Yes
38	SSDDetectionOutput	SSD network detection output	[Input] Three inputs [Parameter] num_classes: (mandatory) int32, indicating the number of classes to be predicted share_location: (optional) bool, default to true, indicating that classes share one bounding box background_label_id: (optional) int32, default to 0 nms_param: (optional) indicating non-maximum suppression (NMS) save_output_param: (optional) indicating whether to save the detection result code_type: (optional) default to CENTER_SIZE variance_encoded_in_target: (optional) bool, default to true. The value true indicates that the variance is encoded in the target, otherwise the prediction offset needs to be adjusted accordingly. keep_top_k: (optional) int32, indicating the total number of BBoxes to be reserved for each image after NMS confidence_threshold: (optional) float, indicating that only the detection whose confidence is above the threshold is considered. If this parameter is not set, all boxes are considered. nms_threshold: (optional) float top_k: (optional) int32 boxes: (optional) int32, default to 1 relative: (optional) bool, default to true objectness_threshold: (optional) float, default to 0.5 class_threshold: (optional) float, default to 0.5 biases: array general_nms_param: optional [Constraint] Used for the SSD network only Value range of preTopK and postTopK: 1–1024 shareLocation = true nmsEta = 1 Value range of numClasses: 1–2048 code_type = CENTER_SIZE Value range of nms_threshold and confidence_threshold: 0.0–1.0 [Quantitative tool support] Yes
39	Reorg	Real-time object detection	[Input] One input [Parameter] stride: (optional) uint32, default to 2 reverse: (optional) bool, default to false [Constraint] sed only for YOLOv2 [Quantitative tool support] No
40	Reverse	Reversion	[Input] One input [Parameter] axis: (optional) int32, default to 1. Controls the axis to be reversed. The content layout will not be reversed. [Constraint] None [Quantitative tool support] No
41	LeakyRelu	LeakyRelu activation function	[Input] One input [Parameter] Same as ReLU [Constraint] None [Quantitative tool support] Yes
42	YOLODetectionOutput	YOLO network detection output	[Input] Four inputs [Parameter] num_classes: (mandatory) int32, indicating the number of classes to be predicted share_location: (optional) bool, default to true, indicating that classes share one bounding box background_label_id: (optional) int32, default to 0 nms_param: (optional) indicating non-maximum suppression (NMS) save_output_param: (optional) indicating whether to save the detection result code_type: (optional) default to CENTER_SIZE variance_encoded_in_target: (optional) bool, default to true. The value true indicates that the variance is encoded in the target, otherwise the prediction offset needs to be adjusted accordingly. keep_top_k: (optional) int32, indicating the total number of BBoxes to be reserved for each image after NMS confidence_threshold: (optional) float, indicating that only the detection whose confidence is above the threshold is considered. If this parameter is not set, all boxes are considered. nms_threshold: (optional) float top_k: (optional) int32 boxes: (optional) int32, default to 1 relative: (optional) bool, default to true objectness_threshold: (optional) float, default to 0.5 class_threshold: (optional) float, default to 0.5 biases: array general_nms_param: optional [Constraint] sed only for YOLOv2 classNUm < 10,240; anchorBox ≤ 8 W ≤ 1,536 The upper layer of yolodetectionoutput must be the yoloregion operator. [Quantitative tool support] No