Advertisement

EffificientDet: Scalable and Effificient Object Detection

阅读量:

动机:

CC

CC

CC

CC

CC

CC

解题思路:

CC

CC

CC

CC

CC

BiFPN

CC

CC

CC

在这里插入图片描述

Figure 2 illustrates the design of feature networks.
(a) The FPN framework incorporates an upper pathway that integrates features across scales from levels 3 through 7 (P3-P7).
(b) PANet enhances this by adding a lower layer pathway.
(c) NAS-FPN leverages neural architecture search to identify an irregular feature topology and consistently applies identical blocks.
(d) Our BiFPN offers improved accuracy and efficiency trade-offs compared to existing methods.

CC

CC

CC

CC

CC

CC

CC

CC

CC

  • Weighted Feature Fusion

CC

CC

Unbounded fusion:

在这里插入图片描述

CC

Softmax-based fusion:

在这里插入图片描述

CC

CC

Fast normalized fusion:

在这里插入图片描述

CC

Our finalized BiFPN combines two-way cross-scale connections with fast normalized fusion. As a specific instance, we elaborate on the two integrated features at level six for BiFPN as depicted in Figure 2(d):

在这里插入图片描述
在这里插入图片描述

CC

CC

EffificientDet

CC

CC

在这里插入图片描述

Figure 3 illustrates the EfficientDet architecture, which integrates a backbone network based on EfficientNet [36], coupled with a feature pyramid network (BiFPN), alongside a shared class and box prediction network. The BiFPN layers and class/box prediction layers are meticulously designed to be repeatedly employed under diverse resource constraints, as detailed in Table 1.

CC

CC

CC

CC

CC

  1. Backbone network

CC

CC

CC

  1. BiFPN network

Mathematically, the number of channels and layers in a BiFPN are determined by the following equation. Additionally, the network's depth is represented by Dbifpn (#layers), while its width is denoted by Wbifpn (number of channels).

在这里插入图片描述

【CC】也就是个检验公式,没啥好说的,下面作者自己也写了,也是试验出来的

  1. Box/class prediction network

We set their width equal to that of BiFPN (i.e., Wpred = Wbifpn), while gradually increasing the depth (#layers) through an equation.

在这里插入图片描述
  1. Input image resolution

In the BiFPN architecture, the input resolution must be divisible by 27, which equals 128. Thus, we linearly increase the resolutions through the use of an equation.

在这里插入图片描述

CC

CC

CC

在这里插入图片描述

CC

CC

CC

在这里插入图片描述

CC

全部评论 (0)

还没有任何评论哟~