赞
踩
Figure 1. Abstract representation of triplet attention with three branches capturing cross-dimension interaction. Given the input tensor, triplet attention captures inter-dimensional dependencies by rotating the input tensor followed by residual transformation.
Figure 2. Comparisons with different attention modules: (a) Squeeze Excitation (SE) Module; (b) Convolutional Block Attention Module (CBAM); © Global Context (GC) Module; (d) triplet attention (ours). The feature maps are denoted as feature dimensions, e.g. C × H × W denotes a feature map with channel number C, height H and width W. ⊗ represents matrix multiplication, ⊙ denotes broadcast element wise multiplication and ⊕ denotes broadcast element-wise addition.
1st Para: 背景
2st Para: 列举经典案例
3st Para: 我们的方法不同之处
4st Para: 我们的方法
5st Para: 优势
6st Para: 贡献
Figure 3. Illustration of the proposed triplet attention which has three branches. The top branch is responsible for computing attention weights across the channel dimension C and the spatial dimension W. Similarly, the middle branch is responsible for channel dimension C and spatial dimension H. The final branch at the bottom is used to capture spatial dependencies (H and W). In the first two branches, we adopt rotation operation to build connections between the channel dimension and either one of the spatial dimension. Finally, the weights are aggregated by simple averaging. More details can be found in Sec. 3.2
1st Para: 总述
1st Para: CBAM
ω = σ ( f ( W 0 , W 1 ) ( g ( χ ) ) + f ( W 0 , W 1 ) ( δ ( χ ) ) ) (1) \omega=\sigma(f_{(\mathbf{W}_0,\mathbf{W}_1)}(g(\chi))+f_{(\mathbf{W}_0,\mathbf{W}_1)}(\delta(\chi))) \tag{1} ω=σ(f(W0,W1)(g(χ))+f(W0,W1)(δ(χ)))(1)
g ( χ ) = 1 W × H ∑ i = 1 H ∑ j = 1 W χ i , j (2) g(\chi)=\frac{1}{W\times H}\sum_{i=1}^{H}\sum_{j=1}^{W}\chi_{i,j} \tag{2} g(χ)=W×H1i=1∑Hj=1∑Wχi,j(2)
δ ( χ ) = max H , W ( χ ) (3) \delta(\chi)=\max_{H,W}(\chi) \tag{3} δ(χ)=H,Wmax(χ)(3)
ω = σ ( W 1 ReLU ( W 0 g ( χ ) ) + W 1 ReLU ( W 0 δ ( χ ) ) ) (4) \omega=\sigma(\mathbf{W}_1\text{ReLU}(\mathbf{W}_0g(\chi))+\mathbf{W}_1\text{ReLU}(\mathbf{W}_0\delta(\chi))) \tag{4} ω=σ(W1ReLU(W0g(χ))+W1ReLU(W0δ(χ)))(4)
1st Para:
2st Para: Overview
3st Para: Cross-Dimension Interaction
4st Para: Z-pool
Z -pool ( χ ) = [ MaxPool 0 d ( χ ) , AvgPool 0 d ( χ ) ] (5) Z\text{-pool}(\chi)=[\text{MaxPool}_{0d}(\chi),\text{AvgPool}_{0d}(\chi)] \tag{5} Z-pool(χ)=[MaxPool0d(χ),AvgPool0d(χ)](5)
5st Para: Triplet Attention
6st Para:
7st Para:
8st Para: 总结
y = 1 3 ( χ 1 σ ( ψ 1 ( χ 1 ∗ ) ) ‾ + χ 2 σ ( ψ 2 ( χ 2 ∗ ) ) ‾ + χ σ ( ψ 3 ( χ 3 ) ) ) (6) y=\frac13(\overline{\chi_1\sigma(\psi_1(\chi_1^*))}+\overline{\chi_2\sigma(\psi_2(\chi_2^*))}+\chi\sigma(\psi_3(\chi_3))) \tag{6} y=31(χ1σ(ψ1(χ1∗))+χ2σ(ψ2(χ2∗))+χσ(ψ3(χ3)))(6)
y = 1 3 ( χ ^ 1 ω 1 ‾ + χ ^ 2 ω 2 ‾ + χ ω 3 ) = 1 3 ( y 1 ‾ + y 2 ‾ + y 3 ) (7) y=\frac13(\overline{\hat{\chi}_1\omega_1}+\overline{\hat{\chi}_2\omega_2}+\chi\omega_3)=\frac13(\overline{y_1}+\overline{y_2}+y_3) \tag{7} y=31(χ^1ω1+χ^2ω2+χω3)=31(y1+y2+y3)(7)
1st Para: 总结
2st Para: 展望
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。