当前位置:   article > 正文

swin-transformer详解及代码复现_swin transformer复现

swin transformer复现

1. swin-transformer网络结构








实际上,我们在进行代码复现时应该是下图,接下来我们根据下面的图片进行分段实现



2. Patch Partition & Patch Embedding

首先将图片输入到Patch Partition模块中进行分块,即每4x4相邻的像素为一个Patch,然后在channel方向展平(flatten)。假设输入的是RGB三通道图片,那么每个patch就有4x4=16个像素,然后每个像素有R、G、B三个值所以展平后是16x3=48,所以通过Patch Partition后图像shape由 [H, W, 3]变成了 [H/4, W/4, 48]。然后在通过Linear Embeding层对每个像素的channel数据做线性变换,由48变成C,即图像shape再由 [H/4, W/4, 48]变成了 [H/4, W/4, C]。其实在源码中Patch Partition和Linear Embeding就是直接通过一个卷积层实现的,和之前Vision Transformer中讲的 Embedding层结构一模一样。

import paddle
import paddle.nn as nn
class PatchEmbedding(nn.Layer):
    def __init__(self,patch_size=4,embed_dim=96):
        super().__init__()
        self.patch_embed = nn.Conv2D(3,out_channels=96,kernel_size=4,stride=4)
        self.norm = nn.LayerNorm(embed_dim)
    def forward(self,x):
        x = self.patch_embed(x) #[B,embed_dim,h,w]
        x = x.flatten(2)    #[B,embed_dim,h*w]
        x = x.transpose([0,2,1])
        x = self.norm(x)   
        return x

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

3. Patch Merging

前面有说,在每个Stage中首先要通过一个Patch Merging层进行下采样(Stage1除外)。如下图所示,假设输入Patch Merging的是一个4x4大小的单通道特征图(feature map),Patch Merging会将每个2x2的相邻像素划分为一个patch,然后将每个patch中相同位置(同一颜色)像素给拼在一起就得到了4个feature map。接着将这四个feature map在深度方向进行concat拼接,然后在通过一个LayerNorm层。最后通过一个全连接层在feature map的深度方向做线性变化,将feature map的深度由C变成C/2。通过这个简单的例子可以看出,通过Patch Merging层后,feature map的高和宽会减半,深度会翻倍

class PatchMerging(nn.Layer):
    def __init__(self,resolution,dim):
        super().__init__()
        self.resolution = resolution
        self.dim = dim
        self.reduction = nn.Linear(4*dim,2*dim)
        self.norm = nn.LayerNorm(4*dim)
        
    def forward(self,x):
        h ,w = self.resolution
        b,_,c = x.shape
        x = x.reshape([b,h,w,c])
        x0 = x[:,0::2,0::2,:]
        x1 = x[:,0::2,1::2,:]
        x2 = x[:,1::2,0::2,:]
        x3 = x[:,1::2,1::2,:]
        x = paddle.concat([x0,x1,x2,x3],axis=-1)
        x = x.reshape([b,-1,4*c])
        x = self.norm(x)
        x = self.reduction(x)
        return x
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

PS:演示一下 x[:,0::2,0::2,:]等的作用

4. W-MSA(Windows Multi-head Self-Attention)和SW-MSA(Shifted Windows Multi-head Self-Attentio)

之所以引用Windows Multi-head Self-Attention(W-MSA)模块是为了减少计算量,采用W-MSA模块时,只会在每个窗口内进行自注意力计算,所以窗口与窗口之间是无法进行信息传递的,为了解决这个问题,作者引入了Shifted Windows Multi-Head Self-Attention(SW-MSA)模块。

# 将layer分成若干个windows,然后在每个windows内attention计算
def windows_partition(x , window_size):
    B , H , W , C = x.shape
    x = x.reshape([B,H//window_size,window_size,W//window_size,window_size,C])
    # [B,H//window_size,W//window_size,window_size,window_size,C]
    x.transpose([0,1,3,2,4,5])
    x.reshape([-1,window_size,window_size,C])
    # [B*H//window_size*w//window_size,window_size,window_size,c]
    return x

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
#将若干个windows合并为一个layer。
def window_reverse(window, window_size , H , W ):
    B = window.shape[0]//((H//window_size)*(W//window_size))
    x = window.reshape([B,H//window_size,W//window_size,window_size,window_size,-1])
    x = x.transpose([0,1,3,2,4,5])
    x = x.reshape([B,H,W,-1])
    return x
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7




接下来,在每个window中做self attention,就是在不关注mask的情况下,attention与transformer中的self attention没啥区别。

class window_attention(nn.Layer):
    def __init__(self,dim,window_size,num_heads):
        super().__init__()
        self.dim = dim
        self.dim_head = dim//num_heads
        self.num_heads = num_heads
        self.scale = self.dim_head**-0.5
        self.softmax = nn.Softmax(-1)
        self.qkv = nn.Linear(dim,int(dim*3))
        self.proj = nn.Linear(dim,dim)
    
    def transpose_multi_head(self,x):
        new_shape = x.shape[:-1]+[self.num_heads,self.dim_head]
        x = x.reshape(new_shape)
        # [B,num_patches,num_heads,dim_head]
        x = x.transpose([0,2,1,3])
         # [B,num_heads,num_patches,dim_head]
        return x
    def forward(self,x,mask=None):
        B,N,C = x.shape
        qkv = self.qkv(x).chunk(3,-1)
        q,k,v = map(self.transpose_multi_head,qkv)
        q = q*self.scale
        attn = paddle.matmul(q,k,transpose_y=True)
        
        # attn = self.softmax(attn)
        if mask is None:
            attn = self.softmax(attn)
        else:
            attn = attn.reshape([B//mask.shape[0],mask.shape[0],self.num_heads,mask.shape[1],mask.shape[1 ]])
            attn = attn+mask.unsqueeze(1).unsqueeze(0)
            attn = attn.reshape([-1,self.num_heads,mask.shape[1],mask.shape[1]])
            attn = self.softmax(attn)
        attn = paddle.matmul(attn,v)
        # [B,num_heads,num_patches,dim_head]
        attn = attn.transpose([0,2,1,3])
        #[B,num_patches,num_heas,dim_head]
        attn = attn.reshape([B,N,C])
        out = self.proj(attn)
        return out 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40

至于SW-MSA(Shifted Windows Multi-head Self-Attentio),具体的是如何实现的,可以详见博客,我在此处针对我所认为的难点,写了一些demo方便理解。

paddle.roll()

关于paddle.roll(同torch.roll),下面的图片中,ba 分别在第0轴和第1轴,下移两次,然后b再同样的操作便能达到a

如何生成generate mask

关于self.register_buffer与attention mask

        if self.shift_size > 0:
            H, W = self.resolution
            img_mask = paddle.zeros((1, H, W, 1))
            h_slices = (slice(0, -self.window_size),
                        slice(-self.window_size, -self.shift_size),
                        slice(-self.shift_size, None))
            w_slices = (slice(0, -self.window_size),
                        slice(-self.window_size, -self.shift_size),
                        slice(-self.shift_size, None))
            cnt = 0
            for h in h_slices:
                for w in w_slices:
                    img_mask[:, h, w, :] = cnt
                    cnt += 1
            mask_windows = windows_partition(img_mask, self.window_size)
            mask_windows = mask_windows.reshape((-1, self.window_size * self.window_size))
            attn_mask = mask_windows.unsqueeze(1) - mask_windows.unsqueeze(2)
            attn_mask = paddle.where(attn_mask != 0,
                                     paddle.ones_like(attn_mask) * float(-100.0),
                                     attn_mask)
            attn_mask = paddle.where(attn_mask == 0,
                                     paddle.zeros_like(attn_mask),
                                     attn_mask)
        else:
            attn_mask = None
            
        self.register_buffer("attn_mask", attn_mask)

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28

一般情况下,是将网络中的参数保存成orderedDict形式的,这里的参数其实包含两种,一种是模型中各种module含的参数,即nn.Parameter,我们当然可以在网络中定义其他的nn.Parameter参数,另一种就是buffer,前者每次optim.step会得到更新,而不会更新后者
接下来就是分成若干个window,展平(flatten),展平后,自己乘自己,最后得到attention mask。(上上图有展示)

class Identity(nn.Layer):
    def __init__(self):
        super().__init__()
    def forward(self,x):
        return x
  • 1
  • 2
  • 3
  • 4
  • 5
class Mlp(nn.Layer):
    def __init__(self,embed_dim,mlp_ratio=4.0,dropout=0.):
        super().__init__()
        w_att_1,b_att_1 = self.init_weight()
        w_att_2,b_att_2 = self.init_weight()
        self.fc1 = nn.Linear(embed_dim,int(embed_dim*mlp_ratio),weight_attr=w_att_1,bias_attr=b_att_1)
        self.fc2 = nn.Linear(int(embed_dim*mlp_ratio),embed_dim,weight_attr=w_att_2,bias_attr=b_att_2)
        self.dropout = nn.Dropout(dropout)
        self.act = nn.GELU()
    def init_weight(self):
        weight_attr = paddle.ParamAttr(initializer=nn.initializer.TruncatedNormal(std=0.2))
        bias_attr = paddle.ParamAttr(initializer=nn.initializer.Constant(.0))
        return  weight_attr,bias_attr
    def forward(self,x):
        x = self.fc1(x)
        x = self.act(x)
        x = self.dropout(x)
        x = self.fc2(x)
        x = self.dropout(x)
        return x
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

4. swin block

所有的模块在写完后,我们便需要将每个模块串联起来生成swin block。除了需要判断是 W-MSA和SW-MSA,其他的和transformer中的encoder没区别。在patch embedding后,将patch分成若干个window,在各个window中分别做W-MSA或SW-MSA,残差连接,然后再mlp,再进行残差连接。

class SwinBlock(nn.Layer):
    def __init__(self,dim,input_resolution,num_heads,window_size,shift_size):
        super().__init__()
        self.dim = dim
        self.resolution = input_resolution
        self.window_size = window_size
        self.att_norm = nn.LayerNorm(dim)
        self.attn = window_attention(dim=dim,window_size=window_size, num_heads=num_heads)
        self.mlp = Mlp(dim)
        self.shift_size = shift_size
        self.mlp_norm = nn.LayerNorm(dim)
        if self.shift_size > 0:
            H, W = self.resolution
            img_mask = paddle.zeros((1, H, W, 1))
            h_slices = (slice(0, -self.window_size),
                        slice(-self.window_size, -self.shift_size),
                        slice(-self.shift_size, None))
            w_slices = (slice(0, -self.window_size),
                        slice(-self.window_size, -self.shift_size),
                        slice(-self.shift_size, None))
            cnt = 0
            for h in h_slices:
                for w in w_slices:
                    img_mask[:, h, w, :] = cnt
                    cnt += 1
            mask_windows = windows_partition(img_mask, self.window_size)
            mask_windows = mask_windows.reshape((-1, self.window_size * self.window_size))
            attn_mask = mask_windows.unsqueeze(1) - mask_windows.unsqueeze(2)
            attn_mask = paddle.where(attn_mask != 0,
                                     paddle.ones_like(attn_mask) * float(-100.0),
                                     attn_mask)
            attn_mask = paddle.where(attn_mask == 0,
                                     paddle.zeros_like(attn_mask),
                                     attn_mask)
        else:
            attn_mask = None
        self.register_buffer("attn_mask", attn_mask)

    def forward(self,x):

        H,W = self.resolution
        B,N,C = x.shape
        h = x
        x = self.att_norm(x)
        x = x.reshape([B,H,W,C])
        if self.shift_size >0 :
            shift_x = paddle.roll(x,shifts=(-self.shift_size,-self.shift_size),axis=(1,2))
        else:
            shift_x = x
        x_windows = windows_partition(shift_x,self.window_size)
        x_windows = x_windows.reshape([-1,self.window_size*self.window_size,C])
        attn_windows = self.attn(x_windows,mask = self.attn_mask)
        attn_windows = attn_windows.reshape([-1,self.window_size,self.window_size,C])
        shifted_x = window_reverse(attn_windows,self.window_size,H,W)
        if self.shift_size>0:
            x = paddle.roll(shifted_x,shifts=(-self.shift_size,-self.shift_size),axis=(1,2))
        else:
            x = shifted_x          
        x = x.reshape([B,-1,C])
        x = h+x
        h = x
        x = self.mlp_norm(x)
        x = self.mlp(x)
        x = h+x
        return x
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65

5. 接下来我们将所有的模块串联起来生成一个stage

stage由若干个Swin Transformer Block和一个Patch Merging生成。

class SwinTransformerStage(nn.Layer):
    def __init__(self,dim,input_resolution,depth,num_heads,window_size,patch_merging= None):
        super().__init__()
        self.blocks = nn.LayerList()
        for i in range(depth):
            # print(i)
            self.blocks.append(SwinBlock(dim = dim,input_resolution=input_resolution,num_heads=num_heads,window_size=window_size,\
                        shift_size=0 if (i % 2 == 0) else window_size//2))
        if patch_merging is None:
            self.patch_merging = Identity()
        else:
            self.patch_merging = patch_merging(input_resolution,dim)
    def forward(self,x):
        for block in self.blocks:
            x = block(x)
        x = self.patch_merging(x)
        return x
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
class SwinTransformerStage(nn.Layer):
    def __init__(self,dim,input_resolution,depth,num_heads,window_size,patch_merging= None):
        super().__init__()
        self.blocks = nn.LayerList()
        for i in range(depth):
            # print(i)
            self.blocks.append(SwinBlock(dim = dim,input_resolution=input_resolution,num_heads=num_heads,window_size=window_size,\
                        shift_size=0 if (i % 2 == 0) else window_size//2))
        if patch_merging is None:
            self.patch_merging = Identity()
        else:
            self.patch_merging = patch_merging(input_resolution,dim)
    def forward(self,x):
        for block in self.blocks:
            x = block(x)
        x = self.patch_merging(x)
        return x

class Swin(nn.Layer):
    def __init__(self, 
                 image_size=224,
                 patch_size=4,
                 in_channels=3,
                 embed_dim=96,
                 window_size=7,
                 num_heads=[3,6,12,24],
                 depths = [2,2,62],
                 num_classes=1000):
        super().__init__()
        self.num_classes = num_classes
        self.depths = depths
        self.num_heads = num_heads
        self.embed_dim = embed_dim
        self.num_stages = len(depths)
        self.num_features = int(self.embed_dim * 2 ** (self.num_stages - 1))
        self.patch_resolution = [image_size//patch_size,image_size//patch_size]
        self.patch_embedding = PatchEmbedding(patch_size=patch_size,embed_dim=embed_dim)
        self.stages = nn.LayerList()
        for idx,(depth,num_heads) in enumerate(zip(self.depths,num_heads)):

            stage = SwinTransformerStage(dim=int(self.embed_dim*2**idx),
                                        input_resolution=(self.patch_resolution[0]//(2**idx),
                                                          self.patch_resolution[0]//(2**idx)),
                                        depth=depth,
                                        num_heads=num_heads,
                                        window_size=window_size,
                                        patch_merging=PatchMerging if (idx < self.num_stages-1 ) else None )
            self.stages.append(stage)
        self.norm = nn.LayerNorm(self.num_features)
        self.avgpool = nn.AdaptiveAvgPool1D(1)
        self.fc = nn.Linear(self.num_features,self.num_classes)
    def forward(self,x):
        x = self.patch_embedding(x)
        for stage in self.stages:
            x = stage(x)
        x = self.norm(x)
        x = x.transpose([0,2,1])
        x = self.avgpool(x)
        x = x.flatten(1)
        x = self.fc(x)
        return x       
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61

6. 输出网络

    model = Swin()
    print(model)
    out = model(t)
    print(out.shape)
  • 1
  • 2
  • 3
  • 4
Swin(
  (patch_embedding): PatchEmbedding(
    (patch_embed): Conv2D(3, 96, kernel_size=[4, 4], stride=[4, 4], data_format=NCHW)
    (norm): LayerNorm(normalized_shape=[96], epsilon=1e-05)
  )
  (stages): LayerList(
    (0): SwinTransformerStage(
      (blocks): LayerList(
        (0): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[96], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=96, out_features=288, dtype=float32)
            (proj): Linear(in_features=96, out_features=96, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=96, out_features=384, dtype=float32)
            (fc2): Linear(in_features=384, out_features=96, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[96], epsilon=1e-05)
        )
        (1): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[96], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=96, out_features=288, dtype=float32)
            (proj): Linear(in_features=96, out_features=96, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=96, out_features=384, dtype=float32)
            (fc2): Linear(in_features=384, out_features=96, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[96], epsilon=1e-05)
        )
      )
      (patch_merging): PatchMerging(
        (reduction): Linear(in_features=384, out_features=192, dtype=float32)
        (norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
      )
    )
    (1): SwinTransformerStage(
      (blocks): LayerList(
        (0): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[192], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=192, out_features=576, dtype=float32)
            (proj): Linear(in_features=192, out_features=192, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=192, out_features=768, dtype=float32)
            (fc2): Linear(in_features=768, out_features=192, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[192], epsilon=1e-05)
        )
        (1): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[192], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=192, out_features=576, dtype=float32)
            (proj): Linear(in_features=192, out_features=192, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=192, out_features=768, dtype=float32)
            (fc2): Linear(in_features=768, out_features=192, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[192], epsilon=1e-05)
        )
      )
      (patch_merging): PatchMerging(
        (reduction): Linear(in_features=768, out_features=384, dtype=float32)
        (norm): LayerNorm(normalized_shape=[768], epsilon=1e-05)
      )
    )
    (2): SwinTransformerStage(
      (blocks): LayerList(
        (0): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (1): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (2): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (3): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (4): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (5): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (6): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (7): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (8): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (9): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (10): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (11): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (12): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (13): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (14): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (15): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (16): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (17): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (18): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (19): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (20): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (21): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (22): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (23): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (24): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (25): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (26): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (27): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (28): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (29): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (30): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (31): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (32): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (33): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (34): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (35): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (36): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (37): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (38): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (39): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (40): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (41): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (42): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (43): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (44): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (45): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (46): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (47): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (48): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (49): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (50): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (51): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (52): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (53): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (54): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (55): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (56): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (57): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (58): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (59): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (60): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
        (61): SwinBlock(
          (att_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
          (attn): window_attention(
            (softmax): Softmax(axis=-1)
            (qkv): Linear(in_features=384, out_features=1152, dtype=float32)
            (proj): Linear(in_features=384, out_features=384, dtype=float32)
          )
          (mlp): Mlp(
            (fc1): Linear(in_features=384, out_features=1536, dtype=float32)
            (fc2): Linear(in_features=1536, out_features=384, dtype=float32)
            (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train)
            (act): GELU(approximate=False)
          )
          (mlp_norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
        )
      )
      (patch_merging): Identity()
    )
  )
  (norm): LayerNorm(normalized_shape=[384], epsilon=1e-05)
  (avgpool): AdaptiveAvgPool1D(output_size=1)
  (fc): Linear(in_features=384, out_features=1000, dtype=float32)
)



---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

/tmp/ipykernel_790/2976751405.py in <module>
      1 model = Swin()
      2 print(model)
----> 3 out = model(t)
      4 print(out.shape)


NameError: name 't' is not defined
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310
  • 311
  • 312
  • 313
  • 314
  • 315
  • 316
  • 317
  • 318
  • 319
  • 320
  • 321
  • 322
  • 323
  • 324
  • 325
  • 326
  • 327
  • 328
  • 329
  • 330
  • 331
  • 332
  • 333
  • 334
  • 335
  • 336
  • 337
  • 338
  • 339
  • 340
  • 341
  • 342
  • 343
  • 344
  • 345
  • 346
  • 347
  • 348
  • 349
  • 350
  • 351
  • 352
  • 353
  • 354
  • 355
  • 356
  • 357
  • 358
  • 359
  • 360
  • 361
  • 362
  • 363
  • 364
  • 365
  • 366
  • 367
  • 368
  • 369
  • 370
  • 371
  • 372
  • 373
  • 374
  • 375
  • 376
  • 377
  • 378
  • 379
  • 380
  • 381
  • 382
  • 383
  • 384
  • 385
  • 386
  • 387
  • 388
  • 389
  • 390
  • 391
  • 392
  • 393
  • 394
  • 395
  • 396
  • 397
  • 398
  • 399
  • 400
  • 401
  • 402
  • 403
  • 404
  • 405
  • 406
  • 407
  • 408
  • 409
  • 410
  • 411
  • 412
  • 413
  • 414
  • 415
  • 416
  • 417
  • 418
  • 419
  • 420
  • 421
  • 422
  • 423
  • 424
  • 425
  • 426
  • 427
  • 428
  • 429
  • 430
  • 431
  • 432
  • 433
  • 434
  • 435
  • 436
  • 437
  • 438
  • 439
  • 440
  • 441
  • 442
  • 443
  • 444
  • 445
  • 446
  • 447
  • 448
  • 449
  • 450
  • 451
  • 452
  • 453
  • 454
  • 455
  • 456
  • 457
  • 458
  • 459
  • 460
  • 461
  • 462
  • 463
  • 464
  • 465
  • 466
  • 467
  • 468
  • 469
  • 470
  • 471
  • 472
  • 473
  • 474
  • 475
  • 476
  • 477
  • 478
  • 479
  • 480
  • 481
  • 482
  • 483
  • 484
  • 485
  • 486
  • 487
  • 488
  • 489
  • 490
  • 491
  • 492
  • 493
  • 494
  • 495
  • 496
  • 497
  • 498
  • 499
  • 500
  • 501
  • 502
  • 503
  • 504
  • 505
  • 506
  • 507
  • 508
  • 509
  • 510
  • 511
  • 512
  • 513
  • 514
  • 515
  • 516
  • 517
  • 518
  • 519
  • 520
  • 521
  • 522
  • 523
  • 524
  • 525
  • 526
  • 527
  • 528
  • 529
  • 530
  • 531
  • 532
  • 533
  • 534
  • 535
  • 536
  • 537
  • 538
  • 539
  • 540
  • 541
  • 542
  • 543
  • 544
  • 545
  • 546
  • 547
  • 548
  • 549
  • 550
  • 551
  • 552
  • 553
  • 554
  • 555
  • 556
  • 557
  • 558
  • 559
  • 560
  • 561
  • 562
  • 563
  • 564
  • 565
  • 566
  • 567
  • 568
  • 569
  • 570
  • 571
  • 572
  • 573
  • 574
  • 575
  • 576
  • 577
  • 578
  • 579
  • 580
  • 581
  • 582
  • 583
  • 584
  • 585
  • 586
  • 587
  • 588
  • 589
  • 590
  • 591
  • 592
  • 593
  • 594
  • 595
  • 596
  • 597
  • 598
  • 599
  • 600
  • 601
  • 602
  • 603
  • 604
  • 605
  • 606
  • 607
  • 608
  • 609
  • 610
  • 611
  • 612
  • 613
  • 614
  • 615
  • 616
  • 617
  • 618
  • 619
  • 620
  • 621
  • 622
  • 623
  • 624
  • 625
  • 626
  • 627
  • 628
  • 629
  • 630
  • 631
  • 632
  • 633
  • 634
  • 635
  • 636
  • 637
  • 638
  • 639
  • 640
  • 641
  • 642
  • 643
  • 644
  • 645
  • 646
  • 647
  • 648
  • 649
  • 650
  • 651
  • 652
  • 653
  • 654
  • 655
  • 656
  • 657
  • 658
  • 659
  • 660
  • 661
  • 662
  • 663
  • 664
  • 665
  • 666
  • 667
  • 668
  • 669
  • 670
  • 671
  • 672
  • 673
  • 674
  • 675
  • 676
  • 677
  • 678
  • 679
  • 680
  • 681
  • 682
  • 683
  • 684
  • 685
  • 686
  • 687
  • 688
  • 689
  • 690
  • 691
  • 692
  • 693
  • 694
  • 695
  • 696
  • 697
  • 698
  • 699
  • 700
  • 701
  • 702
  • 703
  • 704
  • 705
  • 706
  • 707
  • 708
  • 709
  • 710
  • 711
  • 712
  • 713
  • 714
  • 715
  • 716
  • 717
  • 718
  • 719
  • 720
  • 721
  • 722
  • 723
  • 724
  • 725
  • 726
  • 727
  • 728
  • 729
  • 730
  • 731
  • 732
  • 733
  • 734
  • 735
  • 736
  • 737
  • 738
  • 739
  • 740
  • 741
  • 742
  • 743
  • 744
  • 745
  • 746
  • 747
  • 748
  • 749
  • 750
  • 751
  • 752
  • 753
  • 754
  • 755
  • 756
  • 757
  • 758
  • 759
  • 760
  • 761
  • 762
  • 763
  • 764
  • 765
  • 766
  • 767
  • 768
  • 769
  • 770
  • 771
  • 772
  • 773
  • 774
  • 775
  • 776
  • 777
  • 778
  • 779
  • 780
  • 781
  • 782
  • 783
  • 784
  • 785
  • 786
  • 787
  • 788
  • 789
  • 790
  • 791
  • 792
  • 793
  • 794
  • 795
  • 796
  • 797
  • 798
  • 799
  • 800
  • 801
  • 802
  • 803
  • 804
  • 805
  • 806
  • 807
  • 808
  • 809
  • 810
  • 811
  • 812
  • 813
  • 814
  • 815
  • 816
  • 817
  • 818
  • 819
  • 820
  • 821
  • 822
  • 823
  • 824
  • 825
  • 826
  • 827
  • 828
  • 829
  • 830
  • 831
  • 832
  • 833
  • 834
  • 835
  • 836
  • 837
  • 838
  • 839
  • 840
  • 841
  • 842
  • 843
  • 844
  • 845
  • 846
  • 847
  • 848
  • 849
  • 850
  • 851
  • 852
  • 853
  • 854
  • 855
  • 856
  • 857
  • 858
  • 859
  • 860
  • 861
  • 862
  • 863
  • 864
  • 865
  • 866
  • 867
  • 868
  • 869
  • 870
  • 871
  • 872
  • 873
  • 874
  • 875
  • 876
  • 877
  • 878
  • 879
  • 880
  • 881
  • 882
  • 883
  • 884
  • 885
  • 886
  • 887
  • 888
  • 889
  • 890
  • 891
  • 892
  • 893
  • 894
  • 895
  • 896
  • 897
  • 898
  • 899
  • 900
  • 901
  • 902
  • 903
  • 904
  • 905
  • 906
  • 907
  • 908
  • 909
  • 910
  • 911
  • 912
  • 913
  • 914
  • 915
  • 916
  • 917
  • 918
  • 919
  • 920
  • 921
  • 922
  • 923
  • 924
  • 925
  • 926
  • 927
  • 928
  • 929
  • 930
  • 931
  • 932
  • 933
  • 934
  • 935
  • 936
  • 937
  • 938
  • 939
  • 940
  • 941
  • 942
  • 943
  • 944
  • 945
  • 946
  • 947
  • 948
  • 949
  • 950
  • 951
  • 952
  • 953
  • 954
  • 955
  • 956
  • 957
  • 958
  • 959
  • 960
  • 961
  • 962
  • 963
  • 964
  • 965
  • 966
  • 967
  • 968
  • 969
  • 970
  • 971
  • 972
  • 973
  • 974
  • 975
  • 976
  • 977
  • 978
  • 979
  • 980
  • 981
  • 982
  • 983
  • 984
  • 985
  • 986
  • 987
  • 988
  • 989
  • 990
  • 991
  • 992
  • 993
  • 994
  • 995
  • 996
  • 997
  • 998
  • 999
  • 1000
  • 1001
  • 1002
  • 1003
  • 1004
  • 1005
  • 1006
  • 1007
  • 1008
  • 1009
  • 1010
  • 1011
  • 1012
  • 1013
  • 1014
  • 1015
  • 1016
  • 1017
  • 1018
  • 1019
  • 1020
  • 1021
  • 1022
  • 1023
  • 1024
  • 1025
  • 1026
  • 1027
  • 1028
  • 1029
  • 1030
  • 1031
  • 1032
  • 1033
  • 1034
  • 1035
  • 1036
  • 1037

7. 关于Relative Position Bias

可以参考这里
或者视频

8. 参考

代码参考

视频参考

博客参考

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/盐析白兔/article/detail/352559
推荐阅读
相关标签
  

闽ICP备14008679号