赞
踩
语义分割结合了图像分类、目标检测和图像分割,通过一定的方法将图像分割成具有一定语义含义的区域块,并识别出每个区域块的语义类别,实现从底层到高层的语义推理过程,最终得到一幅具有逐像素语义标注的分割图像。设计损失函数想要达到的目标是损失与梯度同步变化,求导自变量定义为神经网络的最后一层带权重层的输出。当学习率恒定时,希望当预测结果远离真实值时,损失大,梯度大;当预测结果靠近真实值时,损失小,梯度小
最常用损失函数是像素级别的交叉熵损失 (cross entropy loss,ce),逐个检查每个像素,将对每个像素类别的预测结果(概率分布向量)与热编码标签向量进行比较
假设需要对每个像素的预测类别有
5
5
5 个,则预测的概率分布向量长度也为
5
5
5 维
对应的每个像素损失函数
l
o
s
s
p
i
x
e
l
=
−
∑
c
l
a
s
s
y
t
r
u
e
c
l
a
s
s
l
o
g
(
y
p
r
e
d
c
l
a
s
s
)
\pmb{loss_{pixel}}=-\sum_{class}y_{true}^{class}log(y_{pred}^{class})
losspixel=−class∑ytrueclasslog(ypredclass)
令
y
p
r
e
d
=
s
o
f
t
m
a
x
(
x
)
y_{pred}=softmax(x)
ypred=softmax(x) 那么回传的梯度为
d
(
l
o
s
s
c
e
)
d
x
=
∑
c
l
a
s
s
y
t
r
u
e
c
l
a
s
s
(
y
p
r
e
d
c
l
a
s
s
−
1
)
\frac{d(loss_{ce})}{dx}=\sum_{class}y_{true}^{class}(y_{pred}^{class}-1)
dxd(lossce)=∑classytrueclass(ypredclass−1) 正比于每个类别误差求和的均值,因此优化过程中损失小时梯度小
整个图像的损失就是全部像素损失的平均值
l
o
s
s
c
e
=
1
n
∑
p
i
x
e
l
=
1
n
l
o
s
s
p
i
x
e
l
\pmb{loss_{ce}}=\frac{1}{n}\sum_{pixel=1}^{n}loss_{pixel}
lossce=n1pixel=1∑nlosspixel
F.cross_entropy(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction, label_smoothing=self.label_smoothing)
PyTorch API
- this case is equivalent to the combination of
~torch.nn.LogSoftmax
and~torch.nn.NLLLoss
.
LogSoftmax ( x i ) = log ( exp ( x i ) ∑ j exp ( x j ) ) \text{LogSoftmax}(x_{i}) = \log\left(\frac{\exp(x_i) }{ \sum_j \exp(x_j)} \right) LogSoftmax(xi)=log(∑jexp(xj)exp(xi))
N L L ( x , y ) = L = { l 1 , … , l N } ⊤ , l n = − w y n x n , y n , w c = weight [ c ] ⋅ 1 { c ≠ ignore_index } NLL(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_{y_n} x_{n,y_n}, \quad w_{c} = \text{weight}[c] \cdot \mathbb{1}\{c \not= \text{ignore\_index}\} NLL(x,y)=L={l1,…,lN}⊤,ln=−wynxn,yn,wc=weight[c]⋅1{c=ignore_index}- It is useful when training a classification problem with
C
classes.- If provided, the optional argument
weight
should be a 1DTensor
assigning weight to each of the classes, This is
particularly useful when you have an unbalanced training set.- The
input
is expected to contain raw, unnormalized scores for each class.input
has to be a Tensor of size(C)
for unbatched input,(N, C)
or(N, C, d_1, d_2, ..., d_K)
with K ≥ 1 K\geq 1 K≥1 for theK
-dimensional case.
C = number of classes N = batch sizeC=N=number of classesbatch sizeC=N=number of classesbatch size - The
target
that this criterion expects should contain either- Class indices in the range
[0, C)
whereC
is the number of classes, not one-hot, dtype is long.- if
ignore_index
is specified, loss also accepts this class index (this index may not necessarily be in the class range).- If containing class probabilities, same shape as the input and each value should be between
[0, 1]
, dtype is float.- The unreduced (i.e. with
reduction
set to'none'
) loss for this case can be described as
ℓ ( x , y ) = L = { l 1 , … , l N } ⊤ , l n = − w y n log exp ( x n , y n ) ∑ c = 1 C exp ( x n , c ) ⋅ 1 { y n ≠ ignore_index } \ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_{y_n} \log \frac{\exp(x_{n,y_n})}{\sum_{c=1}^C \exp(x_{n,c})} \cdot \mathbb{1}\{y_n \not= \text{ignore\_index}\} ℓ(x,y)=L={l1,…,lN}⊤,ln=−wynlog∑c=1Cexp(xn,c)exp(xn,yn)⋅1{yn=ignore_index}x
is the input,y
is the target,w
is the weight,
C
is the number of classes, andN
spans the minibatch dimension as well asd_1, ..., d_k
for theK
-dimensional case.- The performance of this criterion is generally better when
target
contains class
indices, as this allows for optimized computation. Consider providingtarget
as
class probabilities only when a single class label per minibatch item is too restrictive.- The
output
If reduction is ‘none’, same shape as the target. Otherwise, scalar.
数学上 torch.nn.CrossEntropyLoss
等价 torch.nn.LogSoftmax
加 torch.nn.NLLLoss
,但是 API
实现上,它们存在一些差异 torch.nn.NLLLoss
的标签无法使用概率值,而 torch.nn.CrossEntropyLoss
可以,因此可以认为前者是后者的超集
ce = nn.CrossEntropyLoss() ls = nn.LogSoftmax(dim=1) nll = nn.NLLLoss() # 逆向实现API def cross_entorpy(inputs, targets): inputs = inputs.numpy() targets = targets.numpy() outputs = 0. weight = 1. if targets.dtype == np.int64: assert len(inputs.shape) == 4 and len(targets.shape) == 3 for k in range(targets.shape[0]): temp = 0. for i in range(targets.shape[-2]): for j in range(targets.shape[-1]): temp += -1. * weight * (np.log(np.exp(inputs[k, :, i, j][..., int(targets[k, i, j].item())]) / np.sum(np.exp(inputs[k, :, i, j])))) outputs += temp elif targets.dtype == np.float32: assert inputs.shape == targets.shape for k in range(targets.shape[0]): temp = 0. for i in range(targets.shape[-2]): for j in range(targets.shape[-1]): temp += -1. * weight * np.sum(np.log(np.exp(inputs[k, :, i, j]) / np.sum(np.exp(inputs[k, :, i, j]))) * targets[k, :, i, j]) outputs += temp else: print(f'标签的数据类型应该是 int64 或者 float32 而不是 {targets.dtype}') sys.exit() return (outputs / (targets.shape[0] * targets.shape[-2] * targets.shape[-1])).item() # 交叉熵的计算模式一 - 标签中的元素是类的索引值, [0, C-1] -> int64 # 交叉熵的计算模式二 - 标签中的元素是类的概率值, [0, 1] -> float32 inputs = torch.rand(1, 5, 5, 5) targets = torch.rand(1, 5, 5).random_(5).long() # targets = torch.nn.Softmax(dim=1)(torch.rand(1, 5, 5, 5)) outputs = ce(inputs, targets) print(f'ce {outputs:6f}') if targets.dtype == torch.int64: outputs = nll(ls(inputs), targets) print(f'logsoftmax+nll {outputs:6f}') outputs = cross_entorpy(inputs, targets) print(f'cross_entorpy {outputs:6f}') """ ce 0.725609 logsoftmax+nll 0.725609 cross_entorpy 0.725609 """
损失函数二值交叉熵 (binary entropy loss,bce) 适用于目标只有两个类别
l
o
s
s
b
c
e
=
−
y
t
r
u
e
l
o
g
(
y
p
r
e
d
)
−
(
1
−
y
t
r
u
e
)
l
o
g
(
1
−
y
p
r
e
d
)
\pmb{loss_{bce}}=-y_{true}log(y_{pred})-(1-y_{true})log(1-y_{pred})
lossbce=−ytruelog(ypred)−(1−ytrue)log(1−ypred)
如果
y
p
r
e
d
=
s
i
g
m
o
i
d
(
x
)
y_{pred}=sigmoid(x)
ypred=sigmoid(x) 那么回传的梯度为
d
(
l
o
s
s
b
c
e
)
d
x
=
y
p
r
e
d
−
y
t
r
u
e
\frac{d(loss_{bce})}{dx}=y_{pred}-y_{true}
dxd(lossbce)=ypred−ytrue 正比于误差,因此优化过程中损失小时梯度小
F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
PyTorch API
- weight (Tensor, optional): a manual rescaling weight given to the loss of each batch element.
- This is used for measuring the error of a reconstruction in for example an auto-encoder.
- The unreduced (i.e. with
reduction
set to'none'
) loss can be described as
ℓ ( x , y ) = L = { l 1 , … , l N } ⊤ , l n = − w n [ y n ⋅ log x n + ( 1 − y n ) ⋅ log ( 1 − x n ) ] , \ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_n \left[ y_n \cdot \log x_n + (1 - y_n) \cdot \log (1 - x_n) \right], ℓ(x,y)=L={l1,…,lN}⊤,ln=−wn[yn⋅logxn+(1−yn)⋅log(1−xn)],N
is the batch size.- targets
y
should be numbers between 0 and 1.
Ifreduction
is not'none'
(default'mean'
), then
ℓ ( x , y ) = { mean ( L ) , if reduction = ‘mean’; sum ( L ) , if reduction = ‘sum’. \ell(x, y) =ℓ(x,y)={mean(L),sum(L),if reduction=‘mean’;if reduction=‘sum’.{mean(L),sum(L),if reduction=`mean';if reduction=`sum'.
用于图像重建等回归任务时,此时真实标签不是二元的,可取 [ 0 , 1 ] [0, 1] [0,1] 之间任意值。例如标签有前景与背景等两类,两类和为 1 1 1,在这种情况下,交叉熵的最小值仍然是当预测值完全等于真实标签时交叉熵达到最小值,但这个最小值不再为 0 0 0
import torch import numpy as np bce = torch.nn.BCELoss() # 逆向实现API def binary_cross_entorpy(inputs, targets): inputs = inputs.numpy() inputs = inputs.reshape((inputs.shape[0]*inputs.shape[1], inputs.shape[-2]*inputs.shape[-1])) targets = targets.numpy() targets = targets.reshape((targets.shape[0]*targets.shape[1], targets.shape[-2]*targets.shape[-1])) outputs = 0. weight = 1. for i in range(targets.shape[0]): temp = 0 for j in range(targets.shape[1]): temp += -1. * weight * (targets[i, j]*np.log(inputs[i, j]) + (1-targets[i, j])*np.log(1-inputs[i, j])) outputs += (temp / targets.shape[1]) return outputs / targets.shape[0] inputs = torch.rand((1, 2, 2, 2)) outputs = torch.tensor([[[[0, 1.], [1., 0]], [[0, 1.], [1., 0]]]]) # outputs = torch.nn.Softmax(dim=1)(torch.rand(1, 2, 2, 2)) print(f'bce {bce(inputs, outputs):6f}', f'binary_cross_entorpy {binary_cross_entorpy(inputs, outputs):6f}', sep="\n") """ bce 0.586063 binary_cross_entorpy 0.586063 """
交叉熵损失会分别评估每个像素的类别预测,然后对所有像素的损失进行平均,因此实质上是在对图像中的每个像素进行平等地学习。如果多个类在图像中的分布不均衡,那么这可能导致训练过程由像素数量多的类所主导,即模型会主要学习数量多的类别样本的特征,并且学习出来的模型会更偏向将像素预测为该类别
全卷积神经网络 FCN 与 U 型神经网路 U-Net 论文中对输出概率分布向量中的每个值进行加权,使得模型更加关注数量较少的样本,以缓解图像中存在的类别不均衡问题
例如,二分类中正负样本比例为 1 : 99 1:99 1:99,此时模型将所有样本都预测为负样本,那么准确率仍有 99 % 99\% 99%,然而实际上没有意义
为了平衡这个差距,就对正样本和负样本的损失赋予不同的权重,带权重的二分类损失函数 weighted loss
l
o
s
s
w
i
e
g
h
t
e
d
=
−
p
o
s
w
i
e
g
h
t
e
d
×
y
t
r
u
e
l
o
g
(
y
p
r
e
d
)
−
(
1
−
y
t
r
u
e
)
l
o
g
(
1
−
y
p
r
e
d
)
p
o
s
w
i
e
g
h
t
e
d
=
n
e
g
n
u
m
p
o
s
n
u
m
\pmb{loss_{wieghted}}=-pos_{wieghted}\times y_{true}log(y_{pred})-(1-y_{true})log(1-y_{pred})\\ \pmb{pos_{wieghted}}=\frac{neg_{num}}{pos_{num}}
losswieghted=−poswieghted×ytruelog(ypred)−(1−ytrue)log(1−ypred)poswieghted=posnumnegnum
令
y
p
r
e
d
=
s
i
g
m
o
i
d
(
x
)
y_{pred}=sigmoid(x)
ypred=sigmoid(x) 那么回传的梯度为
d
(
l
o
s
s
w
i
e
g
h
t
e
d
)
d
x
=
(
1
−
y
t
r
u
e
)
y
p
r
e
d
−
p
o
s
w
i
e
g
h
t
e
d
×
y
t
r
u
e
(
1
−
y
p
r
e
d
)
\frac{d(loss_{wieghted})}{dx}=(1-y_{true})y_{pred}-pos_{wieghted}\times y_{true}(1-y_{pred})
dxd(losswieghted)=(1−ytrue)ypred−poswieghted×ytrue(1−ypred) 正比于误差,且正样本则为
p
o
s
w
i
e
g
h
t
e
d
(
y
p
r
e
d
−
1
)
pos_{wieghted}(y_{pred}-1)
poswieghted(ypred−1) 被抑制,负样本则为
−
y
p
r
e
d
-y_{pred}
−ypred 相对增强,因此优化过程中损失小时梯度小,且放大了负样本的优化效果
有时不仅需要针对不同类别的像素数量的不均衡改进,还需要将像素分为难学习和容易学习这两种样本,对于容易学习的样本模型可以很轻松地预测正确,而模型只要将大量容易学习的样本预测正确,loss 就会减小很多,从而导致模型无法顾及难学习的样本,所以要让模型更加关注难学习的样本
对于难易程度不同的学习样本可赋予不同的权重调整
−
(
1
−
y
p
r
e
d
)
γ
×
y
t
r
u
e
l
o
g
(
y
p
r
e
d
)
−
y
p
r
e
d
γ
(
1
−
y
t
r
u
e
)
×
l
o
g
(
1
−
y
p
r
e
d
)
-(1-y_{pred})^{\gamma}\times y_{true}log(y_{pred})-y_{pred}^{\gamma}(1-y_{true})\times log(1-y_{pred})\\
−(1−ypred)γ×ytruelog(ypred)−ypredγ(1−ytrue)×log(1−ypred)
例如,预测一个正样本,预测结果为
0.95
0.95
0.95 是一个容易学习的样本,有
(
1
−
0.95
)
2
=
0.0025
(1-0.95)^2=0.0025
(1−0.95)2=0.0025 损失直接减少为原来的
1
400
1\over400
4001,预测结果为
0.5
0.5
0.5 是一个难学习的样本,有
(
1
−
0.5
)
2
=
0.25
(1-0.5)^2=0.25
(1−0.5)2=0.25,损失减小为原来的
1
4
1\over4
41,相对减小的程度小很多,总体上更多的考虑到了难学习样本,因此模型更加专注学习难学习的样本
可得考虑正负样本不均衡与难易程度的 focal loss
l
o
s
s
f
o
c
a
l
=
−
α
(
1
−
y
p
r
e
d
)
γ
×
y
t
r
u
e
l
o
g
(
y
p
r
e
d
)
−
(
1
−
α
)
y
p
r
e
d
γ
(
1
−
y
t
r
u
e
)
×
l
o
g
(
1
−
y
p
r
e
d
)
d
e
f
a
u
l
t
γ
=
2
\pmb{loss_{focal}}=-\alpha(1-y_{pred})^{\gamma}\times y_{true}log(y_{pred})-(1-\alpha)y_{pred}^{\gamma}(1-y_{true})\times log(1-y_{pred})\\ \pmb{default\;\gamma=2}
lossfocal=−α(1−ypred)γ×ytruelog(ypred)−(1−α)ypredγ(1−ytrue)×log(1−ypred)defaultγ=2
梯度性质于 Weighted Loss 类似
常用的损失函数还有基于
D
i
c
e
Dice
Dice 系数的损失函数 (soft dice loss,sd) 其系数实质是两个样本之间重叠的度量,范围为
0
~
1
0~1
0~1,其中
1
1
1 表示完全重叠
D
i
c
e
=
2
∣
A
∩
B
∣
∣
A
∣
+
∣
B
∣
=
2
T
P
2
T
P
+
F
P
+
F
N
Dice=\frac{2|A\cap B|}{|A|+|B|}=\frac{2TP}{2TP+FP+FN}
Dice=∣A∣+∣B∣2∣A∩B∣=2TP+FP+FN2TP
∣
A
∩
B
∣
|A\cap B|
∣A∩B∣ 代表集合
A
A
A 和
B
B
B 之间的公共元素,并且
∣
A
∣
|A|
∣A∣ 与
∣
B
∣
|B|
∣B∣ 分别代表集合
A
A
A 和
B
B
B 的元素数量,分子乘
2
2
2 保证取值范围在
[
0
,
1
]
[0,1]
[0,1],
∣
A
∩
B
∣
|A\cap B|
∣A∩B∣ 为预测掩码和标签掩码之间的逐元素乘法,然后对结果矩阵求和
D
i
c
e
Dice
Dice 系数中
T
P
TP
TP 为真阳性样本
F
P
FP
FP 为假阳性样本
F
N
FN
FN 为假阴性样本,而
p
r
e
c
i
s
i
o
n
=
T
P
T
P
+
F
P
precision=\frac{TP}{TP+FP}
precision=TP+FPTP,
r
e
c
a
l
l
=
T
P
T
P
+
F
N
recall=\frac{TP}{TP+FN}
recall=TP+FNTP,可知
D
i
c
e
Dice
Dice 包涵了两部分的意义
需要对每个类进行整体预测,使得预测结果的每个类都与真实标签尽可能重叠,即
T
P
TP
TP 充分的大,
F
P
FP
FP 与
F
N
FN
FN 充分的小
对每个类别都计算
1
−
D
i
c
e
1-Dice
1−Dice 后求和取平均得到最后的 soft dice loss
l
o
s
s
s
d
=
1
n
∑
c
l
a
s
s
=
1
n
{
1
−
2
∑
p
i
e
x
l
(
y
t
r
u
e
y
p
r
e
d
)
∑
p
i
e
x
l
(
y
t
r
u
e
+
y
p
r
e
d
)
}
\pmb{loss_{sd}}=\frac{1}{n}\sum_{class=1}^{n}\left\{1-\frac{2\sum_{piexl}(y_{true}y_{pred})}{\sum_{piexl}(y_{true}+y_{pred})}\right\}
losssd=n1class=1∑n{1−∑piexl(ytrue+ypred)2∑piexl(ytrueypred)}
如果是二分类则令
y
p
r
e
d
=
s
i
g
m
o
i
d
(
x
)
y_{pred}=sigmoid(x)
ypred=sigmoid(x) 那么回传的梯度为
d
(
l
o
s
s
s
d
p
i
x
e
l
)
d
y
p
i
x
e
l
=
1
2
∑
c
l
a
s
s
=
1
2
{
2
[
y
t
r
u
e
p
i
x
e
l
(
y
t
r
u
e
p
i
x
e
l
+
y
p
r
e
d
p
i
x
e
l
)
−
y
t
r
u
e
p
i
x
e
l
y
p
r
e
d
p
i
x
e
l
]
(
y
t
r
u
e
p
i
x
e
l
+
y
p
r
e
d
p
i
x
e
l
)
2
}
=
1
2
∑
c
l
a
s
s
=
1
2
{
0
,
y
t
r
u
e
p
i
x
e
l
=
0
−
2
(
1
+
y
p
r
e
d
p
i
x
e
l
)
2
,
y
t
r
u
e
p
i
x
e
l
=
1
\frac{d(loss_{sd}^{pixel})}{dy^{pixel}}=\frac{1}{2}\sum_{class=1}^{2}\left\{\frac{2[y_{true}^{pixel}(y_{true}^{pixel}+y_{pred}^{pixel})-y_{true}^{pixel}y_{pred}^{pixel}]}{(y_{true}^{pixel}+y_{pred}^{pixel})^2}\right\}=\frac{1}{2}\sum_{class=1}^{2}
d ( l o s s s d p i x e l ) d x p i x e l = d ( l o s s s d p i x e l ) d y p i x e l × e − x p i x e l ( e − x p i x e l + 1 ) 2 \frac{d(loss_{sd}^{pixel})}{dx^{pixel}}=\frac{d(loss_{sd}^{pixel})}{dy^{pixel}}\times\frac{e^{-x^{pixel}}}{(e^{-x^{pixel}}+1)^2} dxpixeld(losssdpixel)=dypixeld(losssdpixel)×(e−xpixel+1)2e−xpixel
随着 x p i x e l x^{pixel} xpixel 增大,损失(蓝色)趋向零梯度(红色)趋向零,随着 x p i x e l x^{pixel} xpixel 减小,损失趋于一梯度趋向零(类似均方误差 (mse) 不论预测接近真实值或是接近错误值,梯度都很小)
def _take_channels(*xs, ignore_channels=None): if ignore_channels is None: return xs else: channels = [channel for channel in range(xs[0].shape[1]) if channel not in ignore_channels] xs = [torch.index_select(x, dim=1, index=torch.tensor(channels).to(x.device)) for x in xs] return xs def _threshold(x, threshold=None): if threshold is not None: return (x > threshold).type(x.dtype) else: return x class DiceLoss(nn.Module): def __init__(self, eps=1, threshold=0.5, ignore_channels=None): super(DiceLoss, self).__init__() self.eps = eps self.threshold = threshold self.ignore_channels = ignore_channels def forward(self, probs, targets): assert probs.shape[0] == targets.shape[0] probs = _threshold(probs, threshold=self.threshold) pr, gt = _take_channels(probs, targets, ignore_channels=self.ignore_channels) tp = torch.sum(gt * pr) fp = torch.sum(pr) - tp fn = torch.sum(gt) - tp score = (2 * tp + eps) / (2 * tp + fn + fp + eps) return score
I
o
U
IoU
IoU 系数也叫 Jarcard 相似度,其计算公式与计算
D
i
c
e
Dice
Dice 系数的公式很像,区别是仅需计算一次
T
P
TP
TP
I
o
U
=
T
P
T
P
+
F
P
+
F
N
=
∣
A
∩
B
∣
∣
A
∣
+
∣
B
∣
−
∣
A
∩
B
∣
=
∣
A
∩
B
∣
∣
A
∪
B
∣
IoU=\frac{TP}{TP+FP+FN}=\frac{|A\cap B|}{|A|+|B|-|A\cap B|}=\frac{|A\cap B|}{|A\cup B|}
IoU=TP+FP+FNTP=∣A∣+∣B∣−∣A∩B∣∣A∩B∣=∣A∪B∣∣A∩B∣
对于每个类别的 mask 都计算
1
−
I
o
U
1-IoU
1−IoU 最后求和取平均得到基于
I
o
U
IoU
IoU 系数的损失函数 (soft iou loss,si) 为
l
o
s
s
s
i
=
1
n
∑
c
l
a
s
s
=
1
n
{
1
−
∑
p
i
e
x
l
(
y
t
r
u
e
y
p
r
e
d
)
∑
p
i
e
x
l
(
y
t
r
u
e
+
y
p
r
e
d
−
y
t
r
u
e
y
p
r
e
d
)
}
\pmb{loss_{si}}=\frac{1}{n}\sum_{class=1}^{n}\left\{1-\frac{\sum_{piexl}(y_{true}y_{pred})}{\sum_{piexl}(y_{true}+y_{pred}-y_{true}y_{pred})}\right\}
losssi=n1class=1∑n{1−∑piexl(ytrue+ypred−ytrueypred)∑piexl(ytrueypred)}
梯度性质于 soft dice loss 类似
def _take_channels(*xs, ignore_channels=None): ...(同上)... def _threshold(x, threshold=None): ...(同上)... class IouLoss(nn.Module): def __init__(self, eps=1, threshold=0.5, ignore_channels=None): super().__init__() self.eps = eps self.threshold = threshold self.ignore_channels = ignore_channels def forward(self, probs, targets): probs = _threshold(probs, threshold=self.threshold) pr, gt = _take_channels(probs, targets, ignore_channels=self.ignore_channels) intersection = torch.sum(gt * pr) union = torch.sum(gt) + torch.sum(pr) - intersection + self.eps score = (intersection + self.eps) / union return score
交叉熵损失把每个像素都当作一个独立样本进行预测,而相似度损失则以更整体的方式来看待最终的预测输出,两类损失是针对不同情况,各有优点和缺点,在实际应用中,可以同时使用这两类损失来进行互补
1. 语义分割中的 loss function
2. An overview of semantic image segmentation
3. Loss Functions for Medical Image Segmentation
4. Losses for Image Segmentation
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。