MSCOCO物体检测评测系统的分析

博主： admin
发布时间：2024 年 06 月 18 日
279 次浏览
暂无评论
6090字数
分类：机器学习项目测试

引用源：https://zhuanlan.zhihu.com/p/110676412

检测指标通常用AP来衡量，现有的多个评测代码，比如COCO、VOC、Wider Face等等，AP的计算原理都是一样的，https://github.com/huangh12/Object-Detection-Metrics即逼近PR曲线下的面积，因此从PR曲线到AP的计算大同小异。关键的差异在于PR曲线的获取上，各家的思路会有些差异，这个关键就反映在det与gt的match策略上。

在COCO中，gt还有一个crowd的属性，之前一直认为crowd就代表ignore，但是仔细看代码会发现，这个crowd属性和ignore其实是不同的，ignore是gt的另一个属性，

在计算AP的时候两者的处理方式不同，因此要格外注意。

各家评测系统的核心差异，主要就反映在det与gt的match策略上。 COCO的匹配思想主要体现在下面一段代码中：

首先，det按照得分降序排列， gt按照是否ignore（0表示否1，表示是）升序排列，也就是把ignore为1的gt扔到到后面去。

其匹配思想可以概括为：

首先，得分优先原则，即得分大的det先去匹配gt；
每次匹配的时候，每次匹配时的gt candidates是未被匹配的gt并上已匹配但是属性crowd为true的gt；
从前到后遍历gt，从gt candidates中选择与当前det的iou最大的（需满足前提，即大于iou thresh）作为best match。
匹配过程中若已经完成了一次常规匹配（reg match，也即det匹配到了一个非ignore的gt，注意这里只说了非ignore，并没有说非crowd），而且遍历到了gt的ignore=1部分，那么就趁早break，结束此det的match过程了。
每次匹配过程结束（每次匹配过程就是一个det去找gt的过程），记录该det匹配到的gt的id（dtm），记录gt匹配到det的id（gtm），以及det的ignore情况（dtIg，即该det匹配到的gt的ignore属性）。

def evaluateImg(self, imgId, catId, aRng, maxDet):
    '''
    perform evaluation for single category and image
    :return: dict (single image results)
    '''
    p = self.params
    if p.useCats:
        gt = self._gts[imgId,catId]
        dt = self._dts[imgId,catId]
    else:
        gt = [_ for cId in p.catIds for _ in self._gts[imgId,cId]]
        dt = [_ for cId in p.catIds for _ in self._dts[imgId,cId]]
    if len(gt) == 0 and len(dt) ==0:
        return None
 
 
    for g in gt:
        if g['ignore'] or (g['area']<aRng[0] or g['area']>aRng[1]):
            g['_ignore'] = 1
        else:
            g['_ignore'] = 0
 
    # sort dt highest score first, sort gt ignore last
    gtind = np.argsort([g['_ignore'] for g in gt], kind='mergesort')
    gt = [gt[i] for i in gtind]
    dtind = np.argsort([-d['score'] for d in dt], kind='mergesort')
    dt = [dt[i] for i in dtind[0:maxDet]]
    iscrowd = [int(o['iscrowd']) for o in gt]
    # load computed ious
    ious = self.ious[imgId, catId][:, gtind] if len(self.ious[imgId, catId]) > 0 else self.ious[imgId, catId]
 
    T = len(p.iouThrs)
    G = len(gt)
    D = len(dt)
    gtm  = np.zeros((T,G))
    dtm  = np.zeros((T,D))
    gtIg = np.array([g['_ignore'] for g in gt])
    dtIg = np.zeros((T,D))
    if not len(ious)==0:
        for tind, t in enumerate(p.iouThrs):
            for dind, d in enumerate(dt):
                # information about best match so far (m=-1 -> unmatched)
                iou = min([t,1-1e-10])
                m   = -1
                for gind, g in enumerate(gt):
                    # if this gt already matched, and not a crowd, continue
                    if gtm[tind,gind]>0 and not iscrowd[gind]:
                        continue
                    # if dt matched to reg gt, and on ignore gt, stop
                    if m>-1 and gtIg[m]==0 and gtIg[gind]==1:
                        break
                    # continue to next gt unless better match made
                    if ious[dind,gind] < iou:
                        continue
                    # if match successful and best so far, store appropriately
                    iou=ious[dind,gind]
                    m=gind
                # if match made store id of match for both dt and gt
                if m ==-1:
                    continue
                dtIg[tind,dind] = gtIg[m]
                dtm[tind,dind]  = gt[m]['id']
                gtm[tind,m]     = d['id']
    # set unmatched detections outside of area range to ignore
    a = np.array([d['area']<aRng[0] or d['area']>aRng[1] for d in dt]).reshape((1, len(dt)))
    dtIg = np.logical_or(dtIg, np.logical_and(dtm==0, np.repeat(a,T,0)))
    # store results for given image and category
    return {
            'image_id':     imgId,
            'category_id':  catId,
            'aRng':         aRng,
            'maxDet':       maxDet,
            'dtIds':        [d['id'] for d in dt],
            'gtIds':        [g['id'] for g in gt],
            'dtMatches':    dtm,
            'gtMatches':    gtm,
            'dtScores':     [d['score'] for d in dt],
            'gtIgnore':     gtIg,
            'dtIgnore':     dtIg,
        }

COCO的匹配过程可以说是很细致了。从上面的匹配过程可以发现，评测时绝对不允许多个det匹配到同一个crowd为False的gt上（代码如下），但是却允许多个det匹配到crowd为True的gt上。

# if this gt already matched, and not a crowd, continue
if gtm[tind,gind]>0 and not iscrowd[gind]:
    continue

之所以对crowd为True的gt网开一面，看一下MSCOCO的原始论文中关于crowd的定义就明白了。

简言之，就是crowd的物体确实是有一堆密集的东西在一起（比如一卡车香蕉），标注比较困难，所以就用crowd来减轻负担了。

在cocoeval.py中set ignore flag的时候，ignore flag完全由crowd属性决定。

# set ignore flag
for gt in gts:
    gt['ignore'] = gt['ignore'] if 'ignore' in gt else 0
    gt['ignore'] = 'iscrowd' in gt and gt['iscrowd']

而ignore为True的gt样本在最后计算AP的时候没有起作用，所以crowd为True的gt确实没有影响到最后的AP。

COCO的评测代码为GT设计的crowd属性非常合理，考虑很细致，令人佩服。

继续回到刚才的匹配过程的代码，刚才说到了匹配时允许多个det匹配到crowd为True的gt上。但是，对于ignore为True的gt，却并没有说是否允许多个det匹配上去。

而这种设计是合理的！

因为，ignore属性在本质上其用途是为了在计算AP的时候不考虑这些gt，而crowd为True的gt是因为密集物体聚集（所以允许多个det匹配上去）。

ignore只是为了在计算AP的时候屏蔽掉某些gt，比如对想要检测的gt设置了area range时候，那么我们只需要将gt的ignore属性重新赋值就可以了（将area range之外的gt ignore设成True)。

此前，为了去掉crowd为True的gt对AP的影响，所以crowd为True的gt都被设成了ignore为True。

crowd属性与ignore属性，其含义与用途，其实已经反映在他们的名字上了。

那什么时候ignore的gt才允许多个det匹配上去呢？这个，完全取决于这个gt是否crowd为True，与ignore的取值无关。

例如，对于一个ignore为True，crowd为False的gt，匹配过程最多只允许一个det匹配上去，0个或者1个det匹配都不影响最终的AP，

但是如果有大量的det都检测的该gt，那么AP上就会有punishment，因为这些多余的det只能匹配到其他gt上，相当于false positive了。

而对于一个ignore为True，crowd为True的gt，匹配过程允许无数个det匹配上去，而且不影响最终的AP计算。

因此，需要摒弃之前所认为的crowd等同于ignore的思维旧势。

crowd是region级别的，而ignore是instance级别的，一个严谨的检测标注规范应该独立标注二者，crowd用来表示密集聚集的一个区域，用ignore来忽略类标不明的一个instance个体。

但是，在目前的实际业务中，只有ignore属性的标注，而且ignore属性事实上在业务中被当作crowd在用。

VOC和Wider Face也没有设置crowd属性，在计算AP的时候，直接把ignore当作crowd在用。而且匹配过程和COCO也不一样。

VOC以及Wider Face的匹配都是把det匹配到iou最大的一个gt上，这个匹配过程简单得就好像没有匹配过程一样。

最后修改：2024 年 06 月 18 日

如果觉得我的文章对你有用，请随意赞赏

发表评论取消回复
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

评论 *

私密评论

名称 *

🎲

邮箱 *

地址

思维峰网络
专注于为企业提供从需求分析到系统实施的全流程信息化建设服务。h...
思维峰网络
专注于为企业提供从需求分析到系统实施的全流程信息化建设服务。h...
思维峰网络
我们汇聚了一批互联网技术和营销领域的专业人才，专注于为企业提供...
khilytrpyr
独特的构思和新颖的观点，让这篇文章在众多作品中脱颖而出。
tqaiddsurz
文章深入浅出，既有深度思考，又不乏广度覆盖，令人叹为观止。

MSCOCO物体检测评测系统的分析

admin • 2024 年 06 月 18 日

引用源：<a class="no-external-link" href="https://zhuanlan.zhihu.com/p/110676412" target="_blank">https://zhuanlan.zhihu.com/p/110676412</a>检测指标通常用AP来衡量，现有的多个评测代码，比如COCO、VOC、Wider Face等等，AP的计算原理都是一样的，<a class="no-external-link" href="https://link.zhihu.com/?target=https%3A//github.com/huangh12/Object-Detection-Metrics" target="_blank">https://github.com/huangh12/Object-Detection-Metrics</a>即逼近PR曲线下的面积，因此从PR曲线到AP的计算大同小异。关键的差异在于PR曲线的获取上，各家的思路会有些差异，这个关键就反映在det与gt的match策略上。在COCO中，gt还有一个crowd的属性，之前一直认为crowd就代表ignore，但是仔细看代码会发现，这个crowd属性和ignore其实是不同的，ignore是gt的另一个属性，在计算AP的时候两者的处理方式不同，因此要格外注意。各家评测系统的核心差异，主要就反映在det与gt的match策略上。 COCO的匹配思想主要体现在下面一段代码中：首先，det按照得分降序排列， gt按照是否ignore（0表示否1，表示是）升序排列，也就是把ignore为1的gt扔到到后面去。其匹配思想可以概括为：<ol><li>首先，得分优先原则，即得分大的det先去匹配gt；</li><li>每次匹配的时候，每次匹配时的gt candidates是未被匹配的gt并上已匹配但是属性crowd为true的gt；</li><li>从前到后遍历gt，从gt candidates中选择与当前det的iou最大的（需满足前提，即大于iou thresh）作为best match。</li><li>匹配过程中若已经完成了一次常规匹配（reg match，也即det匹配到了一个非ignore的gt，注意这里只说了非ignore，并没有说非crowd），而且遍历到了gt的ignore=1部分，那么就趁早break，结束此det的match过程了。</li><li>每次匹配过程结束（每次匹配过程就是一个det去找gt的过程），记录该det匹配到的gt的id（dtm），记录gt匹配到det的id（gtm），以及det的ignore情况（dtIg，即该det匹配到的gt的ignore属性）。</li></ol><pre><code class="lang-python">def evaluateImg(self, imgId, catId, aRng, maxDet):
 '''
 perform evaluation for single category and image
 :return: dict (single image results)
 '''
 p = self.params
 if p.useCats:
 gt = self._gts[imgId,catId]
 dt = self._dts[imgId,catId]
 else:
 gt = [_ for cId in p.catIds for _ in self._gts[imgId,cId]]
 dt = [_ for cId in p.catIds for _ in self._dts[imgId,cId]]
 if len(gt) == 0 and len(dt) ==0:
 return None
 
 
 for g in gt:
 if g['ignore'] or (g['area']&lt;aRng[0] or g['area']&gt;aRng[1]):
 g['_ignore'] = 1
 else:
 g['_ignore'] = 0
 
 # sort dt highest score first, sort gt ignore last
 gtind = np.argsort([g['_ignore'] for g in gt], kind='mergesort')
 gt = [gt[i] for i in gtind]
 dtind = np.argsort([-d['score'] for d in dt], kind='mergesort')
 dt = [dt[i] for i in dtind[0:maxDet]]
 iscrowd = [int(o['iscrowd']) for o in gt]
 # load computed ious
 ious = self.ious[imgId, catId][:, gtind] if len(self.ious[imgId, catId]) &gt; 0 else self.ious[imgId, catId]
 
 T = len(p.iouThrs)
 G = len(gt)
 D = len(dt)
 gtm = np.zeros((T,G))
 dtm = np.zeros((T,D))
 gtIg = np.array([g['_ignore'] for g in gt])
 dtIg = np.zeros((T,D))
 if not len(ious)==0:
 for tind, t in enumerate(p.iouThrs):
 for dind, d in enumerate(dt):
 # information about best match so far (m=-1 -&gt; unmatched)
 iou = min([t,1-1e-10])
 m = -1
 for gind, g in enumerate(gt):
 # if this gt already matched, and not a crowd, continue
 if gtm[tind,gind]&gt;0 and not iscrowd[gind]:
 continue
 # if dt matched to reg gt, and on ignore gt, stop
 if m&gt;-1 and gtIg[m]==0 and gtIg[gind]==1:
 break
 # continue to next gt unless better match made
 if ious[dind,gind] &lt; iou:
 continue
 # if match successful and best so far, store appropriately
 iou=ious[dind,gind]
 m=gind
 # if match made store id of match for both dt and gt
 if m ==-1:
 continue
 dtIg[tind,dind] = gtIg[m]
 dtm[tind,dind] = gt[m]['id']
 gtm[tind,m] = d['id']
 # set unmatched detections outside of area range to ignore
 a = np.array([d['area']&lt;aRng[0] or d['area']&gt;aRng[1] for d in dt]).reshape((1, len(dt)))
 dtIg = np.logical_or(dtIg, np.logical_and(dtm==0, np.repeat(a,T,0)))
 # store results for given image and category
 return {
 'image_id': imgId,
 'category_id': catId,
 'aRng': aRng,
 'maxDet': maxDet,
 'dtIds': [d['id'] for d in dt],
 'gtIds': [g['id'] for g in gt],
 'dtMatches': dtm,
 'gtMatches': gtm,
 'dtScores': [d['score'] for d in dt],
 'gtIgnore': gtIg,
 'dtIgnore': dtIg,
 }</code></pre>COCO的匹配过程可以说是很细致了。从上面的匹配过程可以发现，评测时绝对不允许多个det匹配到同一个crowd为False的gt上（代码如下），但是却允许多个det匹配到crowd为True的gt上。<pre><code class="lang-python"># if this gt already matched, and not a crowd, continue
if gtm[tind,gind]&gt;0 and not iscrowd[gind]:
 continue</code></pre>之所以对crowd为True的gt网开一面，看一下MSCOCO的原始论文中关于crowd的定义就明白了。简言之，就是crowd的物体确实是有一堆密集的东西在一起（比如一卡车香蕉），标注比较困难，所以就用crowd来减轻负担了。<img src="https://pic3.zhimg.com/80/v2-edcf7fa12f9f1a8e4e3b97fb82667d0a_1440w.webp" alt="" title="" style="">在cocoeval.py中set ignore flag的时候，ignore flag完全由crowd属性决定。<pre><code class="lang-python"># set ignore flag
for gt in gts:
 gt['ignore'] = gt['ignore'] if 'ignore' in gt else 0
 gt['ignore'] = 'iscrowd' in gt and gt['iscrowd']</code></pre>而ignore为True的gt样本在最后计算AP的时候没有起作用，所以crowd为True的gt确实没有影响到最后的AP。COCO的评测代码为GT设计的crowd属性非常合理，考虑很细致，令人佩服。继续回到刚才的匹配过程的代码，刚才说到了匹配时允许多个det匹配到crowd为True的gt上。但是，对于ignore为True的gt，却并没有说是否允许多个det匹配上去。而这种设计是合理的！因为，ignore属性在本质上其用途是为了在计算AP的时候不考虑这些gt，而crowd为True的gt是因为密集物体聚集（所以允许多个det匹配上去）。ignore只是为了在计算AP的时候屏蔽掉某些gt，比如对想要检测的gt设置了area range时候，那么我们只需要将gt的ignore属性重新赋值就可以了（将area range之外的gt ignore设成True)。此前，为了去掉crowd为True的gt对AP的影响，所以crowd为True的gt都被设成了ignore为True。crowd属性与ignore属性，其含义与用途，其实已经反映在他们的名字上了。那什么时候ignore的gt才允许多个det匹配上去呢？ 这个，完全取决于这个gt是否crowd为True，与ignore的取值无关。例如，对于一个ignore为True，crowd为False的gt，匹配过程最多只允许一个det匹配上去，0个或者1个det匹配都不影响最终的AP，但是如果有大量的det都检测的该gt，那么AP上就会有punishment，因为这些多余的det只能匹配到其他gt上，相当于false positive了。而对于一个ignore为True，crowd为True的gt，匹配过程允许无数个det匹配上去，而且不影响最终的AP计算。因此，需要摒弃之前所认为的crowd等同于ignore的思维旧势。crowd是region级别的，而ignore是instance级别的，一个严谨的检测标注规范应该独立标注二者，crowd用来表示密集聚集的一个区域，用ignore来忽略类标不明的一个instance个体。但是，在目前的实际业务中，只有ignore属性的标注，而且ignore属性事实上在业务中被当作crowd在用。VOC和Wider Face也没有设置crowd属性，在计算AP的时候，直接把ignore当作crowd在用。而且匹配过程和COCO也不一样。VOC以及Wider Face的匹配都是把det匹配到iou最大的一个gt上，这个匹配过程简单得就好像没有匹配过程一样。

MSCOCO物体检测评测系统的分析

发表评论取消回复
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

关于如何在《恋活》（Koikatsu）中导入和使用Shader来增强视觉效果的教程

SQL Server 2019基础配置

XX理工嵌入式实验1 Linux基本操作

【错误解决方案】ModuleNotFoundError: No module named ‘torchvision.models.utils

宝塔+blog+v2ray+ws实现一机多用的科学上网姿势

linux服务器被攻击的查看方法

/usr/lib/x86_64-linux-gnu/libQt5Core.so.5: version `Qt_5.15' not found

cuda长时间运行GPU掉卡

【模型推理】ubuntu 配置和使用 torch2trt

pycharm pip 代理 ValueError: check_hostname requires server_hostname

MSCOCO物体检测评测系统的分析

发表评论 取消回复 使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

MSCOCO物体检测评测系统的分析

发表评论取消回复
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款