2018 | Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, Ali Borji
The paper introduces a novel evaluation measure called E-measure (Enhanced-alignment measure) for binary foreground maps (FMs). Traditional measures, such as IOU, F1/JI, and VQ, often fail to capture both global statistics and local pixel matching information, leading to suboptimal performance in evaluating FMs. The E-measure combines local pixel values with the image-level mean value in a single term, effectively capturing both global statistics and local pixel matching. The authors demonstrate the superiority of the E-measure over existing measures using five meta-measures on four popular datasets, including application ranking, demoting generic maps, random Gaussian noise maps, ground-truth switch, and human judgments. The E-measure shows significant improvements in most meta-measures, with improvements ranging from 9.08% to 19.65% compared to other popular measures. The paper also introduces a new dataset, FMDatabase, which contains 555 human-ranked maps, to assess the correlation between evaluation measures and human judgments. The authors conclude by discussing the limitations of their metric and future work, including the potential to develop a new segmentation model and loss function based on the E-measure.The paper introduces a novel evaluation measure called E-measure (Enhanced-alignment measure) for binary foreground maps (FMs). Traditional measures, such as IOU, F1/JI, and VQ, often fail to capture both global statistics and local pixel matching information, leading to suboptimal performance in evaluating FMs. The E-measure combines local pixel values with the image-level mean value in a single term, effectively capturing both global statistics and local pixel matching. The authors demonstrate the superiority of the E-measure over existing measures using five meta-measures on four popular datasets, including application ranking, demoting generic maps, random Gaussian noise maps, ground-truth switch, and human judgments. The E-measure shows significant improvements in most meta-measures, with improvements ranging from 9.08% to 19.65% compared to other popular measures. The paper also introduces a new dataset, FMDatabase, which contains 555 human-ranked maps, to assess the correlation between evaluation measures and human judgments. The authors conclude by discussing the limitations of their metric and future work, including the potential to develop a new segmentation model and loss function based on the E-measure.