MACnet Mask augmented counting network for class-agnostic counting

Tadhg McCarthy, John Jethro Virtusio, Jose Jaena Mari Ople, Daniel Stanley Tan, Divina Amalin, Kai Lung Hua*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Class agnostic counting involves counting the instances of any user-defined class. It is also usually phrased as a matching problem wherein the model finds all the matching objects in a query image given exemplar patches containing the target object. Typically, users define exemplar patches by placing bounding boxes around the target object. However, defining exemplars using bounding boxes inevitably captures both the target object (foreground) and its surrounding background. This would unintentionally match patches similar to the background, leading to an inaccurate count. Moreover, objects poorly represented by a bounding box (e.g., non-axis aligned, irregular, or non-rectangular shapes) may capture a significantly disproportionate amount of background relative to the foreground within the exemplar patch, leading to poor matches. In this paper, we propose to utilize segmentation masks to separate target objects from their background. We derived these segmentation masks from extreme points, which requires no additional annotation effort from the user compared to annotating bounding boxes. Moreover, we designed a module that learns the mask features as residual to the object features, allowing the network to learn how to better incorporate the segmentation masks. Our model improves upon state-of-the-art methods by up to 3.7 MAE points on the FSC-147 benchmark dataset, showing the effectiveness of our masking approach.

Original languageEnglish
Pages (from-to)75-80
Number of pages6
JournalPattern Recognition Letters
Publication statusPublished - May 2023


  • Class-agnostic counting
  • Extreme points
  • Segmentation masks


Dive into the research topics of 'MACnet Mask augmented counting network for class-agnostic counting'. Together they form a unique fingerprint.

Cite this