Sections
You are here: Home Members Riemenschneider Hayko Shape-based detection - An overview

Shape-based detection - An overview

Overview description is still in progress of being written up...
There exists a range of work in the contour-based paradigm which achieve state- of-the-art performance for several object categories using contour information, for an overview see Figure 1. The research falls into four main categories, namely (i) learning codebooks of contour fragments, (ii) approximating contours by piecewise segments, (iii) using local description of the contour at selected in- terest points, or (iv) assigning entire edges to either foreground or background. Additional techniques are used in each work, for example learning deformation models, sophisticated cost functions or probabilistic grouping.

Evaluations common datasets:

The first is ETHZ Shape Classes dataset from Vittorio Ferrari. It consists of ve object classes and a total of 255 images. All classes contain signi cant intra-class variations and scale changes. The images sometimes contain multiple instances of a category and have a large amount of background clutter.
Download ETHZ Shape Classes dataset v1.2

The second is the INRIA horses dataset from Frédéric Jurie and Vittorio Ferrari, which consists of 170 images with one or more horses in side-view at several scales and cluttered background, and 170 images without horses.
Download INRIA horses datasett v1.03

Learning codebooks:

Shotton et al. [1] and Opelt et al. [2] concurrently proposed to construct shape fragments tailored to speci c object classes. Both find matches to a predefined fragment codebook by chamfer matching to the query image and then nd detections by a star-shaped voting model. Their methods rely on chamfer matching which is sensitive to clutter and rotation. In both approaches the major aspect is to learn discriminative combinations of boundary parts as weak classi ers using boosting to build a strong detector.
MethodDatasetProtocolDetection rate
Shotton [1]Cow detectionXYZ Overlap??% @ FPPI 1.0Chamfer+Boosting fragments
Opelt [2]Cow detectionXYZ Overlap??% @ FPPI 1.0Chamfer+Boosting fragments

Piecewise approximation:

Ferrari et al. [15, 3] build groups of approximately straight adjacent segments (kAS) to work together in a team to match the model parts. The segments are matched within a contour segmentation network which provides the combinations of multiple simple segments using the power of connectedness. In later work they also show how to automatically learn codebooks [3], or how to learn category shape models from cropped training images [16]. In a veri cation step they use a thin-plate-spline (TPS)-based matching to accurately localize the object boundary. Similar to this, Ravishankar et al. [4] use short segments to approximate the outer contour of objects. In contrast to straight segments, they prefer slightly curved segments to have better discriminative power between the segments. They further use a sophisticated scoring function which takes local deformations in scale and orientations into account. However, they break the reference template at high curvature points to be able to match parts, again resulting in disjoint approximations of the actual contour. In their veri cation stage, the gradient maps are used as underlying basis for object detection avoiding the error-prone detection of edges. Curved segments
MethodDatasetProtocolDetection rateNotes
Ferrari [15]ETHZPASCAL 50%??% @ FPPI 1.0Contour segmentation network
Ferrari [16]ETHZPASCAL 50%??% @ FPPI 1.0Contour segmentation network
Ferrari [16]CaltechPASCAL 50%??% @ FPPI 1.0Learn models
Ravishankar [4]ETHZPASCAL 50%??% @ FPPI 1.0

Shape-based interest points:

This category uses descriptors to capture and match coarse descriptions of the local shape around interest points. Leordeanu et al. [17] use simple features based on normal orientations and pairwise interactions between them to learn and detect object models in images. Their simple features are represented in pairwise relations in category speci c models that can learn hundreds of parts. Berg et al. [14] formulate the object detection problem as a deformable shape matching problem. However, they require hand-segmented training images and do not learn deformation models in training. Further in the line are the works of Maji and Malik [5] and Ommer and Malik [6] which match geometric blur features to training images. The former use a max-margin framework to learn discriminative weights for each feature type to ensure maximal discrimination during the voting stage. The latter provide an interesting adaptation of the usual Hough-style center voting. Ommer and Malik transform the discrete scale voting to a continuous domain where the scale is another unknown in the voting space. Instead of multiple discrete center vectors, they formulate the votes as lines and cluster these to nd scale-coherent hypotheses. The verification is done using a HOG-based fast SVM kernel (IKSVM).
MethodDatasetProtocolDetection rateNotes
Leordeanu [17]ETHZPASCAL 50%??% @ FPPI 1.0Simple pairwise features
Berg [14]ETHZPASCAL 50%??% @ FPPI 1.0Deformable shapes
Maji [5]ETHZPASCAL 50%??% @ FPPI 1.0Max-margin
Ommer [6]ETHZPASCAL 50%??% @ FPPI 1.0Voting lines

Figure / ground assignment:

Similar in concept but not in practice are the works of Zhu et al. [8] and Lu et al. [9]. They cast the problem as gure / ground labeling of edges and decide for a rather small set of edges which belong to the foreground and which are background clutter. By this labeling they reduce the clutter and focus on salient edges in their veri cation step. Lu et al. use particle lters under static observation to simultaneously group and label the edge contours. They use a new shape descriptor based on angles to decide edge contour similarity. Zhu et al. use control points along the reference contour to find possible edge contour combinations and then solve cost functions eciently using linear programming. They nd a maximal matching between a set of query image contours and a set of salient contour parts from the reference template, which was manually split into a set of reference segments. Both assume to match entire edge contours to the reference sets and require long salient contours. Recent work by Bai et al. [7] is also based on a background clutter removal stage called shapeband. Shapeband is a new type of sliding window adapted to the shape of objects. It is used to provide location hypotheses and to select edge contour candidates. However, in their runtime intensive veri cation step they iteratively compute shape context descriptors [18] to select similar edge contours. Another recent approach by Gu et al. [19] proposes to use regions instead of local interest points or contours to better estimate the location and scale of objects.
MethodDatasetProtocolDetection rateNotes
Zhu [8]ETHZFerrari 20%??% @ FPPI 1.0Set2Set
Lu [9]ETHZFerrari 20%??% @ FPPI 1.0ParticleFilter
Bai [7]ETHZPASCAL 50%??% @ FPPI 1.0Shapeband
Gu [19]ETHZFerrari 50%??% @ FPPI 1.0Regions

Efficient Partial Contour Matching

We place our method in between the aforementioned approaches. We use edge contours in the query image and match them at any length from short contour segments up to full regions boundaries using partial shape matching. In such a setting the similarity to the prototype shape decides the complexity and length of the considered contours.

We propose a method for object category localization by partially matching edge contours to a single shape prototype of the category. Previous work in this area either relies on piecewise contour approximations, requires meaningful supervised decompositions, or matches coarse shape-based descriptions at local interest points. Our method avoids error-prone pre-processing steps by using all obtained edges in a partial contour matching setting. The matched fragments are efficiently summarized and aggregated to form location hypotheses. The efficiency and accuracy of our edge fragment based voting step yields high quality hypotheses in low computation time.
MethodDatasetProtocolDetection rateNotes
Riemenschneider [X]ETHZPASCAL 50%90.5% @ FPPI 0.45 seconds, no approximations
Riemenschneider [X]INRIAPASCAL 50%83.7% @ FPPI 1.05 seconds, no approximations


The following video for efficient partial contour matching illustrates the concept of our method:



Publications

  1. Using Partial Edge Contour Matches for Efficient Object Category Localization (Paper as PDF)
    Hayko Riemenschneider, Michael Donoser and Horst Bischof
    Proceedings of European Conference on Computer Vision (ECCV), 2010
  2. Efficient Partial Shape Matching of Outer Contours (Paper as PDF)
    Michael Donoser, Hayko Riemenschneider and Horst Bischof
    Proceedings of Asian Conference on Computer Vision (ACCV), 2009
Document Actions
[Powered by Plone]