Large-scale 3D Shape Retrieval
from ShapeNet Core55

3D content is becoming increasingly prevalent and important to everyday life. With commodity depth sensors, everyone can easily scan 3D models from the real world. Better 3D modeling tools are allowing designers to produce 3D models more easily. And with the advent of virtual reality, the demand for high quality 3D models will only increase. The increasing availability of 3D models requires scalable and efficient algorithms to manage and analyze them. A key research problem is retrieval of relevant 3D models and the community has been actively working on this task for more than a decade. However, existing algorithms are usually evaluated on datasets with only thousands of models, even though millions of 3D models are now available on the Internet. Thanks to the efforts of the ShapeNet [1] team, we can now use a much bigger dataset of 3D models to develop and evaluate new algorithms. In this track, we aim to evaluate the performance of 3D shape retrieval methods on a subset of the ShapeNet dataset.


The full track report is now available HERE.

Answers to participant questions, and clarifications about the submission process:

Q1: Which are the query models and the target database models (are they the test models?)
A1: Please treat each test model as a query model and all of the models in the test set (including the model itself) as the target retrieval database. This is consistent with the approach taken in previous years of the competition. Please submit result zip archives for the training, validation and test model sets (both normal and perturbed versions of each). Follow the instructions provided at the competition webpage for packaging and result format. In total you should submit six result sets (train/val/test x normal/perturbed).

Q2: There are two empty model files: train/04090263/model_004004.obj and test/model_039231.obj (and their counterparts in the _perturbed sets). How should they be handled?
A2: Please ignore/discard these models. They will not be considered in the contest.

Q3: I noticed that there are inconsistencies in the models provided in the zip files and the corresponding annotation csv files. What should I do?
A3: In some cases, the zip files contain extra models, and the annotation csv files list models that are not in the zip files. This is due to discrepancies when deduplicating the model data across different synsets. Please ignore any such missing models -- we apologize for any confusion these discrepancies may have caused. Treat the annotation csv files as the canonical lists of models and model synset assignments for the purposes of the contest and evaluation.

Q4: How should the method description be submitted? Is there a template?
A4: Please write a report describing your method and its implementation. Maximum one page with at most two figures (included in page length). No need to include any test result details -- we will be computing the evaluation statistics for all participants. We require both a PDF file and source LaTeX files. Please use the Eurographics 2016 LaTeX template (link). Include the names and affiliations of all members of the team when you submit your method description. For examples, please refer to reports from previous SHREC competitions such as the SHREC 2014 Large Scale Comprehensive 3D Shape Retrieval.

Q5: How and when should the full submission (including method writeup and results) be made?
A5: Package your results as specified in A1, and create your method writeup as specified in A4. Then create a single zip archive with all your files and compute an MD5 checksum for it. Send the MD5 checksum and a download link for your file to by Wednesday March 2nd 11:59PM UTC. We will confirm each submission after it is successfully received. Note that we are extending the submission deadline by a couple of days due to demand. Please note that this is a hard deadline, and we will not accept any submissions after that since we need to run and compile all participant results.

Q6: Where should participants contact the organizers about questions for the fastest response?
A6: Please send email to


We use the ShapeNetCore subset of ShapeNet which contains about 51,300 3D models over 55 common categories, each subdivided into several subcategories. We created a 70%/10%/20% training/validation/test split from this dataset. Models are provided in OBJ format and two dataset versions are available: consistently aligned (regular dataset), and a more challenging dataset where models are perturbed by random rotations. Category and subcategory labels are provided for training and validation models as comma-separated files with a header row specifying the meaning of each column: modelId, synsetId (category label) and subsynetId (subcategory label). Download links:
Training Models (8.5GB) | Training Models Perturbed (9.3GB) | Training Model Labels
Validation Models (1.2GB) | Validation Models Perturbed (1.3GB) | Validation Model Labels
Test Models (2.5GB) | Test Models Perturbed (2.7GB)

Procedure and Schedule


Category and subcategory labels are given for the training and validation splits of the dataset. The test set labels will be used for evaluation as described below (subcategory labels will only be used to establish a more challenging graded relevance for the NDCG metric). Each participant will submit a set of ranked retrieval lists using each test set model as the query. The ranked list format and submission procedure is described in more detail HERE. Ranked lists should order retrieved test models by similarity to the query test model. The ranked lists are evaluated against the ground truth category and subcategory annotations of the test set. A set of standard information retrieval evaluation metrics are used:

The first three metrics will be evaluated on binary in-category vs out-of-category relevance, whereas the NDCG metric will use a graded relevance: 3 for perfect category and subcategory match in query and retrieval, 2 for category and subcategory both being same as the category, 1 for correct category and a sibling subcategory, and 0 for no match.
Macro-averaged versions of the above metrics will be used to give an unweighted average over the entire dataset. Micro-averaged versions will also be used to adjust for model category sizes giving a representative performance metric averaged across categories. The organizers provide evaluation code for computing all these metrics HERE.


The full track report document is available HERE. Below is a summary table of evaluation results and Precision-Recall plots for all participating teams and methods, on all the competition datasets. The results were computed as specified in the evaluation section using the ranked list format files submitted by each participating team. They can be regenerated from using the Evaluator code. The full results data visualized below is provided in CSV format in a zip archive HERE.

Evaluation metrics summary table

Precision-recall plots



  • Manolis Savva - Stanford University
  • Fisher Yu - Princeton University
  • Hao Su - Stanford University

Advisory Board

  • Leonidas Guibas - Stanford University
  • Pat Hanrahan - Stanford University
  • Silvio Savarese - Stanford University
  • Qixing Huang - Toyota Technological Institute at Chicago
  • Jianxiong Xiao - Princeton University
  • Thomas Funkhouser - Princeton University


[1] Chang et al., ShapeNet: An Information-Rich 3D Model Repository arXiv:1512.03012
[2] Wu et al., 3D ShapeNets: A Deep Representation for Volumetric Shapes CVPR 2015
[3] Philip Shilane et al., The Princeton Shape Benchmark Shape Modeling International, June 2004