Object Detection Evaluation


The DETRAC Object Detection Benchmark consists of a training set (DETRAC-train: 83,791 frames, 577,899 bounding boxes) and a testing set (DETRAC-test: 56,340 frames, 632,270 bounding boxes). All images are in 24-bit color JPEG format with the resolution of 960×540. Specially, we annotated 8250 vehicles in the benchmark, including 5936 vehicles (i.e., "car": 5177, "bus": 106, "van": 610, "others": 43) in the DETRAC-train set and 2314 vehicles (i.e., "car": 1961, "bus": 199, "van": 123, "others": 31) in the DETRAC-test set. All vehicles in the video frame except the ones in the 'don't care' region are annotated and evaluated in our benchmark. Every 10 frames of the video sequences in the DETRAC-train set are used for detector training. The precision vs. recall (PR) curve is used for object detection evaluation. Our development kit provides details about the data format as well as MATLAB / C++ utility functions for reading and writing the annotation files.


This demo video illustrates the ground truth labeling for vehicle detections in the DETRAC datasets. The vehicles are highlighted by red bounding boxes. Black opaque regions are ignored in the benchmark as general backgrounds. The weather condition, camera status and the vehicle density are presented in the bottom-left of each frame.

The demo video may take a few seconds to load.

Evaluation Protocol

We use the precision vs. recall (PR) curve for object detection evaluation. The PR curve is generated by changing the score threshold of an object detector to generate different precision and recall values. Per-frame detector is evaluated as the KITTI-D benchmark. The hit/miss threshold of the overlap between a detected bounding box and a ground truth bounding box is set to 0.7. The average precision (AP) score of the PR curve is used to indicate the overall performance, i.e., the larger AP score indicates the better performance of object detection algorithm. The performances of the evaluated detection algorithms are presented in the result page.


  • Dataset:
  • Annotations:
  • Baseline methods:
    • DPM: P. F. Felzenszwalb, R. B. Girshick, D. A. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. In TPAMI, 32(9):1627-1645, 2010.
    • ACF: P. Dollár, R. Appel, S. Belongie, and P. Perona. Fast feature pyramids for object detection. In TPAMI, 36(8):1532-1545, 2014.
    • R-CNN: R. B. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, pages 580-587, 2014.
    • CompACT: Z. Cai, M. Saberian, and N. Vasconcelos. Learning complexity-aware cascades for deep pedestrian detection. In ICCV, 2015.

Related Dataset

  • Caltech Pedestrian Detection Benchmark: including 10 hours of videos with 350,000 annotated pedestrian bounding boxes taken from a moving vehicle with the viewpoint of the driver.
  • KITTI Detection Benchmark: including 14,999 frames comprising a total of 80,256 labeled objects taken from a moving vehicle with the viewpoint of the driver.
  • MIT Street Scene: Each image was taken from a DSC-F717 camera at in and around Boston, MA, and was then labeled by hand with polygons surrounding each example of 9 object categories, including cars, pedestrians, bicycles, buildings, trees, skies, roads, sidewalks, and stores.
  • KAIST Benchmark: Multispectral pedestrian detection benchmark which provides well aligned color-thermal image pairs captured by a beam splitter based special hardware.
  • CVC-ADAS: A series of pedestrian sequences including pedestrian videos acquired on-board, virtual-world pedestrians, and occluded pedestrians.
  • ETHZ: A series of video sequences with annotated pedestrians captured from a stroller in the urban scenes.
  • PASCAL VOC: Diverse object views and poses in static images.
  • ImageNet: A image database consists of 14,197,122 images.


If you use this dataset in your research, please include the following citation in any published results.

  author    = {Longyin Wen and Dawei Du and Zhaowei Cai and Zhen Lei and Ming{-}Ching Chang and
               Honggang Qi and Jongwoo Lim and Ming{-}Hsuan Yang and Siwei Lyu},
  title     = { {DETRAC:} {A} New Benchmark and Protocol for Multi-Object Detection and Tracking},
  journal   = {arXiv CoRR},
  volume    = {abs/1511.04136},
  year      = {2015}