The DETRAC Object Detection Benchmark consists of a training set (DETRAC-train: 83,791 frames, 577,899 bounding boxes) and a testing set (DETRAC-test: 56,340 frames, 632,270 bounding boxes). All images are in 24-bit color JPEG format with the resolution of 960×540. Specially, we annotated 8250 vehicles in the benchmark, including 5936 vehicles (i.e., "car": 5177, "bus": 106, "van": 610, "others": 43) in the DETRAC-train set and 2314 vehicles (i.e., "car": 1961, "bus": 199, "van": 123, "others": 31) in the DETRAC-test set. All vehicles in the video frame except the ones in the 'don't care' region are annotated and evaluated in our benchmark. Every 10 frames of the video sequences in the DETRAC-train set are used for detector training. The precision vs. recall (PR) curve is used for object detection evaluation. Our development kit provides details about the data format as well as MATLAB / C++ utility functions for reading and writing the annotation files.