Video Tracking 2021


  • Classic object tracking

        • classic feature detection (SIFT and SURF), combined with a machine learning algorithm like KNN or SVM for classification, or with a description matcher like FLANN for object detection.

        • Kalman filtering, sparse and dense optical flow,

        • Example: Simple Online and Realtime Tracking (SORT), which uses a combination of the Hungarian algorithm and Kalman filter

  • SOT is a hot topic in the last decade. Early visual tracking methods rely on extracting hand-crafted features of candidate target regions, and use matching algorithms or hand-crafted discriminative classifiers to generate tracking results.

  • The MOT track aims to recover the trajectories of objects in video sequences, which is an important problem in computer vision with many applications, such as surveillance, activity analysis, and sport video analysis.

  • Video object detection datasets. The video object detection task aims to detect objects of different categories in video sequences.

  • Multi-object tracking datasets

  • large-scale benchmark Multi-Class Multi-object tracking datasets

  • VisDrone datasets is captured in various unconstrained scenes, focusing on four core problems in computer vision fields, i.e., image object detection, video object detection, single object tracking, and multi- object tracking.

  • the accuracy of detection methods suffers from degenerated object appearances in videos such as motion blur, pose variations, and video de-focus. Exploiting temporal coherence and aggregating features in consecutive frames might to be two effective ways to handle such issue.

    • Temporal coherence. A feasible way to exploit temporal coherence is using object trackers

    • Feature aggregation. Aggregating features in consecutive frames is also a useful way to improve the performance.

  • List of Datasets

    • MOT20

    • KITTI Tracking

    • MOTChallenge 2015

    • UA-DETRAC Tracking

    • DukeMTMC

    • Campus

    • MOT17


    • VisDrone

Source code

Self collected datasets

Video labeling

some examples