Video Tracking

Resilient object detection and tracking on Edge and cloud (AWS):

The best methods of object tracking run on GPU. The versioning of different deep learning frameworks are crucial. For example the latest version of OS for Jetson Nano Jetpack which use latest CUDA but the Pytorch only support up to 10.1 now. So we need to install lower Jetpack version on Jetson Nano or compile the Pytorch. I compile Pytorch and it takes few hours with a lot of issue to solve. For the Ubuntu 20 there is not support for CUDA 10 you need install CUDA 11 and compile the Pytorch with a lot of library. install CUDA 10.x on Ubuntu 20 is possible but takes time to solve conflicts also supporting eGPU is another issue for lower ubuntu. On MacOS installing everything is easy because of not supporting GPU but many library and frameworks of the source codes of tracking require GPU version. Even install CPU version of all library does not grantee to run tracking methods. Another aspect is speed. Running tracking even on GPU is very slow based on my experience using Yolo version 3 which is one of the fastest object detection on GTX 2070 can process up to 15 FPS with Full HD videos. Methods on tracking is very different. First generation, it is completely based on computer vision. The second generation combining Kalman filter and advanced computer vision (SIFT), the third generation using deep learning and some of the methods of previous generation like Kalman filter. The fourth generation using combination of two deep learning methods. And the latest generation using complete end to end models like RNN. Object tracking works with all combination of environments such as, moving objects, moving objects and camera in dynamic environments. As long as object appear in the frame until disappeared it the tracking can track and identification as one objects. No mater how many FPS.

Tracking

  • Classic object tracking

        • classic feature detection (SIFT and SURF), combined with a machine learning algorithm like KNN or SVM for classification, or with a description matcher like FLANN for object detection.

        • Kalman filtering, sparse and dense optical flow,

        • Example: Simple Online and Realtime Tracking (SORT), which uses a combination of the Hungarian algorithm and Kalman filter

  • SOT is a hot topic in the last decade. Early visual tracking methods rely on extracting hand-crafted features of candidate target regions, and use matching algorithms or hand-crafted discriminative classifiers to generate tracking results.

  • The MOT track aims to recover the trajectories of objects in video sequences, which is an important problem in computer vision with many applications, such as surveillance, activity analysis, and sport video analysis.

  • Video object detection datasets. The video object detection task aims to detect objects of different categories in video sequences.

  • Multi-object tracking datasets

  • large-scale benchmark Multi-Class Multi-object tracking datasets

  • VisDrone datasets is captured in various unconstrained scenes, focusing on four core problems in computer vision fields, i.e., image object detection, video object detection, single object tracking, and multi- object tracking.

  • the accuracy of detection methods suffers from degenerated object appearances in videos such as motion blur, pose variations, and video de-focus. Exploiting temporal coherence and aggregating features in consecutive frames might to be two effective ways to handle such issue.

    • Temporal coherence. A feasible way to exploit temporal coherence is using object trackers

    • Feature aggregation. Aggregating features in consecutive frames is also a useful way to improve the performance.

  • List of Datasets

    • MOT20

    • KITTI Tracking

    • MOTChallenge 2015

    • UA-DETRAC Tracking

    • DukeMTMC

    • Campus

    • MOT17

    • UAVDT-MOT

    • VisDrone

Source code

Self collected datasets

Video labeling

some examples

Endeavor to summarize MOT:

The best methods running on GPU. The versioning of different deep learning frameworks are crucial. For example the latest version of OS for Jetson Nano "Jetpack" use CUDA 11 but the Pytorch only support up to 10.1 now. So we need to install lower Jetpack version on Jetson Nano or compile the Pytorch. I compile Pytorch and it takes few hours with a lot of issue to solve. For the Ubuntu 20 there is not support for CUDA 10 you need install CUDA 11 and compile the Pytorch with a lot of library. install CUDA 10.x on Ubuntu 20 is possible but takes time to solve conflicts also supporting eGPU is another issue for lower ubuntu. On MacOS installing everything is easy because of not supporting GPU but many library and frameworks of the source codes of tracking require GPU version. Even install CPU version of all library does not grantee to run tracking methods. Another aspect is speed. Running tracking even on GPU is very slow based on my experience using Yolo version 3 which is one of the fastest object detection on GTX 2070 may run in real time.

Methods on tracking is very different. First generation, it is completely based on computer vision. The second generation combining machine learning, Kalman filter and advanced computer vision (SIFT), the third generation using deep learning and some of the methods of previous generation like Kalman filter. The fourth generation using combination of two deep learning methods. And the latest generation using complete end to end models with RNN.

Object tracking works with all combination of environments such as, moving objects, moving objects and camera in dynamic environments. As long as object appear in the frame until disappeared it the tracking can track and identification as one objects. No mater how many FPS.

In around 130 videos of the course of Multiple Object Tracking on EDEX means this topic is huge and require more attention for the more research and development.

Running MOT on Jetson nano is tricky and hacky in many way. First, the cup is arm based and not many package are build for it.

Datasets for Tracking:

MOTChallenge

MOT15

MOT16/17

MOT19

KITTI

UA-DETRAC tracking benchmark

metrics

  • Mostly Tracked (MT) trajectories: number of ground-truth trajectories that are correctly tracked in at least 80% of the frames.

  • Fragments: trajectory hypotheses which cover at most 80% of a ground truth trajectory. Observe that a true trajectory can be covered by more than one fragment.

  • Mostly Lost (ML) trajectories: number of ground-truth trajectories that are correctly tracked in less than 20% of the frames.
    False trajectories: predicted trajectories which do not correspond to a real object (i.e. to a ground truth trajectory).

  • ID switches: number of times when the object is correctly tracked, but the associated ID for the object is
    mistakenly changed.

Test:

https://github.com/xingyizhou/CenterTrack

Only Ubuntu, Not mac, can based on GPU, webcam not working

https://github.com/tianweiy/CenterPoint

Only GPU

YouTube:

OpenCV

Tracking Objects | OpenCV Python Tutorials for Beginners 2020

Multiple Object Tracking

Python: Real-time Multiple Object Tracking (MOT) with Yolov3, Tensorflow and Deep SORT [FULL COURSE]

There are at least 7 types of tracker algorithms that can be used in OpenCV: not DL

  • MIL

  • BOOSTING

  • MEDIANFLOW

  • TLD

  • KCF

  • GOTURN

  • MOSSE

Kalman filtering, sparse and dense optical flow are Simple Online and Realtime Tracking (SORT), which uses a combination of the Hungarian algorithm and Kalman filter to achieve decent object tracking.

R-CNN

around 2000 region proposals

selective search

share colors and textures, lightning conditions

slow to train and test

Fast R-CNN

computes a convolutional feature map for the entire input image in a single forward pass of the network

architecture is trained end-to-end with a multi-task loss

https://github.com/ZQPei/deep_sort_pytorch

Simple Online and Realtime Tracking with a Deep Association Metric. 2017

https://arxiv.org/pdf/1703.07402

https://mcv-m6-video.github.io/deepvideo-2019/

https://www.youtube.com/watch?v=Cf1INvUsvkM

The online course about multiple object tracking in Edx:

Course Section 0: Welcome and Introduction '

Part 1: Introduction to Multiple Object Tracking (MOT): good ; many definition and definitions: 15 videos

Introductory examples

Is about the accurate perception of the driving environment

Avoid collisions at the airport

Crowd surveillance

Crowd behavior

Planning of emergency procedures

Pedestrian tracking using LIDAR

Tracking based on detections

Group behavior

Part 2: Single Object Tracking in clutter (SOT): Many math; basic methods, 23 videos

Introduction to SOT in Clutter

Pruning and merging

Pruning : remove hypotheses with small weights (and renormalize)

Merging: approximate a mixture of densities by a single density (often Gaussian)

Gating: technique to disregard unreasonable detections [pruning]

SOT

  • Gaussian densities

    • Nearest neighbour (NN) filter [pruning]

    • Probabilistic data association (PDA) filter [merging]

  • Gaussian mixture densites

    • Gaussian sum filter (GSF) [pruning/merging]

Part 3: Tracking a known number of objects in clutter 30

3.3.6 Predicting the n object density

3.4.1 Introduction to data association

Part 4: Random Finite Sets 24

Part 5: Multiple Object Tracking using conjugate priors 25 [only in YouTube]

Part 6: Outlook - what is next? 18 [only in YouTube]