Video Tracking
Resilient object detection and tracking on Edge and cloud (AWS):
The best methods of object tracking run on GPU. The versioning of different deep learning frameworks are crucial. For example the latest version of OS for Jetson Nano Jetpack which use latest CUDA but the Pytorch only support up to 10.1 now. So we need to install lower Jetpack version on Jetson Nano or compile the Pytorch. I compile Pytorch and it takes few hours with a lot of issue to solve. For the Ubuntu 20 there is not support for CUDA 10 you need install CUDA 11 and compile the Pytorch with a lot of library. install CUDA 10.x on Ubuntu 20 is possible but takes time to solve conflicts also supporting eGPU is another issue for lower ubuntu. On MacOS installing everything is easy because of not supporting GPU but many library and frameworks of the source codes of tracking require GPU version. Even install CPU version of all library does not grantee to run tracking methods. Another aspect is speed. Running tracking even on GPU is very slow based on my experience using Yolo version 3 which is one of the fastest object detection on GTX 2070 can process up to 15 FPS with Full HD videos. Methods on tracking is very different. First generation, it is completely based on computer vision. The second generation combining Kalman filter and advanced computer vision (SIFT), the third generation using deep learning and some of the methods of previous generation like Kalman filter. The fourth generation using combination of two deep learning methods. And the latest generation using complete end to end models like RNN. Object tracking works with all combination of environments such as, moving objects, moving objects and camera in dynamic environments. As long as object appear in the frame until disappeared it the tracking can track and identification as one objects. No mater how many FPS.
Tracking
Classic object tracking
classic feature detection (SIFT and SURF), combined with a machine learning algorithm like KNN or SVM for classification, or with a description matcher like FLANN for object detection.
Kalman filtering, sparse and dense optical flow,
Example: Simple Online and Realtime Tracking (SORT), which uses a combination of the Hungarian algorithm and Kalman filter
SOT is a hot topic in the last decade. Early visual tracking methods rely on extracting hand-crafted features of candidate target regions, and use matching algorithms or hand-crafted discriminative classifiers to generate tracking results.
The MOT track aims to recover the trajectories of objects in video sequences, which is an important problem in computer vision with many applications, such as surveillance, activity analysis, and sport video analysis.
Video object detection datasets. The video object detection task aims to detect objects of different categories in video sequences.
Multi-object tracking datasets
large-scale benchmark Multi-Class Multi-object tracking datasets
VisDrone datasets is captured in various unconstrained scenes, focusing on four core problems in computer vision fields, i.e., image object detection, video object detection, single object tracking, and multi- object tracking.
the accuracy of detection methods suffers from degenerated object appearances in videos such as motion blur, pose variations, and video de-focus. Exploiting temporal coherence and aggregating features in consecutive frames might to be two effective ways to handle such issue.
Temporal coherence. A feasible way to exploit temporal coherence is using object trackers
Feature aggregation. Aggregating features in consecutive frames is also a useful way to improve the performance.
- List of Datasets
MOT20
KITTI Tracking
MOTChallenge 2015
UA-DETRAC Tracking
DukeMTMC
Campus
MOT17
UAVDT-MOT
VisDrone
Source code
ROLO
TensorFlow: link
SiamMask
PyTorch 0.4.1: link
Deep SORT
TrackR-CNN
TensorFlow 1.13.1: link
Tracktor++
PyTorch 1.3.1: link
JDE
PyTorch ≥ 1.2.0: link
Self collected datasets
Reference
Vision Meets Drones: Past, Present and Future
https://blog.netcetera.com/object-detection-and-tracking-in-2020-f10fb6ff9af3
https://pythonawesome.com/yolo-rcnn-object-detection-and-multi-object-tracking/
https://cv-tricks.com/object-tracking/quick-guide-mdnet-goturn-rolo/
Deep Learning in Video Multi-Object Tracking: A Survey https://arxiv.org/abs/1907.12740
HOTA: A Higher Order Metric for Evaluating Multi-object Tracking https://link.springer.com/article/10.1007/s11263-020-01375-2
some examples
Endeavor to summarize MOT:
The best methods running on GPU. The versioning of different deep learning frameworks are crucial. For example the latest version of OS for Jetson Nano "Jetpack" use CUDA 11 but the Pytorch only support up to 10.1 now. So we need to install lower Jetpack version on Jetson Nano or compile the Pytorch. I compile Pytorch and it takes few hours with a lot of issue to solve. For the Ubuntu 20 there is not support for CUDA 10 you need install CUDA 11 and compile the Pytorch with a lot of library. install CUDA 10.x on Ubuntu 20 is possible but takes time to solve conflicts also supporting eGPU is another issue for lower ubuntu. On MacOS installing everything is easy because of not supporting GPU but many library and frameworks of the source codes of tracking require GPU version. Even install CPU version of all library does not grantee to run tracking methods. Another aspect is speed. Running tracking even on GPU is very slow based on my experience using Yolo version 3 which is one of the fastest object detection on GTX 2070 may run in real time.
Methods on tracking is very different. First generation, it is completely based on computer vision. The second generation combining machine learning, Kalman filter and advanced computer vision (SIFT), the third generation using deep learning and some of the methods of previous generation like Kalman filter. The fourth generation using combination of two deep learning methods. And the latest generation using complete end to end models with RNN.
Object tracking works with all combination of environments such as, moving objects, moving objects and camera in dynamic environments. As long as object appear in the frame until disappeared it the tracking can track and identification as one objects. No mater how many FPS.
In around 130 videos of the course of Multiple Object Tracking on EDEX means this topic is huge and require more attention for the more research and development.
Running MOT on Jetson nano is tricky and hacky in many way. First, the cup is arm based and not many package are build for it.
Datasets for Tracking:
MOTChallenge
MOT15
MOT16/17
MOT19
KITTI
UA-DETRAC tracking benchmark
metrics
Mostly Tracked (MT) trajectories: number of ground-truth trajectories that are correctly tracked in at least 80% of the frames.
Fragments: trajectory hypotheses which cover at most 80% of a ground truth trajectory. Observe that a true trajectory can be covered by more than one fragment.
Mostly Lost (ML) trajectories: number of ground-truth trajectories that are correctly tracked in less than 20% of the frames.
False trajectories: predicted trajectories which do not correspond to a real object (i.e. to a ground truth trajectory).ID switches: number of times when the object is correctly tracked, but the associated ID for the object is
mistakenly changed.
Test:
https://github.com/xingyizhou/CenterTrack
Only Ubuntu, Not mac, can based on GPU, webcam not working
https://github.com/tianweiy/CenterPoint
Only GPU
YouTube:
OpenCV
Tracking Objects | OpenCV Python Tutorials for Beginners 2020
Multiple Object Tracking
Python: Real-time Multiple Object Tracking (MOT) with Yolov3, Tensorflow and Deep SORT [FULL COURSE]
There are at least 7 types of tracker algorithms that can be used in OpenCV: not DL
MIL
BOOSTING
MEDIANFLOW
TLD
KCF
GOTURN
MOSSE
Kalman filtering, sparse and dense optical flow are Simple Online and Realtime Tracking (SORT), which uses a combination of the Hungarian algorithm and Kalman filter to achieve decent object tracking.
R-CNN
around 2000 region proposals
share colors and textures, lightning conditions
slow to train and test
Fast R-CNN
computes a convolutional feature map for the entire input image in a single forward pass of the network
architecture is trained end-to-end with a multi-task loss
https://github.com/ZQPei/deep_sort_pytorch
Simple Online and Realtime Tracking with a Deep Association Metric. 2017
https://arxiv.org/pdf/1703.07402
https://mcv-m6-video.github.io/deepvideo-2019/
https://www.youtube.com/watch?v=Cf1INvUsvkM
Course Section 0: Welcome and Introduction '
Part 1: Introduction to Multiple Object Tracking (MOT): good ; many definition and definitions: 15 videos
Is about the accurate perception of the driving environment
Avoid collisions at the airport
Crowd surveillance
Crowd behavior
Planning of emergency procedures
Pedestrian tracking using LIDAR
Tracking based on detections
Group behavior
Part 2: Single Object Tracking in clutter (SOT): Many math; basic methods, 23 videos
Introduction to SOT in Clutter
Pruning and merging
Pruning : remove hypotheses with small weights (and renormalize)
Merging: approximate a mixture of densities by a single density (often Gaussian)
Gating: technique to disregard unreasonable detections [pruning]
SOT
Gaussian densities
Nearest neighbour (NN) filter [pruning]
Probabilistic data association (PDA) filter [merging]
Gaussian mixture densites
Gaussian sum filter (GSF) [pruning/merging]
Part 3: Tracking a known number of objects in clutter 30
3.3.6 Predicting the n object density
3.4.1 Introduction to data association
Part 4: Random Finite Sets 24
Part 5: Multiple Object Tracking using conjugate priors 25 [only in YouTube]
Part 6: Outlook - what is next? 18 [only in YouTube]