I would like to give you some of my experience with AI projects.
How to train model to add new classes?
How to add a new class to an existing classifier in deep learning?
Adding new Class to One Shot Learning trained model
Is it possible to train a neural network as new classes are given?
Merging all several models that detection system for all these tasks.
Online learning is a term used to refer to a model which takes a continual or sequential stream of input data while training, in contrast to offline learning (also called batch learning), where the model is pre-trained on a static predefined dataset.
Continual learning (also called incremental, continuous, lifelong learning) refers to a branch of ML working in an online learning context where models are designed to learn new tasks while maintaining performance on historic tasks. It can be applied to multiple problem paradigms (including Class-incremental learning, where each new task presents new class labels for an ever expanding super-classification problem).
Do I need to train my whole model again on all four classes or is there any way I can just train my model on new class?
Naively re-training the model on the updated dataset is indeed a solution. Continual learning seeks to address contexts where access to historic data (i.e. the original 3 classes) is not possible, or when retraining on an increasingly large dataset is impractical (for efficiency, space, privacy etc concerns). Multiple such models using different underlying architectures have been proposed, but almost all examples exclusively deal with image classification problems.
by using Continual learning approaches to trained without losing the original classes. It has 3 categories:
if you access to the dataset then you can download it and add all you new classes when you have " 'N' COCO Classes + 'M' New classes "
after that you can fine tune model based on new dataset. you do not need all of the dataset just same number of image for all class enough.
Before start your machine learning project ask these questions and preparation: What is your inference hardware? specify the use case. specify model interface. how would we monitor performance after deployment? how can we approximate post-deployment monitoring before deployment? build a model and iteratively improve it. How to deploy the model at the end? monitor performance after deployment. what is your metric? How do you split your data (training and validation)?
Preparation ML Project Workflow
specify the use case
specify model interface
how would we monitor performance after deployment?
how can we approximate post-deployment monitoring before deployment?
build a model and iteratively improve it
deploy the model
what is your are metric?
How do you split your data?
Before Training deep learning model
using large model to train because
it is faster to train with lower overfit and faster converge due to best training
it is easier and higher compress in the final stage
model compression and acceleration: reducing parameters without significantly decreasing the model performance
Data: How to have good data for training deep learning models; How to Build and Enhance A Good Data Set For Your Deep Learning Project: using same config and data for training and inference, removing redundant (delete data which you don't need), get more data, Handle missing data, using data augmentation techniques or GAN to generate more data, re-scale/balance data, Transform your data (Change data types), Feature selection based on data-set and use case
The data you don't need: removing redundant samples
get more data
Invent more data
Transform your data
Feature selection based on dataset and use case
ML-Augmented Video Object Tracking: By applying and evaluating multiple algorithmic models, enhanced ability to scale object tracking in high-density video compositions.
Training deep learning model
Using Hyperparameter tuning / Hyperparameter optimization tools
After Training deep learning model
model pruning: reducing redundant parameters which are not sensitive to the performance.
aim: remove all connections with absolute weights below a threshold
compresses by reducing the number of bits used to represent the weights
quantization effectively constraints the number of different weights we can use inside our kernels
per-channel quantization for weights, which improves performance by model compression and latency reduction.
Low rank matrix factorization (LRMF)
there exists latent structures in the data, by uncovering which we can obtain a compressed representation of the data
LRMF factorizes the original matrix into lower rank matrices while preserving latent structures and addressing the issue of sparseness
Compact convolutional filters (Video/CNN)
designing special structural convolutional filters to save parameters
replace over parametric filters with compact filters to achieve overall speedup while maintaining comparable accuracy
training a compact neural network with distilled knowledge of a large model
distillation (knowledge transfer) from an ensemble of big networks into a much smaller network which learns directly from the cumbersome model's outputs, that is lighter to deploy
Binarized Neural Networks (BNNs)
Apache TVM (incubating) is a compiler stack for deep learning systems
Neural Networks Compression Framework (NNCF)
Deep learning model in production
security: controls access to model(s) through secure packaging and execution
using parallel processing and library such as GStreamer
My Keynote (February 2021)
Machine Learning/ Deep Learning
Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed
supervised Machine Learning
Deep Convolutional Neural Networks (DCNN) Architecture
Visualizing and Understanding Convolutional Networks
Object Detection by Deep Learning
semi-supervised Machine Learning/ Deep Reinforcement learning (DRL)
unsupervised Machine Learning
Generative Adversarial Networks (GANs)
Pre trained model
Effect of Augmented Datasets to Train DCNNs
Training for more classes
business , Gartner, Hype Cycle for emerging technologies, 2025
Advanced and practical
Deep Convolutional Neural Networks Architecture
Dropout ; L2 pooling
Max-pooling is useful
How to see inside each layer and find important features
Visualizing and Understanding Convolutional Networks
Hands on python for deep learning
Fundamental deep learning
Installation: TensorFlow, PyTorch
Summary of the summit
containing or not containing a face
Eigenface, Fisherface, waveletface, PCA (Principal Component Analysis), LDA (Linear Dis-criminant Analysis), Haar wavelet transform, and so on.
illumination changes and occlusion
depthinformation is used to filter the regions of the image where a candidate face regionis found by the Viola–Jones (VJ) detector
- the first filtering rule is defined on the color of the region; since some false positiveshave colors not compatible with the face (e.g. shadows on jeans) a skin detector isapplied to remove the candidate face regions that do not contain skin pixels;
- the second filtering rule is defined on the size of the face: using the depth mapit is quite easy to calculate the size of the candidate face region, which is use-ful to discard smallest and largest faces from the final result set;
- the third filtering rule is defined on the depth map to discard flat objects (e.g.candidate faces found in a wall) or uneven objects (e.g. candidate face foundin the leaves of a tree). Combining color and depth data the candidate faceregion can be extracted from the background and measures of depth and reg-ularity are used for filtering out false positives.
The size criteria simply remove the candidate faces not included in a fixed rangesize ([12.5,30] cm). The size of a candidate face region is extracted from the depthmap according to the following approach.
Gaussian mixture 3D morphable face model
Face Synthesis for Eyeglass-Robust Face Recognition
GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data
FacePoseNet: Making a Case for Landmark-Free Face Alignment
Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision
Unsupervised Eyeglasses Removal in the Wild
How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)
(a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and fi- nally evaluate it on all other 2D facial landmark datasets.
(b) We create a guided by 2D landmarks network which con- verts 2D landmark annotations to 3D and unifies all exist- ing datasets, leading to the creation of LS3D-W, the largest and most challenging 3D facial landmark dataset to date (~230,000 images).
(c) Following that, we train a neural network for 3D face alignment and evaluate it on the newly introduced LS3D-W.
(d) We further look into the effect of all “traditional” factors affecting face alignment performance like large pose, initialization and resolution, and introduce a “new” one, namely the size of the network.
(e) We show that both 2D and 3D face alignment networks achieve per- formance of remarkable accuracy which is probably close to saturating the datasets used.
Training and testing code as well as the dataset can be downloaded from https: //www.adrianbulat.com/face-alignment/