I would like to give you some of my experience with AI projects.


How to train model to add new classes?

How to add a new class to an existing classifier in deep learning?

Adding new Class to One Shot Learning trained model

Is it possible to train a neural network as new classes are given?

Merging all several models that detection system for all these tasks.

Answer 1:

There are several ways to add new classes to the trained model, which require just training for the new classes.

  • Incremental training (GitHub)

  • continuously learn a stream of data (GitHub)

  • online machine learning (GitHub)

  • Transfer Learning Twice

  • Continual learning approaches (Regularization, Expansion, Rehearsal) (GitHub)

Answer 2:

Online learning is a term used to refer to a model which takes a continual or sequential stream of input data while training, in contrast to offline learning (also called batch learning), where the model is pre-trained on a static predefined dataset.

Continual learning (also called incremental, continuous, lifelong learning) refers to a branch of ML working in an online learning context where models are designed to learn new tasks while maintaining performance on historic tasks. It can be applied to multiple problem paradigms (including Class-incremental learning, where each new task presents new class labels for an ever expanding super-classification problem).

Do I need to train my whole model again on all four classes or is there any way I can just train my model on new class?

Naively re-training the model on the updated dataset is indeed a solution. Continual learning seeks to address contexts where access to historic data (i.e. the original 3 classes) is not possible, or when retraining on an increasingly large dataset is impractical (for efficiency, space, privacy etc concerns). Multiple such models using different underlying architectures have been proposed, but almost all examples exclusively deal with image classification problems.

Answer 3:

You could use transfer learning (i.e. use a pre-trained model, then change its last layer to accommodate the new classes, and re-train this slightly modified model, maybe with a lower learning rate) to achieve that, but transfer learning does not necessarily attempt to retain any of the previously acquired information (especially if you don't use very small learning rates, you keep on training and you do not freeze the weights of the convolutional layers), but only to speed up training or when your new dataset is not big enough, by starting from a model that has already learned general features that are supposedly similar to the features needed for your specific task. There is also the related domain adaptation problem.

There are more suitable approaches to perform incremental class learning (which is what you are asking for!), which directly address the catastrophic forgetting problem. For instance, you can take a look at this paper Class-incremental Learning via Deep Model Consolidation, which proposes the Deep Model Consolidation (DMC) approach. There are other continual/incremental learning approaches, many of them are described here or in more detail here.

Answer 4:

by using Continual learning approaches to trained without losing the original classes. It has 3 categories:




Answer 5:

if you access to the dataset then you can download it and add all you new classes when you have " 'N' COCO Classes + 'M' New classes "

after that you can fine tune model based on new dataset. you do not need all of the dataset just same number of image for all class enough.

Before start your machine learning project ask these questions and preparation: What is your inference hardware? specify the use case. specify model interface. how would we monitor performance after deployment? how can we approximate post-deployment monitoring before deployment? build a model and iteratively improve it. How to deploy the model at the end? monitor performance after deployment. what is your metric? How do you split your data (training and validation)?

Preparation ML Project Workflow

  • What is your hardware ?

  • specify the use case

  • specify model interface

  • how would we monitor performance after deployment?

  • how can we approximate post-deployment monitoring before deployment?

  • build a model and iteratively improve it

  • deploy the model

  • monitor performance

    • what is your are metric?

    • How do you split your data?

Before Training deep learning model

  • using large model to train because

    • it is faster to train with lower overfit and faster converge due to best training

    • it is easier and higher compress in the final stage

      • model compression and acceleration: reducing parameters without significantly decreasing the model performance

  • Data: How to have good data for training deep learning models; How to Build and Enhance A Good Data Set For Your Deep Learning Project: using same config and data for training and inference, removing redundant (delete data which you don't need), get more data, Handle missing data, using data augmentation techniques or GAN to generate more data, re-scale/balance data, Transform your data (Change data types), Feature selection based on data-set and use case

      • The data you don't need: removing redundant samples

      • get more data

      • Invent more data

        • data augmentation

      • Re-scale data

        • balance datasets

      • Transform your data

      • Feature selection based on dataset and use case

      • ML-Augmented Video Object Tracking: By applying and evaluating multiple algorithmic models, enhanced ability to scale object tracking in high-density video compositions.

Training deep learning model

  • automated hyper-parameters

    • Using Hyperparameter tuning / Hyperparameter optimization tools

    • AutoML

    • genetic algorithm

    • population based training

    • bayesian optimization

  • You need to set some parameters and config for training

      • Diagnostics

      • Weight Initialization

      • Learning rate

      • Activation function

      • Network Topology

      • Batches and Epochs

      • Regularization

      • Optimization and Loss

      • Early Stopping

Continuous delivery

  • evolve with latest detection models

  • more data (no labels)

    • semi-supervised learning: big self-supervised models are strong semi-supervised learners

After Training deep learning model

  • Parameter pruning

    • model pruning: reducing redundant parameters which are not sensitive to the performance.

      • aim: remove all connections with absolute weights below a threshold

  • Quantization

    • compresses by reducing the number of bits used to represent the weights

    • quantization effectively constraints the number of different weights we can use inside our kernels

    • per-channel quantization for weights, which improves performance by model compression and latency reduction.

  • Low rank matrix factorization (LRMF)

    • there exists latent structures in the data, by uncovering which we can obtain a compressed representation of the data

    • LRMF factorizes the original matrix into lower rank matrices while preserving latent structures and addressing the issue of sparseness

  • Compact convolutional filters (Video/CNN)

    • designing special structural convolutional filters to save parameters

    • replace over parametric filters with compact filters to achieve overall speedup while maintaining comparable accuracy

  • Knowledge distillation

    • training a compact neural network with distilled knowledge of a large model

    • distillation (knowledge transfer) from an ensemble of big networks into a much smaller network which learns directly from the cumbersome model's outputs, that is lighter to deploy

  • Binarized Neural Networks (BNNs)

  • Apache TVM (incubating) is a compiler stack for deep learning systems

  • Neural Networks Compression Framework (NNCF)

Deep learning model in production

  • security: controls access to model(s) through secure packaging and execution

  • Test

  • auto training

  • using parallel processing and library such as GStreamer






My Keynote (February 2021)

  1. introduction

  2. Machine Learning/ Deep Learning

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed

  1. supervised Machine Learning

    1. Deep Convolutional Neural Networks (DCNN) Architecture

    2. Visualizing and Understanding Convolutional Networks

    3. Object Detection by Deep Learning

    4. Video Tracking

    5. Style Transfer

  2. semi-supervised Machine Learning/ Deep Reinforcement learning (DRL)

    1. Google

    2. Deep Reinforcement learning (DRL)

  3. unsupervised Machine Learning

    1. Auto Encoder

  4. Generative Adversarial Networks (GANs)

  5. Tools

  6. Pre trained model

  7. Effect of Augmented Datasets to Train DCNNs

  8. Training for more classes

  9. Optimization

  10. Hardware

  11. Production setup

  12. post development

  13. business , Gartner, Hype Cycle for emerging technologies, 2025

Advanced and practical

  1. Inside CNN

    1. Deep Convolutional Neural Networks Architecture

    2. Convolution

    3. Convolution Layer

    4. Conv/FC Filters

    5. Activation Functions

    6. Layer Activations

    7. Pooling Layer

    8. Dropout ; L2 pooling

    9. Why

      1. Max-pooling is useful

      2. How to see inside each layer and find important features

  1. Hands on python for deep learning

  2. Fundamental deep learning

  3. Installation: TensorFlow, PyTorch

  4. Using PC+eGPU for training video tracking

Summary of the summit


  • Effective and precise face detection based on color and depth data


      • containing or not containing a face

      • Eigenface, Fisherface, waveletface, PCA (Principal Component Analysis), LDA (Linear Dis-criminant Analysis), Haar wavelet transform, and so on.

      • Viola–Jones detector

      • illumination changes and occlusion

      • depthinformation is used to filter the regions of the image where a candidate face regionis found by the Viola–Jones (VJ) detector

      • - the first filtering rule is defined on the color of the region; since some false positiveshave colors not compatible with the face (e.g. shadows on jeans) a skin detector isapplied to remove the candidate face regions that do not contain skin pixels;

      • - the second filtering rule is defined on the size of the face: using the depth mapit is quite easy to calculate the size of the candidate face region, which is use-ful to discard smallest and largest faces from the final result set;

      • - the third filtering rule is defined on the depth map to discard flat objects (e.g.candidate faces found in a wall) or uneven objects (e.g. candidate face foundin the leaves of a tree). Combining color and depth data the candidate faceregion can be extracted from the background and measures of depth and reg-ularity are used for filtering out false positives.

      • The size criteria simply remove the candidate faces not included in a fixed rangesize ([12.5,30] cm). The size of a candidate face region is extracted from the depthmap according to the following approach.

      • image below

  • Gaussian mixture 3D morphable face model

  • Face Synthesis for Eyeglass-Robust Face Recognition

  • GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data

  • FacePoseNet: Making a Case for Landmark-Free Face Alignment

  • Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

  • Unsupervised Eyeglasses Removal in the Wild

  • How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)


    • (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and fi- nally evaluate it on all other 2D facial landmark datasets.

    • (b) We create a guided by 2D landmarks network which con- verts 2D landmark annotations to 3D and unifies all exist- ing datasets, leading to the creation of LS3D-W, the largest and most challenging 3D facial landmark dataset to date (~230,000 images).

    • (c) Following that, we train a neural network for 3D face alignment and evaluate it on the newly introduced LS3D-W.

    • (d) We further look into the effect of all “traditional” factors affecting face alignment performance like large pose, initialization and resolution, and introduce a “new” one, namely the size of the network.

    • (e) We show that both 2D and 3D face alignment networks achieve per- formance of remarkable accuracy which is probably close to saturating the datasets used.

    • Training and testing code as well as the dataset can be downloaded from https: //