I would like to give you some of my experience with AI projects.
Before start your machine learning project ask these questions and preparation: What is your inference hardware? specify the use case. specify model interface. how would we monitor performance after deployment? how can we approximate post-deployment monitoring before deployment? build a model and iteratively improve it. How to deploy the model at the end? monitor performance after deployment. what is your metric? How do you split your data (training and validation)?
Preparation ML Project Workflow
specify the use case
specify model interface
how would we monitor performance after deployment?
how can we approximate post-deployment monitoring before deployment?
build a model and iteratively improve it
deploy the model
what is your are metric?
How do you split your data?
Before Training deep learning model
using large model to train because
it is faster to train with lower overfit and faster converge due to best training
it is easier and higher compress in the final stage
model compression and acceleration: reducing parameters without significantly decreasing the model performance
Data: How to have good data for training deep learning models; How to Build and Enhance A Good Data Set For Your Deep Learning Project: using same config and data for training and inference, removing redundant (delete data which you don't need), get more data, Handle missing data, using data augmentation techniques or GAN to generate more data, re-scale/balance data, Transform your data (Change data types), Feature selection based on data-set and use case
The data you don't need: removing redundant samples
get more data
Invent more data
Transform your data
Feature selection based on dataset and use case
ML-Augmented Video Object Tracking: By applying and evaluating multiple algorithmic models, enhanced ability to scale object tracking in high-density video compositions.
Training deep learning model
Using Hyperparameter tuning / Hyperparameter optimization tools
After Training deep learning model
model pruning: reducing redundant parameters which are not sensitive to the performance.
aim: remove all connections with absolute weights below a threshold
compresses by reducing the number of bits used to represent the weights
quantization effectively constraints the number of different weights we can use inside our kernels
per-channel quantization for weights, which improves performance by model compression and latency reduction.
Low rank matrix factorization (LRMF)
there exists latent structures in the data, by uncovering which we can obtain a compressed representation of the data
LRMF factorizes the original matrix into lower rank matrices while preserving latent structures and addressing the issue of sparseness
Compact convolutional filters (Video/CNN)
designing special structural convolutional filters to save parameters
replace over parametric filters with compact filters to achieve overall speedup while maintaining comparable accuracy
training a compact neural network with distilled knowledge of a large model
distillation (knowledge transfer) from an ensemble of big networks into a much smaller network which learns directly from the cumbersome model's outputs, that is lighter to deploy
Binarized Neural Networks (BNNs)
Apache TVM (incubating) is a compiler stack for deep learning systems
Neural Networks Compression Framework (NNCF)
Deep learning model in production
security: controls access to model(s) through secure packaging and execution
using parallel processing and library such as GStreamer
My Keynote (February 2021)
Machine Learning/ Deep Learning
Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed
supervised Machine Learning
Deep Convolutional Neural Networks (DCNN) Architecture
Visualizing and Understanding Convolutional Networks
Object Detection by Deep Learning
semi-supervised Machine Learning/ Deep Reinforcement learning (DRL)
unsupervised Machine Learning
Generative Adversarial Networks (GANs)
Pre trained model
Effect of Augmented Datasets to Train DCNNs
Training for more classes
business , Gartner, Hype Cycle for emerging technologies, 2025
Advanced and practical
Deep Convolutional Neural Networks Architecture
Dropout ; L2 pooling
Max-pooling is useful
How to see inside each layer and find important features
Visualizing and Understanding Convolutional Networks
Hands on python for deep learning
Fundamental deep learning
Installation: TensorFlow, PyTorch
Summary of the summit
containing or not containing a face
Eigenface, Fisherface, waveletface, PCA (Principal Component Analysis), LDA (Linear Dis-criminant Analysis), Haar wavelet transform, and so on.
illumination changes and occlusion
depthinformation is used to filter the regions of the image where a candidate face regionis found by the Viola–Jones (VJ) detector
- the first filtering rule is defined on the color of the region; since some false positiveshave colors not compatible with the face (e.g. shadows on jeans) a skin detector isapplied to remove the candidate face regions that do not contain skin pixels;
- the second filtering rule is defined on the size of the face: using the depth mapit is quite easy to calculate the size of the candidate face region, which is use-ful to discard smallest and largest faces from the final result set;
- the third filtering rule is defined on the depth map to discard flat objects (e.g.candidate faces found in a wall) or uneven objects (e.g. candidate face foundin the leaves of a tree). Combining color and depth data the candidate faceregion can be extracted from the background and measures of depth and reg-ularity are used for filtering out false positives.
The size criteria simply remove the candidate faces not included in a fixed rangesize ([12.5,30] cm). The size of a candidate face region is extracted from the depthmap according to the following approach.
Gaussian mixture 3D morphable face model
Face Synthesis for Eyeglass-Robust Face Recognition
GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data
FacePoseNet: Making a Case for Landmark-Free Face Alignment
Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision
Unsupervised Eyeglasses Removal in the Wild
How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)
(a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and fi- nally evaluate it on all other 2D facial landmark datasets.
(b) We create a guided by 2D landmarks network which con- verts 2D landmark annotations to 3D and unifies all exist- ing datasets, leading to the creation of LS3D-W, the largest and most challenging 3D facial landmark dataset to date (~230,000 images).
(c) Following that, we train a neural network for 3D face alignment and evaluate it on the newly introduced LS3D-W.
(d) We further look into the effect of all “traditional” factors affecting face alignment performance like large pose, initialization and resolution, and introduce a “new” one, namely the size of the network.
(e) We show that both 2D and 3D face alignment networks achieve per- formance of remarkable accuracy which is probably close to saturating the datasets used.
Training and testing code as well as the dataset can be downloaded from https: //www.adrianbulat.com/face-alignment/