idea
Smart robot for people care
My AI journey in a nutshell
Computer Vision and Deep Learning Optimisation, Efficiency, Real-time, and Accuracy in Smart Computer Vision and Deep Learning for IoT
The data you don't need: removing redundant samples
3D SLAM based on your hardware: raspberry pi, Jetson nano, FPGA, ...
enhancing your deep learning model by using
Embedded mixed reality
Humanoid robot for Elderly care, aged care, is the fulfillment of the special needs and requirements that are unique to senior citizens by helping AI system. This AI system help such services as assisted living, adult day care, long term care, nursing homes (often referred to as residential care), hospice care, and home care. Because of the wide variety of elderly care found nationally, as well as differentiating cultural perspectives on elderly citizens.
I am excited to introduce “RC”(Robot Care) project. A pre-built and get starter API template for deploying your #AI applications on humanoid robot. Would love to have more features coming. We appreciate any feedback and adding DL examples. Push a PR!
labeling videos : Ai. Smart. Tracking. Asynchronous
telling important objects/place/location which pass trough
Anomaly detection
Use experience. Personalizes.
Prediction manage society mobility
Personalization
Covenant
Platform.
Blind spots
Deploying large AI models
deploy using web framework (e.g. Flask):
for developing small scale simple web apps
create a wrapper around the model while deploying a model using it
deploy in the cloud (e.g. Azure/AWS/Google cloud):
cloud services offers web interfaces and software kits like AWS Lambda
deploy using scalable, easy to mange open frameworks (e.g. Kubernetes):
kubernetes is an open source system for automation deployment , scaling and management of containerized application (Docker)
tensorflow/pytorch serving:
most DL models trained using toolkits like tensorflow and pytorch
they provide servin: high performance model deployment system
federated learning
Deep Learning Optimization: Some of the methods which we can use to compression and acceleration deep learning models are: parameter/model pruning, quantization, Binarized Neural Networks (BNNs), low rank matrix factorization (LRMF), compact convolutional filters (Video/CNN), knowledge distillation and using Apache TVM.
model compression and acceleration: reducing parameters without significantly decreasing the model performance
parameter pruning
model pruning: reducing redundant parameters which are not sensitive to the performance.
aim: remove all connections with absolute weights below a threshold
quantization
compresses by reducing the number of bits used to represent the weights
quantization effectively constraints the number of different weights we can use inside our kernels
v
there exists latent structures in the data, by uncovering which we can obtain a compressed representation of the data
LRMF factorizes the original matrix into lower rank matrices while preserving latent structures and addressing the issue of sparseness
compact convolutional filters (Video/CNN)
designing special structural convolutional filters to save parameters
replace over parametric filters with compact filters to achieve overall speedup while maintaining comparable accuracy
knowledge distillation
training a compact neural network with distilled knowledge of a large model
distillation (knowledge transfer) from an ensemble of big networks into a much smaller network which learns directly from the cumbersome model's outputs, that is lighter to deploy
Binarized Neural Networks (BNNs)
TVM
challenges with large scale models
deep neural networks are:
expensive
computationally expensive
memory intensive
hindering their deployment in:
devices with low memory resources
applications with strict latency requirements
other issues:
data security: tend to memorize everything including PII
bias e.g. profanity: trained on large scale public data
self discovering: instead of manually configuring conversational flows, automatically discover them from your data
self training: let your system train itself with new examples
self managing: let your system optimize by itself
knowledge distillation
depth maps
applications
computer graphics (z-buffering, subsurface scattering, ...)
autonomous navigation
SLAM systems
object tracking
3D reconstruction
defocus
augmented reality
OpenSlam's Gmapping. The gmapping package provides laser-based SLAM (Simultaneous Localization and Mapping): http://wiki.ros.org/gmapping amcl is a probabilistic localization system for a robot moving in 2D: http://wiki.ros.org/amcl A 2D navigation stack that takes in information from odometry, sensor streams, and a goal pose and outputs safe velocity commands that are sent to a mobile base: http://wiki.ros.org/navigation The hector_slam metapackage that installs hector_mapping and related packages: http://wiki.ros.org/hector_slam
Traffic & crowded people & traffic light
Anti-vandalism
Identify road conditions for optimized navigation
Event based scenario
Self collected dataset by integrated cameras on scooters
Data is different to what we can capture with eye-level detection
Labeling data
Testing models
Extreme condition
Low light condition
Night vision:
Using different cameras
Using different methods
Fine tune the model to work on night
Summery of trip:
Understand environment
Landmark detection
Best picture
Nice view
Gesture detection
Start/stop recording
Start/stop picture
Combine the two camera images for person and landmark together
Using drone
Understand environment
Observe the state of scooters
Observe the crowded place
People
Scooter
Parking zoon
Summery of trip and send to user
Improve positioning with VPS
Realtime detection
Parking
Monitoring city to maintain
Visualize data in 3D
Improve positioning with VPS
Realtime detection
Parking
Monitoring city to maintain
Visualize data in 3D
City/Street health check
Transferring data to clouds and process and store in best way
Using safety
Unsafe environments
Optical flow for movements
Detect vehicles and pedestrian
Avoid heating obstacles
Obstacles
Road problem
Traffic & crowded people & traffic light
Small information
Identify road conditions for optimized navigation
Identify hot spots for rebalancing
bicycle lane optimization in conjunctions
Improve positioning with VPS
Detect car parking spaces in real-time
Help cities to maintain and plan their infrastructure
Visualize data in a 3D environment with VR/AR
glasses for better understanding
Extreme condition:
Low light condition
Event based scenario:
Heating
Conflict
accident
Night vision:
Using different cameras
Using different methods
Fine tune the model to work on night
Summery of trip:
Understand environment
Landmark detection
Best picture
Nice view
Gesture detection
Start/stop recording
Start/stop picture
Combine the two camera images for person and landmark together
Using drone:
Understand environment
Observe the state of scooters
Observe the crowded place
People
Presentation:
My related topic for the humanoid robot and computer vision algorithms
Large Motions Specular Reflections Motion Blur Defocus Blur Atmospheric Effects
Do you want to know scene flow? scene flow based motion estimator could also detect animals that move in the direction of viewing, while vision based motion estimation in 2D can only detect animals that move vertically to the direction of driving.
3D Semantic Scene Understanding: The world around us exists spatially in 3D, and it is crucial to understand real-world scenes in 3D to enable virtual or robotic interactions with such environments. We investigate machine learning approaches to infer semantic understanding of real-world scenes and the objects inside them from visual data, including images and depth/3D observations.
Generating 3D Models From Visual Data: Imagine creating 3D photos, holograms, or your own custom video game content from a quick video observation. We develop generative 3D models from 2D or 3D observations, focusing on indoor environments.
3D vision-guided robotic solutions suited to customer needs
visual odometry and SLAM
3D geometries
LiDAR point cloud processing, 3D analysis software, point cloud registration, point cloud instance segmentation
sensor fusion using Camera and LiDAR using Deep Learning
3D mapping worlds
2D/3D single- and two-stage Object Detection
Semantic, Instance, Panoptic Segmentation (incl. Lane Markings, Free Space etc.)
Dense Depth and Optical Flow Estimation (SLAM)
Early, Mid and Late Fusion concepts in multi-modal sensor frontends (incl. LiDAR, Radar, IR)
Ensembles, knowledge distillation, regularization, multi-task learning, compression, quantization …
System-level AV integration, ROS
Traditional computer vision (particularly around low-level pre-processing of image sensor data)
3D SLAM on our LiDAR data (SLAM, IMU, ROS)
Detection of moving objects /people with a moving 3D LiDAR (ROS, PCL)
Build an IOT Cloud for 3D LiDAR data processing (IOT Frameworks, ROS)
Reliably find markers in 3D LiDAR data (ROS, PCL)
Implementation of realtime point cloud processing in embedded systems (ARM Cortex, ROS, Linux)
Object classification of 3D LiDAR data
Create a web based visualization for ROS LiDAR data (Javascript, ROS)
Create a LiDAR data showcase with Web technology (Javascript)
3d model of city, better understanding, SLAM, accurate positioning system,
Crowd flow people : fire alarm in
3d model city: building, planning, minize wind in building, lidar, wind flow
Update 3d model real time
Pothole: road condition