Hardware

Jetson Nano + OpenCV AI KIT OAK-D-LITE depth camera

Hardware for Deep Learning (machine learning)

My experience

Raspberry Pi 4

Smart AI IoT, Robotic, 3D SLAM, AR, VR

RISC-V

I worked with many different hardware such as

Camera

What is important?

Scaled-YOLOv4:scaling model based on hardware

Cost

How to use computer vision with deep learning in IoT devices. Inference machine learning on Edge require some extra steps.

Use special frameworks or library for edge devices:

In some case you need to enhance model for inference. There are many techniques to use such as,

How

Jetson Nano + OpenCV AI KIT OAK-D-LITE depth camera

Camera
- - Camera Specs: Color camera, Stereo pair
  - IMX214 (PY014 AF, PY114 FF), OV7251 (PY013)
  - DFOV / HFOV / VFOV= 81° / 69° / 54° , 86° / 73° / 58°
  - Resolution: 13MP (4208x3120), 480P (640x480)
  - Focus: AF: 8cm - ∞ OR FF: 50cm - ∞ , Fixed-Focus 6.5cm - ∞
  - Max Framerate: 35 FPS, 120 FPS
Width: 91 mm , Height: 28 mm, Length: 17.5 mm, Baseline: 75 mm, Weight: 61 g
chips:
- - Robotics Vision Core 2 (RVC2 in short)

Myriad X are integrated into the Robotics Vision Core 2

Speed ML
- - Model name, Size, FPS, Latency [ms],
  - MobileOne S0 224x224, 165.5, 11.1
  - YoloV8n, 416x416, 31.3, 56.9,
  - YoloV8n, 640x640, 14.3, 123.6
  - YoloV8s, 416x416, 15.2, 111.9
  - YoloV8m, 416x416, 6.0, 273.8

Hardware for Deep Learning (machine learning)

https://www.tiziran.com/topics/hardware

I experiment with many different hardware to train and run deep learning application. The below list shows my suggestion, comparison, expectation of using different hardware. Embedded AI, implementing distributed data parallel, distributed model parallel solutions.

https://www.tiziran.com/topics/hardware

#hardware #deep_learning #IoT #training_machine_learning_model #tiziran

Laptop:

NVIDIA Geforce RTX 3080 Ti
- Razer Blade 17 - 17.3 inch gaming laptop (NVIDIA Geforce RTX 3080 Ti, Intel i9-12900H, 4K UHD 144Hz display 32GB DDR5 RAM, 1TB SSD,

Desktop

eGPU
- Razer RC21-01310100-R351 Core X External Graphics Card Case = ~ 300 Euro + GPU
- Cooler Master MasterCase EG200 External GPU Enclosure - Thunderbolt 3 Compatible eGPU Enclosure, 1 PWM 92mm Fan, V550 SFX Gold Fully Modular PSU, USB Hub, Vertical Laptop Support - EU Plug = ~ 300 Euro + GPU
GPU
- Geforce RTX 3090 24G 384Bit Gddr6x Nvidia Geforce
- MSI GeForce RTX 3090 GAMING TRIO 24G Gaming Graphics Card - NVIDIA RTX 3090, GPU 1740MHz, 24GB GDDR6X memory = ~ 2800 Euro

IoT:

Raspberry pi 3 (you need accelerator )
Raspberry pi 4 (you need accelerator )
Intel® Neural Compute Stick 2
- Intel® Distribution of OpenVINO™ Toolkit
- I attached to Raspberry pi 4 by USB 3 and work very well for many deep learning models
Google Coral
- I attached to Raspberry pi 4 by USB 3 and work very well for TensorFlow models
- Why TensorFlow lite on Edge: Lightweight, low-latency, Privacy, improved power consumption, efficient model ready to used
NVIDIA Jetson Nano ( 2GB and 4GB ram)
- I test Multi-Class Multi-Object Multi-Camera Tracking (MCMOMCT) under heavy workloads can perform up to 30 minutes
NVIDIA JETSON AGX XAVIER
NVIDIA AGX Orin = ~ 1900 Euro
- Compare NVIDIA Jetson AGX Orin with AGX Xavier: 8x AI performance, in-advance Ampere GPU, CPU, Memory & Storage
OpenCV AI Kit
- OAK = ~ 100 Euro
- OAK—D = ~ 200 Euro
- OAK—D + Wifi = ~ 250 Euro
- OpenCV AI Kit: OAK—D-PoE = ~ 250 Euro
- OAK—D lite = ~ 100 Euro

My experience

I tested many different hardware for different computer vision applications in area of IoT and Robotics

AI Edge: How to inference deep learning models on edge/IoT ; Enabling efficient high-performance ; Accelerators/Optimization on Deep Learning

#computervision #AI #objectdetection #objecttracking #ml #research #CNN #gans #convolutionalneuralnetworks #ai #vr #reinforcementlearning #mlops #aiforbusiness #science #researcher #phd #cameracalibration #opticalflow #videostablization #humanoidrobot #localization #3dSLAM #reconstruction #pointcloud #mixedreality #edgecomputing #raspberrypi #intelstick #googlecoral #jetsonnano #nvidiavgpu #tensorflowjs #pytorch #opencv #aikit #caffee #DIGITS #c++ #python #ubuntu #farshidpirahansiah #tiziran.com #farshid #pirahansiah #robotics #tiziran.com #farshid #pirahansiah #MultiCameraMultiClassMultiObjectTracking #deeplearning #machinelearning #artificialintelligence #tensorflow #robotics #3dvision #sterovision #depthmap #RCNN #machinevision #imageprocessing #patternrecognition #compiler #RISC-V #RNN #fullStackDeepLearning #productinnovation #patents #TensorRT #ApacheTVM #TFLite #PyTorchmobile #dockers #gRPC #RESTAPIs #GRPC #GraphQL #imageprocessing #patternrecognition

Raspberry Pi 4

How to upgrade Raspberry Pi 4 EEPROM boot recovery; Released 2020-09-14; to install and boot from USB 3 (SSD)

update Raspberry Pi 4 EEPROM boot recovery
install Ubuntu 20 on SSD
change the config.txt and add "program_usb_boot_mode=1" at the end of file
remove and micro sd card and boot from ssd

Smart AI IoT, Robotic, 3D SLAM, AR, VR

3D printed humanoid robot: NimbRo-OP2 and NimbRo-OP2X hardware

RISC-V

I worked with many different hardware such as

Raspberry pi 3
Raspberry pi 4
Intel® Neural Compute Stick 2
- Intel® Distribution of OpenVINO™ Toolkit
- I attached to Raspberry pi 4 by USB 3 and work very well for many deep learning models
Google Coral
- I attached to Raspberry pi 4 by USB 3 and work very well for TensorFlow models
- Why TensorFlow lite on Edge: Lightweight, low-latency, Privacy, improved power consumption, efficient model ready to used
NVIDIA Jetson Nano
- I test Multi-Class Multi-Object Multi-Camera Tracking (MCMOMCT) under heavy workloads can perform up to 30 minutes
NVIDIA JETSON AGX XAVIER
- The best hardware
- I attended in may conferences and summits in area of Hardware for deep learning such as:
  - - - AI Hardware Europe Summit (July 2020)
      - The Edge AI & Brain Inspired Computing (November 2020)
      - Apache TVM And Deep Learning Compilation Conference (December 2020)
      - RISC-V Summit (December 2020)
OpenCV AI Kit

Camera

I worked with many different cameras such as:

Camera Module V1
Camera Module V2
Camera Module V2.1
multispectral camera
USB webcam
IP camera
high resolution camera > 8K
depth camera
stereo camera

What is important?

camera calibration is important
Quantum efficiency [%] (spectral response)
Sensor size [inches or mm] and pixel size [micro meter]
Dynamic Range [dB]
Image noise and signal to noise ratio (SNR), PSNR, SSIM, : greater SNR yields better contrast and clarity, as well as improved low light performance
inter face, cable length in m, bandwidth max in MB/s , multi camera, cable costs, real time, plug and play
- - firewire, 4.5 , 64, *, *, **, **
  - gige, 100, 100, **, **, *, *
  - usb, 8, 350, *, *, **, **
  - link, 10, 850, -, -, **, -
  - usb-c, 10, 40 GB,,,,
distortions, scaling factors, quality is important, calculate minimum sensor resolution *, determine your sensor size, focal length,
- - sensor resolution= image resolution = 2 * ( field of view (FOV) / smallest feature )
some online tools: baslerweb.com, edmundoptics.com, flir.com
to sum up
- use USB-C camera. it will help you in the future upgrades in hardware and easy to use with less issues
- find your best trade-off between WD and FOV
- sometimes you cannot have everything in life!
- your lens aperture (f/#) is your friend, use it!
- a larger DOF requires a larger f/#
- lens performance curves are the ultimate documentation to read when selecting a lens
- understanding them properly requires good knowledge in optics, but it totally worth it.

Scaled-YOLOv4:scaling model based on hardware

Cost

How much does a patent cost?
Mobile, Open Hardware, RISC-V System-on-Chip (SoC) Development Kit
Hardware
- NVIDIA Jetson Xavier NX Developer Kit
- WIFI
- SparkFun GPS-RTK Dead Reckoning pHAT
- Micro Sd card
- Mophie Powerstation USB C 20000
- ZED 2 Stereo Camera
- 3D-printed box
AWS
- AWS S3
- AWS xml.p2.xlarge EC2 instances
- AWS Sagemaker
Hackboard 2 with Ubuntu Linux (99$) Intel CPU
3D printed humanoid robot: NimbRo-OP2 and NimbRo-OP2X hardware
Post Product to customer by
- - - - easyship
        fulfillmentcrowd
        ChinaDivision
        ORQA FPV
        floship

Update 26.April.2021

How to use computer vision with deep learning in IoT devices. Inference machine learning on Edge require some extra steps.

I tested several hardware such as Raspberry pi 3, Raspberry pi 4, Intel® Neural Compute Stick 2, OpenCV AI Kit, Google Coral, NVIDIA Jetson Nano, etc. Different OS: real-time operating system (RTOS), Nasa cFS (core Flight System), Real-Time Executive for Multiprocessor Systems (RTEMS),

anomaly detection, object detection, object tracking, ...

Use special frameworks or library for edge devices:

NVIDIA TensorRT
TensorFlow Lite: TensorFlow Lite on Microcontroller Gesture Recognition OpenMV/Tensorflow/ studio.edgeimpulse.com
TensorFlow.js
PyTorch Lightning
PyTorch Mobile
Intel® Distribution of OpenVINO Toolkit
CoreML
ML kit
FRITZ
MediaPipe
Apache TVM
TinyML: enabling ultra-low power machine learning at the edge tiny machine learning with Arduino
Libraries: ffmpeg, GStreamer, celery,
GPU library for python: PyCUDA, NumbaPro, PyOpenCL, CuPy

Moreover, think about deep learning model for your specific hardware at first stage.

In some case you need to enhance model for inference. There are many techniques to use such as,

Pruning
Quantization
Distillation Techniques
Binarized Neural Networks (BNNs)
Apache TVM (incubating) is a compiler stack for deep learning systems
Distributed machine learning and load balancing strategy
Low rank matrix factorization (LRMF)
Compact convolutional filters (Video/CNN)
Knowledge distillation
Neural Networks Compression Framework (NNCF)
Parallel programming

How

Distributed machine learning and load balancing strategy

Pruning

model pruning: reducing redundant parameters which are not sensitive to the performance. aim: remove all connections with absolute weights below a threshold. 🤔go for bigger size of network with many layers then pruning much better and faster

Quantization

The best way is using Google library which support most comprehensive methods

compresses by reducing the number of bits used to represent the weights quantization effectively constraints the number of different weights we can use inside our kernels per channel quantization for weights, which improves performance by model compression and latency reduction.

training a compact neural network with distilled knowledge of a large model distillation (knowledge transfer) from an ensemble of big networks into a much smaller network which learns directly from the cumbersome model's outputs, that is lighter to deploy

Distillation Techniques

Distill-Net: Application-Specific Distillation of Deep Convolutional Neural Networks for Resource-Constrained IoT Platforms

Binarized Neural Networks (BNNs)

It is not support by GPU hardware such as Jetson Nano. mostly based on CPU

Apache TVM (incubating) is a compiler stack for deep learning systems

challenges with large scale models deep neural networks are: expensive computationally expensive memory intensive hindering their deployment in:devices with low memory resources applications with strict latency requirements other issues:data security: tend to memorize everything including PII bias e.g. profanity: trained on large scale public datas elf discovering: instead of manually configuring conversational flows, automatically discover them from your data self training: let your system train itself with new example s self managing: let your system optimize by itself knowledge distillation

Distributed machine learning and load balancing strategy

run models which use all processing power like CPU,GPU,DSP,AI chip together to enhance inference performance. dynamic pruning of kernels which aims to the parsimonious inference by learning to exploit and dynamically remove the redundant capacity of a CNN architecture. partitioning techniques through convolution layer fusion to dynamically select the optimal partition according to the availability of computational resources and network conditions.

Low rank matrix factorization (LRMF)

there exists latent structures in the data, by uncovering which we can obtain a compressed representation of the dataLRMF factorizes the original matrix into lower rank matrices while preserving latent structures and addressing the issue of sparseness

Compact convolutional filters (Video/CNN)

designing special structural convolutional filters to save parameters replace over parametric filters with compact filters to achieve overall speedup while maintaining comparable accuracy

Knowledge distillation

Neural Networks Compression Framework (NNCF)

AI Edge: How to inference deep learning models on edge/IoT Enabling efficient high-performance Accelerators/Optimization on Deep Learning

if the object is large and we do not need small anchor

in mobileNet we can remove small part of network which related to small objects. in YOLO reduce number of anchor. decrease size of image input but reduce the accuracy

Parallel programming and clean code, design pattern,

Page updated

Google Sites

Report abuse