Full Stack Deep Learning
FSDL 2022
Lecture 07
https://www.tiziran.com/topics/courses/fsdl
Note on Full Stack Deep Learning FSDL 2022;
#Full_Stack_Deep_Learning #tiziran #Farshid_PirahanSiah
Lecture 06
https://www.tiziran.com/topics/courses/fsdl
Note on Full Stack Deep Learning FSDL 2022;
#Full_Stack_Deep_Learning #tiziran #Farshid_PirahanSiah
Lecture 05
https://www.tiziran.com/topics/courses/fsdl
Note on Full Stack Deep Learning FSDL 2022;
#Full_Stack_Deep_Learning #tiziran #Farshid_PirahanSiah
Lecture 06: Continual Learning (FSDL 2022)
- what metrics to monitor
- Outcomes and feedback from users
- Model performance metrics
- Proxy metrics
- Data quality testing
- accuracy
- completeness
- consistency
- timeliness
- validity
- integrity
- Distribution drift
- type
- instantaneous drift like
- gradual drift
- periodic drifts
- temporary drift
- measure
- reference window
- metric
- 1D : KL , KS
- dealing with high-dimensional data
- *projections*
- system metrics
- how to tell if those metrics are "bad"
- KS-Test
- good
- 1 fixed rule
- 2 specified range
- 3 predicted range
- 4 unsupervised detection -
- tools for monitoring
- system monitoring tools
- datadog
- honeycomb.io
- NewRelic
- amazon cloudwatch
- OSS ML monitoring: evidently AI, why logs
- 1 logging
- profiling
- sampling
- 2 curation
- L1: just sample randomly
- L2: stratified sampling
- L3: curate "interesting" data
- manually
- similarity-based curation
- projection-based curation
- automatically curating data using active learning
- scoring function
- most uncertain
- highest predicted loss
- most different from labels
- most representative
- big impact on training
- tools: scale nucleus, data-centric ML tools
- 3 retraining triggers
- based on performance
- online learning
- 4 dataset formation
- 1: train on all available data
- 2: sliding window
- 3: online batch selection
- 4: continual fine-tuning
- 5: offline testing
- dynamic
- expectation tests
- 6: online testing
- shadow mode, AB test, roll out gradually, roll back, ...
- trying it all
Lecture 04
https://www.tiziran.com/topics/courses/fsdl
Note on Full Stack Deep Learning FSDL 2022;
#Full_Stack_Deep_Learning #tiziran #Farshid_PirahanSiah
Lecture 04: Data Management (FSDL 2022)
- fixing/adding/augmenting data: keep it simple
- data sources
- filesystem: local disk speeds: NVME M.2 SSD, latency: nice visualization -
- object storage: usually binary object can versioning, redundancy "S3",
- database: persistent, fast, scalable, in RAM, object-store URLs, - postgres, SQLite
- data warehouse: OLAP, OLTP, ETL
- data lake: unstructured: ELT,
- SQL and DataFrames
- SQL: structured
- Pandas is DataFrames: DASK parallelize pandas, RAPIDS pandas on GPUs
- Airflow: specify the DAG of tasks using python
- Prefect
- Dagster
- feature stores
- tecton.ai
- FEAST
- Featureform
- Hu
- Activeloop
- Labeling
- self-supervised learning
- image data augmentation
- HIVE
- scale.ai
- labelbox
- label studio **
- diffgram
- aquarium and scale nucleus
- weak supervision : snorkel.ai - rubrix
- data versioning: level 1 - level 3: DVC
- privacy
Lecture 03: testing
https://www.tiziran.com/topics/courses/fsdl
Note on Full Stack Deep Learning FSDL 2022;
#Full_Stack_Deep_Learning #tiziran #Farshid_PirahanSiah
Lecture 02: Development Infrastructure & Tooling (FSDL 2022)
https://www.tiziran.com/topics/courses/fsdl
Note on Full Stack Deep Learning FSDL 2022;
#Full_Stack_Deep_Learning #tiziran #Farshid_PirahanSiah
Lecture 01 + Lab 1&2&3:
https://www.tiziran.com/topics/courses/fsdl
Note on Full Stack Deep Learning FSDL 2022; ResnetTransformer, teacher_forward, --precision 16, - --limit_train_batches 10
#Full_Stack_Deep_Learning #tiziran #Farshid_PirahanSiah
Lecture 01: When to Use ML and Course Vision
https://www.tiziran.com/topics/courses/fsdl
#Full_Stack_Deep_Learning #tiziran #Farshid_PirahanSiah
Formulating the problem and estimating project cost
Sourcing, cleaning, processing, labeling, synthesizing, and augmenting data
Picking the right framework and compute infrastructure
Troubleshooting training and ensuring reproducibility
Deploying the model at scale
✨ Monitoring and continually improving the deployed model ✨
✨ How ML teams work and how to manage ML projects ✨
✨ Building on Large Language Models and other Foundation Models ✨