Full Stack Deep Learning

Common solution for under-fitting or over-fitting: check data-set, error analysis, choose a different model architecture, hyper-parameter tuning

Under-fitting (reducing bias): ⬆️ bigger model ⬇️ reduce regularization 🤔 error analysis 🤔 different model architecture 🤔 tune hyper-parameters ⬆️ add features

over-fitting (reducing variance): ⬆️ add more training data ⬆️ add normalization (batch norm, layer norm) ⬆️ add data augmentation ⬆️ increase regularization (dropout, L2, weight decay) 🤔 error analysis 🤔 choose a different model architecture 🤔 tune hyper-parameters ⬇️ early stopping ⬇️ remove features ⬇️ reduce model size

{

"root": {

"id": "fxrwupio",

"text": "<br><br>Full Stack Deep Learning<br><br><br>Farshid PirahanSiah<br><br><br>",

"notes": "<span style=\"color: rgb(61, 71, 77); font-family: Avenir, &quot;Segoe UI&quot;, Helvetica, Arial, sans-serif; font-size: 17.55px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 700; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;\">Full Stack Deep Learning\nhttps://fullstackdeeplearning.com/spring2021 </span>",

"layout": "graph-right",

"children": [

{

"id": "zxdfdkhy",

"text": "1: DL Fundamentals",

"notes": null,

"side": "right",

"children": [

{

"id": "syortvgz",

"text": "* Machine learning <br><br>* Supervised <br>Deep Convolutional Neural Networks (DCNN) Architecture<br>Visualizing and Understanding Convolutional Networks<br>Object Detection by Deep Learning <br>Style Transfer <br><br>* Semi-supervised <br>Deep Reinforcement learning&nbsp; (DLR)<br>DRL Applications<br><br>* Unsupervised learning <br>Auto Encoder <br><br>* Generative Adversarial Networks (GANs)<br>Adversarial attacks<br><br><br>pose estimation<br>Mesh R-CNN<br>3D<br>Style Transfer<br>",

"notes": null,

"children": [

{

"id": "zlkokxcu",

"text": "IEC 61508<br>ISO 26262: for automotive industry<br>",

"notes": null,

"children": [

{

"id": "icmrjdth",

"text": "appropriate level of rigour<br>",

"notes": null,

"children": [

{

"id": "vcvqgbjw",

"text": "safety",

"notes": null,

"children": [

{

"id": "osnwedee",

"text": "failures not harm people<br>",

"notes": null

},

{

"id": "fokyuzlj",

"text": "severity: the potential harm<br>exposure: the probability of occurrence<br>controllability: the ability of the system to avoid the specified harm<br>",

"notes": null

},

{

"id": "srsmurnw",

"text": "ASIL (automotive safety integrity levels): A &lt; B &lt; C &lt; D<br>",

"notes": null

}

]

},

{

"id": "qdzconfu",

"text": "intrinsic quality<br>",

"notes": null,

"children": [

{

"id": "kqchtziq",

"text": "simplicity,robustness,maintainability, testability<br>",

"notes": null

},

{

"id": "uhhpiaoq",

"text": "V-model",

"notes": null,

"children": [

{

"id": "zligwtna",

"text": "Requirements",

"notes": null

},

{

"id": "jqjvexkv",

"text": "Architecture",

"notes": null

},

{

"id": "pmarxogn",

"text": "Design",

"notes": null

},

{

"id": "uihsuolp",

"text": "Code",

"notes": null

},

{

"id": "wslrdkdd",

"text": "Unit test<br>",

"notes": null

},

{

"id": "grsxzdsl",

"text": "Integration test<br>",

"notes": null

},

{

"id": "ouwpxjbx",

"text": "system test<br>",

"notes": null

}

]

},

{

"id": "nhtqsxsj",

"text": "Process",

"notes": null,

"children": [

{

"id": "fiehpcxp",

"text": "Tool Aspects: <br>",

"notes": null

},

{

"id": "vcrvstmk",

"text": "Techniques: suggested to required<br>",

"notes": null

},

{

"id": "minzymnv",

"text": "Methodologies: <br>",

"notes": null

},

{

"id": "kgazizht",

"text": "Artefacts: <br>",

"notes": null

},

{

"id": "dbaddzgl",

"text": "Safety Aspects<br>",

"notes": null

}

]

}

]

}

]

}

]

}

]

},

{

"id": "owhbwhhl",

"text": "Google Colab:<br>!python --version<br>!pip list | grep tensor<br>!pip list | grep torch<br>!nvidia-smi",

"notes": null

}

]

},

{

"id": "nosjuhrh",

"text": "2A: CNNs",

"notes": null,

"side": "right",

"children": [

{

"id": "jggcazyj",

"text": "Why not FC<br>Convolutional filters<br>stacks CNN<br>strides<br>padding<br>(paper: A guid to convolution arithmetic for deep learning)",

"notes": null,

"shape": "box",

"children": [

{

"id": "zsakvvds",

"text": "size=(last size - filters + 2*padding)/ stride+1<br><br>parameters= layer * ( filter * filter * volume)",

"notes": null,

"shape": "box"

}

]

}

]

},

{

"id": "todulkhq",

"text": "2B: Computer Vision",

"notes": null,

"side": "right",

"children": [

{

"id": "hqxfotta",

"text": "AlexNet<br>ZFNet: deconvolution<br>VGG: 19,<br>GoogLeNet: inception module<br>ResNet: DenseNet, ResNeXt, 152 layers<br>SENet: <br>SqueezeNet: 50x less parameters",

"notes": null

},

{

"id": "nokmmuye",

"text": "Overfeat<br>YOLO V1,2,3,4 : you only look once<br>SSD: single shot detector",

"notes": null,

"children": [

{

"id": "vfdklxzo",

"text": "NMS: non-maximum suppression<br>IOU: intersection over union",

"notes": null

}

]

},

{

"id": "unsozaxj",

"text": "Region proposal methods (RPN):<br>R-CNN 2014<br>Faster R-CNN<br>RPN<br>Mask R-CNN (instance segmentation)",

"notes": null

},

{

"id": "tbdwwand",

"text": "Fully convolutional Nets",

"notes": null,

"children": [

{

"id": "swhurnxl",

"text": "Upsampling:<br>unpooling<br>transpose convolutions<br>dilated convolutions",

"notes": null

},

{

"id": "htaczxkq",

"text": "Mesh R-CNN: 3D shape inference",

"notes": null

}

]

}

]

},

{

"id": "skylajlz",

"text": "3 RNNs<br>",

"notes": null,

"side": "left",

"children": [

{

"id": "bkccfyou",

"text": "Image Captioning<br>Video processing: <br>video content understanding",

"notes": null,

"shape": "ellipse"

},

{

"id": "ekgvzqxy",

"text": "1 to Many<br>many to 1<br>many to many<br>",

"notes": null,

"shape": "ellipse"

},

{

"id": "rlncqiwv",

"text": "problem 1: variable length inputs<br>problem 2: memory scaling<br>problem 3: overkill",

"notes": null,

"shape": "ellipse"

},

{

"id": "aqbojirf",

"text": "LSTMs<br>GRUs",

"notes": null,

"shape": "ellipse"

},

{

"id": "dukpagvq",

"text": "residual connections<br>stacked LSTMs (encoder-decoder)<br>Attention",

"notes": null,

"shape": "ellipse"

},

{

"id": "ofmzbkmv",

"text": "Stacked biderictional LSTMs with attention<br>",

"notes": null,

"shape": "ellipse"

},

{

"id": "idojfeqq",

"text": "CTC loss<br>",

"notes": null,

"shape": "ellipse"

},

{

"id": "qlkcugmv",

"text": "+ model arbitrary<br>+ many success NLP<br><br>- training not parallelizable<br>-much slower<br>-finicky to train<br>",

"notes": null,

"shape": "ellipse"

},

{

"id": "ytzhqioo",

"text": "non RCC:<br>Model: WaveNet<br>causal convolution<br>dilated convolutions<br><br>",

"notes": null,

"shape": "ellipse"

}

]

},

{

"id": "raovzfbf",

"text": "4: Transformers",

"notes": null,

"side": "left",

"children": [

{

"id": "hhnlrvfh",

"text": "Transfer Learning in Computer Vision<br>",

"notes": null

},

{

"id": "uxddflrb",

"text": "language models:<br>map one-hot to dense vectors<br>",

"notes": null

},

{

"id": "vatxmkrf",

"text": "NLPs: <br>ELMO (2018)<br>ULMFit (2018)<br>",

"notes": null

},

{

"id": "xpvcnnea",

"text": "Transformers:",

"notes": null,

"children": [

{

"id": "oizvmtqk",

"text": "Attention is all you need<br>* encoder-decoder with only attention and fully-conncted layers<br>* set new SOTA on translation datasets<br>",

"notes": null,

"children": [

{

"id": "ipnsptbb",

"text": "Multi-head attention<br>",

"notes": null

}

]

},

{

"id": "mygjhclr",

"text": "self-attention layer -&gt; layer normalization -&gt; dense layer<br>",

"notes": null

},

{

"id": "wuodbnte",

"text": "Generative pre-trained transformer (GPT) 1,2,3: 175B params<br>Bidirectional encoder representations form transformers (BERT)<br>google switch transformer over 1.5<br>",

"notes": null

},

{

"id": "fnpskdwj",

"text": "image generation <br>",

"notes": null

}

]

}

]

},

{

"id": "cocsfimj",

"text": "5: ML Projects",

"notes": null,

"side": "left",

"children": [

{

"id": "jwrmstim",

"text": "ML is still research<br>",

"notes": null

},

{

"id": "gssfvbmx",

"text": "Full stack robotics<br>",

"notes": null

},

{

"id": "bkbaoqir",

"text": "Lifecycle of a ML project<br>",

"notes": null,

"value": 1,

"children": [

{

"id": "sogvukce",

"text": "planning&amp;project setup<br>",

"notes": null

},

{

"id": "gjrzjlvm",

"text": "data collection&amp;labeling<br>",

"notes": null

},

{

"id": "fxgkkoyk",

"text": "training&amp;debugging",

"notes": null

},

{

"id": "tuucvhgs",

"text": "Deploying &amp; testing<br>",

"notes": null

}

]

},

{

"id": "afcqmbit",

"text": "prioritizing ML projects<br>",

"notes": null,

"value": 2,

"children": [

{

"id": "ybrciqfl",

"text": "What role does ML play in your app?<br>* critical or complementary?<br>* private or public?<br>* proactive or retroactive?<br>* visible or invisible?<br>* dynamic or static?<br>",

"notes": null

}

]

},

{

"id": "pgfwwaiw",

"text": "archetypes",

"notes": null,

"value": 3,

"children": [

{

"id": "iuwtnkek",

"text": "<a href=\"https://developer.apple.com/design/human-interface-guidelines/machine-learning/overview/introduction\">https://developer.apple.com/design/<br>human-interface-guidelines/<br>machine-learning/overview/introduction</a>/ <br>",

"notes": null

},

{

"id": "jztpdbbh",

"text": "<a href=\"https://www.microsoft.com/en-us/research/project/guidelines-for-human-ai-interaction/\">https://www.microsoft.com/en-us/<br>research/project/<br>guidelines-for-human-ai-interaction/</a>",

"notes": null

}

]

},

{

"id": "ajlencdc",

"text": "metrics",

"notes": null,

"value": 4,

"children": [

{

"id": "fqpjslxa",

"text": "confusion matrix<br>precision, recall<br>nAP <br>",

"notes": null

},

{

"id": "qozfgtvy",

"text": "simple average<br>threshold n-1 metrics<br><br>",

"notes": null

}

]

},

{

"id": "rxxgdyxk",

"text": "baselines",

"notes": null,

"value": 5,

"children": [

{

"id": "jqqtqqob",

"text": "External baselines<br>internal baseline<br>",

"notes": null

}

]

}

]

},

{

"id": "nufukixy",

"text": "6: Infrastructure &amp; Tooling",

"notes": null,

"side": "right",

"children": [

{

"id": "nprkjeea",

"text": "1- data collection, aggregate, clean, label, and version data<br>2- write and debug model code<br>3- provision compute<br>4- run experiments, review results<br>5-deploy model<br>6-monitor predictions and close data flywheel loop<br>",

"notes": null,

"children": [

{

"id": "ucewqujb",

"text": "Construction good dataset<br>",

"notes": null

}

]

},

{

"id": "igpoqjmb",

"text": "Data",

"notes": null,

"value": 1,

"children": [

{

"id": "zjlmevwj",

"text": "Source: S3<br>data lake/warehouse: databricks<br>processing: spark<br>exploration: pandas<br>versioning: DVC<br>labeling: CVat, Ultimatelabelling <br>",

"notes": null

}

]

},

{

"id": "vosffaax",

"text": "Training/Evaluation",

"notes": null,

"value": 2,

"children": [

{

"id": "ruidxkhf",

"text": "Compute: sagemaker<br>resource management: docker<br><b>software engineering:</b> git, visual code<br>frameworks&amp;distributed training: pytorch<br>experiment management: Tensorflow board, MLflow<br>hyperparameter tuning: autoML, Sigopt, Ray Tune, <br>Determined AI<br>Domino Data Lab, SageMaker, GC ML Engine<br>",

"notes": null,

"children": [

{

"id": "lrqugina",

"text": "Visual code, jupyter lab<br>",

"notes": null

},

{

"id": "jfdxtzin",

"text": "coreweave: 4.8$ hr<br>on-prem: build your own<br>lambda labes: buy pc<br>",

"notes": null

},

{

"id": "czfaeboo",

"text": "resource management: <br>python scripts, SLURM (job file), Docker+Kubernetes+<br>kubeflow,SageMaker, Software specialized for ML<br>",

"notes": null

},

{

"id": "nabxjywr",

"text": "<br>",

"notes": null

}

]

},

{

"id": "odwczdnx",

"text": "CI/Testing: <br>edge: tensorflow lite, tensorRT<br>web: seldon core<br>feature store:<br>monitoring:<br>",

"notes": null

}

]

},

{

"id": "fipcfnvx",

"text": "Deployment",

"notes": null,

"value": 3

}

]

},

{

"id": "ygeirvft",

"text": "7: Troubleshooting Deep Neural Networks",

"notes": null,

"side": "left",

"status": "yes",

"shape": "ellipse",

"children": [

{

"id": "fslqvxdr",

"text": "Why poor model performance<br>",

"notes": null,

"children": [

{

"id": "xidjauhc",

"text": "Implementation bugs<br>Hyperparameter choices<br>Data/model fit<br>dataset construction<br>",

"notes": null

}

]

},

{

"id": "jvypfgjk",

"text": "key idea of DL troubleshooting<br>",

"notes": null,

"children": [

{

"id": "mchyjcdl",

"text": "start simple and gradually ramp up complexity: LeNet and subset of data<br>implement&amp;debug: overfit a single batc<br>evaluation: bias-variance decomposition<br>tune hyperparameters: coarse to fine random searches<br>improve model/data: bigger model if underfit; regularize if overfit<br>meets requirements<br>",

"notes": null,

"shape": "box",

"children": [

{

"id": "tcoulhqy",

"text": "Common solution for under-fitting or over-fitting<br>check data-set <br>error analysis<br>choose a different model architecture<br>hyper-parameter tuning <br>",

"notes": null,

"color": "#3dd",

"icon": "fa-eye",

"shape": "box"

},

{

"id": "thxfkjfp",

"text": "<b>Under-fitting (reducing bias):</b><br>&nbsp;⬆️ bigger model<br>⬇️ reduce regularization<br>🤔 error analysis<br>🤔 different model architecture<br>🤔 tune hyper-parameters<br>⬆️ add features<br>",

"notes": null,

"color": "#e33",

"icon": "fa-eye",

"status": "computed",

"shape": "box"

},

{

"id": "lbeszktj",

"text": "<b>over-fitting (reducing variance):</b><br>⬆️ add more training data<br>⬆️ add normalization (batch norm, layer norm)<br>⬆️ add data augmentation<br>⬆️ increase regularization (dropout, L2, weight decay)<br>🤔 error analysis<br>🤔 choose a different model architecture<br>🤔 tune hyperparameters<br>⬇️ early stopping<br>⬇️ remove features<br>⬇️ reduce model size<br>",

"notes": null,

"color": "#33e",

"icon": "fa-eye",

"status": "yes",

"shape": "box"

},

{

"id": "kdrncupp",

"text": "Noise <br>in case of noise in data or noise in production you can add noise slowly to the training<br>increase noise level until robust model trained<br>",

"notes": null

},

{

"id": "uhpihlfn",

"text": "distribution shift:<br>analyze test-val set errors: <br>- collect more training data to compensate<br>- synthesize more training data to compensate<br>apply domain adaptation techniques to training and test distributions<br><br>",

"notes": null

},

{

"id": "zjwpbclr",

"text": "bayesian hyperparam optimization; coarse-to-fine random searches; <br>",

"notes": null

}

]

}

]

}

]

},

{

"id": "csjevydi",

"text": "8: Data Management",

"notes": null,

"side": "right",

"children": [

{

"id": "poaydywi",

"text": "data sources -&gt; training<br>adding/augmenting data is best way<br>",

"notes": null

},

{

"id": "euztpiwl",

"text": "Data Sources:<br>label dataset<br>data flywheel<br>semi-supervised learning: SEER, SOTA<br>data augmentation<br>synthetic data<br>",

"notes": null

},

{

"id": "cssfcmuw",

"text": "filesystem: fastest way SSD; binary data, TFRecord, HDF5, parquet, Apache feather<br>object storage: Amazon S3, Google cloud storage, <br>database: persistent, fast, scalable, structured data; OLTP; store references not binary; JSON; Redis<br>Data Warehouse: OLSP; ETL, <br>Data Lake: ELT, Lake House, Databricks, <br>AirFlow; tensorflow datasets + apache Beam; Prefect; dbt; dagster; <br><br>",

"notes": null

},

{

"id": "nnmltylv",

"text": "Feature Store:<br>1. online <br>2. offline <br>dask; rapids; <br>",

"notes": null

},

{

"id": "daudfpvm",

"text": "data labeling: <br>training the annotators is crucial<br>CVat<br>Ultimate labeling<br>label studio<br>snorkel.org<br><br>",

"notes": null

},

{

"id": "zioanxib",

"text": "Data versioning:<br>level 0: filesystem/S3: <br>level 1: snapshot, version deployed<br>level 2: mix of assets and code, JSON, lazydata, git signature<br>level 3: DVC, pachyderm, Quill, delta lake, git large file storage, dolt, lakeFS<br><br>",

"notes": null

},

{

"id": "zjgftamd",

"text": "privacy:<br>federated learning: training a global model from data on local devices, without ever having access to the data<br>differential privacy: aggregating data such that individual points cannot be identified<br>learning on encrypted data<br>",

"notes": null

}

]

},

{

"id": "kwlevahl",

"text": "9: AI Ethics",

"notes": null,

"side": "left",

"children": [

{

"id": "jnpgxvqe",

"text": "divine command; virtue ethics; deontology; utilitarianism<br>statistical bias; COMPAS; Group Fairness; <br>trade-offs; <br>fast.ai<br>",

"notes": null

}

]

},

{

"id": "cgubvplk",

"text": "10: Testing &amp; Explainability",

"notes": null,

"side": "right",

"status": "yes",

"children": [

{

"id": "lnctmaca",

"text": "<b>Software testing: <br></b>-types of tests: unit test (class or function)[one batch/epoch]; integration test (units work together)[training tests,model consistent,sliding window]; end-to-end tests<br>-best practices: pyramid, 70/20/10 %,&nbsp; solitary tests(mocking); test coverage (percentage of lines of code called);test-driven development; <br>-testing in production: canary deployments; A/B testing; Real user monitoring; exploratory testing <br>-CI/CD: circleCI, Travis, Github Actions/ : SaaS, commands, docker, No GPUs; Jenkins/Buildkite; <br>",

"notes": null,

"color": "#3dd",

"status": "yes",

"shape": "box"

},

{

"id": "sywcrzgx",

"text": "<b>Testing machine learning systems:</b><br>evaluate: metrics,datasets,slices,<br>behavioral testing,robustness, privacy and fairness, simulation tests <br>tools: what-if tool,slicefinder, <br><br>",

"notes": null

},

{

"id": "zwkmctlw",

"text": "shadow test: bug in production, inconsistencies offline and online mode, issues production data =&gt; run model in production but don't return predictions to users, save the data and run the offline model on it<br>",

"notes": null

},

{

"id": "uavlwusl",

"text": "A/B test: canarying (percentage of predictions ) <br>",

"notes": null

},

{

"id": "lchvkmxm",

"text": "unit tests for data: define rules about properties of each of data/stage in cleaning and preprocessing pipeline<br>tools: greatexpectations.io; <br>",

"notes": null,

"shape": "box"

},

{

"id": "acznkecx",

"text": "explain model=&gt; interpretable model family<br>interpretability SHAP, LIME help interpretability threshold faster<br>",

"notes": null

}

]

},

{

"id": "krygkxlt",

"text": "11: Deployment &amp; Monitoring",

"notes": null,

"side": "left",

"children": [

{

"id": "rjabixek",

"text": "Architectures of Deployment",

"notes": null,

"children": [

{

"id": "blyhufqk",

"text": "batch prediction: periodically run model for new data ; works if inputs small<br>",

"notes": null,

"children": [

{

"id": "ipawwqfi",

"text": "+:<br>simple<br>low latency<br>-:<br>doesn't scale<br>not up to date predictions<br>stale model<br>",

"notes": null

}

]

},

{

"id": "ehjfpbtd",

"text": "model-in-service: package model with web server; web server loads model<br>",

"notes": null,

"children": [

{

"id": "crciyyxi",

"text": "+:<br>re-uses existing infrastructure<br>-:<br>may different language<br>model more update<br>lare models use more resources<br>not optimize or GPU<br>scale differently<br>",

"notes": null

}

]

},

{

"id": "sjqhwmvl",

"text": "Model-as-service: run model in own web server; backend use model <br>",

"notes": null,

"children": [

{

"id": "lsmyvhxf",

"text": "+:<br>dependability<br>scalability<br>flexibility<br>-:<br>can add latency<br>adds infrastructural complexity<br>run model service<br>",

"notes": null,

"children": [

{

"id": "vpdtdraw",

"text": "REST APIs/<b>gRPC</b>/GRPC (tensorflow serving)/GraphQL/FastAPI<br>Dependency management<br>Performance optimization<br>Horizontal scaling<br>Deployment<br>managed options<br>",

"notes": null,

"children": [

{

"id": "prxuxlee",

"text": "<b>gRPC</b>:<br>HTTP/V2; secure tls build in; used in kubernetes and docker; multi language; micro-service;<br>tools: etcd 3; kubernetes; openshift<br>instead of Json use protocol buffer",

"notes": null

},

{

"id": "hxpmebtc",

"text": "<b>Dependency management:</b><br>model predictions=code+model weights+dependencies<br>1- constrain the dependencies for model: ONNX (open NN exchange)<br>2- use containers: Docker, Dockerfile, DockerHub, orchestration(Kubernetes, Docker Compose)<br>",

"notes": null

},

{

"id": "mvbbolqn",

"text": "<b>Performance optimization</b>:<br>CPU or GPU<br><b>concurrency</b>: multiple copies of model; careful about thread tuning<br><b>model distillation</b>:smaller model to imitate lager model; DistilBERT<br><b>quantization</b>: FP32 to INT8; tradeoffs with accuracy; <b>quantization-aware training</b> <br><b>caching</b>: some inputs more common; first check cache; python functools<br><b>batching</b>: parallel or GPU; batch size need to be tuned, <br><b>sharing GPU</b>: multiple models on GPU; model serving solution<br><b>Model serving libraries</b>: TensorFlow serving, TorchServe, RAY serve, NVIDIA Triton Inference Server<br><b>Horizontal scaling</b>: load balancer to use multiple copies model; <br>1- container orchestration (Kubernetes) <br>2- serverless (AWS Lambda):only pay for compute-time<br>frameworks for ML deployment on kubernetes:MLflow, Seldon Core, KFserving <br>",

"notes": null

},

{

"id": "rmgavdgr",

"text": "<b>Deployment</b>:<br>how roll out , manage and update<br>toll out gradually; roll back instantly; deploy pipelines of models<br>",

"notes": null

},

{

"id": "xyziarsr",

"text": "<b>Managed options:</b><br>Google AI, Algorithmia, SageMaker, Cortex<br><br>",

"notes": null

}

]

}

]

}

]

},

{

"id": "nhcsjaqu",

"text": "Edge prediction<br>",

"notes": null,

"children": [

{

"id": "ccnqkogy",

"text": "send model to device<br>+:<br>low-latency<br>not internet connection<br>data security<br>-:<br>limited hardware<br>less full featured<br>difficult to update models<br>difficult to monitor and debug<br>",

"notes": null,

"children": [

{

"id": "seaqmbmz",

"text": "<b>Edge deployment </b><br>TensorRT<br>Apache TVM<br>TFLite<br>PyTorch mobile<br>TensorFlow.js<br><br>CoreML<br>ML kit<br>FRITZ<br>",

"notes": null,

"children": [

{

"id": "wojporyb",

"text": "more efficient models:<br>quantization and distillation<br>mobile-friendly model<br>-MobileNets<br>-DistillBERT<br>",

"notes": null,

"children": [

{

"id": "apoyyidk",

"text": "Web deployment is easier<br>framework for hardware<br>TVM<br>",

"notes": null

}

]

}

]

}

]

}

]

}

]

},

{

"id": "khaionzg",

"text": "Monitorin",

"notes": null,

"children": [

{

"id": "sxfuezzo",

"text": "Validation loss &lt;<br>test loss &lt;<br>model performs &lt;<br>qualitatively &lt;<br><br>",

"notes": null,

"children": [

{

"id": "feghnimw",

"text": "<b>Data drift: </b><br>- instantaneous drift(new domain), <br>- gradual drift(preferences change, new concepts), <br>- periodic drift(different time zones), <br>-temporary drift(new user different demographics, first time user)<br><b>model drift</b><br><b>concept drift</b><br><b>the long tail</b><br><b>domain shift</b><br>",

"notes": null

}

]

},

{

"id": "xelabzcu",

"text": "model monitoring<br>",

"notes": null,

"children": [

{

"id": "gqkogvxo",

"text": "model metrics<br>business metrics<br>model inputs and predictions<br>system performance<br>",

"notes": null

},

{

"id": "yvmtstjj",

"text": "measuring distribution <br>(data validation)<br>",

"notes": null,

"children": [

{

"id": "nbolpblu",

"text": "<b>1D data<br>select windows then distance metrics:</b><br>rule-based : tools: great expectations <br>K-L: Kullback-Leibler divergence<br>K-S: Kolmogorov-Smirnov statistic<br>D1 distance<br>earth mover distance<br>population stability index<br>",

"notes": null

},

{

"id": "tuguoqeo",

"text": "<b>more than 1D data</b><br>maximum mean discrepancy<br>do a bunch of 1D comparisons<br>prioritize some features for 1D comparisons<br>Projections:&nbsp; <br>- analytical projections, <br>- random projections,statistical(autoencoder, PCA,T-SNE)<br>",

"notes": null

}

]

},

{

"id": "xnnviopw",

"text": "is change bad?<br>",

"notes": null,

"children": [

{

"id": "ovzrteke",

"text": "statistical tests: KS-test<br><br>",

"notes": null

}

]

},

{

"id": "sbxaaiqc",

"text": "&nbsp;monitoring tools<br>",

"notes": null,

"children": [

{

"id": "xpbbldtk",

"text": "system monitoring tools<br>",

"notes": null,

"children": [

{

"id": "nyixrqcl",

"text": "DataDog<br>Amazon cloudWatch<br>NewRelic<br>honeycomb.io<br>",

"notes": null

}

]

},

{

"id": "znlybjuk",

"text": "data quality tools<br>",

"notes": null,

"children": [

{

"id": "hjymyjpy",

"text": "great_expectations<br>anomalo<br>monte carlo<br>",

"notes": null

}

]

},

{

"id": "nelnuowb",

"text": "ML monitoring<br>",

"notes": null,

"children": [

{

"id": "kksjzcdw",

"text": "arize<br>arthur<br>fiddler",

"notes": null

}

]

}

]

},

{

"id": "xhfjemwt",

"text": "monitoring broader ML system<br>",

"notes": null,

"children": [

{

"id": "hgjaknds",

"text": "<br>",

"notes": null

}

]

}

]

}

]

}

]

},

{

"id": "vvagknmo",

"text": "12: Research Directions",

"notes": null,

"side": "right",

"children": [

{

"id": "qeqgeuzv",

"text": "Deep Semi-supervised Learning<br>",

"notes": null,

"children": [

{

"id": "wfaydwtt",

"text": "classification problem<br>each data-point belongs to one of the classes<br>",

"notes": null

},

{

"id": "yzmjhvqk",

"text": "<b>Noisy-Student</b><br>train teacher model with labeled data-&gt; (2)infer pseudo-labels on unlabeled data<br>data augmentation+dropout+stochastic depth-&gt; and (2) =&gt; (3) train student with noise injected<br>make the student a new teacher<br>",

"notes": null

}

]

},

{

"id": "oakpqtbs",

"text": "Deep unsupervised learning<br>",

"notes": null,

"children": [

{

"id": "kryqkwkd",

"text": "transfer with multi-headed networks<br>",

"notes": null

},

{

"id": "cjzgjjrp",

"text": "Example:<br>predict missing patch<br>solving jigsaw puzzles<br>rotation prediction<br>SimCLR and MoCo<br>",

"notes": null

}

]

},

{

"id": "mjtgeurf",

"text": "Deep Reinforcement Learning (DRL)<br>",

"notes": null,

"children": [

{

"id": "jhakclyx",

"text": "DRL Atari<br>",

"notes": null

},

{

"id": "blqvpzss",

"text": "Deep Q-Network<br>DDQN<br>TRPO,A3C,DDPG,PPO,Rainbow,... fully general RL algorithms<br>",

"notes": null

},

{

"id": "vciytrsc",

"text": "DRL GO<br>",

"notes": null

},

{

"id": "tifxcxqx",

"text": "robot locomotion<br>",

"notes": null

},

{

"id": "jpnuzzcv",

"text": "dynamic animation<br>",

"notes": null

},

{

"id": "heiqotxc",

"text": "BRETT: berkeley robot for the elimination of tedious task<br>tensegrity robotics: NASA superBall<br>",

"notes": null

},

{

"id": "hharxvug",

"text": "Contrastive learning<br>",

"notes": null,

"shape": "box",

"children": [

{

"id": "ppkuakts",

"text": "contrastive+RL:<br>CURL(2020)<br>",

"notes": null

}

]

}

]

},

{

"id": "acxjnthk",

"text": "Meta reinforcement learning<br>Meta-training environments<br>learning to learn<br>",

"notes": null

},

{

"id": "iuiwwwkh",

"text": "Imitation learning<br>learning a one-shot imitator <br>",

"notes": null

},

{

"id": "bogxvbul",

"text": "domain randomization:<br>use realistic simulated data<br>domain confusion/adaptation<br>domain randomization<br>",

"notes": null

}

]

}

]

}

}