Fundamentals of Deep Learning and Generative AI Training
This Generative AI (GenAI) training teaches attendees the fundamentals of Deep Learning and GenAI, with participants building the skills to apply these powerful technologies to real-world scenarios. The course introduces attendees to key concepts, tools, and frameworks like TensorFlow and Keras, as well as neural network architectures like CNN, RNN, and Transformers. Students learn about large language models (LLMs), GANs, and diffusion models, and their applications in diverse domains.
Through interactive, hands-on labs and exercises, attendees reinforce their theoretical knowledge of the fundamental concepts such as types of neural network architecture, embeddings, multimodality, fine-tuning, and transfer learning while exploring ethical considerations and ensuring responsible AI practices.
Duration
5 days
Skills Gained
- Construct predictive models using machine learning and deep learning techniques, understanding their applications and limitations
- Build and evaluate artificial neural networks (ANNs) for various tasks, optimizing their architecture and monitoring convergence
- Develop robust deep learning models for classification and regression tasks, implementing preprocessing, validation, and regularization strategies
- Apply generative AI techniques to create new content across various domains, understanding the ethical considerations involved
- Utilize recurrent neural networks (RNNs) and variational autoencoders (VAEs) for sequential data generation and other applications
- Create generative adversarial networks (GANs) to generate realistic data samples and address adversarial examples
- Leverage transformer architectures for natural language processing and time series classification tasks
- Explore popular large language models (LLMs) like ChatGPT, DALL-E 2, and Bing AI, and understand their capabilities
- Fine-tune medium-sized LLMs like Stanford Alpaca and Facebook Llama with your own data for specific use cases
Prerequisites
Basic knowledge of Python and familiarity with the NumPy library.
Audience
Data practitioners, business analysts, software engineers, and IT architects.
- Introduction to Neural Networks and Deep Learning
- What is an Artificial Neural Network?
- Types of Neural Networks
- Machine Learning with Neural Networks
- Deep Learning
- Navigating Neural Networks Layers
- Positional Types of Layers
- The Network and the Model
- Model Properties
- A Bit of Terminology
- Data Pre-processing
- How Does My Network Know Which Problem I Want It to Solve?
- A Neuron
- The Artificial Neuron
- The Perceptron
- The Perceptron Symbol
- A Breakthrough in Neural Networks Design
- Perceptrons and MLPs
- A Basic Neural Network Example
- Popular Activation Functions
- Supervised Model Training
- Measuring the Error with the Loss (Cost) Function
- Mini-batches and Epochs
- Neural Network Training Steps
- Applying Efficiencies with Autodiff ...
- Neural Network Libraries and Frameworks
- Neural Network Concepts and Terminology
- Why We Need Terminology ...
- Features and Targets
- Observations (Examples)
- Notation for Observations
- Data Structures: Tensors
- Continuous and Categorical Features
- Continuous Features
- Categorical Features
- Feature Types Visually
- Feature Importance
- Supervised and Unsupervised Machine Learning
- Self-Supervised Learning
- Common Distance Metrics
- The Euclidean Distance
- Visualizing Data on the X-Y Plane
- What is a Model?
- Model Life-Cycles
- Model Parameters and Hyperparameters
- The Train/Validate/Test Machine Learning Triad
- Training/Validation/Test Data Split Ratios
- Data Splitting Considerations
- Cross-Validation Technique
- Test Data Leakage
- Bias-Variance (Underfitting vs Overfitting) Trade-off
- Bias and Variance Visually
- Model Underfitting vs Model Overfitting Visually
- Ways to Balance Off the Bias-Variance Ratio
- Training Error vs Validation Error Diagram
- Loss (Cost) Functions
- Loss Function Properties
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- (Categorical) Cross Entropy Loss
- The Cross Entropy Loss Visually
- The Chain Rule in Calculus
- The Chain Rule in Neural Networks
- Gradient Descent in Neural Networks (1/2)
- Gradient Descend Visually
- Gradient Descent in Neural Networks (2/2)
- An Annotated Example of Gradient Calculation
- The softmax Function
- Coding Softmax
- Model Accuracy in Classification Tasks
- Confusion Matrix
- The Binary Classification Confusion Matrix
- Multi-class Classification Confusion Matrix Example
- Feature Engineering
- Data Scaling and Normalization
- The Data Normalization Tooling
- Regularization
- A Hands-On Exercise
- Mathematical Formulations ...
- Dimensionality Reduction
- Online Machine Learning Glossaries
- TensorFlow Introduction
- What is TensorFlow?
- The TensorFlow Logo
- Tensors and Python API
- Python TensorFlow Interfaces Diagram
- PyTorch
- GPUs and TPUs
- Google Colab
- Data Tools
- TensorFlow Variants
- TensorFlow Core API
- TensorFlow Lite
- TFX (TensorFlow Extended)
- A TFX Pipeline Example
- XLA Optimization
- TensorFlow Toolkit Stack
- Keras
- TensorBoard
- Introduction to Keras
- What is Keras?
- Keras 3.0
- Core Keras Data Structures
- Layers in Keras
- The Dense Layer
- Defining the Layer Activation Function
- Models in Keras
- Components of a Keras Model
- Creating Neural Networks in Keras
- The Sequential Model
- A Sequential Model Code Example
- The Strengths and Weaknesses of Sequential Models
- The Functional API
- A Functional API Example
- The Strengths and Weaknesses of the Functional API
- Making New Layers and Models via Subclassing
- A Layer Subclassing Example
- A Model Subclassing Example
- The Strengths and Weaknesses of Subclassing
- Introduction to CNNs
- Convolutional Neural Networks (CNNs)
- Kernels and Convolutions
- A Convolution Mathematically
- A Convolution Visually
- A Quiz
- Kernels and Feature Maps
- Feature Maps in CNNs
- CNN Efficiencies
- Feature Maps Visually
- The Stride Hyperparameter
- The CNN Architecture
- The Conv2D Class
- A Quiz
- An Example of Pooling Layer
- Finally, Putting it All Together
- Summary
- Introduction to RNNs
- Recurrent Neural Networks (RNNs)
- How Do RNNs Do It?
- Feedforward Neural Networks vs RNNs
- Mathematical Formulations
- (Simplified) RNN Visual Representations
- A More Accurate RNN Diagram
- Sampling the Data
- Problems with RNNs
- LSTM and GRU Networks
- Problems with LSTM and GRU Networks
- RNNs as a Precursor of Generative AI
- Embeddings
- Embeddings ...
- Understanding the Embeddings Visually
- Dimensionality
- The Semantic Aspect of Embeddings
- Word Embeddings in NLP
- Embeddings in Transformers
- Cosine Similarity
- Introduction to Generative AI
- The Age of Digital Assistants ...
- What is Generative AI?
- Applications
- What Are Natural Language Models?
- The Probabilistic Language Model
- Training a Language Model to Predict the Next Word
- Generative AI, the Pre-Cursor Technologies
- RNN Limitations
- Wait, there is More ...
- Transformers
- The Problem Domain
- LLMs
- Multimodality of LLMs
- Infographic of Multimodality Tasks
- Generative Foundation Models
- Inferring Movie from Emoji
- Fine-Tuning and Transfer Learning
- Transfer Learning in Computer Vision
- The Transfer Learning Diagram
- Can I Have My Very Own Model?
- The Age of Digital Assistants ... Transformed ...
- The Training Datasets
- The Training Techniques
- Hugging Face
- The Evolutionary Tree of LLMs
- The LLM Capabilities vs LLM Size (in Parameters)
- Does the Model Size Matter?
- Inference Accuracy vs LLM Size
- The Microsoft 365 Copilot Ecosystem
- The LLaMA Family of LLMs
- LLaMA 2
- The AI-Powered Chatbots
- Options for Accessing LLMs
- Cloud Hosting
- Prompt Engineering
- Context Window and Prompts
- Zero- and Few-Shot Prompting
- Understanding Model Sizes
- Physical Model Sizes
- Quantization
- Generative Adversarial Networks
- Generator and Discriminator Networks
- A High-Level GAN Diagram
- The Above Generator's Sample Output
- The Diffusion Models, Names, and “Competition”
- The Core Diffusion Modeling Idea
- The Diffusion Process
- AI Alignment
- Ethical AI
- Introduction to Transformers
- What is a Transformer?
- Transformer Use Cases
- Transformers, Encoders, and Decoders
- Recurrent Neural Networks
- Why Transformers?
- The Transformer Evolution Path
- A Short Summary of the Transformer Inner Workings
- N-Grams
- Tokenization
- Two Types of Weights
- (Self-)Attention (1 of 3)
- Multi-Head Attention
- The Encoder
- The Decoder
- The Head-First Approach ...
- The Transformer Model Architecture
- Model Training
- The Overall Translation (Inference) Process
- Positional Encoding
- The Encoding Part (a Big Picture)
- The Decoder Attention Units
- Cross-Attention
- The Decoder Part (a Big Picture)
- The Attentions Weights Matrix
Lab Exercises
- Lab 1. Learning the Colab Jupyter Notebook Environment
- Lab 2. Neural Network Playground Web App
- Lab 3. Multi-layer Perceptron Classifier
- Lab 4. Vectors and Matrix Operations
- Lab 5. Understanding the Gradient Descent Algorithm
- Lab 6. Understanding Regularization
- Lab 7. TensorFlow Basics
- Lab 8. Using Keras for Image Classification
- Lab 9. Using CNNs for Image Classification
- Lab 10. Understanding RNNs
- Lab 11. Word2vec Pre-Trained Embeddings
- Lab 12. Hello, Generative AI!
- Lab 13. Using OpenAI API
- Lab 14. NLP and NLU with Transformers