Machine Learning and Neural Networks - page 17

 

The explanation of the chain rule with a schematic example of a neural net and calculations 

at 27:15 onwards 

https://youtu.be/uXt8qF2Zzfo?t=1635

And an explanation from a coder's perspective , in case you are fed up with people just blurting out the math because "that's how it works"

https://www.mql5.com/en/blogs/post/752198

12a: Neural Nets
12a: Neural Nets
  • 2016.04.20
  • www.youtube.com
*NOTE: These videos were recorded in Fall 2015 to update the Neural Nets portion of the class.MIT 6.034 Artificial Intelligence, Fall 2010View the complete c...
 

PyTorch for Deep Learning & Machine Learning – Full Course (parts 11-16)


PyTorch for Deep Learning & Machine Learning – Full Course


Part 11

  • 10:00:00 In this section of the video course, the instructor provides an overview of various loss functions and optimizers commonly used in PyTorch for neural network classification. The instructor explains binary cross entropy loss, cross entropy loss, mean absolute error, and mean squared error, and which ones are typically used for regression versus classification tasks. For binary classification tasks, the code examples provided include torch.nn BCE loss with logits and BCE loss. The video also covers the concept of logit in deep learning and explores two commonly used optimizers, SGD and Adam. The instructor notes that while there may be other optimizers available, sticking with these two can still achieve good results on many problems.

  • 10:05:00 In this section of the video, the speaker sets up the loss function and optimizer in PyTorch for deep learning and machine learning. The loss function is called BCE with logits loss, which has the sigmoid activation function built in. The speaker explains that if one wants to learn more about activation functions in neural networks, there are online resources available. The optimizer chosen is stochastic gradient descent (SGD), with a learning rate of 0.1, and the parameters are set to update the model parameters with respect to the loss. Finally, the speaker creates an evaluation metric.

  • 10:10:00 In this section, the instructor discusses the importance of accuracy as an evaluation metric and demonstrates how to create an accuracy function using pytorch. The accuracy function compares the predictions to the ground truth labels and returns the percentage of correct predictions out of the total number of samples. The instructor also provides an overview of the steps involved in a pytorch training loop, including the forward pass, loss calculation, optimizer zero grad, back propagation, and gradient descent. The steps are listed out and the importance of each step is discussed.

  • 10:15:00 In this section, the instructor explains how to go from raw logits to prediction probabilities to prediction labels. Raw outputs of the model are referred to as logits, which can be converted into prediction probabilities by passing them to an activation function such as sigmoid for binary classification and softmax for multi-class classification. Prediction probabilities can then be converted to prediction labels by either rounding them for binary classification or taking the argmax for multi-class classification. The instructor also explains the concept of an activation function as something separate from a layer and shows how data passed through a linear layer applies a linear transformation to the incoming data through a dot product and bias term.

  • 10:20:00 In this section of the video, the instructor explains how to use the sigmoid activation function to turn the raw output of a model, called logits, into prediction probabilities that can be used for binary classification. The instructor demonstrates how to use the sigmoid function on the model logits to create predprobs, which can then be passed to a torch dot round function to obtain prediction labels. These prediction labels are used to determine which class an input belongs to, using a decision boundary often set at 0.5. The instructor also emphasizes the importance of performing this step on the raw logits to ensure the prediction probabilities are in the same format as the test data.

  • 10:25:00 In this section, the video discusses the process of turning raw logits from the model into prediction probabilities using an activation function and then converting them into prediction labels. The steps are demonstrated through a fair bit of code, where y-pred is created from y-pred probes using the full step, including predictions, logits to pred probes to pred labels. The model's predictions are compared to test labels, and the same format is created using the squeeze function. The next step is to build a training and testing loop, which involves doing a forward pass, calculating the loss and optimizing the gradients. The video encourages the viewer to attempt this on their own before proceeding to the next video for further instructions.

  • 10:30:00 In this section, the instructor sets a manual seed for reproducibility, specifically a CUDA random seed for operations on a CUDA device. They then move on to putting the data on the target device and building the training and evaluation loop. The instructor highlights a little tidbit in the forward pass where raw logits are outputted and must be passed through torch.round and torch.sigmoid to convert them into prediction probabilities and labels. Lastly, they calculate the loss and accuracy, noting that although calculating accuracy is not necessary, it can be useful to visualize different metrics while the model is training.

  • 10:35:00 In this section, the video discusses the difference between BCE loss and BCE with logits loss in PyTorch. The BCE with logits loss expects logits as input and combines a sigmoid layer and BCE loss, making it more numerically stable. On the other hand, the BCE loss expects prediction probabilities as input, so torch sigmoid needs to be called on the logits to convert them to probabilities. The video also outlines the steps for the PyTorch optimization loop, including zeroing the gradients, performing backpropagation, and updating the parameters to reduce the gradients. Likewise, when testing or making predictions, the model should be put in inference mode, and the test logits should be processed by calling the sigmoid function to acquire prediction probabilities.

  • 10:40:00 In this section, the instructor discusses how to calculate test loss and accuracy for the classification model. To calculate the test loss, the instructor uses BCE with logits loss function and compares it to Y test labels. To calculate the test accuracy, the instructor uses the accuracy function on Y true and Y pred variables. The order of variables is reversed for the accuracy function, as the instructor based it off scikit-learn's metrics package. Finally, the instructor prints the epoch number, training loss and accuracy, and test loss and accuracy at every 10th epoch. The instructor encourages the users to run this mammoth code and fix any errors that arise.

  • 10:45:00 In this section, the instructor discusses the results of the model training from the previous section, which did not show any significant improvement in accuracy. The instructor suggests that an ideal accuracy for the model should be 100 and a loss value should be zero. However, the current model's accuracy is below 50%, which is equivalent to randomly guessing. To determine the reason for the poor performance, the instructor suggests visualizing the predictions made by the model. The instructor imports a function called "plot decision boundary" from a helper function file to be used in this visualization process. The instructor also recommends a resource, madewithml.com, for those interested in learning more about machine learning foundations and ml ops.

  • 10:50:00 In this section, the instructor explains how to download helper functions from the PyTorch learn repository in a programmatic way using Python's "pathlib" and "request" modules. The instructor shows the process of checking if the path of helper functions already exists, and if it does not exist, a request is made to download the helper function as a file called "helper_functions.py". The instructor demonstrates the successful import of methods "plot_predictions" and "plot_decision_boundary" from the downloaded helper function, which will be used later in the course. Finally, the instructor performs a test to visualize the helper function using the "plot_decision_boundary" function, which successfully plots a decision boundary for the training set.

  • 10:55:00 In this section of the video, the presenter discusses the limitations of a linear model in separating circular data with straight lines, as shown in a visualization of the model's decision boundary. The solution to improving the model's accuracy is to add more layers, i.e., increase the depth of the neural network, which allows for more chances to learn about patterns in the data. Other ways to improve a model's performance include increasing the amount of training data and adjusting hyperparameters such as learning rate and batch size. Importing and using helper functions from external Python scripts is also mentioned as a common practice.

Part 12

  • 11:00:00 In this section, the instructor discusses ways to improve a model: adding more hidden units, fitting for longer, changing activation functions, adjusting the learning rate, and changing the loss function. The instructor points out that increasing the number of parameters in a model can potentially help represent the data better, but too many parameters may make the model too complex for a simple dataset. The instructor also illustrates how experimentation can help improve the model by changing its structure and hyperparameters. Finally, the instructor shows graphical examples of how adding layers, increasing the number of hidden units, adding activation functions, and changing the optimization function can potentially improve the model.

  • 11:05:00 In this section, the instructor discusses how to improve a model from a model perspective. He explains the concept of hyper parameters, which are values that machine learning engineers and data scientists can change to improve the model's results. The instructor demonstrates how to change the hyper parameters of a model, such as the number of hidden units, the number of layers, and the number of epochs. He also highlights the importance of testing these changes one at a time to identify which one offers the improvement or degradation. Finally, he explains the difference between parameters and hyper parameters and why it's important to make this distinction.

  • 11:10:00 In this section, the instructor creates a three-layered model with more hidden units to see if training this model for longer yields better results. The forward method is overridden to pass data through each layer, with an extra hidden unit and an extra layer overall. The method that leverages speed-ups is also demonstrated to perform all operations at once. An instance of the three-layered model is created and sent to the target device, followed by the creations of a loss function and an optimizer, and a training and evaluation loop for model one.

  • 11:15:00 In this section, the video continues from the previous one in which the nn.module was subclassed to create a new model, Circle Model V1, with more hidden units and an extra layer. Now, the next step in the workflow is to select a loss function, and the video uses nn.BCEWithLogitsLoss() as before, with the same optimizer, torch.optin.SGD(). The video sets the learning rate to 0.1, the number of epochs to 1000, and puts data on the target device (CPU or GPU). The video also demonstrates a loop through the epochs and passes the training data through the model with the new architecture, calculates the loss, and updates the parameters using torch's autograd.

  • 11:20:00 In this section of the video, the instructor goes over the steps for evaluating the model's accuracy and loss. The loss function takes in the predicted label values and compares them to the actual label values. The accuracy function is used to determine how accurate the model's predictions are. The optimizer is used to adjust the model's parameters to create a better representation of the data. Testing is done by calling the model's eval() method and turning on the inference mode. Logits are created by passing input data to the model, and then the torch.round() and torch.sigmoid() functions are used to convert them to predictions. The loss and accuracy are calculated for the test data and printed out every 100 epochs during training of the model.

  • 11:25:00 In this section of the PyTorch for Deep Learning & Machine Learning course, the instructor discusses troubleshooting techniques for when a model is not working, such as testing out a smaller problem to see if the model can learn anything at all. He suggests replicating the data set from a previous section where a linear model was able to fit a straight line and using it to see if the current model can learn anything, as it is currently only guessing and unable to draw a straight line to separate the circular data. The instructor also mentions that some methods to improve a model include changing the hyperparameters, such as the number of layers and hidden units, and changing the activation and loss functions.

  • 11:30:00 In this section, the instructor creates a data set using linear regression formula to see if the model works on any kind of problem. The data set is called x regression and contains 100 samples of one x-value per y-value. The instructor then creates training and test splits for the data and checks their lengths. Finally, the plot predictions function from the helper functions file is used to visually inspect the data.

  • 11:35:00 In this section, the presenter discusses a side project to see if their model can fit a straight line data set before attempting to fit a non-straight line data set. They adjust Model 1 to fit the new data set by changing the number of input features from two to one to match the data, while keeping the output features at 10 to give the model as many parameters as possible. They also create Model 2 using NN dot sequential, which passes data through layers, and set up a loss and optimizer function.

  • 11:40:00 In this section, the instructor introduces the L1 loss function to optimize a regression problem and uses the SGD optimizer with a learning rate of 0.1 to optimize the model's parameters. After loading the dataset and putting it on the target device, the model is trained for a thousand epochs with a loop. In each epoch, the forward pass is performed, the loss is calculated, and the parameters are updated using backward and step functions. The training progress is printed out with the epoch, loss, and test loss every 100 epochs. The loss goes down as the model is optimized.

  • 11:45:00 In this section of the video, the instructor recaps the previous section where they created a straight line data set and trained a model to fit it. They confirm that the model is learning something and suggest that learners play around with different values of the learning rate to experiment with machine learning models. The instructor then goes on to explain how to turn on evaluation mode and make predictions, which are also known as inference. They also teach how to use the plot predictions function and encounter an error due to the data not being on the same device as the model, which they solve by calling dot CPU on their tensor inputs.

  • 11:50:00 In this section, the instructor introduces the importance of nonlinearity in machine learning and deep learning models. Linear functions alone cannot capture the complex patterns in data that require nonlinear functions like curves to represent them accurately. Neural networks are built by combining linear functions with nonlinear functions, or activations, to model complex data patterns. The instructor hints at upcoming videos that will cover nonlinear activations and their role in deep learning models.

  • 11:55:00 In this section, the instructor discusses the power of nonlinearity in machine learning and neural networks. Nonlinearity is essential in machine learning because the data is not always comprised of straight lines. The instructor then demonstrates how to create and plot nonlinear data using the make circles function, and convert data to tenses and train and test splits using PyTorch and the train test split function from sklearn.

Part 13

  • 12:00:00 In this section, the instructor in a PyTorch for Deep Learning & Machine Learning course introduces nonlinearity, which is a crucial component of building a model. The instructor challenges the viewer to find a specific nonlinear function in the TorchNN module, which can include pooling layers, padding layers, and activation functions to perform some mathematical operation on an input. Examples of nonlinear activations such as n dot sigmoid and n dot relu are provided. The instructor then demonstrates how to build a classification model using nonlinearity with PyTorch. Nonlinearity means that the graph is not a straight line, whereas linear means the opposite.

  • 12:05:00 In this section of the PyTorch for Deep Learning & Machine Learning course, the instructor introduces the concept of nonlinear data and how neural networks and machine learning models can work with numbers in hundreds of dimensions, making it easier for them to handle nonlinear data. A new neural network, circle model V2, is created using classes with a constructor and several layers that perform linear operations with the addition of a nonlinear activation function called "relu". This function turns negative inputs of the model to zero while leaving the positives as they are. The new model is then passed through the sigmoid function to determine the output.

  • 12:10:00 In this section, the instructor challenges viewers to recreate a neural network model in the TensorFlow Playground with two hidden layers and five neurons, using the Rectified Linear Unit (ReLU) activation function instead of the linear activation function, which they've been using. The instructor explains that the ReLU activation function is a popular and effective nonlinear activation function necessary for modeling nonlinear data, which neural networks are designed to do. The instructor demonstrates the effect of changing the learning rate on the training loss and encourages viewers to experiment with different learning rates to observe the effect on the loss curve.

  • 12:15:00 In this section, the instructor discusses building an optimizer and a loss function for a binary classification problem using PyTorch. They set the model's non-linear activation function to ReLU and create random seeds for CUDA. They then loop through 1000 epochs to train the model, and calculate the loss and accuracy using the BCE with logits loss function and an accuracy function respectively. The instructor encourages thinking about how to functionalize the training code and suggests that this section is to build experience and momentum towards working on real-world PyTorch projects.

  • 12:20:00 In this section, the instructor explains the process of optimizing the model with the backpropagation method in PyTorch. Before performing backpropagation, the optimizer's gradients are zeroed so that it can start from a clean slate. After executing loss.backward(), the optimizer's step method is called to perform gradient descent on the model parameters. The instructor also demonstrates how to debug the model's parameters and explains that the ReLU activation function does not have any parameters, which makes it effective. Finally, the instructor prints out the training and test loss, accuracy, and epoch to track the progress of the model's learning.

  • 12:25:00 In this section, the instructor troubleshoots a shape issue in the PyTorch code and fixes it by using the squeeze function to remove an extra dimension in the test logits dot shape. They then discuss the power of non-linearity and how the addition of relu layers improved the performance of the model, allowing it to potentially draw a line to separate the circles in the dataset. The instructor also emphasizes the importance of visualization in evaluating the model and making predictions, and they challenge the viewers to plot the decision boundaries.

  • 12:30:00 In this section, the instructor demonstrates using the plot decision boundary function to visualize the performance of a nonlinear model compared to a linear model. The nonlinear model is shown to have better accuracy than the linear one, but the instructor challenges the viewer to improve upon the accuracy even more. The instructor then moves on to discuss how neural networks use linear and nonlinear functions as tools for discovering patterns in data, leading into a demonstration of how to create tensors and use nonlinear activation functions in PyTorch.

  • 12:35:00 In this section, the instructor explains how to create custom activation functions in PyTorch by replicating the popular ReLU and Sigmoid functions. The instructor first sets a data type of torch float 32 and visualizes a straight line plotted using the negative 10 to 10 values on the x-axis. The ReLU function is then created using the torch.relu and nn.relu functions by taking an input tensor and returning the maximum of zero and x. Similarly, the Sigmoid function is created by taking an input tensor and returning one divided by one plus the exponential of negative x. The instructor demonstrates the effectiveness of the custom ReLU and Sigmoid functions by plotting them and comparing them to the PyTorch built-in functions.

  • 12:40:00 In this section of the PyTorch course, the instructor explains the importance of combining linear and nonlinear functions to find patterns in data in order to fit a data set. The idea behind neural networks is to stack layers of these functions to create a model. While it is possible to build these layers from scratch, Pytorch offers pre-built layers that have been error-tested and compute as fast as possible behind the scenes while also allowing for the use of GPUs. The instructor also discusses the difference between binary classification, which involves two possible outcomes, and multi-class classification, which involves more than two possible outcomes. Finally, the section concludes by reiterating the importance of nonlinearity in neural networks and the instructor issues a challenge to improve upon their previous binary classification model.

  • 12:45:00 In this section, the course instructor introduces multi-class classification and the differences between it and binary classification. The softmax activation function is used instead of sigmoid and cross entropy instead of binary cross entropy. The instructor then proceeds to create a 20 multi-class data set using the make blobs function from scikit-learn.datasets to generate four classes with two features each. The center standard deviation is adjusted to give the clusters some randomness and shake them up a bit, thereby making it a little bit harder for the model.

  • 12:50:00 In this section, the transcript excerpt discusses how to prepare data for a multi-class classification model using PyTorch. They turn the data into tensors and use the train test split function from scikit-learn to split the data into training and test sets. They also visualize the data using plot.figure and set the random seed to ensure reproducibility. After creating the multi-class classification dataset, they consider whether non-linearity is needed to separate the data and then proceed to build a model for the data.

  • 12:55:00 In this section, the instructor discusses how to set up a multi-class classification model using PyTorch. He explains the process step-by-step, starting with defining the input layer shape and determining the number of neurons per hidden layer. The instructor then explains how to set the output layer shape, which requires one output feature per class. To create the model, the instructor creates a class called "blob model" that inherits from nn.module and sets some parameters for the model. Finally, the instructor demonstrates how to initialize the multi-class classification model with input features and output features.

Part 14

  • 13:00:00 In this section, the instructor discusses the creation of a linear layer stack model using PyTorch's nn.Sequential method. To instantiate the model, the number of input features and the number of output classes are accessed to determine the configuration of the hidden layers. The instructor sets up a sequential stack of layers to pass data through each layer one by one. They also provide instructions on adding nonlinearity to the dataset, and then create a forward method to allow the input to go through the specified layers sequentially. Finally, an instance of the blob model is created with the appropriate number of input and output features.

  • 13:05:00 In this section of the video, the instructor creates a multi-class classification model by subclassing an nn.Module and sets up parameters for the class constructor to customize the input and output features. They also explain that the output features parameter lines up with the number of classes in the data. To create a loss function for a multi-class classification model, the instructor searches for and finds cross entropy loss in the torch.nn module, which computes the loss between input and target and is useful when training a classification problem with C classes. The instructor also explains that the weight parameter is useful when dealing with an unbalanced training set.

  • 13:10:00 In this section, the instructor discusses the creation of a loss function and optimizer for multi-class classification. He recommends two common optimizers, SGD and Adam, but chooses to use SGD for this example. He then challenges viewers to do a forward pass with the model created in the previous video and to consider what the raw outputs of a model are. The instructor also reminds viewers to pay attention to device parameters, as a runtime error can occur if tensors are not on the same device. Finally, he turns the model into eval mode and makes some predictions.

  • 13:15:00 In this section, the instructor explains how to convert a model's output logits into prediction probabilities and prediction labels for multi-class classification problems. To do this, the softmax function is used to convert the logits into prediction probabilities, and then the prediction with the highest probability is considered the predicted label. The instructor demonstrates this process using PyTorch code and also notes that the sum of the probabilities for each sample will always be one due to the nature of the softmax function.

  • 13:20:00 In this section, the instructor explains how to go from the raw output of a PyTorch model for a multi-class classification problem to prediction probabilities using the softmax activation function, and then to prediction labels by taking the argmax of the prediction probabilities. This process involves converting the model's raw output into logits, using the softmax function to get the prediction probabilities, and taking the argmax of those probabilities to get the prediction labels. The instructor notes that while the current predictions are random as the model has not been trained yet, these steps will be used in a training loop to train and evaluate the model.

  • 13:25:00 In this section, the instructor begins building a training and testing loop for a multi-class model. First, they set up manual seeds to attempt to get the same output each time, but note that this is not guaranteed. They then set the number of epochs to 100 and put the data onto the target device. The loop through data begins, and for each epoch, the model is trained with a forward pass and logits created from x blob train. The output of the Torch softmax function is used to calculate the loss with cross-entropy loss and accuracy. The optimizer is then zeroed, and back propagation is performed before the optimizer is stepped. The code for testing or inference is also presented, which involves setting the model to evaluation mode.

  • 13:30:00 In this section, the instructor discusses dropout layers, turning off match norm, and the torch inference mode to make predictions faster. They explain that during training, dropout layers randomly drop out some of the neurons to avoid overfitting. The instructor also demonstrates how to calculate test logits and test accuracy by passing in test loss and test labels behind the scenes. They then discuss a pesky data type issue that caused a runtime error and how they resolved it. The instructor emphasizes that troubleshooting code is an essential part of machine learning, and it takes time to identify and resolve errors.

  • 13:35:00 In this section, the narrator encounters various troubleshooting challenges while creating a multi-class classification model. Firstly, he figures out that the error in his code is due to one of the tensors having the wrong data type. Through some research and experimentation, he changes the tensor to a "torch.long tensor," which optimizes computation for cross entropy loss. Later, he encounters another error due to different sizes of his training and test data. By debugging the code on the fly, he identifies the issue and reassigns the data. Despite these challenges, the model's accuracy and loss perform as expected, indicating that the model is working for a multi-class classification dataset.

  • 13:40:00 In this section, the instructor discusses how to evaluate the trained multi-class classification model by making predictions and evaluating them. The instructor explains that predictions are made after setting the model to evaluation mode, passing the test data, and obtaining raw logits as a result. The next step is to convert logits into predictions probabilities by calling torch.softmax on logits. Then, predictions labels are obtained by calling torch.argmax on the prediction probabilities. The instructor emphasizes the importance of visualizing the predictions by plotting them and comparing them with the actual data.

  • 13:45:00 In this section, the instructor evaluates the multi-class classification model visually and explores the linear and non-linear functions used to separate the data. The instructor also mentions that most data requires both linear and non-linear functions for classification, and PyTorch makes it easy to add these functions to models. Additionally, the section covers the importance of evaluating models and introduces precision and recall as important metrics when dealing with classes with different amounts of values.

  • 13:50:00 In this section, the instructor discusses various classification evaluation methods, including accuracy, precision, recall, F1 score, confusion matrix, and classification report. The instructor explains that while accuracy is the default metric for classification problems, it may not be the best for imbalanced datasets. For imbalanced data sets, precision and recall should be used. The precision is determined by the true positive over true positive plus false positive, while the recall is determined by the true positive over true positive plus false negative. The instructor also notes the trade-off between precision and recall, where increasing one metric would lower the other. The use of torchmetrics and scikit-learn libraries for classification metrics is also discussed.

  • 13:55:00 In this section, the instructor shows how to import and use pre-built metrics functions in PyTorch using the torchmetrics package. They demonstrate how to install torchmetrics, import the accuracy metric, and use it to calculate the accuracy of a multi-class model. However, they also caution that when using torchmetrics, the metrics must be on the same device as the data, using device-agnostic code. The instructor provides a link to the torchmetrics module and extracurricular articles for further exploration. They also introduce exercises and solutions for practicing the code covered in the previous sections.

Part 15

  • 14:00:00 In this section, the instructor advises viewers on where to get help for PyTorch computer vision code, including following along with the code, using Google Colab's doc string feature, searching for code on Stack Overflow or in PyTorch documentation, and asking questions on the PyTorch deep learning repo's Discussions tab. The section also covers examples of computer vision problems, such as binary or multi-class classification problems, where a machine learning model learns patterns from different examples of images to determine if an image is a steak or pizza, or to classify images into multiple categories.

  • 14:05:00 In this section, the speaker discusses different applications of computer vision using machine learning, such as multi-class classification for image problems, object detection, and image segmentation. The speaker provides an example of Nutrify, which uses machine learning to classify up to 100 different foods from an uploaded image. The speaker also discusses how Tesla uses computer vision to plan its self-driving cars' movements using 3-dimensional vector space and machine learning. The speaker notes that Tesla uses PyTorch, which is the same code that is taught in the course.

  • 14:10:00 In this section of the video, the instructor discusses using PyTorch to create a computer vision model for multi-class image classification. Using the example of Nutrify, a photo recognition technology for food, the instructor explains the typical inputs and outputs for a computer vision problem. Inputs include a tensor representing an image's height, width, and color channels. The instructor also mentions that existing algorithms may already exist for popular computer vision problems, but one can be built if needed. The desired output for the Nutrify example is three outputs, one for each food class.

  • 14:15:00 In this section, the video explains how machine learning models can be used for image classification, utilizing PyTorch and convolutional neural networks (CNNs) to represent information numerically and train the model to recognize patterns in the data. The example given is predicting the types of food in an image, such as sushi, steak, and pizza, with the use of PyTorch to encode the information and CNNs to recognize patterns in the images. The video emphasizes that the input and output shapes will vary based on the problem being solved and that CNNs are typically the best choice for image data, although other models can be used. Finally, the video introduces a problem involving grayscale images of fashion items that will be used to further demonstrate the same principles learned in this section.

  • 14:20:00 In this section, the instructor discusses the representation of image data in PyTorch and other deep learning libraries. These libraries often expect color channels last, but PyTorch defaults to representing image data with color channels first. The video explains the importance of aligning the input and output shapes of a model for a given problem. The instructor provides an overview of the PyTorch workflow for building models, including getting data ready using transforms and data loaders, building or picking a pre-trained model, selecting an optimizer and a loss function, evaluating the model using metrics, and experimenting to improve the model. The next section discusses the architecture of a convolutional neural network (CNN).

  • 14:25:00 In this section, the instructor discusses the architecture of a typical convolutional neural network (CNN). The input data goes through various layers, including convolutional, activation, and pooling layers, until it gets converted into an output shape that can be converted into class names. The instructor emphasizes that there are almost unlimited ways of stacking a CNN, and demonstrates one way of doing it through slides. However, the best way to learn is to code it out, and the instructor directs users to a Google Colab notebook where they can practice building a CNN using PyTorch and the TorchVision library. The instructor also provides additional resources, including a reference notebook and a PyTorch computer vision section in LearnPyTorch.io.

  • 14:30:00 In this section of the video, the instructor introduces the different PyTorch libraries for different domains, highlighting PyTorch's strength in computer vision. The main library for computer vision is torch vision, which contains data sets, pre-trained models for computer vision, and transforms for manipulating vision data into numbers usable by machine learning models. The instructor demonstrates importing PyTorch, NN, and torch vision, and walks through the transforms module, which contains common image transformations and can be trained together using compose. The to_tensor function is introduced as a main transform for turning image data into tensor format.

  • 14:35:00 In this section of the PyTorch for Deep Learning & Machine Learning course, the instructor covers the fundamental computer vision libraries in PyTorch, including TorchVision, modules stemming off TorchVision, and TorchUtils.data.dataset, which is the base dataset class for PyTorch. The instructor also discusses the importance of using Matplotlib for visualization and the necessity of converting images into tensors to use with models. The instructor then introduces the FashionMNIST dataset, which is a take on the original MNIST database, featuring grayscale images of pieces of clothing. This dataset will be used to demonstrate computer vision techniques. The instructor explains that while serious machine learning researchers consider MNIST to be overused and not representative of modern computer vision tasks, FashionMNIST is a useful dataset to get started with.

  • 14:40:00 In this section, the instructor discusses how to download and use datasets from the TorchVision library. They mention various image classification datasets such as Caltech101, CIFAR-100, and CIFAR-10, and how to download them using the torchvision.datasets module. The instructor then goes on to demonstrate how to download and use the Fashion-MNIST dataset, explaining the various parameters and transforms that can be applied to the dataset. They also provide sample code to download both the training and testing datasets.

  • 14:45:00 In this section, the instructor explains how to use PyTorch's torchvision.datasets to download example computer vision datasets, specifically the FashionMNIST dataset. We can store the data in a variable called "data" and use torchvision.transforms to convert the image data into tensors. The instructor also demonstrates how to check the length of the training and testing datasets, view a training example's image and label, and get more information about the class names through the use of attributes such as ".classes" and ".class_to_idx". Finally, they explain that a label does not have a shape because it's only an integer.

  • 14:50:00 In this section, the instructor discusses the input and output shapes of the Fashion MNIST dataset, which is comprised of grayscale images of different types of clothing. The input shape of the images is in NCHW format, where the batch size is set to "none," and the output shape is 10. To better understand the data, the instructor uses Matplotlib to visualize an image and its shape, but encounters an error because the data format does not match the expected format. The section emphasizes the importance of understanding input and output shapes and formatting when working with machine learning models.

  • 14:55:00 In this section, the video explores how to plot and visualize image data using PyTorch and Matplotlib. The instructor demonstrates how to plot a single image and remove extra dimensions using image.squeeze(). Next, they plot a set of 16 random images from the dataset using a fixed random seed and Matplotlib's subplot function. The instructor also shows how to use the cmap property or parameter to change the plot's color map to grayscale. They then discuss the importance of visually exploring a dataset to gain a better understanding of the data and identify potential issues, such as similarities between pull overs and shirts in the dataset.

Part 16

  • 15:00:00 In this section, the instructor explains the importance of preparing data for a computer vision model and how to do so using PyTorch data sets and data loaders. He also discusses the potential need for nonlinearity in modeling the 60,000 images of clothing to be classified into 10 different classes and how breaking the data set into smaller batches can improve computational efficiency. The goal of this preparation is to create a Python iterable that can be used by the model to identify patterns in the data.

  • 15:05:00 In this section, the instructor explains the concept of mini batches and why it's commonly used in deep learning, starting with breaking down a dataset of 60,000 images into batches of 32. The two main reasons for using mini batches are to make the neural network more computationally efficient by avoiding GPU memory limitations and to give the network more chances to update its gradients per epoch. The data is batched using the data loader from torch.utils.data by passing it a data set, defining the batch size, and setting shuffle to true to avoid the network memorizing the order of the data. The instructor provides code to create train and test data loaders, which will be used in the training loop.

  • 15:10:00 In this section, the importance of mini-batches in deep learning problems is emphasized, and the process of creating train and test data loaders is explained using PyTorch. The batch size hyperparameter is set to 32, and the data sets are turned into iterables. The train and test data sets are loaded using DataLoader, with the batch size set to 32 for the train data and the test data, and shuffle set to True for train data and False for test data. The attributes of the train data loader, such as batch size and data set, are explored. The length of both train and test data loaders is printed to determine the number of batches in each.

  • 15:15:00 In this section, the instructor discusses how to visualize batches of images using PyTorch. The transcript excerpt shows how the length of train data loader is determined, based on the batch size and number of training samples. Then, the instructor shows how to visualize a single image from a batch using randomness and checks the image size and label associated with that sample. The instructor emphasizes that these input and output shapes will vary depending on the specific problem, but the basic premise remains the same – data is turned into batches to pass it to a model.

  • 15:20:00 In this section, the video instructor explains how to visualize images in a batch and turn the data into data loaders. They have also introduced the concept of a baseline model, which is used as a starting point and is a simple model that can be improved upon later through experimentation. The instructor then introduces a new layer, "flatten", which flattens a continuous range of dims into a tensor for use with sequential, and shows how to use it as a standalone model.

  • 15:25:00 In this section, we learn about flattening and how it's used to transform multi-dimensional data into a single vector. After printing the shapes before and after flattening, we see that the output is now a one-dimensional vector with a length of 1784. We also see that this process is similar to the encoding of information in Tesla's cameras for use in deep learning models. We then see how the flattened data will be used in the linear layer of our PyTorch model. The model is defined using nn.Sequential and includes a flatten layer and two linear layers. The input and output shapes are defined, and we see that the first linear layer's out features match the second linear layer's in features.

  • 15:30:00 In this section, the instructor explains how to create a simple neural network model using PyTorch. The model consists of a flatten layer followed by two linear layers, with no non-linearities. The forward method of the model is defined, which takes an input, passes it through the flatten layer, then through the two linear layers, and returns the output. The instructor then sets up an instance of the model and performs a dummy forward pass to ensure the model is working as expected. Additionally, they also explain the input and output shape of each layer and how they are arranged to obtain the desired output shape. Finally, they demonstrate the importance of using the flatten layer and why it is needed to combine the output of the previous layer into a single vector.

  • 15:35:00 In this section, the instructor reviews the previous video where they created model zero for a computer vision problem and reiterated the importance of ensuring input and output shapes align with where they need to be. They also explain that the weights and bias matrices represent different features in the images, which the model will learn through deep learning and machine learning. Moving forward, they discuss the selection of a loss function, optimizer, and evaluation metric for the model, choosing cross-entropy loss, stochastic gradient descent optimizer, and accuracy evaluation metric respectively. They also provide a reference to an online PyTorch resource for classification evaluation metrics.

  • 15:40:00 In this section, the video instructor discusses the concept of using helper functions in Python machine learning projects. He provides a sample code for importing a Python script containing common functions, including a helper function called accuracy. The accuracy function calculates the accuracy metric, and the instructor demonstrates that it can be imported successfully by checking for a doc string. He also explains that using helper functions in Python projects can save a lot of time and effort, especially when dealing with common functionalities that do not need to be rewritten every time. Finally, he sets up a loss function equals nn dot cross entropy loss, and optimizer to train the model.

  • 15:45:00 In this section, the instructor sets up the optimizer for stochastic gradient descent and sets a relatively high learning rate of 0.1 for the simple dataset of 28x28 images. They then discuss the importance of tracking a model's performance as well as its runtime, as there is often a trade-off between the two. They proceed to demonstrate how to create a function to time the model's training using the time module in Python and passing in the torch.device to compare how fast the model runs on different devices.

  • 15:50:00 In this section, the instructor discusses the importance of timing functions for measuring how long a model takes to train. He demonstrates how to create a timer using the Python "timer" module and shows how to incorporate it into the training process. The instructor also explains how to use Google Colab's reconnection feature and provides a reminder about using data loaders to split data into batches for training. He then outlines the steps involved in creating a training loop and training a model on batches of data, emphasizing the need to loop through epochs and batches, perform training steps, and calculate the train loss per batch. Finally, he mentions that the model will be evaluated at the same step as training.

  • 15:55:00 In this section, the instructor begins the testing phase by importing TQDM for a progress bar that will indicate how many epochs the training loop has gone through. TQDM is a Python progress bar that has low overhead and is open source software. Since TQDM is so popular, it is built into Google CoLab. The instructor sets the seed and starts the timer before setting the number of epochs to three for faster training time to run more experiments. They create a training and test loop, instantiate the train loss, and calculate the training loss per epoch. The data is batchified, and a loop is added to loop through the training batch data.

PyTorch for Deep Learning & Machine Learning – Full Course
PyTorch for Deep Learning & Machine Learning – Full Course
  • 2022.10.06
  • www.youtube.com
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.✏️ Daniel Bourke develo...
 

PyTorch for Deep Learning & Machine Learning – Full Course (description of parts 17-22)


PyTorch for Deep Learning & Machine Learning – Full Course


Part 17

  • 16:00:00 In this section, the instructor sets up the training loop for the neural network. The model is put in training mode and the forward pass is performed to calculate the loss. The training loss values are accumulated every batch and the optimizer is updated once per batch rather than once per epoch. The instructor also prints out the looked at samples and the average training loss per batch per epoch for monitoring purposes. This loop will continue until all batches have been processed in the train data loader.

  • 16:05:00 In this section, the instructor walks through the code for the testing loop in PyTorch, which involves setting up a test loss variable and using a forward pass to evaluate the patterns learned on the training data. The accuracy for testing is also calculated using the downloaded accuracy function, and the test loss and test accuracy values are accumulated per batch and then divided by the number of batches to find the average per epoch. These values are then printed out to track the progress of the model.

  • 16:10:00 In this section, the instructor discusses the final steps of setting up the training loop, which includes printing the train loss, test loss, and test accuracy, as well as calculating the training time. They also provide troubleshooting tips for potential errors that may arise while coding in PyTorch. They conclude by demonstrating how to run the code and show the progress bar of the training loop.

  • 16:15:00 In this section, the instructor discusses the results of the training loop and introduces the idea of baseline accuracy and training time. He highlights that the numbers may vary slightly due to the inherent randomness of machine learning and the hardware used. The instructor then moves on to evaluate the model by creating a function to build multiple models and compare the results later on. The function takes in a model, a data loader, a loss function, and an accuracy function and returns a dictionary containing the results of the model predicting on the data loader. He explains that the function is similar to the testing loop but is functionized to be used with multiple models and data loaders. The instructor also mentions that the following sections will cover making predictions and evaluating the model on the GPU and convolutional neural network.

  • 16:20:00 In this section, the video creator shows how to make the loss and accuracy function generalizable so that it can be used with any model and data loader. They demonstrate how to accumulate the loss and accuracy values per batch, scale them to find the average loss/accuracy per batch, and return the results in a dictionary format. They also show how to create a new function to calculate the model's results on the test dataset, using the functions that were defined earlier. Finally, they use a progress bar to track the performance of the model on the test dataset.

  • 16:25:00 In this section of the PyTorch for Deep Learning and Machine Learning course, the instructor discusses how to set up device-agnostic code to train models on both CPUs and GPUs. They demonstrate checking for the availability of CUDA and switching to the GPU to leverage its processing power. The instructor also recommends starting with smaller datasets and models before increasing complexity and size. Finally, they propose building a new model for the dataset and testing its performance with and without nonlinearities while running on the GPU.

  • 16:30:00 In this section, the instructor introduces the concept of nonlinearity in neural networks and encourages viewers to experiment with creating a model with nonlinear functions. The benefits of nonlinearity for modeling nonlinear data are discussed, and the instructor explains how to implement a neural network with both linear and nonlinear layers using PyTorch. They walk through the code step-by-step and emphasize the importance of experimentation in finding the best model for a given dataset.

  • 16:35:00 In this section, the instructor discusses customizing neural networks with linear and nonlinear functions, and demonstrates adding in two ReLU activation functions to a previously defined network. The forward method is then overridden to allow the input to pass through the layer stack, and the model is instantiated on the device. The video then moves on to creating a loss function, optimizer, and evaluation metrics for the new model, which has added non-linear layers, emphasizing the importance of running experiments to understand how different functions may influence neural networks.

  • 16:40:00 In this section, the speaker discusses creating helper functions and functionizing training and evaluation loops in PyTorch. They mention importing the accuracy function and setting up a loss function as well as the optimizer. The next step is to build training and evaluation loops as functions so that they can be called repeatedly without the risk of errors. The speaker walks through the process of creating a train step function, which requires a model, data loader, loss function, optimizer, and optionally, an accuracy function and target device as inputs. The train step function loops through a data loader, performs a forward pass, calculates the loss, backpropagates, and updates the model parameters with the optimizer.

  • 16:45:00 In this section, the presenter explains how to perform a training step in PyTorch. They start by defining the inputs of the function, including a model, a data loader, a loss function, an optimizer, and a device. Then, they go through each line of the function, starting with a loop through the data loader and putting the data on the target device. They also add an accuracy function to accumulate the accuracy score per batch. Finally, at the end of the training step, they calculate the average loss and accuracy per batch, and print out the results. Overall, this section provides a clear and concise explanation of how to use PyTorch for deep learning and machine learning.

  • 16:50:00 In this section, the trainer functionalizes the testing loop by creating a test step function that takes in a model, data loader, loss function, accuracy function, and device as inputs. The trainer demonstrates how to set up a test loss and accuracy and put the model in eval mode before looping through the data loader and performing a forward pass. The trainer also explains the importance of using the inference mode context manager and creating device-agnostic code. The test pred is calculated by passing in X, and the test loss and accuracy are accumulated per batch using the respective functions. Finally, the trainer converts the output logits to prediction labels by taking the argmax.

  • 16:55:00 In this section of the PyTorch full course, the instructor demonstrates how to create a test step function and an optimize and evaluation loop using the train step and test step functions. The new functions are used to train a model and evaluate its accuracy for three epochs. The instructor also shows how to measure the time it takes the model to run on the GPU versus the CPU.

Part 18

  • 17:00:00 In this section, the instructor takes us through setting up the test step for our deep learning model and creating a timer to measure the training time. The code is simple and efficient as it is designed to be reusable in future projects. Furthermore, we run the second modeling experiment and compare the results with the first model, which used nonlinear layers. Though the second model was slightly faster in terms of training time, it did not produce better results than the previous one, which means that your numbers might not be the same as the instructor's but should be quite similar in direction. Finally, the instructor explains that our model is not too complex, and our data set is not very large, so this could explain why the CPU and GPU training times are not dramatically different.

  • 17:05:00 In this section, the instructor explains that sometimes a model trains faster on a CPU than on a GPU. The two main reasons for this are that the overhead for copying data/ model to and from the GPU outweighs the compute benefits offered by the GPU and that the hardware being used has a better CPU in terms of compute capability than the GPU. However, the instructor notes that this is less common and generally, a modern GPU is faster at computing deep learning algorithms than a general CPU. Next, the instructor shares a resource that talks about how to make deep learning go faster by optimizing bandwidth and overhead costs, among other things. Finally, the instructor creates a results dictionary for model one to compare modeling results later.

  • 17:10:00 In this section, the instructor discusses a common error that can occur in deep learning models regarding device mismatches between data and model. He explains that the error occurred because the data and model were on different devices and suggests a fix by making the code device-agnostic. He also cautions that it is always better to create device agnostic code. Finally, the instructor introduces the next modeling experiment, which involves building a convolutional neural network (CNN) and explains the architecture of a typical CNN.

  • 17:15:00 In this section, the instructor explains the different types of layers in a simple convolutional neural network (CNN). The CNN starts with an input, preprocessed into a tensor in red, green, and blue for an image. The input then passes through a combination of convolutional layers, relu layers, and pooling layers. The deep learning model can have more layers added to it to find more patterns in the data, with each layer performing a different combination of mathematical operations on the data. The instructor demonstrates the CNN using the CNN explainer website, where the input of different images passes through different layers, with the final output being the class with the highest value.

  • 17:20:00 In this section, the instructor explains how convolutional neural networks work and the beauty of deep learning. Each layer of the network is designed to learn different features of the data, with the network itself figuring out the best way to learn those features. The instructor then introduces CNN explainer website as a resource for learning more about convolutional neural networks, but also encourages
    learners to join in replicating the neural network in PyTorch code. The instructor then proceeds to build a tiny VGG convolutional neural network in PyTorch and explains that authors of research papers get to name new model architectures to make it easier for future reference. The code is initialized with input shape, hidden units, and output shape, which are typical parameters in building a PyTorch model.

  • 17:25:00 In this section, the instructor explains how to create a neural network using blocks in PyTorch, which are often referred to as convolutional blocks. These blocks are comprised of multiple layers and an overall architecture is comprised of multiple blocks. The instructor shows how to create convolutional blocks by writing two examples of layers with hyperparameters such as in and out channels, kernel size, stride, and padding. The instructor also provides interactive resources for learners to understand the basics of hyperparameters and encourages them to go through it.

  • 17:30:00 In this section, the instructor walks through the code for building a deep learning model using PyTorch, specifically focusing on the convolutional block layers. The model takes in 2D image data and the layers are used to learn a compressed representation of the input data, with max pooling used to take the maximum value of the input data. The code is broken down into two blocks, and then an output layer is added. The inputs of the final layer are flattened before being put through the last linear layer to create the final output.

  • 17:35:00 In this section, the instructor builds the classifier layer for a convolutional neural network (CNN) called tiny VGG, which has two layers that act as feature extractors, and a final layer that classifies these features into target classes. The instructor codes the classifier layer using sequential and passes in a flatten layer to flatten the output of the two previous layers into a single feature vector. The feature vector is then passed to an nn.linear layer, which calculates in features based on the number of hidden units, and out features based on the length of the classes. Finally, the instructor sets up the forward method, and prints out X dot shape to track the shape changes of each layer. The instructor establishes the input shape for the CNN model, which has only one color channel for black and white images, sets the hidden units value for each layer, and finishes by instantiating the model.

  • 17:40:00 In this section of the video, the instructor goes through the code they wrote in the previous section to create a convolutional neural network using PyTorch. They identify and correct some typos in the code and explain that the maxpool2d layer does not have any learnable parameters. They then introduce the conv2d layer and explain that its weight tensor and bias value manipulate the input to produce the output. They show how to reproduce the first layer of the CNN explainer website using a dummy input in PyTorch and provide a link to the PyTorch documentation for further reading. They also demonstrate how to create batch images using PyTorch style with color channels first.

  • 17:45:00 In this section, the video tutorial explores the composition of a PyTorch model and how convolutions work in practice. The instructor explains how the model, comprised of random numbers, adjusts these layers to best represent the data using comp2d layers. After passing some random data through one of these layers, the tutorial dives into the kernel size and how it determines the operation performed by the convolution. The instructor elaborates on the purpose of a convolutional layer, which is to ensure that this kernel is able to correctly perform the operation to provide the right output.

  • 17:50:00 In this section, the instructor explains the effects of changing the stride and padding values in a convolutional layer in PyTorch. A stride value of 1 means that the convolution hops over one pixel at a time, whereas a stride value of 2 hops over two pixels at a time, leading to a decrease in the output size. Meanwhile, adding padding to the edges of the image allows the kernel to operate on image information at the edges. The instructor also notes that when unsure about what values to set for the different parameters, it's common to copy existing values and adjust as needed. The section concludes with a demonstration of how to add a batch dimension to a test image and pass it through a convolutional layer in PyTorch.

  • 17:55:00 In this section, the video covers the convolutional and max pooling layers in PyTorch for deep learning and machine learning. The video demonstrates how to use PyTorch to create convolutional layers by passing a test image through a convolutional layer to produce an output. By playing with the values of kernel size, stride, and padding, users can observe how the output size changes. The video also covers the max pooling layer and shows how to create a sample max pooling layer with a kernel size of two.

Part 19

  • 18:00:00 In this section, the instructor demonstrates how to pass data through a convolutional layer and a max pool layer in PyTorch. They start by passing the test image through the conv layer first and then printing out the shape. They then pass the output of the conv layer through the max pool layer and again print out the resulting shape. The instructor explains that the max pool layer takes the maximum of a certain range of inner tensor and reduces the output size of the convolutional layer. They also demonstrate how the shapes will change if the values of the layers and parameters are altered.

  • 18:05:00 In this section, the instructor explains the concept of Max Pooling in Convolutional Neural Networks (CNNs). The goal is to compress the input data into a smaller feature vector that can be used for future predictions. Max pooling involves taking the maximum value of a certain section of input data to determine the most important feature in that region. The instructor demonstrates the effects of varying the kernel size for max pooling and how it affects the feature space. They also provide a visual example using a smaller random tensor to show the process of max pooling. Overall, max pooling is a useful technique for reducing data dimensionality while maintaining important features for predictions.

  • 18:10:00 In this section of the course, the instructor discusses the purpose of the max pool layer in a convolutional neural network, which is to compress the learned features from the convolutional layer into a smaller space, ultimately leading to a compressed representation of the input data that can be used to make predictions. The instructor also challenges the viewers to create a dummy tensor and pass it through the tiny VGG network that they constructed in the previous videos to see what happens to the shape of the dummy tensor as it moves through the convolutional blocks. Finally, the instructor explains that the purpose of replicating a model from somewhere else and passing data through it is a common practice in deep learning.

  • 18:15:00 In this section, the instructor provides an example of a forward pass in PyTorch and demonstrates how to deal with shape mismatch errors. They use a previously created image from the Fashion MNIST dataset and create a tensor of the same shape as the image. However, they get an error due to the tensor having an extra dimension for batch size. To fix this, they unsqueeze the tensor at dimension zero to add the batch dimension. They also ensure that the tensor is on the same device as the model and demonstrate how to troubleshoot and find the shapes needed for different layers of the model. The section concludes with the instructor recreating the model using the gathered information about shapes.

  • 18:20:00 In this section of the video, the instructor demonstrates a trick for debugging the shapes of the layers in a neural network model. By passing dummy data through the model and printing the shapes of the output at each layer, the instructor is able to determine where the shape mismatches are occurring and to identify issues with the classifier layer. The instructor then shows how to calculate the input and output shapes of convolutional layers by hand, but also asserts the benefits of using code to perform these calculations. Finally, the instructor uses the trick to ensure that the model is compatible with matrix multiplication rules and to confirm that the model can process data with the desired shape.

  • 18:25:00 In this section, the instructor discusses the input and output shapes of each layer in the model they’ve built. They pass a random tensor through the model and obtain an output shape of one and ten, since they have ten classes in their dataset. They then move on to setting up a loss function and optimizer for their second model and explain how they are going to be training their first convolutional neural network (CNN). They import an accuracy function, set up a cross entropy loss function, and keep the optimizer the same as before, torch.opt in SGD. They then demonstrate how to use their train step and test step functions to train model two, which they’ll cover in detail in the next video. Finally, they set up the training and testing functionality by performing the training step with the model and trying to learn on a data loader.

  • 18:30:00 In this section, the video focuses on training a convolutional neural network and measures the time it takes to train it using TQDM to measure the progress. They set up the accuracy function, loss function, optimizer, train data loader, and test data loader. They also measure the end time to know how long the code has taken to run. They had a code issue with a printout, but fixed it and successfully trained their first CNN, achieving a test accuracy of 88.5% in about 42 seconds. The video advises being aware that a better performing model generally takes longer to train.

  • 18:35:00 In this section, the instructor discusses the importance of comparing results and training time across different models in machine learning experiments. They introduce three model results dictionaries and create a data frame using pandas to compare the accuracy, loss, and training time of each model. They find that the convolutional neural network (Model 2) outperformed the other models with an accuracy of 88% and encourage viewers to experiment with different model architectures, hyperparameters, and training times for improved results. The instructor emphasizes the importance of considering the trade-off between model performance and speed in practical applications.

  • 18:40:00 In this section, the instructor discusses comparing the results of the three experiments conducted in the previous section using a data frame and a graph. The training time and accuracy are compared for each model, and the instructor notes that the training time will vary depending on the hardware used. The best performing model was the convolutional neural network, but it had the longest training time. The instructor suggests trying to make predictions on random samples from the test data set using the best performing model.

  • 18:45:00 In this section, the instructor discusses how to create a function called "make predictions" to evaluate a trained machine learning model. The function takes in a torch and end module type model, some data, and a device type. The goal is to take random samples from the test dataset, make predictions on them using the model, and visualize the predictions. The function prepares the sample by unsqueezing it and passing it to the target device. Then it does a forward pass on the model to get the raw logits and applies the softmax activation function to get the prediction probability. Finally, the prediction probabilities are turned into prediction labels, and the list of prediction probabilities relating to particular samples are stacked to turn the list into a tensor. The section ends with a demonstration of the function in action using test samples.

  • 18:50:00 In this section, the instructor explains how to randomly sample test data and create test labels to evaluate the model's predictions. The test data is not yet converted into a data loader, and the code samples nine random test data samples. The instructor emphasizes the importance of making predictions on random test data samples even after training the model to understand how the model is performing. The instructor also discusses how to convert prediction probabilities into prediction labels using argmax to take the index of the highest value in the probabilities.

  • 18:55:00 In this section, the instructor writes code to plot the predictions and images for random samples. The code creates a Matplotlib figure with three rows and three columns, and enumerates through each sample in the test samples. For each sample, a subplot is created and the target image is plotted. The prediction label and truth label are also found and converted to text form using class names and the pred classes and test labels indexes. Finally, a title is created for the plot and the color of the title text is changed to green if the prediction label is equal to the truth label, and to red if they are not equal.

Part 20

  • 19:00:00 In this section, the presenter discusses the importance of visualizing machine learning model predictions and shows how to plot the predictions of a trained convolutional neural network (CNN) on randomly selected samples from a test dataset. The presenter demonstrates how to plot images with their predicted and true labels and change the color of the title text depending on whether the prediction is correct or not. By analyzing the predictions, the presenter shows the potential confusions between label classes and suggests that visualizing predictions can provide insights into improving the labels or the model itself. The presenter then introduces the concept of confusion matrix as another way of evaluating model performance by comparing the predicted and true labels of a large set of test samples.

  • 19:05:00 In this section, the instructor discusses how to evaluate a multi-class classification model using confusion matrix in PyTorch. The confusion matrix is a visual representation that shows the performance of the model on different classes. The instructor explains how to use torch metrics for evaluation metrics and shows how to access confusion matrix evaluation metrics. Additionally, ML extend is used to plot the confusion matrix. Importantly, the instructor explains that Google Colab does not have the required version of ML extend, and version 0.19.0 is needed to ensure the proper installation of the package. Finally, the video shows how to make predictions across the test dataset and how to set the model into evaluation mode with torch inference mode as the context manager.

  • 19:10:00 In this section, the instructor demonstrates how to iterate through the test data loader to make predictions using PyTorch. The predictions are appended to a list and then concatenated into a tensor using torch.cat. The resulting tensor has one prediction per test sample. The instructor also installs torch metrics and shows how to use a try and accept loop to import it. Finally, the required version of ML extend is checked using an assert statement for use in the plot confusion matrix function.

  • 19:15:00 In this section, the instructor explains how to install and upgrade packages in Google Colab using the example of installing torch metrics and upgrading ML extend to version 0.19.0 or above. The instructor walks through the code and explains how to check if the installation went well and what to do if there are any errors that come up during the process, including how to restart the runtime if necessary. Once the installation is complete, the instructor moves on to explain how to create a confusion matrix with the predictions that were made across the entire test data set in the previous video.

  • 19:20:00 In this section, the instructor explains how to create and plot a confusion matrix to evaluate the performance of a deep learning model. First, the confusion matrix class from torch metrics and plot confusion matrix function from ML extend are imported. Then, a confusion matrix instance is set up by passing the number of classes as the length of the class names list. The confusion matrix tensor is created by passing in predictions and targets on the test dataset. Finally, the confusion matrix is plotted using the plot confusion matrix function by passing in the confusion matrix tensor and class names list, creating a visually pleasing diagonal showing correct predictions, and potential areas where the model is making errors.

  • 19:25:00 In this section, the instructor explains the importance of using a confusion matrix to evaluate a classification model's predictions visually, especially when the errors made by the model are significant, such as confusing two similar-looking data classes like shirts and coats. Using a confusion matrix is a powerful way to evaluate a model's performance and can help identify any issues with the existing labels. He also talks about the importance of saving and loading a trained model, especially when the model's performance is satisfactory. By saving the model to a file, it can be used elsewhere or reloaded to ensure it has been saved correctly. The instructor walks through how to create a model directory path and a model save path and then shows how to save the model state dict using the torch.save method.

  • 19:30:00 In this section, the instructor demonstrates how to save and load a PyTorch model. The state dictionary is saved, which represents all the learned parameters of the model after it has been trained. To load the saved model, a new instance is created with the same parameters as the original one. It is important to set up the loaded model with the same parameters as the original to avoid a shape mismatch error. The loaded model is then evaluated to ensure that it produces similar results as the original model. The instructor emphasizes the importance of evaluating a model after saving and loading it to ensure that it has been saved correctly.

  • 19:35:00 In this section, we see that the loaded model produces the same results as the previously trained model before it was saved, and we can use torch.is_close to programmatically check if model results are close to each other. The absolute tolerance level can be adjusted to ensure the results are similar enough, and if there are discrepancies, it's recommended to check if the model is saving correctly and random seeds are set up. The workflow for a computer vision problem is also discussed, from using reference materials and libraries like torchvision to evaluating the model and experimenting with non-linearity and convolutional neural network models to find the best one.

  • 19:40:00 In this section of "PyTorch for Deep Learning & Machine Learning", the instructor encourages viewers to practice what they have learned so far by going to the learn pytorch.io website and completing the exercises provided. The exercises focus on practicing the code and concepts covered in the previous sections, and there is also extra curriculum available for those who want to dive deeper into computer vision. Additionally, the section covers the topic of pytorch custom datasets and offers resources for getting help if needed, such as the pytorch documentation and stack overflow.

  • 19:45:00 In this section of the PyTorch course, it is discussed how to work with custom datasets, as well as the different domain libraries such as torch vision, torch text, torch audio, and torch rec. Domain libraries contain data loading functions for different data sources and come with built-in datasets such as torch vision datasets for pre-built vision datasets like Fashion MNIST, and customized ones. Each domain library also has a 'datasets' module that helps users to work with different datasets in different domains, and depending on the domain you are working in, such as vision, text, audio, recommendation, it's recommended to look into its custom library in PyTorch.

  • 19:50:00 In this section, the instructor discusses how to load custom data sets into PyTorch for use in building a computer vision model. The model they will build is called food vision mini, which will classify images of pizza, sushi, and steak. The instructor covers various steps involved in training a model, such as picking a loss function and an optimizer, building a training loop, and evaluating the model. They also explain how to transform data for use with a model and compare models with and without data augmentation. Finally, they show how to make predictions on custom data and provide a resource for accessing the video notebook in the PyTorch deep learning repo.

  • 19:55:00 In this section, the instructor discusses the process of getting your own data into PyTorch through custom datasets. They stress the importance of using domain libraries for data loading functions and customizable data loading functions and give examples of these libraries for various categories such as vision, text, audio, and recommendation. The instructor also demonstrates how to import the necessary libraries and set up device agnostic code for best practices with PyTorch. They show how to check for available CUDA devices and how to change the runtime type to use GPUs for faster processing. Finally, the instructor hints at obtaining data for working with in the next video.

Part 21

  • 20:00:00 In this section of the course, the instructor introduces the Food 101 dataset, which includes 101 different food categories with 101,000 images. However, to practice using PyTorch, the instructor has created a smaller subset of this dataset that includes three food categories and only 10% of the images. This smaller dataset has 750 training images, 250 testing images, and about 75 training images and 25 testing images per class. By starting with this smaller dataset, the goal is to speed up experimentation and reduce the time it takes to train models. The instructor provides a notebook on how to create this custom dataset and encourages students to start small and upgrade as necessary.

  • 20:05:00 In this section, the instructor explains the process of downloading and preparing an image dataset for PyTorch. The dataset includes images of pizza, steak, and sushi, which are stored in a folder called data. The instructor uses the Python requests library to download the data, and then unzips it into the data folder. The purpose of this section is to demonstrate how to load image data into PyTorch, which can be applied to any similar project. The instructor emphasizes the importance of having a separate directory for data, which can be located on a local computer or in the cloud.

  • 20:10:00 In this section, the instructor walks through how to use Python library, zipfile, to extract data from a zip file. They use the example of extracting a zip file containing images of pizza, steak, and sushi for a machine learning computer vision problem. The instructor demonstrates how to extract the contents of the zip file to a specific file path using zipfile.extractall() method. They also address an error in the code that resulted from copying the wrong link address from GitHub, emphasizing the importance of ensuring the correct link is used to download data sets. Overall, the process shown can be used to download and extract any custom data set for use in PyTorch. The next video will explore the data further.

  • 20:15:00 In this section, the instructor discusses the importance of becoming one with the data through data preparation and exploration. He shares a made-up quote from Abraham loss function, emphasizing the need to spend ample time preparing the data set. The instructor then walks through each directory of the downloaded sample data, which is in a standard image classification format. He uses the OS dot walk function to generate a directory tree for each directory, displaying information about the directories and images present in each one. Finally, the instructor sets up the training and testing parts and demonstrates the standard image classification setup for them.

  • 20:20:00 In this section, the instructor explains the standard image classification data structure where an overall data set folder contains training and testing folders with class-named subdirectories that contain respective images. The instructor notes that standardized ways of storing specific types of data exist as a reference for data formats. To prepare image data for use with PyTorch, code is written to convert the data into tensors. The instructor highlights the data format needed to classify dog and cat images, where the training and testing image directories hold respective class folders. The instructor also mentions plans to visualize an image and code to accomplish this that involves getting all image paths, picking a random image path, getting the image class name using the Pathlib module.

  • 20:25:00 In this section, the instructor explains how to open and manipulate images using the Python image library called Pillow. First, they generate a list of all image paths within a specific folder and use Python's random library to randomly select an image from this list. They then open and display the image while also extracting metadata about it. Additionally, the instructor provides an overview of the capabilities of the Torch Vision library, including methods for loading and processing images.

  • 20:30:00 In this section, the instructor demonstrates how to work with images using the PIL library and how to open and analyze image metadata. The image class is the name of the directory where the image data is stored, and the metadata is obtained using the Print function. The instructor then shows some random food images from the data set, including pizza, sushi, and steak, and explains the importance of visualizing images randomly to become familiar with the data set. The instructor provides a little challenge for the viewers, which is to visualize an image using Matplotlib for the next section.

  • 20:35:00 In this section, the instructor shows how to plot images and data with matplotlib and turn images into arrays using the NumPy method NP. The importance of understanding the shape of data is emphasized to prevent shape-mismatch issues. The default format for the pill library and matplotlib is the color channels last format, but PyTorch defaults to the color channels first format. The instructor also shows how to visualize different images and become familiar with the data, and to transform the data by converting it into PyTorch tensors to use with PyTorch.

  • 20:40:00 In this section, the instructor discusses the process of transforming target data into PyTorch tensors and creating PyTorch data sets and data loaders. Using the PyTorch documentation, the instructor shows how to create data sets with the image folder module and introduces the transform parameter which allows for the application of specific transforms on the data. The instructor then demonstrates how to create a transform for image data which resizes the images to 64x64 and flips them randomly on the horizontal plane to artificially increase the diversity of the data set. This is done using the transforms.compose method which takes a list of transforms as its argument.

  • 20:45:00 In this section, the instructor explains how to transform an image into a torch tensor using the transforms module in PyTorch. This is done with the "transforms.ToTensor()" function, which converts a PIL image or a NumPy array from height with color channels in the range 0 to 255 to a torch float tensor of shape color channels height width in the range 0 to 1. The instructor suggests trying to pass data through this transform and shows how to change the image shape using the "transforms.Resize()" function. The section concludes with a discussion of the various transforms available in the torchvision library, including data augmentation transforms, and a preview of the upcoming visualization code to explore the transformed images.

  • 20:50:00 In this section, the instructor demonstrates how to randomly sample images from a path, load and transform them, and then compare the original and transformed versions using PyTorch. The code uses the random seed function to set the seed for the random function and randomly samples k images from the list of image paths. The instructor then uses the matplotlib library to create a subplot with one row and n columns to plot the original and transformed images side by side. The transformed image needs to have its shape changed to fit the matplotlib library's preferred format of color channels last. Finally, the code sets the title of the original and transformed images and sets the super title to the class name of the image.

  • 20:55:00 In this section of the video, the instructor demonstrates how to use transforms to manipulate image data for deep learning models using PyTorch. The instructor sets the transform to be equal to the data transform, which means the image will be resized, randomly horizontally flipped, and converted to a tensor. They also show how to use the permute function to rearrange the shape of the data by swapping the order of the axes. The images are then plotted to show the original and transformed versions side-by-side, with the transformed images in tensor format, which is ideal for use with deep learning models. The instructor advises that the size of the image is a hyper parameter that can be adjusted, and encourages viewers to explore the wide array of transforms available in PyTorch.

Part 22

  • 21:00:00 In this section, the instructor explains how to load image data using the image folder option. They use the torch vision data sets module to show how to load in data in the generic image classification format. The prebuilt data sets function, image folder, is demonstrated which can then be used to load all custom images into tensors with the help of transforms. The instructor then shows how to transform the training data set and create a test data set, which will also be transformed in the same way as the training data set. Finally, they print out the created data sets.

  • 21:05:00 In this section, the instructor explains how to use the image folder function in PyTorch to load images into tensors and transform them using a pipeline, which can then be used with a PyTorch model. They demonstrate how to access and utilize various attributes that come with the prebuilt data loader, such as obtaining class names as a list or dictionary and checking the length of the data set. Additionally, they show how to visualize samples and labels from the training data set using indexing.

  • 21:10:00 In this section of the course, the instructor demonstrates how to convert an image dataset into tensor format, which is the default data format for PyTorch. They use an example of an image of pizza and show how to get its associated label and convert it to numeric format. They then print out some important information about the tensor data, like its data type and shape, which will be useful for troubleshooting later. Finally, they plot the image using matplotlib and set the title to the class name, which in this case is pizza. They encourage students to try this out with different images and explore different transforms.

  • 21:15:00 In this section of the video, the instructor discusses the process of converting loaded images into data loaders in PyTorch. A data loader can help turn data sets into iterables, allowing users to customize the batch size and view a specific number of images at a time. This is important because if all the images were loaded at once, there is a risk of running out of memory. Therefore, batchifying images helps in leveraging all the memory available. The instructor goes on to provide a step-by-step guide to create a train data loader and also introduces the concept of the number of worker parameters, which decides how many CPU cores are used to load data.

  • 21:20:00 In this section, the instructor discusses how to create and customize data loaders for training and testing data in PyTorch. He shows how to initialize data sets and customize hyperparameters for the data loaders, such as batch size and number of workers. The instructor also demonstrates how to iterate through the loaders and obtain information about the shape of the images and labels. The section concludes with a summary of the loading process and a recommendation to build a convolutional neural network to identify patterns in image tensors.

  • 21:25:00 In this section, the instructor discusses the process of creating a custom data loading class to load image data into Tensor format. The goal is to replicate the functionality of image folder through this custom class, as it is good practice and may be necessary in cases where a prebuilt function does not exist. The instructor lists the steps required for the custom class to load images, get class names as a list, and get classes as a dictionary from the data set. The pros and cons of creating a custom data set are also discussed, including the flexibility to create a data set out of almost anything, but also the potential for errors and performance issues. The instructor then imports the necessary modules to create the custom class, including OS and path lib to work with the file system.

  • 21:30:00 In this section, we are learning how to create a custom dataset in PyTorch and are focused on writing a function to get class names from the target directory. The custom dataset will be used to load images from a file, get class names from the dataset, and get classes as a dictionary from the dataset. The function will use OS scanner to traverse the target directory and get class names and raise an error if the class names are not found, indicating an issue with the directory structure. We will later subclass the torch.utils.data.Dataset to create a custom dataset.

  • 21:35:00 In this section of the video, the instructor demonstrates how to create a function called "find_classes" that takes in a directory as a string and returns a list of class names and a dictionary that maps the class names to integers. The function utilizes the OS scanner to scan the target directory and get the class names. The instructor also demonstrates how to raise an error if the class names cannot be found. The function can be used for any given directory, and it replicates the functionality done before for the training directory.

  • 21:40:00 In this section of the video, the instructor explains how to create a custom data set by sub-classing torch.utils.data.dataset. The data set should represent a map from keys to data samples, where keys refer to targets or labels, and data samples in this case are food images. The sub-class should override the get item method, which fetches a data sample for a given key and optionally overwrite the len method to return the size of the data set. The instructor walks through the steps to build the custom data set and explains how to use the helper function we created in the previous section to map class names to integers.

  • 21:45:00 In this section, the instructor explains how to create a custom data set in PyTorch. To do this, we need to subclass the torch.utils.data.Dataset class and initialize our custom data set by passing the target directory (where the data resides) and transform (to perform any data transformations). Additionally, we need to create several attributes such as paths, transform, classes, and class to ID X. Furthermore, we need to create a function to load images, override the LAN method to return the length of our data set, and the get item method to return a given sample when passed an index. Finally, we write a custom dataset by implementing pathlib and passing target directories like the test or train directories.

  • 21:50:00 In this section, the instructor explains the different steps involved in creating a custom PyTorch data set class. The first step involves getting all of the image paths that follow the correct file name convention. The next step is to create a set of image transforms, which can be optional. A function to load images is also created, which takes in an index and returns an image. The instructor then shows how to override the "len" method to return the total number of samples in the data set, which is optional. Lastly, the "get item" method is overwritten so that it will return a particular sample if passed an index.

  • 21:55:00 In this section, we learn about subclassing torch.utils.data.Dataset to customize the way data is loaded in PyTorch. The class is initialized with a root directory and a dictionary mapping class names to indices. The __len__ method returns the length of the dataset while the __getitem__ allows indexing into the dataset to return a tuple of a torch tensor image and the corresponding integer label. The class also has an optional transform parameter to apply transformations on the image before returning the tuple. The advantages of subclassing torch.utils.data.Dataset are the customization capabilities it offers, but it requires writing significant amounts of code that may be prone to errors.
PyTorch for Deep Learning & Machine Learning – Full Course
PyTorch for Deep Learning & Machine Learning – Full Course
  • 2022.10.06
  • www.youtube.com
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.✏️ Daniel Bourke develo...
 

PyTorch for Deep Learning & Machine Learning – Full Course (description of parts 23-26)


PyTorch for Deep Learning & Machine Learning – Full Course

Part 23

  • 22:00:00 In this section, the instructor shows how to create a custom dataset in PyTorch using the Torchvision module. They create a transform to convert raw JPEG images into tensors and set up train and test transforms compose. They then test out the custom image folder class and see if it works on their own custom dataset. They check the length and class attributes to ensure it's working properly. Finally, they inspect the train data custom and test data custom to verify that everything is working as expected.

  • 22:05:00 In this section of the PyTorch for Deep Learning and Machine Learning full course, the instructor demonstrates how to check for equality between the original image folder dataset and the custom dataset he created in the previous section by comparing the class to ID X of both datasets. He affirms that we can replicate the main functionality of the image folder dataset class by creating our own custom dataset loading function, and the takeaways from this are that PyTorch provides a base dataset class to inherit from, and as long as we override the land and get item methods and return some sort of values, we can create our own data set loading function. The instructor proceeds to explain how to create a function to display random images from the trained data custom class to visualize our data.

  • 22:10:00 In this section, the instructor walks through the steps for creating a function called "display random images" that takes in a "data set", "classes", and "n" as parameters. The function displays n number of randomly selected images from the data set and prints out its shape if "display shape" is set to true. The instructor also goes through the implementation details such as adjusting the display if n is greater than 10 and setting the random seed for reproducibility. Additionally, the function loops through the random sample images or indexes and plots them with Matplotlib.

  • 22:15:00 In this section, the instructor continues to build their deep learning model in PyTorch by setting up the plot and making sure the dimensions of their images line up with matplotlib. They adjust their tensor dimensions for plotting, add a subplot to their matplotlib plot, and adjust the title of the plot based on the classes variable, a list of classes. They then create a function to display random images from the image folder, both for the inbuilt pytorch image folder and the custom data set created by the instructor. Finally, they adjust the seed and plot the images to see how their code for adjusting the plot makes sense.

  • 22:20:00 In this section of the transcript, the instructor demonstrates how to turn a custom loaded image data set into data loaders, which is a necessary step to batchify the images and use them with a model. Using the torchvision.datasets.ImageFolder and torchvision.transforms, the custom data set is turned into a tensor format. The next step is to turn data set into data loader using torch.utils.data.DataLoader. The instructor sets the batch size to 32 and the number of workers to 0 for both the train data loader custom and test data loader custom. The difference between them is that the train data will be shuffled while the test data will not be shuffled.

  • 22:25:00 In this section, the video covers custom data loaders and transforming data in PyTorch. The instructor first resets the OOS CPU count and sets the numb workers to zero to ensure that the training runs smoothly. After setting up the custom data loader, he uses the print function to check the image shape and batch size, which were set to 32 by the transform that was previously set up. The instructor also explains how data augmentation can artificially increase the diversity of the training data set, and demonstrates the various ways in which data can be transformed using the torch vision transforms module, including resize, center crop, grayscale, random transforms, and random augment.

  • 22:30:00 learned about data augmentation, which is the process of artificially adding diversity to your training data by applying various image transformations. This helps make the model more generalizable to unseen data. There are many different kinds of data augmentation such as cropping, replacing, shearing, and more. PyTorch has a torch vision package that includes primitives or functions that can help train models to perform well. By using data augmentation and other improvements, PyTorch has been able to train state-of-the-art models with high accuracy, such as the ResNet 50 model.

  • 22:35:00 In this section, the instructor discusses ways to improve model accuracy, such as learning rate optimization, training for longer, and using different augmentation techniques. The instructor focuses on the trivial augment technique, which leverages the power of randomness to change images in various ways using a number of magnitude bins. The instructor demonstrates how to implement trivial augment using the PyTorch torch vision transforms library and provides a link to the paper for those interested in reading more about it. Additionally, the instructor advises trying out different augmentation techniques and experiments to see what works best for individual problems. Finally, the instructor shows how to test the augmentation pipeline by getting all image paths and globbing all the files and folders that match a specific pattern.

  • 22:40:00 In this section, the video demonstrates the use of trivial augment, a data augmentation technique, in transforming images to artificially add diversity to a training dataset. The power of randomness is harnessed by selecting from different augmentation types and applying them with varying levels of intensity. This section shows how trivial augment is applied to randomly transformed images, and the results are displayed. The objective is to enable the machine learning model to learn the patterns of the manipulated images and be able to identify them accordingly. The next section focuses on building the first computer vision model without data augmentation using the tiny VGG architecture.

  • 22:45:00 In this section, the presenter goes through the process of creating transforms and loading data for a PyTorch model. The goal is to load images from the data folder, in this case, pizza, steak, and sushi, and turn them into tensors. The transforms include resizing the images to 64x64 and converting them into tensors so that the values are between 0 and 1. The presenter also explains how to create data loaders and adjust the batch size and number of CPU cores dedicated to loading the data. The batch size used in this example is 32.

  • 22:50:00 In this section, the instructor explains how to load in and transform data using PyTorch's DataLoader. The process involves creating a transform and then loading and transforming the data at the same time using the DataLoader function. The instructor also provides a simple code for building the Tiny VGG architecture from scratch, which includes creating the first COM block that consists of layers like COM, ReLU, MaxPool, and ComToD. The model is initialized with input shape, hidden units, and output shape parameters. The instructor encourages learners to experiment with different values for hyperparameters such as kernel size and stride.

  • 22:55:00 In this section, we see the creation of a convolutional neural network using PyTorch. We start by defining the convolutional blocks and max pooling layers for the network. Then, we replicate the same block to create another one and change the input shape to match the output shape. Following that, we create a classifier layer to turn the output of convolutional blocks into a feature vector and pass it through a linear layer to output ten classes. Finally, we override the forward method to pass the data through the convolutional blocks and print out its shape at each step. The forward method could also be rewritten to include operator fusion, which speeds up the GPU computation.

Part 24

  • 23:00:00 In this section, the instructor discusses the concept of operator fusion, which is the most important optimization in deep learning compilers. They also create a model using the tiny VGG architecture for RGB color images and check its input and output shapes. The instructor highlights the importance of using operator fusion to speed up the computation of large neural networks by avoiding transportation between memory and compute. They also suggest passing dummy data through the model to troubleshoot it and ensure the forward method is working correctly. Finally, an error message is shown when trying to pass the image batch to the model due to a mismatch in input types.

  • 23:05:00 In this section, the instructor is troubleshooting a shape error for the model. They explain that matrix multiplication rules need to be satisfied when passing tensors through linear layers. The instructor investigates the matrix shapes and determines that 10, which represents the number of hidden units, is causing an issue when multiplied by 2560. They use the previous layer's output shape to determine that 10 should be multiplied by 16x16 to get 2560. After correcting this and verifying that the model's shapes align with the output of the CNN explainer, they move on to further troubleshooting and eventually discover that removing padding from the convolutional layers will align the shapes with the CNN explainer's output.

  • 23:10:00 In this section, the instructor introduces Torch Info, a package that allows users to print out a summary of their PyTorch models. First, the instructor comments out the print statements from the forward method and installs Torch Info in Google CoLab using the pip install command. The instructor then imports summary from Torch Info and uses it to pass in the model and an input size to get the shapes of the data flowing through the model. The instructor shows how Torch Info prints out a summary of the model, including the layers and their corresponding shapes.

  • 23:15:00 In this section, the speaker discusses the torch info package, which is used to give an idea of the input and output shapes of each layer in a PyTorch model. They explain that the package also provides information on the number of parameters in each layer, which can be helpful in determining the model size and storage constraints for future applications. The speaker notes that as a model gets larger and has more layers, it will have more parameters, resulting in a bigger input size and estimated total size. In the next section, the speaker moves towards training a custom data set, and creates two functions - train step and test step - that are generic and can be used with almost any model and data loader. The train step function takes in a model, a data loader, a loss function, and an optimizer, and puts the model in train mode while setting up evaluation metrics.

  • 23:20:00 In this section, the speaker discusses setting up train loss and train accuracy values for a train loop function in PyTorch. The data is looped through using the data loader, and for each batch, the forward pass is done to make predictions and calculate the loss. The optimizer is then used to perform backpropagation and take a step. Once the train loop is complete, accuracy is calculated by getting the predicted class and comparing it to the correct labels. This is done outside of the batch loop, where the train loss and train accuracy are adjusted to get the average per epoch across all batches. The speaker then challenges viewers to write a test loop function.

  • 23:25:00 In this section, the instructor goes through the process of creating a test step for evaluating the performance of a PyTorch deep learning model on a dataset. The step involves setting up the model in evaluation mode, looping through the batches of the dataset, sending the data to the target device, doing a forward pass, calculating the loss and accuracy per batch, accumulating the loss and accuracy, and adjusting the metrics to get the average values. The instructor then suggests creating a train function to functionize the process of training the model, which will be covered in the next section.

  • 23:30:00 In this section, the instructor walks through the process of creating a train function that combines the train step and test step functions. The idea is to create a function to call both of these functions to train and evaluate a model with just one function call. The train function takes in a range of model parameters, including optimizer, data loaders, loss function, and others. The instructor then creates an empty dictionary to help track the model's performance as it trains, including train and test loss and accuracy. They then loop through the epochs, calling TQDM to get a progress bar while the model is training. The train function is a useful tool for not having to rewrite code when training more models and leveraging existing code.

  • 23:35:00 In this section, the instructor explains the train function which keeps track of the training and testing using train and test step functions respectively. The function will run for a specified number of epochs, and for each epoch, it will print out the train and test loss and accuracy using a fancy print statement. The results will be stored in the results dictionary to be later used for analysis. The train function will leverage the train step function and test step function to update the model and test it respectively. The function will return the results of the epoch.

  • 23:40:00 In this section, the instructor reviews the progress made in the PyTorch workflow, which includes data preparation, building and selecting a model, building a training loop, and now the challenge to create a loss function and an optimizer. Moving onto section 7.7, the instructor walks through how to train and evaluate model zero, the baseline model, on their custom data set. They set the random seeds for reproducibility, instantiate the tiny VGG model with an input shape of three for color images, set the number of hidden units and the output shape to match the number of classes in their training data set. They also select the cross-entropy loss function for multiclass classification and try the Adam optimizer with a learning rate of 0.001.

  • 23:45:00 In this section, the instructor shows how to time the training process of a deep learning model. They first import the default timer class from time, and then start the timer before training model zero using the train function from a previous video. They then set the train data to train data loader simple and the test data to test data loader simple, as well as the optimizer to the FriendlyAtomOptimizer and the loss function to the n cross entropy loss. The model is trained for five epochs and the timer is ended to display the total training time. The instructor then shows the accuracy results of the model on the training and test sets, which are around 40% and 50% respectively. They suggest trying different methods to improve the model, such as adding more layers or hidden units, fitting for longer, changing activation functions, and adjusting the learning rate.

  • 23:50:00 In this section, the instructor explains how to plot loss curves to track the progress of the model over time. A loss curve is a way of tracking a model's progress over time by visualizing the loss value on the left and the steps on the bottom axis. By plotting the train and test loss and accuracy values of our results dictionary using matplotlib, we can see how our model is performing and evaluate it. The instructor writes a function called "def plot loss curves" that takes in a results dictionary containing the loss and accuracy values as a string and a list of floats.

  • 23:55:00 In this section, the instructor shows how to create loss curves for both training and testing data using epochs as the metric of time. The plot consists of two subplots, one for loss and another one for accuracy, with labels and titles for each of them. The ideal trend for a loss curve is for the loss to decrease over time, and the accuracy to increase. The instructor encourages the viewer to experiment with additional epochs to see whether the loss reaches the optimal value. The next video will cover different forms of loss curves, and the instructor recommends a guide on interpreting loss curves.

Part 25

  • 24:00:00 In this section of the PyTorch for Deep Learning & Machine Learning course, the instructor discusses loss curves and their importance in evaluating a model's performance over time. A loss curve should show a trend of decreasing loss over time and increasing accuracy. There are different forms of loss curves, and an ideal loss curve shows that the training and test loss decrease at a similar rate. Underfitting occurs when the model's loss could be lower, while overfitting occurs when the model learns the training data too well, leading to a lower training loss than the testing loss. The instructor provides extra curriculum from Google's loss curve guide and discusses methods to address overfitting, such as regularization techniques and reducing model complexity.

  • 24:05:00 a few ways to reduce overfitting in your deep learning model that were discussed in this section. Getting more data through data augmentation or better data quality can help your model learn more generalizable patterns. Using transfer learning by taking patterns learned from pre-trained models and applying them to your own data set can also be effective. Simplifying your model by reducing the number of layers or hidden units can also help. Learning rate decay can help by decreasing the learning rate over time, and early stopping can stop the training process before overfitting occurs.

  • 24:10:00 In this section, the concept of early stopping is discussed as a method for dealing with overfitting in machine learning. Before the testing error starts to increase, the model's testing error is tracked, and the model is stopped from training or the weights/patterns are saved where the model's loss was the lowest. Different methods to deal with underfitting are also explored, such as adding more layers/units to the model, tweaking the learning rate, training for longer, and using transfer learning. The balance between overfitting and underfitting is emphasized, and the importance of evaluating a model's performance over time using loss curves is highlighted. Finally, ways to prevent over-regularizing the model and ending up with underfitting are discussed, and the goal of achieving a just-right balance between underfitting and overfitting is emphasized.

  • 24:15:00 In this section of the video on PyTorch for deep learning and machine learning, the instructor discusses the concept of overfitting and underfitting in models, along with ways to deal with them. Data augmentation is one of the methods introduced to deal with overfitting, where images are manipulated to increase the diversity of the training data set. The instructor then goes on to demonstrate how to create a transform with data augmentation and load data using those transforms to create train and test data sets and data loaders. The video emphasizes the importance of trying different models with various tweaks and transformations to find the best fit for a particular problem.

  • 24:20:00 In this section of the video, the instructor walks through the process of creating a data set and data loader using PyTorch transforms and the ImageFolder class. They provide code examples and encourage viewers to test it out on their own if they'd like. The data set is created from images of pizza, steak, and sushi for both the training and testing folders. The instructor also discusses the importance of being clear with variable names when working with similar names throughout the notebook. They set up the data loaders for both the training and testing data sets, with the training data set being augmented with the trivial augment wide function. The instructor then suggests that viewers construct and train model one using the tiny VGG class and the train function.

  • 24:25:00 In this section of the PyTorch full course, the instructor guides the viewer through the process of creating and training a new model using the same architecture as before, but with augmented training data. The goal is to compare the performance of this model to the baseline model without data augmentation. The instructor uses the class previously created for the tiny VGG model and sets a manual seed for reproducibility. They then define the loss function and optimizer, set the hyperparameters, and start the timer. Finally, the instructor trains the model by calling the previously created train function, passing in the model and data loaders, and evaluating the results.

  • 24:30:00 In this section, the instructor continues the training of the second model with data augmentation and shows that it did not perform as well as the first model without data augmentation due to the fact that the loss was already going down and there wasn't much overfitting. The instructor then introduces a function to plot loss curves and uses it to evaluate the performance of the second model. The loss curve shows that the model is underfitting and possibly overfitting, indicated by the higher test loss compared to the training loss. The instructor then poses the question of what can be done to address both underfitting and overfitting in the model, suggesting options such as getting more data, simplifying the model, using transfer learning, or adding more layers.

  • 24:35:00 In this section, the instructor discusses the importance of comparing model results and provides some tools, such as PyTorch plus TensorBoard and weights and biases, to track different experiments. However, he emphasizes that this course will focus on just pure PyTorch for now. The instructor then sets up a plot to compare the model results side by side, using data frames for each of the model results. He also suggests trying an experiment to train model zero for a longer duration to see if it improves. Ultimately, comparing different experiments and their metrics against each other visually is crucial to improving models.

  • 24:40:00 In this section, the instructor uses subplots to compare different metrics across two models they experimented with. They begin by creating a range for the number of epochs and then create a plot for train loss by using PLT.subplot() and PLT.plot(). They do the same for test loss and accuracy for both training and test data. The instructor points out that model one, which implemented data augmentation, seems to be overfitting at this stage, while model zero is performing better in terms of loss. The instructor suggests that if they had more models to compare, they could potentially turn this into a function, but also notes that tools like TensorBoard, weights and biases, and MLflow can aid in making sense of these graphs when numerous experiments are conducted.

  • 24:45:00 In this section, the speaker discusses the importance of evaluating models based on how well they perform on the testing dataset as opposed to just the training dataset. They suggest that while metrics on the training dataset are good, the ultimate goal is to have the model perform well on unseen data. The speaker recommends training the models for longer and possibly adding more hidden units to each layer to achieve better results. They then move on to demonstrating how to make predictions on custom images that are not in the training or testing dataset, using a food recognition app as an example. They explain the workflow for downloading a custom image and making a prediction using a trained PyTorch model, but caution that the current model may not have great performance.

  • 24:50:00 In this section, the instructor shows how to download a custom image of a pizza and prepare it for prediction using the model they've trained. The image is downloaded using a raw GitHub URL and saved to the data folder. The instructor notes that the custom image must be in the same format as the data that was used to train the model, specifically Tensor form with data type torch float 32 and a shape of 64 by 64 by three. They demonstrate how to load the image into PyTorch using the torch vision package and the read_image function, which reads a JPEG or PNG into a three-dimensional RGB or grayscale tensor.

  • 24:55:00 In this section, the instructor demonstrates how to read a custom image into PyTorch using torch vision.io and convert it to a tensor. He also shows how to get metadata about the image, such as its shape and data type. The instructor notes that before passing the image through a model, it may need to be resized, converted to float32, and put on the right device. In the next section, he plans to demonstrate how to make a prediction on the custom image using a PyTorch model.

Part 26

  • 25:00:00 In this section, the instructor discusses the importance of data types and shapes in deep learning and how to fix errors related to them. The instructor attempts to make a prediction on an image but runs into errors because the custom data is not of the same data type that the model was originally trained on. They show how to fix the error by recreating the custom image tensor and converting it to torch float 32. The instructor then faces another issue with the shape of the custom image and shows how to fix it by creating a transform pipeline to resize the image to the same size that the model was trained on.

  • 25:05:00 In this section, the instructor shows how to use PyTorch's transforms package to transform an input image and prepare it for use by a deep learning model. They demonstrate how to apply a transformation pipeline to a custom image, which results in the image being compressed and pixelated. The instructor notes that this could potentially impact the accuracy of the model and recommends experimenting with larger image sizes to improve performance. They also discuss the importance of ensuring that tensor dimensions align with the model's requirements, including adding a batch dimension to a custom image before passing it through the model for inference.

  • 25:10:00 In this section of the video, the presenter demonstrates how to make predictions on custom image data using a PyTorch model. They highlight the importance of formatting the data correctly and ensuring that it has the same data type, shape, and device as the model was trained on to avoid errors. The presenter also shows how to convert the raw outputs of the model, or logits, into prediction probabilities using the softmax function. Although the model used in the example doesn't perform well, the process of predicting on custom data is illustrated.

  • 25:15:00 In this section of the video, the instructor shows how to functionize the custom image prediction process. This function takes a PyTorch model, an image path, a list of class names, a transform, and a device as inputs. It loads in the image using TorchVision, formats it, gets the prediction labels, and plots the image with its prediction as the title. The instructor challenges the viewers to try building this function on their own and then goes through a possible implementation in the video. The function is not fully implemented in this section and will be continued in the next video.

  • 25:20:00 In this section, we see how to make a prediction on custom data using PyTorch. First, we need to scale the image data to be between 0 and 1 so that our model can process it properly. Then, we check if any transformations are necessary and pass the image through them if they are. Next, we ensure that the model is on the correct device and put it in inference mode. We also add an extra dimension to the image to reflect the batch size of 1 that our model will predict on. Then we make a prediction, convert the raw logits to prediction probabilities using softmax, and then convert those to prediction labels using argmax. Finally, we create a plot of the image alongside its prediction and prediction probability. If a list of class names is provided, the function will replicate the class names for each prediction in the plot.

  • 25:25:00 In this section, the instructor explains how to create a function that can take in images and display their predicted class using a pre-trained PyTorch model. The function can take in a list of class names for labeling and also displays the prediction probability. The instructor then demonstrates using this function on custom images and a pre-trained model, explaining the importance of putting the result on the CPU for compatibility with matplotlib. Despite the poor performance of the model, the instructor emphasizes the power of visualizing results.

  • 25:30:00 In this section, the instructor summarizes the main takeaways from the previous section, which covered how to predict on custom data with PyTorch. The main points to remember are that data must be preprocessed to match the model's expected format, including correct data type, correct device, and correct shape. PyTorch has many built-in functions for handling different data types, and users can write their own custom dataset classes if necessary. Additionally, the instructor highlights the importance of balancing overfitting and underfitting when training models and mentions several resources for further learning and practice, including exercises and extra-curricular materials.

  • 25:35:00 In this section, the instructor encourages learners to go through the PyTorch custom data sets exercises template first and try to fill out all the code on their own. In case they get stuck, they can refer to the example solutions provided by the instructor. The solutions offered are just one way of doing things, and users are free to reference them and compare with their implementation. The solutions and errors encountered during the process can also be seen in the live walkthroughs available on YouTube. The instructor reminds users that they have covered a lot of exercises and can check out the extras exercises and solutions in the PyTorch deep learning repo. The instructor concludes by mentioning that there are five more chapters available at learnpytorch.io, which learners can explore to learn more about transfer learning, pytorch model experiment tracking, pytorch paper replicating, and pytorch model deployment.
PyTorch for Deep Learning & Machine Learning – Full Course
PyTorch for Deep Learning & Machine Learning – Full Course
  • 2022.10.06
  • www.youtube.com
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.✏️ Daniel Bourke develo...
 

No Black Box Machine Learning Course – Learn Without Libraries



No Black Box Machine Learning Course – Learn Without Libraries

00:00:00 - 01:00:00 In this YouTube video, the instructor presents a No Black Box Machine Learning Course that teaches how to code in machine learning without relying on libraries. The course covers topics related to building a web app that recognizes drawings, including data collection, feature extraction and visualization, and implementing classifiers such as the nearest neighbor and K nearest neighbor. The instructor emphasizes the importance of understanding data in machine learning and suggests resources for those who need to brush up on high school math and programming experience. The video demonstrates the process of creating a web page that acts as a data creator using JavaScript without any external libraries. The presenter also includes instructions on how to create an undo button and a name input field, store drawings in a data object, and save the paths on the user's computer. Finally, the video shows how to create a dataset generator in node.js and generate data associated with each sample using JavaScript.

01:00:00 - 02:00:00 In this YouTube video, the instructor teaches viewers how to create a machine learning dataset and extract features without using libraries. They demonstrate how to store the dataset in a folder that can communicate between node scripts and web apps and create a data viewer app. The instructor also shows how to visualize collected data using Google charts and how to identify and emphasize selected items in the chart and list. Overall, the video provides a comprehensive guide for learners to create machine learning datasets and extract features using only JavaScript.02:00:00 - 03:00:00 The "No Black Box Machine Learning Course – Learn Without Libraries" video demonstrates how to classify drawings based on their features without using machine learning libraries. The video creator emphasizes the importance of having a fast and responsive system for inspecting data to avoid manual errors. They demonstrate how to add features to the chart, how to hide the background, and how to display predicted labels on screen using dynamic containers with HTML and CSS. The video also covers data scaling techniques such as normalization and standardization. Finally, the video shows how to implement the K nearest neighbors classifier and count the number of each label within the K nearest neighbors.

03:00:00 - 03:50:00 The YouTube video "No Black Box Machine Learning Course - Learn Without Libraries" covers various topics related to K-nearest neighbor classification without using machine learning libraries such as JavaScript and Python. The video explains how to split data sets into training and testing sets, handle training and testing samples separately, and normalize the data. The instructor also discusses the importance of decision boundaries in understanding how a classifier operates, demonstrates how to implement a K-nearest neighbor (KNN) classifier in JavaScript, and generate a pixel-based plot without using machine learning libraries. Finally, the video ends with a call for viewers to explore additional capabilities of Python and reflect on what they've learned so far.

Part 1

  • 00:00:00 In this section, the speaker introduces the no black box machine learning course which focuses on coding without relying on libraries. The course covers various topics for building a web app that recognizes drawings, including data collection, feature extraction and visualization, and implementing classifiers such as the nearest neighbor and K nearest neighbor. The speaker emphasizes the importance of understanding data in machine learning and provides a short break for students to focus on homework while also suggesting resources for brushing up on high school math and programming experience. The course then moves to phase two where more advanced methods such as neural networks will be covered. An example of building a drawing app for data collection is also demonstrated with undo and save functionalities.

  • 00:05:00 In this section of the video, the instructor walks through the process of creating a web page that will be used as a data creator for a data set. They begin by creating a new folder named web and inside this folder, they create the first file, a web page called Creator.html. The page includes basic HTML, a title section, and an external style sheet called Styles.css. They also add basic styles for the page, including the font family and background color. The instructor then moves on to implement the sketchpad using an external JavaScript file called sketchpad.js and defines the sketchpad class constructor to hold the canvas element.

  • 00:10:00 In this section, the instructor sets up a canvas using JavaScript and adds an "onmousedown" event listener to detect mouse actions. They obtain the mouse coordinates by getting the rectangle of the canvas bounding area and subtracting the left and top sides respectively. After rounding the coordinates to integers, the instructor creates a path array that contains the mouse coordinates when the canvas is clicked. They also set "drawing" to false and "path" to empty. Another event listener is added for "onmousemove" to continue adding more points to the path array as the mouse is moved.

  • 00:15:00 In this section, the speaker explains how to implement mouse events to draw on a canvas using JavaScript, without the use of libraries. By using event listeners for "onMouseMove" and "onMouseUp," the code tracks mouse movements and adds the location to a path if the user is drawing. Additionally, a new "get mouse" function is created to add the location to the canvas. Finally, the speaker demonstrates how to create a "draw" utility object to clear and draw the path onto the canvas.

  • 00:20:00 In this section, the video instructor continues building a drawing program without any external libraries by addressing some issues with the drawn lines, such as corner appearance and straight line endings. They then proceed to create a function for drawing multiple paths and incorporate it into the program. The instructor encounters some issues when running the program on a mobile device due to the viewport, and they fix it by using a meta tag in the head section of the HTML file.

  • 00:25:00 In this section, the tutorial focuses on making the canvas fit on smaller screens like those of mobile devices by adding specific commands to the viewport meta tag in the HTML code. However, the event listeners for touch are different from those for mouse, necessitating a modification of the sketch pad with event listeners for touch. To further enhance the canvas, an undo button is created, but only when there are paths to undo. The button is disabled when the canvas is empty.

  • 00:30:00 In this section, the video explains how to improve the appearance of the button by changing the style in the CSS file. The narrator adds a hover effect and sets the styles for the disabled state. Next, we learn how to create an input field for users to enter their name and a button to advance to the next drawing. The video also explains how to collect data from these fields and store them in an object with three fields: student, session, and drawings. Finally, the narrator begins to implement the start function which will be used to initiate the drawing process.

  • 00:35:00 In this section of the video, the presenter is showing how to implement a drawing app using JavaScript without using any libraries. They start by defining an index for the labels of the things they want to draw, such as a car, fish, house, etc. They also add a field for instructions and modify the start button to change to "next" after the first drawing. They then implement a function for the "next" button, which increases the index, gets the next label, and updates the instructions. They also store the drawings in a data object for the specific label and add a public method to reset the sketch pad. The presenter tests the app and shows that the data object is collecting the drawings.

  • 00:40:00 In this section, the instructor explains how to save the paths drawn by users locally on their computer. They create an "a" element with the href attribute set to "data plain text," and they encode URI component using the stringified version of the data. The data collected is saved as a JSON string in a file with a unique name generated from a timestamp. Finally, the download action is triggered to download the file. The instructor also adds instructions on what to do with the downloaded file and states that this will make more sense after the next lecture.

  • 00:45:00 In this section, the instructor shows how to fix a potential issue with the sketch pad by adding an event listener on the document instead of the canvas. He also asks viewers to help test the system on different devices and report any issues or propose solutions. The instructor then explains how to process the collected data into a more manageable form using node.js and shows how to navigate to the project directory and create a new folder for storing the data. Finally, he creates a "raw" folder where he pastes all the data collected from almost 500 student submissions, each containing eight different drawings, and explains how he will process these files to create a data set where each sample is a drawing.

  • 00:50:00 +Alt+M and the Json file will be formatted nicely. In this section, the instructor explains how they will create a dataset generator in nodejs to process the samples and visualize them using two separate folders: one for Json representations and the other for images. The script will read file names from the raw data directory, extract content from them, and store information about each sample such as their ID, label, student name and student ID, session, and drawing. Finally, the section briefly shows how to run and test the code, resulting in the creation of a samples Json file in the designated directory.

  • 00:55:00 In this section, the speaker explains how to generate data associated with each sample using JavaScript. This involves writing files into a Json directory and stringifying the drawing of each specific label. The speaker then demonstrates how to generate an image representation of each drawing using a canvas and the 'draw paths' function from a common directory. To do this, the speaker exports the 'draw' object from the 'draw.js' file to be used in the data set generator, and installs the canvas library using the node package manager.


Part 2

  • 01:00:00 In this section, the instructor shows how to create a canvas and use it to draw paths on the canvas, then store it as an image. They also clear the canvas before drawing new paths. After generating the image files, the instructor fixes a problem in the drawing app caused by the module not being defined in draw JS. They use a structure that will be used throughout the course, separating constants in another file and requiring it. The instructor adds a progress indicator in a new file called utils by creating the utils object and adding the function called print progress. They use process STD out to get the standard output, calculate the percent using the function for formatting a percent, and write it to the standard output to show the progress indicator.

  • 01:05:00 In this section, the video creator explains how to store the generated dataset in a way that the browser can read it. He creates a folder called "JS_objects" which will contain files that can communicate between the node scripts and the web apps. A "samples" JavaScript file is created which will initialize a JavaScript object inside the JS_objects folder. The video creator also mentions that he will create a viewer app for the data set and creates an HTML file called "viewer.html" with basic HTML code. The head section of the file includes a meta tag for supporting UTF characters and a title for the page. The body section includes an H1 tag with the title "Data Viewer" and a div with an ID of "container" to hold the data set. The "samples" JavaScript file is included in the HTML file.

  • 01:10:00 In this section, the instructor is working on creating a table with samples grouped by student ID. To do this, they implement a "group by" function in the utils.js file, which groups an array by a given key. Then they log the groups to the console to check if it's working. Next, they create a function called "create row" in a separate display.js file, which takes in a container, a student name, and samples as parameters and creates a row with the name on the left and samples on the right. They create a loop to iterate through each student ID, call the "create row" function, and pass in the necessary parameters to display the data in a table format.

  • 01:15:00 In this section, the instructor is showing how to dynamically create a row of images with labels and align them properly with CSS. They start by looping through a set of image samples, creating an image element and assigning the source and style attributes. The images are then appended to a row while a label div is created and appended to a sample container. The sample container is then wrapped with a div, which is given an ID and a class. The instructor then refines the CSS to center the labels and images and add ellipses to longer names. Finally, they add a white background to the sample drawings by creating a div and appending it after the label.

  • 01:20:00 In this section, the video creator modifies the display of the collected image samples in the web app. The modification entails creating a sample container with a white background, a center-aligned text, a rounded corner, and a one-pixel margin. The thumbnail is set to 100, and the row label has a property that takes up 20 percent of the space, with the remaining eight samples taking ten percent of the space each. The resulting display structure is neat, but some images don't fit perfectly, which isn't a big deal because it's meant for desktop applications. Additionally, the creator adds a blur filter to some drawings made by flagged users using their IDs. Some of the collected drawings are impressive, while some contain misinterpretations that make the data more challenging.

  • 01:25:00 In this section, the YouTuber is taking a look at some drawings in the dataset and commenting on their quality, noting that some are very detailed and must have taken a long time to create. They also mention that their dataset is different from the Quick Draw dataset, as they have an undo button and no time limit, which means their drawings should be of better quality on average. Finally, they make an offhand comment about the organization and styling of the page.

  • 01:30:00 In this section, the instructor explains how to extract features from samples without using any libraries. The functions for extracting path count and point count are implemented in a file named features.js and added to an object called features. Then, in the feature extractor.js file, the samples are read and the features are extracted by looping through all the samples and getting the path count and point count for each of them. These feature values are then combined into an array and written to a new file. Finally, the feature names and samples are combined in another file named features.json. When running the feature extractor script, the log says "extracting features" and at the end, "done". The resulting features in the dataset directory can then be examined.

  • 01:35:00 In this section, the video creator explains how to use a JavaScript object to hold additional data that is not already contained in a feature file. The object can be saved to a separate JavaScript file and used to extract all the data needed for a web application. The creator also demonstrates how to visualize the data using Google charts, where options such as width, height, axis titles, and the core chart package can be defined in an object. A data table is created with two columns for feature values and their corresponding names.

  • 01:40:00 In this section, the video demonstrates how to create a scatter chart with Google Visualization, allowing users to investigate different features of the data more closely by using Explorer actions to zoom in and out. The video also shows how to use different colors for each class and implement transparency for better visualization of density in different parts using a different version of the Google chart library called materials charts.

  • 01:45:00 In this section, the video creator shows how to create a scatter chart using Google charts and then create a new chart with their own JavaScript code. The creator simplifies the options for the chart and changes the color scheme to use emojis instead, which allows for easier recognition of the data points without the need for labels or legends. The transparency of the chart is also adjusted to allow for better visibility of the densely plotted data.

  • 01:50:00 In this section, the instructor adds a callback function to the chart to identify any selected item in the table below. The new function is called "handle click," which adds an "emphasize" class to the selected item and uses "scroll into view" and "block center" to ensure that it is automatically scrolled into the center of the page. The instructor then modifies the page layout so that the chart is on the right-hand side of the page, with the other content on the left-hand side. The chart is also fixed in position so that it does not move when the user scrolls.

  • 01:55:00 In this section of the video, the presenter shows how to select items from the chart and the list, and deselect them as well. They specify a parameter to set whether scrolling should occur and add code to handle the error that occurs when trying to select nothing. Additionally, the presenter adds the ability to emphasize an element through a class, removing the class if it's already emphasized. Finally, they test the chart's functionality and adjust the chart size.


Part 3

  • 02:00:00 In this section, the speaker demonstrates the use of a sketch pad as an input to draw something to be classified. They prepare a container for it, add styles to it and fix its position at a certain distance from the right and top of the screen. They also give it a margin for clear visibility and an undo button in the center. They then add a control panel and a button to toggle the input visible or not which works successfully upon testing. The speaker emphasizes the importance of having a fast and responsive system when inspecting data to avoid manual errors that may lead to mistakes.

  • 02:05:00 In this section, the video instructor shows how to add a control panel to the chart and how to hide the background when the input is present. They demonstrate a solution to hide the background in viewer HTML by making the sketchpad canvas outline transparent. They also show how to display features on the chart immediately when something is drawn on the sketchpad by adding an update callback function that extracts features in the same way as the feature extractor. The instructor encounters a problem with conflicting objects called features but resolves it by renaming them to feature functions everywhere.

  • 02:10:00 In this section, the speaker demonstrates a dynamic point that can be added to the chart, which moves as drawing takes place. This is achieved by setting an attribute to a value, and redrawing it. The dynamic point is added to the chart class and draws before the axes are displayed. By dragging, the point moves to different areas, and when drawn with a transparent white point on a black background, it is much more visible. The value needs to be large, as the chart may be zoomed in, and the point remains visible without running over the edges.

  • 02:15:00 In this section, the instructor demonstrates how to trigger an update method in sketchpad.js, which works by hiding the dynamic input and showing the data when the toggle input is pressed. With the trigger update method, the instructor applies the get width and height feature functions in Common feature functions to calculate the width and height of the drawing, which are used to extract new features in the node feature extractor. The instructor suggests that they need to restructure the HTML so that the same resource is used for extracting and displaying data.

  • 02:20:00 In this section, the video creator demonstrates how to remove unnecessary code and replace it with new feature functions, resulting in a more generalized and multi-dimensional point creation process. After regenerating the features and updating the feature names, a few problematic sample points are observed, which is a common issue when working with data. The creator notes that outliers and inconsistencies are to be expected in data and invites viewers to help identify the cause of the issue.

  • 02:25:00 In this section, the instructor explains how to classify a drawing based on its features without using machine learning libraries. They extract features from the input and look at the nearby points to classify the input. To find the nearest point, the instructor copies the get nearest function from the math.js chart and pastes it into their code. They then call the function to identify the label for the drawing and log the result.

  • 02:30:00 In this section, the video creator adds the functionality to display the predicted label on screen using dynamic containers with HTML and CSS. The predicted label is displayed in a white container with a concatenated text displaying whether the object is a car or not. The creator experiments with drawing different objects such as clocks and pencils to test the program's predictive capabilities. The video creator then updates the chart using dynamic points with labels and images and draws lines connecting the nearest samples.

  • 02:35:00 In this section, the speaker discusses the importance of not squishing or stretching data in charts when working on machine learning projects. They demonstrate how the aspect ratio of a chart can affect the way data is interpreted, leading to confusion and errors. To solve this issue, the speaker calculates a delta for the maximum x and y values and adjusts the chart accordingly. Though this creates empty space, it allows for proper visualization of the data and accurate machine learning outcomes.

  • 02:40:00 In this section of the transcript, the video creator emphasizes the importance of data scaling in machine learning to ensure that all features have the same significance in classification. The creator demonstrates how the data can be squished and stretched during feature extraction, resulting in unequal treatment of certain features. To level the playing field, the creator introduces normalization, a common technique for remapping feature values to a range between 0 and 1. The video walks through the implementation of a new function called "normalize points" in the "utils" section to accomplish this remapping.

  • 02:45:00 In this section, the video tutorial shows how to normalize the data by changing the values to be between 0 and 1. The function is initialized to be general enough, and the minimum and maximum values for each feature are computed. The points are modified to be between 0 and 1 by subtracting the minimum value and dividing by the difference. The inverse lerp function is utilized to convert the given value into a percentage for normalizing the features extracted from the drawing. The min-max values are returned from the function and written in one of the JavaScript object files to communicate with the interface. Finally, the data is generated, and the min-max values are included in the JavaScript object files.

  • 02:50:00 In this section, the presenter explains how to normalize points before attempting to classify them using the Utils normalize points function. To do this, the raw feature data is loaded and passed as input to the function. Additionally, a min-max value can be passed in to support normalization without having to calculate them. It is also demonstrated how normalization is sensitive to outlier points and how to deal with them, such as by automatically detecting and removing them or using standardization as a different data scaling.

  • 02:55:00 In this section, the instructor discusses the technique of standardization, which involves computing the mean and standard deviation of each feature and remapping it by subtracting the mean and dividing by the standard deviation. This technique is less sensitive to outliers and can work better in certain cases. The instructor also introduces the K nearest neighbors classifier, where the class is determined based on the majority of the K nearest neighbors. The code is updated to allow for K nearest neighbors search, and the instructor demonstrates how to count the number of each label within the K nearest neighbors.


Part 4

  • 03:00:00 In this section, the instructor explains the process of figuring out the majority of a set of samples based on their labels. This involves counting the occurrences of each label in the samples and setting the majority label as the one with the highest count. The instructor makes updates to the code to return all nearest samples and draw lines to them in the chart, rather than just the nearest one. They then demonstrate the functioning of the classifier on various datasets and encourage viewers to share their own implementations of other variants of the nearest neighbor classifiers. Lastly, the instructor stresses the need to split the data into training and testing sets to evaluate the classifier performance objectively.

  • 03:05:00 In this section, the video demonstrates how to split a data set into training and testing sets, write them into files, and define constants for those splits in the code using JavaScript. The training set is set to be 50% of the number of samples, and the video warns about the mistake of testing on the training data. In order to test, the code loops through all of the test samples and stores the value of the label in an attribute called Truth while pretending not to know the label for testing purposes.

  • 03:10:00 In this section, the video goes over how to handle the training and testing samples separately and how to properly normalize the data. The speaker explains that it's important to normalize the data with only the training set since we have no idea what the testing set will be. They also walk through how to properly classify using only information from the training data, and demonstrate how to handle unknown data points by using the classify function.

  • 03:15:00 In this section, the video creator adds a correct attribute to a test label and compares this label with the truth value from earlier to determine accuracy. A subtitle is also added to clarify where the test set starts. The creator then adds a statistics field to calculate and display the accuracy of the K nearest neighbor classifier, which is set to 10 nearest neighbors, resulting in an accuracy of 39.62%. By changing the parameter to one nearest neighbor, the accuracy is actually much worse, showing that considering multiple neighbors was a good idea.

  • 03:20:00 In this section, the instructor refactors the code and discusses the importance of decision boundaries in understanding how a classifier operates. They create a new file called "run evaluation" and load necessary constants and utilities. The instructor explains how to create a classifier and how to obtain training and testing samples to compute a classifier's accuracy. They also introduce decision boundaries, which provide valuable information about how a classifier determines the classification of a data point. The instructor states that decision boundaries are more useful than simply counting the different features of a data point.

  • 03:25:00 In this section, the speaker explains how to implement a K-nearest neighbor (KNN) classifier in JavaScript. The code starts with predicting labels for each testing sample point using the KNN method and calculating the accuracy by checking the correct predictions count. The KN.js file is created to define the class that takes training samples and K to store and predict a given point. The class code for classification is copied from viewer HTML to KN.js and modified to fit the new class. The run evaluation script is updated to use the KNN classifier instead of the old classification method. By refactoring in this way, the code becomes more manageable, and silly mistakes can be avoided.

  • 03:30:00 In this section, the instructor demonstrates how to generate a pixel-based plot without using machine learning libraries by normalizing pixel values and color-coding them based on predicted values. The plot is then saved as a PNG image and set as the background of a chart. The instructor shows how to implement this new feature in the chart.js file by taking the top left coordinate according to the data, getting the pixel value from the data bounds to the pixel bounds, and dividing it according to how the scaling was done using the transformation.

  • 03:35:00 In this section of the video, the presenter discusses the image generated in the previous section and comments on its low resolution and smoothing effect. They then introduce a higher resolution image, which doesn't require the showing of data anymore. They explain that the colored regions tell us about the different labels and how interesting it is to observe the different regions appear. They then challenge the viewers to calculate the accuracy for all possible values of K and create a line chart to determine the best value and also to generate a high-resolution decision boundary plot for the best value.

  • 03:40:00 In this section, the YouTuber explains how to prepare data for Python by writing a function to convert sample data into CSV format using JavaScript. They create a function called toCSV that converts sample data with headers and feature names into CSV format, which is commonly used in Python. They output the CSV with feature names and labels for both training and testing data, and then move on to the Python implementation of K nearest neighbor using libraries. They open the training CSV file, read the lines, and parse the data as an array of rows represented as a string, with the new line character.

  • 03:45:00 In this section, the instructor explains how to prepare the data for K nearest neighbors classification without the use of libraries. The data is read from a CSV file and stored in two empty arrays in Python - X for feature values and Y for labels. The instructor walks through a loop to populate the arrays, convert the feature values to floats and remap the labels to numbers. The remap is achieved using a Python dictionary. The data is then fitted to a KNN classifier object with parameters set to emulate the web app, including 250 neighbors, brute force algorithm and uniform weights. The instructor ends by highlighting the importance of indentation in Python and extracting the reading feature data from a file as a function.

  • 03:50:00 In this section, the speaker demonstrates how to pass data to the model and check its accuracy using the score function. They also encourage viewers to explore additional capabilities of Python, such as installing matplotlib to visualize feature values and decision boundaries. The video concludes with a call for viewers to reflect on what they've learned so far and prepare for the next phase of the course.

No Black Box Machine Learning Course – Learn Without Libraries
No Black Box Machine Learning Course – Learn Without Libraries
  • 2023.04.17
  • www.youtube.com
In this No Black Box Machine Learning Course in JavaScript, you will gain a deep understanding of machine learning systems by coding without relying on libra...
 

MIT 6.034 "Artificial Intelligence". Fall 2010. Lecture 1. Introduction and Scope



1. Introduction and Scope

This video is an introduction to the MIT 6.034 course "Artificial Intelligence" The professor explains the definition of artificial intelligence and its importance, and goes on to discuss the models of thinking and representations that are important for understanding the subject. Finally, the video provides a brief overview of the course, including how the grade is calculated and what the quiz and final will entail.

  • 00:00:00 In this video, a professor discusses the definition of artificial intelligence and its importance. He goes on to say that anyone who takes the course will get smarter. The professor also discusses models of thinking and how they are important in order to have a good understanding of the subject. Finally, he talks about the importance of representations in order to make good models.

  • 00:05:00 In this video, the professor explains how gyroscopes work and how to represent a problem in terms of a graph. He then goes on to explain how to solve the farmer Fox goose and grain problem, which is an example that many people may be familiar with from childhood.

  • 00:10:00 The video introduces the concept of artificial intelligence and its various components, including artificial intelligence generated tests. The video then goes on to discuss the Rumpelstiltskin principle, which states that once you can name something, you can get power over it.

  • 00:15:00 This video introduces the concept of simple ideas, which are powerful and can be simple or complex. The video then goes on to discuss the definition and examples of simple ideas. The main point of the video is that simple ideas are important for building smarter programs, and that scientists and engineers have different motivations for studying them.

  • 00:20:00 This video discusses the history of artificial intelligence, beginning with the work done by Ada Lovelace over a century ago. The modern era of AI was kicked off with the paper written by Marvin Minsky in 1960. In one day, the discussion of artificial intelligence will be encompassed in the course.

  • 00:25:00 The " bulldozer age " refers to the age when people began to see that they had access to unlimited computing power, and began to develop rule-based expert systems.

  • 00:30:00 The video discusses the history of human evolution and the high school idea, which is that humans evolved through gradual and continuous improvement. It goes on to discuss how the accidental changes that led to human evolution were accidental evolutionary products, and speculate on what these changes might be.

  • 00:35:00 This video provides a brief introduction to Noam Chomsky's ideas on the development of human intelligence. The main points made are that language is at the center of human intelligence, and that the main purpose of the course is to help students develop skills in the area. The video also mentions the importance of recitations and tutorials, which are key aspects of the course.

  • 00:40:00 This video provides a brief overview of the MIT course, including its correlation between attendance at lectures and grades. The video then provides a summary of how the course calculates a student's grade, which includes taking into account the student's performance on quizzes and the final. Finally, the video warns students not to attempt to take all of the final exams, as there would be too much pressure and less opportunity for improvement.

  • 00:45:00 The video introduces the quiz and final and explains how the quiz will work and the final exam format. The video also explains how students will be able to contact the instructor and schedule tutorials.
1. Introduction and Scope
1. Introduction and Scope
  • 2014.01.10
  • www.youtube.com
MIT 6.034 Artificial Intelligence, Fall 2010View the complete course: http://ocw.mit.edu/6-034F10Instructor: Patrick WinstonIn this lecture, Prof. Winston in...
 

Lecture 2. Reasoning: Goal Trees and Problem Solving



2. Reasoning: Goal Trees and Problem Solving

This video discusses how to reasoning, goal trees, and problem solving. It introduces a technique called "problem reduction" and explains how it can be used to solve calculus problems. It also discusses how to use heuristic transformations to solve problems, and how knowledge can be used to solve problems in complex domains.

  • 00:00:00 The video introduces problem reduction, which is a common problem-solving technique used by students in calculus. It discusses the educational philosophy behind problem reduction, and provides a list of examples of problem reductions.

  • 00:05:00 The speaker is explaining how problem solving works, and how different transformations can help solve a problem. They go over four safe transformations that are necessary for solving a problem. The first step is to apply all of the safe transformations, and then look in the table to see if the problem has been solved. If the problem has been solved, then the speaker will report success.

  • 00:10:00 The video discusses the concept of goal trees and problem solving, and introduces the idea of heuristic transformations. These transformations, while not always successful, can be useful in certain situations.

  • 00:15:00 The video discusses various heuristic transformations that can be used to solve problems. One of these transformations is a family of transformations which I will show you only one. This transformation goes like this: if you have the integral of a function of the tangent and X, you can rewrite that as the integral of a function of Y over 1 plus y squared dy. This transformation from a trigonometric form into a polynomial form gets rid of all that trigonometric garbage we don't want to deal with. There is a C that we need as well, and that is going to be your proper knee-jerk reaction. You see something of the form 1 minus x squared, and what do you do when you see that? Well, you could do that. There's nothing you can do if Kristen has something she can suggest. She says that because of what where is our Hungarian I turn our young turn, it suggests that we make the transformation that involves X people sign apply. This means that Scylla doesn't actually have to remember that anymore because going forward, she will never have to integrate anything personally in her life. She can just simulate the program. These go from polynomial form back in a trigonometric form, so you have three

  • 00:20:00 The video discusses reasoning, goal trees, and problem solving. The presenter introduces a problem-solving technique called "the goal tree." This tree shows how goals are related to one another, and it can be helpful in making decisions about which problem to solve. The presenter explains that this technique is also known as a "problem induction tree" or "tree goal tree."

  • 00:25:00 This video introduces the concept of goal trees and problem solving, and shows how one can measure the depth of function composition using symbols. The video then demonstrates how one can apply a safe transformation to split an integral into three pieces, and how it works for a particular rational function.

  • 00:30:00 The video discusses the reasoning program, which solves problems by composing safe transformations. It shows how the program stopped short of a solution on a particular problem and went back to work on another problem.

  • 00:35:00 The video discusses the reasoning behind Schlegel's model of freshman calculus problems, which is a model in which knowledge about transformations, how old trees work, and tables are necessary in order to solve the problems. The video also mentions how the depth of functional composition, which is a technique that Brett suggested, doesn't actually matter because the tree doesn't grow deep or broad.

  • 00:40:00 The video discusses how knowledge is represented in problem solving, and how certain transformations make the problem simpler. It also discusses how knowledge can be used to solve problems in complex domains.

  • 00:45:00 The speaker demonstrates a program that supposedly demonstrates how computers can be "intelligent." However, the speaker quickly realizes that the program does the same thing as he does, and so the computer is not truly intelligent.
 

Lecture 3. Reasoning: Goal Trees and Rule-Based Expert Systems



3. Reasoning: Goal Trees and Rule-Based Expert Systems

This video explains how a rule-based expert system works. The system is designed to solve problems that are difficult to solve using more traditional methods. The system is composed of several rules that are connected by and gates, enabling the system to recognize a specific animal with certainty.

  • 00:00:00 This video explains how a rule-based expert system (RBS) is built, and it provides an example of how the system works. The RBS is designed to solve problems that are difficult to solve using more traditional methods, such as algebraic equations.

  • 00:05:00 In this video, Professor Patrick Winston explains how reasoning programs, or rule-based expert systems, work. The program's structure is very simple, with four blocks that are executed in an iterative loop in order to achieve a desired outcome. The program is able to solve problems involving simple blocks because it takes hints from questions it has answered in the past and uses recursion to achieve a complex outcome.

  • 00:10:00 The video explains how a goal tree is used to answer questions about how something was done, and how an and-or tree can be used to do this. It also explains that the integration program can use goal trees to answer questions about its own behavior.

  • 00:15:00 This video discusses how complex behavior is the result of the complexity of the environment, not the complexity of the program. Rule-based expert systems were developed in the late '60s as a way to encapsulate knowledge in simple rules, and they are still in use today.

  • 00:20:00 This YouTube video discusses how a forward-chaining rule-based expert system (RBSES) can be used to identify animals in a small zoo. The RBSES is composed of several rules that are connected by and gates, enabling the system to recognize a specific animal with certainty.

  • 00:25:00 This video explains how a rule-based expert system (RBE) works, by going backwards from a hypothesis to determine whether an object is a certain type of animal.

  • 00:30:00 A rule-based expert system was created to design houses similar to those designed by Portuguese architect Siza. The system is capable of translating what a grocery store bagger says into an if-then rule, allowing a knowledge engineer to understand it.

  • 00:35:00 In this video, Professor Patrick Winston discusses knowledge engineering principles, including the need for specific cases and the use of heuristics. He also provides an example of how heuristic number two, the question of whether two objects are the same or different, can be used to solve problems.

  • 00:40:00 The presenter discusses three ways in which human intelligence can be enhanced: by building rule-based systems, by developing goal-driven programs, and by using integration programs. Heuristic number three is that when a rule or goal is not being followed, the system will crack, indicating a need for further knowledge. The presenter demonstrates this by discussing a case in which a program prescribed a barrel of penicillin to a patient.

  • 00:45:00 This video explains how reasoning via goal trees and rule-based expert systems works. In both examples, the system is able to read stories and determine the consequences of actions.
 

Lecture 4. Search: Depth-First, Hill Climbing, Beam



4. Search: Depth-First, Hill Climbing, Beam

In this YouTube video, Patrick Winston discusses different search algorithms, including Depth-first, Hill Climbing, Beam, and Best-first searches. Using a map as an example, he demonstrates the advantages and limitations of each algorithm and how understanding different search methods can improve problem-solving skills. Winston also discusses the application of search algorithms in intelligent systems, using the Genesis system to answer questions about the Macbeth story. He also introduces the concept of a Pyrrhic victory and how search programs can discover such situations by looking through graphs and reporting their findings in English. Overall, the video provides a comprehensive overview of search algorithms and their practical use in real-world scenarios.

  • 00:00:00 In this section, Patrick Winston discusses different search methods and how they relate to our own problem-solving abilities. He demonstrates the importance of a good search algorithm with the example of finding the optimal path from one point to another on a map. He also introduces the concept of a British Museum search, in which every possible path is explored, but notes that this method is not efficient. He goes on to discuss depth-first, hill climbing, and beam search and how they can be used in different scenarios. He emphasizes that understanding different search algorithms can help develop intuition about problem-solving and may give insight into how our brains tackle problems as well.

  • 00:05:00 In this section, the concept of Depth-first, Hill Climbing, and Beam searches are introduced using the example of a map. The British Museum algorithm is utilized to illustrate how all possible paths can be found without biting one's own tail in a map. While Search is represented through maps, it is made clear that it is not limited to them and is actually about choices that are made when trying to make decisions. Depth-first search is one of the searches shown, and it consists of barreling ahead in a single-minded manner, choosing a path and backtracking when faced with a dead end. The process of backtracking is also introduced as a way to make the algorithm more efficient.

  • 00:10:00 In this section, the video discusses two main search algorithms: Depth-first Search and Breadth-first Search. Depth-first Search is best used in conjunction with the optional backtracking technique, as it can prevent missing a path that leads to the goal. Breadth-first Search builds a tree level by level and completes a path that leads to the goal. The video then tests both search algorithms on a sample problem, moving the starting position and adjusting the search accordingly. A flowchart is introduced to demonstrate the algorithm for the search, utilizing a queue to represent paths under consideration.

  • 00:15:00 In this section, the speaker explains how the Depth-first Search algorithm works. The algorithm starts with initializing the queue and extending the first path on the queue. After extending s, the speaker gets two paths, s goes to a and s goes to b. For Depth-first Search, the new extended paths are put on the front of the queue so that the algorithm can keep going down into the search tree. The speaker also explains that Breadth-first Search uses the same algorithm as Depth-first Search with one line changed, which is to put the new paths at the back of the queue instead of the front.

  • 00:20:00 In this section, we learn about the limitations of Breadth-First Search and how to improve it. The algorithm is considered inefficient and can't tell if it's getting closer or further away from the goal. Additionally, it often extends paths that go to the same node more than once, and we need to avoid that. By amending the algorithm to not extend a path unless a final node has not been extended before, we can avoid wasting time on duplicated paths. Using this method, we see a significant improvement in the search efficiency and path quality.

  • 00:25:00 In this section, the video explores Hill Climbing search as a more informed approach to finding the goal node by considering the distance to the node. Similar to Depth-first search, Hill Climbing lists the options lexically and breaks ties based on the proximity to the goal node. This results in a straighter path with no backtracking, though it may not always be the optimal path. The video demonstrates that Hill Climbing produces fewer enqueueings and a more direct path compared to Depth-First Search. The video encourages the use of heuristics in search algorithms if available.

  • 00:30:00 In this section, the instructor discusses the technique of Beam Search, a complement or addition to Breadth-first Search that allows for an informed search using heuristics. Beam Search sets a limit on the number of paths to consider at each level, and only keeps the top two paths that can get closest to the goal by taking advantage of extra information or heuristic measurement of distance to the goal. The instructor mentions that Hill Climbing is also an informed search that adds new paths to the front of the queue by considering the distance to the goal, which are sorted to keep everything straight.

  • 00:35:00 In this section, the speaker discusses Beam Search and Best-first Search, two additional search algorithms that can be used in continuous spaces such as mountains. Beam Search involves selecting and keeping the w best paths as a solution, while Best-first Search involves always working on the leaf node that is closest to the goal, and can skip around in the search tree. Hill Climbing can encounter problems in continuous spaces, such as getting stuck in a local maximum or not being able to move in a flat area. Finally, the speaker illustrates an additional problem with Hill Climbing in high dimensional spaces, where a sharp bridge may be present.

  • 00:40:00 In this section, the video discusses modeling intelligence and the need for search algorithms in building intelligent systems. The speaker uses the example of a topographical map to illustrate how we can get fooled into thinking we're at the top, when we're actually not. This leads to the concept of searching, which is necessary for making plans and evaluating choices. The speaker then demonstrates the use of the Genesis system to answer questions about the Macbeth story using a search algorithm. The system absorbs information, builds an elaboration graph, and searches for patterns in the story related to revenge and other higher level concepts.

  • 00:45:00 In this section, Patrick Winston discusses the concept of a Pyrrhic victory, which is a situation where everything seems to be going well at first but eventually leads to negative consequences. He demonstrates how search programs can discover such information by looking through graphs and can answer questions based on that information. The programs use a combination of explicit statements and if/then rules to build these graphs and report the information in English. Winston also mentions that these programs can generate common sense answers and higher-level thoughts by reporting on the searches that produced the information. Finally, he demonstrates the system's ability to answer questions about Macbeth's character and motivations using language output generated by a parser system.
4. Search: Depth-First, Hill Climbing, Beam
4. Search: Depth-First, Hill Climbing, Beam
  • 2014.01.10
  • www.youtube.com
MIT 6.034 Artificial Intelligence, Fall 2010View the complete course: http://ocw.mit.edu/6-034F10Instructor: Patrick WinstonThis lecture covers algorithms fo...
 

Lecture 5. Search: Optimal, Branch and Bound, A*



5. Search: Optimal, Branch and Bound, A*

The video discusses several search algorithms for finding the shortest path between two places, focusing on the example of Route 66 between Chicago and Los Angeles. The video introduces the concept of heuristic distance and provides examples of different search algorithms, such as hill climbing, beam search, and branch and bound. The speaker emphasizes the importance of using admissible and consistent heuristics in the A* algorithm to optimize the search. Furthermore, the video notes the effectiveness of using an extended list and airline distances to determine lower bounds on the shortest path. Ultimately, the video concludes with the promise of discussing further refinements of the A* algorithm in the next lecture.

  • 00:00:00 In this section, the professor discusses how to find the shortest path between two places, focusing on the example of Route 66 between Chicago and Los Angeles. He mentions the creation of the interstate highway system by President Eisenhower, who wanted to replicate the German army's ability to move troops around the country quickly. The professor then introduces the concept of heuristic distance and how it can help to find the best path, although it is not always true. He also gives examples of different search algorithms, such as hill climbing and beam search, that aim to find the best path by being close to the destination.

  • 00:05:00 In this section, the professor discusses the concept of heuristic distance and the principle of problem solving by asking someone who knows the answer. Using the example of finding the shortest path on a map, the professor suggests following the path suggested by Juana, but verifies it by checking that all other possible paths end up being longer than the suggested route. The professor elaborates on the process of calculating the path length and choosing the shortest path to extend, until the path length matches the one suggested by Juana.

  • 00:10:00 In this section, the speaker discusses how to find the shortest path without an oracle. The approach involves extending the shortest path so far until reaching the goal. The speaker provides an example to illustrate the process of finding the shortest path by considering paths with non-negative lengths. The approach checks if any of the work done so far is wasted, and if not, then the path length is the shortest. The speaker explains that this approach can find the shortest path, but there might be other paths if zero-length lengths exist.

  • 00:15:00 In this section of the video, the speaker demonstrates using branch and bound for finding the shortest path on a more complicated map. They mention decorating the flowchart and explain the process of initializing the queue, testing the first path on the queue, and extending paths that are not winners. The speaker notes that the branch and bound approach puts many paths onto the queue and extends many paths that are not optimal, but this can be improved by only extending paths that have not been extended before. The speaker emphasizes the importance of using only the extended paths approach for finding optimal paths.

  • 00:20:00 In this section, the concept of an extended list is introduced as an adjustment improvement to the branch-and-bound algorithm. The extended list prevents the algorithm from extending paths that have already been extended and that have longer path lengths than the ones which have already reached the same point. By keeping an extended list, vast areas of the tree can be pruned away, reducing the number of extensions needed to reach a solution. Compared to the previous example, the new algorithm only requires 38 extensions instead of 835, resulting in a substantial savings in computational time.

  • 00:25:00 In this section, the concept of using airline distances to determine the lower bound for the shortest possible path is introduced. The accumulated distance and airline distance are added to provide a lower bound on the path. The simulation is then demonstrated with the selection of the path with the shortest potential distance from S to G. In case of a tie score, the path with the lexically least value is chosen.

  • 00:30:00 In this section, the speaker discusses using heuristics to speed up graph search algorithms. Using an admissible heuristic is when an estimate is guaranteed to be less than the actual distance. The extended list is more useful than using one of these lower bound heuristics. However, the effectiveness of heuristics depends on the problem, and by changing the placement of the starting position, the results of the search can be altered. Ultimately, it is important to note that using heuristics may not repeat movements through the same node, but it will not necessarily do something essential for an efficient search.

  • 00:35:00 In this section, the video discusses A*, a search algorithm that combines both the admissible heuristic and the branch and bound algorithm. By utilizing both techniques, A* can greatly improve upon their individual performance. The admissible heuristic uses a strict goal while the branch and bound algorithm understands the space exploration involved. The video shows how A* can solve problems more efficiently when both techniques are utilized together. However, the video also notes that certain circumstances can render admissibility impossible if the search goes beyond traditional maps. As a result, the admissible hierarchy and A* algorithm might become less effective in finding optimal solutions.

  • 00:40:00 In this section, the professor explains the concept of admissible heuristics in the A* algorithm. He shows an example of a map with odd distances and explains how the use of an admissible heuristic may not always lead to finding the shortest path. The professor emphasizes that the admissible heuristic only works for maps and that to make the algorithm work in situations that aren't maps, which needs something stronger than admissibility in the heuristics. The video concludes with the promise of discussing this refinement in the next lecture.

  • 00:45:00 In this section, the lecturer discusses the requirements for a heuristic function to work within the A* algorithm. He introduces the concepts of admissibility and consistency, explaining that a heuristic function must be both admissible and consistent to work in situations where it is not a map. He shows that using an admissible but inconsistent heuristic can cause the algorithm to fail, even in scenarios where a consistent heuristic would have worked. Finally, the lecturer emphasizes the importance of using every advantage available to optimize the A* algorithm, including using an extended list and an appropriate heuristic function.