Machine Learning and Neural Networks - page 43

 

How to implement KNN from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/01%20KNN



How to implement KNN from scratch with Python

In the video titled "How to implement KNN from scratch with Python", the speaker explains how to create a KNN classifier from scratch using Python. They cover the steps involved in implementing the algorithm, such as calculating the distance between the new data point and other points in the dataset, selecting the k closest points, and determining the label for classification or average for regression. The speaker implements the algorithm using a class in Python and demonstrates its successful implementation on the iris dataset with an accuracy rate of 96%. They also invite viewers to check out the code on their Github repository and ask questions in the comments section.

  • 00:00:00 In this section, we learn about k Nearest Neighbors (k-NN) algorithm, how it works, and the steps needed to implement the algorithm in Python. k-NN is a distance-based algorithm where the closest k data points are selected based on their distance to the new data point. This value of k is determined by the user and can be used for both regression and classification problems. The algorithm starts by calculating the distance between the new data point and other data points in the dataset. Then, the k closest points are chosen and the average of their values is taken for regression, or the label with the majority vote is determined for classification. We also see how to implement the algorithm using a class in Python with a fit and predict function, and a helper function to calculate the distance between two points.

  • 00:05:00 In this section, the speaker explains how to create a KNN classifier from scratch using Python. Starting with the arc sort method to sort the distance array, they move on to selecting the k nearest neighbors, getting the most common class label, and returning the most common label. They then implement this classifier on the iris dataset to classify flower types and achieve an accuracy rate of 96%, demonstrating the successful implementation of KNN. The speaker invites viewers to check the code available on their Github repository and ask questions in the comments section.
How to implement KNN from scratch with Python
How to implement KNN from scratch with Python
  • 2022.09.11
  • www.youtube.com
In the first lesson of the Machine Learning from Scratch course, we will learn how to implement the K-Nearest Neighbours algorithm. Being one of the simpler ...
 

How to implement Linear Regression from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/02%20Linear%20Regression



How to implement Linear Regression from scratch with Python

This video covers the process of implementing linear regression from scratch using Python. The speaker explains how to find the best fitting line using mean squared error and how to calculate the weights and biases with gradient descent. The speaker also discusses how the learning rate affects convergence and demonstrates how to test the model using scikit-learn's data set feature. They also fix a typo in the code and adjust the learning rate to improve the fit of the prediction line. The code is shared on GitHub and viewers are invited to ask questions.

  • 00:00:00 In this section, the focus is on linear regression, which involves understanding the pattern of a given dataset and drawing a linear line that fits the data as best as possible. The mean squared error is used to calculate the error of the line for all data points, and the best fitting line is found by calculating the values for the parameters of the model or weight and bias that give minimum mean squared error using gradient descent. The learning rate is used to control how fast or slow to go in the direction that gradient descent tells us to go, where a low learning rate can cause a slow approach to minimum error, whereas a high learning rate can result in jumping around the airspace and failure to find the minimum. During training, the weight and bias are initialized as zero, and the equation is given a data point to predict or estimate
    the result, and the error of the equation is calculated, making it easier using matrix multiplication with all data points to calculate the gradients. During testing, a trained model predicts results using the equation.

  • 00:05:00 In this section, the speaker is implementing linear regression from scratch with Python. The speaker initializes the learning rate, sets a default value for the number of iterations, and defines the weights and biases as zero. The speaker then proceeds to predict the result by taking the dot product of x with the weights and adding the bias. To calculate the derivatives, the speaker uses a simple equation, and then updates the weights and biases by calculating the gradients. Finally, the speaker sums up the differences between the predictions and the actual values, and the process is repeated for a number of iterations until convergence.

  • 00:10:00 In this section, the speaker discusses how to train the linear regression model and make predictions using the given class. The update of weights and biases is made by subtracting the learning rate times the derivatives of weights and biases respectively. For making multiple iterations of running the algorithm, a for loop is added to run the algorithm over the data set. Finally, the speaker shows how to test the linear regression algorithm’s efficiency using a scikit-learn's data set feature, by fitting a line that results in good performance and calculating mean squared error for the predictions. A dimension error is encountered due to wrong dot product calculation, which is rectified by getting the transpose of x.

  • 00:15:00 In this section, the presenter fixes a typo in the code and uses it to create a linear regression model that predicts the y values based on the x values from a given dataset. They then visualize the prediction line and notice that while it fits well, it could be improved. The presenter decides to adjust the learning rate and reruns the model to obtain a better fit. They share the code on GitHub and invite viewers to ask questions if needed.
How to implement Linear Regression from scratch with Python
How to implement Linear Regression from scratch with Python
  • 2022.09.13
  • www.youtube.com
In the second lesson of the Machine Learning from Scratch course, we will learn how to implement the Linear Regression algorithm.You can find the code here: ...
 

How to implement Logistic Regression from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/03%20Logistic%20Regression



How to implement Logistic Regression from scratch with Python

The video explains how to implement logistic regression from scratch with Python, using the sigmoid function to create probabilities and cross-entropy as an error function. The instructor shares step-by-step instructions for calculating predictions, gradients, and updating biases through iterations. They also demonstrate how to load a breast cancer dataset and train the logistic regression classifier to predict whether a tumor is malignant or benign. The video concludes by evaluating the accuracy of the model using a custom function. Overall, the implementation is successful and proves that the logistic regression algorithm works well.

  • 00:00:00 In this section, the video discusses logistic regression and how it involves creating probabilities instead of specific values using the sigmoid function. Instead of using mean squared error, logistic regression uses cross entropy for its error function. To use gradient descent, the gradient of the error function in terms of weight and bias needs to be calculated. The learning rate is used to determine how fast to approach the direction given by the gradient. During testing, the probability is calculated, and the label is chosen based on the highest probability. The implementation of logistic regression is similar to linear regression, but with the initialization of weights and biases as zero.

  • 00:05:00 In this section of the video, the instructor explains how to implement logistic regression with Python using a sigmoid function to predict the results. The process involves calculating the predictions from the product of weights and the x-values plus bias, putting them into a sigmoid function to give the results, and calculating the gradients. The instructor demonstrates how to calculate the gradient for the bias and updates through iterations. The section also includes how to perform inference with logistic regression by getting the probability and choosing the label based on the values of the predictions.

  • 00:10:00 In this section, the instructor demonstrates how to implement logistic regression from scratch using Python. They explain the process step by step, showing how to calculate probabilities and class labels using the sigmoid function, and how to adjust the learning rate to obtain better results. The instructor also loads a breast cancer dataset from Scikit-learn and trains the logistic regression classifier to predict if a tumor is malignant or benign based on the features of the dataset. Lastly, they evaluate the accuracy of the algorithm and demonstrate how to calculate it using a custom function. Overall, the implementation is successful and shows that the homemade algorithm works quite well.
How to implement Logistic Regression from scratch with Python
How to implement Logistic Regression from scratch with Python
  • 2022.09.14
  • www.youtube.com
In the third lesson of the Machine Learning from Scratch course, we will learn how to implement the Logistic Regression algorithm. It is quite similar to the...
 

How to implement Decision Trees from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/04%20Decision%20Trees



How to implement Decision Trees from scratch with Python

The video provides a step-by-step guide on building a decision tree from scratch using Python. The speaker explains the concept of decision trees, how they work, and how they are built. They discuss stopping criteria, the grow tree function, the helper functions "most common label," "information gain," "entropy," and "split," as well as the predict function. The speaker also demonstrates how to calculate information gain, weighted entropy, and accuracy. Additionally, they test the decision tree model and provide viewers with a link to their GitHub repository where the code is available.

  • 00:00:00 In this section, we learn about decision trees, how they work and how they are built. Decision trees are built to represent a data set, with each data point being divided into leaf nodes that represent either 'yes' or 'no.' The nodes between the leaf nodes are called branches, and they are divided based on features like features that show whether the data point is located in the east or west neighborhood. Information gain is calculated as the entropy of the parent and the weighted average of the entropy of the children, and the decision tree model is trained based on the most significant information gains. Lastly, we discuss stopping criteria, which is used to decide when to stop building the decision tree.

  • 00:05:00 In this section, the speaker discusses ways to stop a decision tree before analyzing all possible leaf nodes, including setting a maximum depth or minimum number of samples for a node to have. The speaker then presents two classes that will be used for the implementation of the decision tree: a Node class and a DecisionTree class. The Node class includes information on the feature the node was divided with and the value of the node. The DecisionTree class includes methods for fitting the tree with x and y values, predicting on new data, and setting stopping criteria like minimum number of samples and maximum depth. Overall, the speaker takes a step-by-step approach to outlining the implementation of a decision tree from scratch in Python.

  • 00:10:00 In this section, the speaker discusses the implementation of the grow tree function, which is the main function that builds the decision tree recursively. The function takes in the x and y values, and checks that the number of features does not exceed the number of actual features. The function first checks the stopping criteria and then proceeds to find the best split, create child nodes, and call the grow tree function again. If the stopping criteria are met, the function creates a new leaf node and returns it with the value parameter. The speaker also discusses a helper function called the "most common label," which utilizes the counter data structure and returns the most common label in the dataset.

  • 00:15:00 In this section, the video discusses how to implement decision trees from scratch with Python. The instructor demonstrates how to create a helper function to find the best threshold and feature to create a new split. This function uses numpy to randomly select a group of features to consider creating a new split. Once the helper function finds the threshold among all possible splits, it calculates the information gain to determine if it's better than the best gain calculated so far. Finally, the best split index and threshold are returned.

  • 00:20:00 In this section of the video, the speaker creates a helper function called "information gain" to calculate the information gain and defines another helper function called "entropy" to calculate the entropy of the parent based on the values passed in. They explain that the entropy of the parent is calculated as the summation of p x times log 2 of p x, and they use a numpy trick to count the occurrences of each value and divide it by the total number of values to get the p of x. Next, the speaker creates another helper function called "split" to help find which indices go to the left and which go to the right and demonstrates how numpy argwhere works.

  • 00:25:00 In this section of the video, the presenter explains how to calculate the weighted entropy of the children of a decision tree using Python. After obtaining the length of the y values and the left and right indices, the entropy of the children can be calculated using the weighted average formula. This involves finding the number of samples in each child node divided by the total number of samples, multiplying it by the entropy of each node, and then adding the results together to obtain the entropy of the children. With this information, the information gain can then be calculated by taking the parent entropy minus the child entropy, which is then passed back to the entire decision tree.

  • 00:30:00 In this section of the video, the presenter explains how to implement the predict function for the decision tree classifier. The traverse_tree helper function is utilized here to recursively traverse the tree and return the value of the leaf node if it is reached. If the value of the feature is smaller or equal to the threshold, the left side of the tree is passed to be traversed, and the right side of the tree is passed to be traversed otherwise. The values are returned and then turned into a numpy array before being outputted. The decision tree classifier is then tested with the breast cancer dataset and the predict function is used to generate predictions which are passed to an accuracy metric.

  • 00:35:00 In this section, the presenter is testing the decision tree model that they built from scratch using Python. They first calculate the accuracy of the model using the predictions and the test data. They also find two errors in the code—one in the initialization of the node and the other in the traverse tree function. After fixing the errors, they run the test data again and get an accuracy of 0.91. They then pass the model different arguments and get a bit better accuracy. Finally, the presenter invites viewers to ask questions and provides a link to their GitHub repository where the code is available.
How to implement Decision Trees from scratch with Python
How to implement Decision Trees from scratch with Python
  • 2022.09.15
  • www.youtube.com
In the fourth lesson of the Machine Learning from Scratch course, we will learn how to implement Decision Trees. This one is a bit longer due to all the deta...
 

How to implement Random Forest from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/05%20Random%20Forests



How to implement Random Forest from scratch with Python

This video tutorial teaches how to implement Random Forests from scratch with Python. During training, a random subset of the dataset is selected, and a decision tree is created with this subset. This process is repeated for the number of trees determined before beginning the algorithm. During inference, the prediction is obtained from each tree, and if it's classification, the majority vote of the class label is taken. The speaker demonstrates how to implement it by creating a list spreading the decision trees into it and adding it to a Numpy array. The accuracy can be calculated using the number of true values correctly predicted divided by the total number of true values. The speaker also talks about the number of trees, max depth, and min sample split can be modified to achieve higher accuracy.

  • 00:00:00 In this section, we learn about random forests, which consist of many different decision trees. The process involves introducing some randomness into the equation when creating these trees. During training, a random subset of the dataset is selected, and a decision tree is created with this subset. This process is repeated for the number of trees determined before beginning the algorithm. During inference, the prediction is obtained from each tree, and if it's a classification, the majority vote of the class label is taken. If it's regression, the mean of all predictions is calculated. The implementation uses the decision trees class created in the previous lesson and is initialized by specifying the number of trees, maximum depth, minimum samples for a split, the number of features, and an empty array to hold all the trees in. The class has a fit and predict function, and what's needed is to pass the required parameters as mentioned above.

  • 00:05:00 In this section, the instructor explains how to fit a decision tree based on a subset of the samples and append it to the list of trees in a random forest model. A helper function "bootstrap_samples" is created to randomly choose a specified number of samples with replacement from the given data set. The instructor then proceeds to explain how to predict using the random forest for an input X, which involves iterating over all trees in the random forest and returning a list of predictions, where each inner list contains predictions for the same sample from different trees. Finally, the instructor introduces the "swap axis" function from numpy to rearrange the lists and a "most_common" helper function that uses the counter data structure from the collections library to return the most common classification label.

  • 00:10:00 In this section, the speaker explains the process of implementing a random forest from scratch using Python. They mention a helper function that will be used for prediction and create a list, spreading the decision trees into it and then adding it to a NumPy array which will be returned for predictions. The accuracy is calculated using the number of true values correctly predicted divided by the total number of true values. The speaker also mentions that the number of trees, max depth, and min sample split can be manipulated to achieve higher accuracy. The speaker directs viewers to the code on their GitHub repository and welcomes questions in the comment section. Finally, the speaker hands over to Patrick for the remaining part of the tutorial.
How to implement Random Forest from scratch with Python
How to implement Random Forest from scratch with Python
  • 2022.09.16
  • www.youtube.com
In the fifth lesson of the Machine Learning from Scratch course, we will learn how to implement Random Forests. Thanks to all the code we developed for Decis...
 

How to implement Naive Bayes from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/06%20NaiveBayes



How to implement Naive Bayes from scratch with Python

This video tutorial focuses on implementing Naive Bayes from scratch using Python. The instructor provides an overview of Bayes' theorem and the assumption of independence. They explain how to calculate the prior probability and class conditional probability, necessary for training the algorithm. The speaker also introduces the Gaussian distribution as a way to model probabilities. The video demonstrates the training and prediction steps for the algorithm with code. The instructor tests the algorithm on a toy dataset with two classes, achieving an accuracy of 96.5%. Overall, this tutorial is a useful resource for those interested in learning Naive Bayes and implementing it in Python.

  • 00:00:00 In this section, the speaker discusses the theory behind Naive Bayes, a probabilistic classifier that assumes independence between features to predict class labels. They explain Bayes' theorem and the assumption of independence, and how this is used to calculate the posterior probability of each class. The speaker goes on to explain how to calculate the prior probability and class conditional probability, both of which are necessary for training the algorithm. They also introduce the Gaussian distribution as a way to model probabilities. The training and prediction steps are summarized, and the code to implement Naive Bayes is demonstrated. The speaker provides a definition for both the fit and predict methods, and outlines the steps necessary for training and prediction in each.

  • 00:05:00 In this section of the video, the instructor explains how to implement Naive Bayes from scratch using Python. The code assumes that x and y are already in numpy and d array format. The instructor shows how to extract x by using x.shape and how to get the number of unique classes by using numpy.unique(). The next step is to calculate the mean, the variance, and the prior for each class. This can be accomplished by initializing these values with zeros and then calculating them using numpy functions. The instructor then explains how to calculate the posterior probability for each class by using a helper function and a list comprehension. Finally, the instructor shows how to return the prediction as a numpy array.

  • 00:10:00 In this section, the speaker discusses the implementation of the Naive Bayes algorithm in Python. They go through the steps of calculating priors, then calculating posterior using a Gaussian distribution and creating a helper function for the probability density, followed by predicting the class with the highest posterior. Finally, they test the algorithm on a toy dataset of 1000 samples and 10 features with two classes, achieving an accuracy of 96.5%. The speaker encourages further exploration of the code and looks forward to the next lesson.
How to implement Naive Bayes from scratch with Python
How to implement Naive Bayes from scratch with Python
  • 2022.09.17
  • www.youtube.com
In the 6th lesson of the Machine Learning from Scratch course, we will learn how to implement the Naive Bayes algorithm.You can find the code here: https://g...
 

How to implement PCA (Principal Component Analysis) from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/07%20PCA



How to implement PCA (Principal Component Analysis) from scratch with Python

The video explains the process of implementing Principal Component Analysis (PCA) from scratch using Python and Numpy. PCA is a technique that reduces the dimensionality of a dataset while retaining most of the information. The instructor walks through the steps of creating a Python class with fit and transform methods to perform PCA on a dataset. The fit method first calculates the mean and covariance of the data and extracts the eigenvectors and eigenvalues. The transform method then projects the data onto the principal components. The speaker highlights the importance of subtracting means and sorting eigenvectors in the process. Finally, the implementation is tested on the Iris dataset, resulting in successful dimensionality reduction from four to two dimensions.

  • 00:00:00 In this section, the instructor discusses Principal Component Analysis (PCA), an unsupervised learning method that reduces the dimensionality of a dataset by transforming it into a lower dimensional set that still contains most of the information of the larger set. The instructor explains how PCA finds a transformation such that the transformed features are linearly independent, with the dimensionality reduced by taking only the dimensions with the highest importance. The newly-found dimensions should minimize the projection error, and the projected points should have maximum spread, which means maximum variance. The instructor walks through the steps to implement PCA from scratch using Python and Numpy. These steps include subtracting the mean from x, calculating the covariance of x and x, and sorting the eigenvectors according to their eigenvalues in decreasing order.

  • 00:05:00 In this section, the speaker explains the implementation of principal component analysis (PCA) using Python. This involves creating an 'init' function that takes the number of components as input, a 'fit' method that subtracts the mean, calculates the covariance, sorts the eigenvectors, and stores the principal components. The 'transform' method then applies this transformation to new data. The speaker walks through each step of code, highlighting the importance of subtracting means and sorting eigenvectors, and ultimately outputting principal components for dimensionality reduction.

  • 00:10:00 In this section, the speaker demonstrates how to implement PCA (Principal Component Analysis) from scratch in Python. They begin by creating a class with a fit and transform method. The fit method first calculates the mean of the data and centers it around the mean. Then, it computes the covariances of the data and extracts the eigenvectors and eigenvalues. The transform method then projects the data onto the principal components with a dot product. Finally, the speaker tests the implementation with the Iris dataset and successfully reduces the dimensionality of the data from four to two dimensions.
How to implement PCA (Principal Component Analysis) from scratch with Python
How to implement PCA (Principal Component Analysis) from scratch with Python
  • 2022.09.18
  • www.youtube.com
In the 7th lesson of the Machine Learning from Scratch course, we will learn how to implement the PCA (Principal Component Analysis) algorithm.You can find t...
 

How to implement Perceptron from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/08%20Perceptron



How to implement Perceptron from scratch with Python

The video tutorial explains the theory behind the Perceptron algorithm, which can learn only linearly separable patterns for binary classification using an activation function, weights, and input. The presenter then outlines the necessary steps for implementing the Perceptron model from scratch in Python by selecting the learning rate and number of iterations for the optimization algorithm and defining the activation function as the unit step function. After initializing the weights and biases, the model learns from the training data by updating the weights and biases according to the Perceptron update rule. Finally, the presenter evaluates the model's accuracy by predicting the class labels for the test data, and the accuracy turns out to be 100%, indicating successful learning of the decision boundary.

  • 00:00:00 In this section, the presenter explains the basic theory behind the Perceptron algorithm and how it is an algorithm for supervised learning of binary classifiers. The Perceptron is a simplified model of a biological neuron and is also known as the prototype for neural networks. The Perceptron algorithm can learn only linearly separable patterns, and it can be seen as a single unit of an artificial neural network. The presenter then explains the mathematical representation of the Perceptron, which includes the weights, input, and activation function, and the binary classifier class labels. The video then explains the Perceptron update rule, which enables the algorithm to update the weights and biases to push them toward the positive or negative target class in case of a misclassification.

  • 00:05:00 In this section, the speaker outlines the steps for implementing a perceptron model from scratch in Python. They begin by selecting the learning rate and number of iterations for the optimization algorithm. Next, the activation function is stored as the unit step function. The weights and biases are initialized to none in the beginning and the code moves on to fit and predict functions. For the fit function, the number of samples and number of features are obtained from the training data, and then weights and biases are initialized. The class labels are adjusted to be 1 or 0. Next, the optimization is performed where the linear output is calculated for each input. Finally, the predict function is implemented where the linear model and activation function are used to calculate predicted output for test data.

  • 00:10:00 In this section, the presenter explains the implementation of the perceptron from scratch with Python. The update rule for perceptron is delta w = alpha times y minus y hat times x and delta bias is alpha times y minus y hat. The presenter then uses this rule to update the weights and bias based on the update parts. After explaining the fit method, the presenter moves on to the predict method, where the linear output is calculated and then passed through the activation function to get y predicted. Finally, the presenter tests this implementation using a helper function for accuracy and data sets make blobs with 150 samples and two features, creating a perceptron with learning rate and number of iterations, fitting it with training data, and predicting with test data. The accuracy turns out to be 100%, indicating successful learning of the decision boundary.
How to implement Perceptron from scratch with Python
How to implement Perceptron from scratch with Python
  • 2022.09.19
  • www.youtube.com
In the 8th lesson of the Machine Learning from Scratch course, we will learn how to implement the Perceptron algorithm.You can find the code here: https://gi...
 

How to implement SVM (Support Vector Machine) from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/09%20SVM



How to implement SVM (Support Vector Machine) from scratch with Python

Support Vector Machines (SVM) aim to find a linear decision boundary that maximizes separation between classes, with the weight being learned during training. The cost function involves a hinge loss determining how far we are from the correct side of the decision boundary, with a regularization term added to trade-off minimizing loss and maximizing distance. Gradients are computed, update rules derived, and weights initialized, while the prediction function is the output of the linear function. The code to implement SVM from scratch in Python using NumPy and Scikit-learn libraries is provided, including import train test and split, data sets, and plotting the decision boundary and the two hyperplanes confirming accurate implementation.

  • 00:00:00 In this section, the video discusses Support Vector Machines (SVM), which aims to find a linear decision boundary or hyperplane that provides maximum separation between classes. The hyperplane should have the biggest margin from the nearest points or support vectors, with the weight (w) being what needs to be learned during training. A loss function is defined, which involves a hinge loss that determines how far we are from the correct side of the decision boundary. A regularization term is added to the cost function to trade-off between minimizing the loss and maximizing the distance to both sides, with a Lambda parameter controlling the importance of the said parts in the cost function.

  • 00:05:00 In this section, the process of finding the weights and bias for SVM is discussed. The computation of gradients is explained, and the update rules are derived from the gradient. The initialization of weights is also demonstrated. The class labels are updated to have values of -1 or 1, and the update rules are applied for the specified number of iterations. The prediction function is simply the output of the linear function that we get from the learned weights. By comparing the output with zero, we can decide the class of the given test sample. The code for SVM is written in Python using NumPy and Scikit-learn libraries.

  • 00:10:00 In this section, the presenter explains how to write a python code to implement SVM from scratch. The method consists of two parts, the fit and predict methods. The fit method is our training, which computes the weights from the given data while the predict method uses the weights to predict the output by approximating the given data. The presenter further explains code updates according to different gradients, which depend on the condition. The condition y times W Times x minus B should be greater or equal than one, which we use with numpy dot to check. The code follows up with import train test, split and data sets from sklearn and matplotlib, and creates an example dataset with two blobs of datasets with two features, then ensures the classes are -1 and plus one, split into training and testing sets, and run svm to predict accuracy. The presenter also outlines code for plotting the decision boundary and the two hyperplanes at plus one and minus one, which confirms accurate implementation.
How to implement SVM (Support Vector Machine) from scratch with Python
How to implement SVM (Support Vector Machine) from scratch with Python
  • 2022.09.20
  • www.youtube.com
In the 9th lesson of the Machine Learning from Scratch course, we will learn how to implement the SVM (Support Vector Machine) algorithm.You can find the cod...
 

How to implement K-Means from scratch with Python

Code: https://github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch/tree/main/10%20KMeans



How to implement K-Means from scratch with Python

This video demonstrates how to implement the K-Means clustering algorithm from scratch with Python. K-Means is an unsupervised learning algorithm for clustering unlabeled data into k different clusters by updating the means or centroids iteratively until there is no further change. The video covers initializing empty clusters and setting parameters for the number of clusters and iterations, updating cluster labels and centroids, and stopping the optimization loop once there is no change. The speaker also explains the importance of measuring Euclidean distance to calculate the closest centroids and provides a pre-written plotting function from Matplotlib to visualize the clustering process.

  • 00:00:00 In this section, the speaker explains the k-means unsupervised learning method, which involves clustering a dataset into k different clusters and updating the means or centroids during an iterative optimization process until there is no more change. The process of updating the cluster labels and centers is repeated, and the nearest centroids formula is used to calculate the Euclidean distance between two feature vectors. The speaker then demonstrates an example of finding three clusters in unlabeled data and shows how to implement k-means from scratch in Python, including initializing empty clusters and setting parameters for the number of clusters and iterations. The video concludes with a summary of the k-means algorithm and its implementation in Python.

  • 00:05:00 In this section, the speaker discusses the implementation of K-Means from scratch using Python. They begin by initializing the necessary variables, such as empty lists for each cluster and centroid, and then defining a predict function rather than a fit method. They explain that K-Means is an unsupervised learning technique for unlabeled data. The optimization loop involves assigning samples to centroids before calculating new centroids from the clusters. The speaker notes that helper functions are necessary for creating and getting centroids and clusters. They end by mentioning that stopping the loop earlier than the maximum iterations is possible if there is no more change.

  • 00:10:00 In this section, the speaker explains the implementation of helper functions to update the cluster labels and assign samples to the closest centroids. The function to update the cluster labels iterates through each cluster and assigns the cluster index to the label of each sample index. The speaker also shows the initialization of empty lists for each cluster to assign the indices and then iterate through each sample to assign it to the closest centroid. Finally, the speak outlines the steps for plotting the centroids and clusters and checks if the code needs to provide the steps to plot.

  • 00:15:00 In this section, the speaker explains how to implement K-Means clustering algorithm from scratch with Python. The algorithm takes a dataset and a specified number of clusters as input, and then assigns each point to their closest centroid. The speaker introduces helper functions for finding the closest centroid and calculating the Euclidean distance between two points. Another helper function is used to calculate the mean of each cluster, which is then assigned to the centroid. Finally, the algorithm checks if the distance between the old and new centroids for each cluster is zero to determine if the algorithm has converged.

  • 00:20:00 In this section, the speaker explains how to implement K-Means clustering from scratch using Python and numpy. They discuss the importance of measuring euclidean distance and how to calculate the new centroids and cluster labels. They also provide a pre-written plotting function that uses the matplotlib library to visualize the clustering process step by step. Finally, they demonstrate the implementation on a sample dataset using sklearn's make_blobs function to create three clusters, showing how the K-Means algorithm successfully groups data points into separate clusters. The speaker encourages viewers to check out the full code on Github and to watch the rest of the course for more in-depth explanations of machine learning concepts.
How to implement K-Means from scratch with Python
How to implement K-Means from scratch with Python
  • 2022.09.21
  • www.youtube.com
In the 10th lesson of the Machine Learning from Scratch course, we will learn how to implement the K-Means algorithm.You can find the code here: https://gith...