You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
CS480/680 Intro to Machine Learning - Spring 2019 - University of Waterloo
CS480/680 Lecture 1: Course Introduction
This lecture introduces the concept of machine learning, which is a new paradigm in computer science where computers can be taught to do complex tasks without having to write down instructions. This video provides a brief history of machine learning, and introduces the three key components of a machine learning algorithm - data, task, and performance.
well it is doing, but does not have a set answer for what the right answer is.
CS480/680 Lecture 2: K-nearest neighbors
CS480/680 Lecture 2: K-nearest neighbours
This video covers the basics of supervised learning, including the differences between classification and regression. It also provides a brief introduction to machine learning and explains how the nearest neighbor algorithm works. Finally, it discusses how to evaluate an algorithm using cross-validation and how underfitting can affect machine learning. This lecture discusses how to use the k-nearest neighbors algorithm for regression and classification, as well as how to weight the neighbors based on their distance. Cross-validation is used to optimize the hyperparameter, and the entire data set is used to train the model.
prediction problem, where the input is sensor data and satellite imagery and the output is a prediction of whether or not it will rain. The fourth example is a problem where the input is a question about a person's sleep habits, and the output is a prediction of whether or not the person will have a good sleep.
CS480/680 Lecture 3: Linear Regression
CS480/680 Lecture 3: Linear Regression
The lecture on Linear Regression starts with an introduction to the problem of finding the best line that comes as close as possible to a given set of points. The lecturer explains that linear functions can be represented by a combination of weighted inputs. Linear regression can be solved via optimization, with the goal of minimizing the Euclidean loss by varying the weight vector, which can be done efficiently using convex optimization problems. The process of solving a linear regression equation involves finding the W variable, or weights, that will give the global minimum for the objective function, which can be done using techniques such as matrix inversion or iterative methods. The importance of regularization in preventing overfitting is also discussed, with a penalty term added to the objective function to constrain the magnitude of the weights and force them to be as small as possible. The lecture ends by discussing the importance of addressing the issue of overfitting in linear regression.
CS480/680 Lecture 4: Statistical Learning
CS480/680 Lecture 4: Statistical Learning
In this lecture on statistical learning, the professor explains various concepts such as the marginalization rule, conditional probability, joint probability, Bayes Rule, and Bayesian learning. These concepts involve the use of probability distributions and updating them to reduce uncertainty when learning. The lecture emphasizes the importance of understanding these concepts for justifying and explaining various algorithms. The lecture also highlights the limitations of these concepts, particularly in dealing with large hypothesis spaces. Despite this limitation, Bayesian learning is considered optimal as long as the prior is correct, providing meaningful information to users.
In this lecture, the instructor explains the concept of approximate Bayesian learning as a solution for the tractability issue with Bayesian learning. Maximum likelihood and maximum a-posteriori are commonly used approximations in statistical learning, but they come with their own set of weaknesses, such as overfitting and less precise predictions than Bayesian learning. The lecture also covers the optimization problem arising from maximizing likelihood, the amount of data needed for different problems, and the importance of the next few slides for the course assignment. The instructor concludes by emphasizing that the algorithm will converge towards the best hypothesis within the given space, even if some ratios are not realizable.
CS480/680 Lecture 5: Statistical Linear Regression
CS480/680 Lecture 5: Statistical Linear Regression
In this lecture on statistical linear regression, the professor covers numerous topics, starting with the concept of maximum likelihood and Gaussian likelihood distributions for noisy, corrupted data. They explain the use of maximum likelihood techniques in finding the weights that give the maximum probability for all the data points in the dataset. The lecture then delves into the idea of maximum a-posteriori (MAP), spherical Gaussian, and the covariance matrix. The speaker also discusses the use of a priori information and regularization. The expected error in linear regression is then broken down into two terms: one accounting for noise and another dependent on the weight vector, W, which can further be broken down into bias and variance. The lecture ends with a discussion on the use of Bayesian learning for computing the posterior distribution. Overall, the lecture covers a broad range of topics related to statistical linear regression and provides valuable insights into optimizing models to reduce prediction error.
The lecture focuses on Bayesian regression, which estimates a posterior distribution that converges towards the true set of weights as more data points are observed. The prior distribution is shown to be a distribution over pairs of W naught and W1 and is a distribution of lines. After observing a data point, the posterior distribution is calculated using prior and likelihood distributions, resulting in an updated belief over the line's position. To make predictions, a weighted combination of the hypotheses' predictions is taken based on the posterior distribution, leading to a Gaussian prediction with a mean and variance given by specific formulas. The trick to obtain an actual point prediction is to take the mean of the Gaussian prediction.
CS480/680 Lecture 6: Tools for surveys (Paulo Pacheco)
CS480/680 Lecture 6: Tools for surveys (Paulo Pacheco)
In this video, Paulo Pacheco introduces two academic tools for surveys: Google Scholar and RefWorks. He explains how to search for academic papers and sort them by citations using Google Scholar, and suggests filtering out older papers for more recent ones. Pacheco emphasizes the importance of exporting and managing citations, and introduces RefWorks as a tool for this task. He also provides tips for accessing academic publications, including using creative keyword searches and potentially requiring university network access or a VPN.
CS480/680 Lecture 6: Kaggle datasets and competitions
CS480/680 Lecture 6: Kaggle datasets and competitions
The lecture discusses Kaggle, a community for data science practitioners to compete in sponsored competitions using provided datasets for a cash prize, offering kernels for machine learning model training and data feature extraction, and a vast selection of almost 17,000 datasets for use in designing algorithms. The lecturer also notes that company GitHub repositories can provide valuable datasets, codes, and published papers for competitions.
CS480/680 Lecture 6: Normalizing flows (Priyank Jaini)
CS480/680 Lecture 6: Normalizing flows (Priyank Jaini)
The video provides an introduction to normalizing flows in deep generative models, a technique that learns a function to transform one distribution to another, with the goal of transforming a known distribution to an unknown distribution of interest. The video also discusses possible research projects related to normalizing flows, including conducting a survey of different papers and advancements related to normalizing flows and analyzing the transformation of a single Gaussian into a mixture of Gaussians. The lecturer encourages exploration of the many different applications of normalizing flows.
CS480/680 Lecture 6: Unsupervised word translation (Kira Selby)
CS480/680 Lecture 6: Unsupervised word translation (Kira Selby)
The video discusses unsupervised word translation, which involves training a machine learning model to translate to and from a language without any cross-lingual information or dictionary matching. The Muse model is introduced as an approach that can achieve state-of-the-art accuracy on hundreds of languages without any cross-lingual information and comes close to supervised models in performance. The process of unsupervised word translation employs a matrix that translates the embedding spaces of different language words, using GAN or generative adversarial networks. By training these two models against each other, a way to map two distributions to one space is created, providing better translation results. The models can achieve 82.3% accuracy in word-to-word translations.
CS480/680 Lecture 6: Fact checking and reinforcement learning (Vik Goel)
CS480/680 Lecture 6: Fact checking and reinforcement learning (Vik Goel)
Computer scientist Vik Goel discusses the application of reinforcement learning in fact-checking online news and proposes using a recommendation system to insert supporting evidence in real-time. He suggests using a large corpus of academic papers as a data source to train a classifier to predict where a citation is needed. Additionally, Goel explains how researchers have begun encoding human priors into reinforcement learning models to accelerate the process and recognize different objects in video games. This presents a promising research area where additional priors can improve the learning process.