How you can learn to work with the shaft curve - General

Maxim Dmitrievsky 2023.10.26 11:29 #33161

So, at the global low on your chart, there's a huge overtraining because the shaft curve goes as far up as possible.

Aleksey Vyazmikin 2023.10.26 11:54 #33162

Andrey Dik #:

Do I understand correctly that you think of learning as finding the answer of a function without some of its members?

For example, we have observations A+B+C=Y, A and B - as predictors, and C is unknown, getting acquainted with the answers (markup) on the sample through optimisation we select compensation of C value, which can be expressed through manipulations, including multiplication by coefficients in case of NS, or through mutual rules. I.e. there is an active investigation of the function in order to replace the missing variables and approach the correct answer. It is assumed that the results of the function calculation in the sample are sufficiently representative that the result can be used on new data.

However, in the case of optimisation in the tester, we can change the values of variables (essentially representing them as functions), assuming that we don't know them, or even know which ones will give more weight to our compensatory function in the form of rules approximating Y values.

Statements Is there a pattern Any questions from newcomers

Maxim Dmitrievsky 2023.10.26 11:58 #33163

Since we already know what a Teacher is, now once again pictures for Andrew

https://ru.wikipedia.org/wiki/%D0%94%D0%B8%D0%BB%D0%B5%D0%BC%D0%BC%D0%B0_%D1%81%D0%BC%D0%B5%D1%89%D0%B5%D0%BD%D0%B8%D1%8F%E2%80%93%D0%B4%D0%B8%D1%81%D0%BF%D0%B5%D1%80%D1%81%D0%B8%D0%B8

Maxim Dmitrievsky 2023.10.26 12:00 #33164

Further questioning from the chatGPT.

Max, [26 Oct 2023, 18:59:31]:

How does the Displacement-Dispersion Dilemma relate to finding a global minimum?

ChatGPT & Midjourney | AI bot, [26 Oct 2023, 18:59:38]:

The bias-dispersion dilemma refers to the model selection problem in machine learning. It indicates that as model complexity increases (e.g., adding new features or increasing the number of parameters), there is an increase in the variance (spread) of model predictions, but there may be a bias (error) towards simpler models.

On the other hand, finding the global minimum refers to optimising the parameters of the model. The goal is to select parameter values at which the minimum value of the error function is achieved. However, with a large number of parameters and model complexity, finding the global minimum may be difficult or even impossible.

Thus, the bias-dispersion dilemma and the search for a global minimum are related in that selecting too complex a model for a machine learning problem can cause an overtraining problem (high variance), while using too simple a model can cause an undertraining problem (high bias). This can make it difficult to find the global minimum and lead to suboptimal results. Therefore, it is important to find a balance between the complexity of the model and its ability to generalise the data to achieve optimal results.

Let's discuss video "Machine AI 2023. Meet ChatGPT. Machine Learning and Neural

Maxim Dmitrievsky 2023.10.26 12:04 #33165

Max, [26 Oct 2023, 19:03:50]:

The optimal model should correspond to the global minimum in this case, or not?

ChatGPT & Midjourney | AI bot, [26 Oct 2023, 19:03:54]:

No, the optimal model does not necessarily correspond to the global minimum. In some cases, especially in the presence of noise in the data, the global minimum can be achieved by retraining the model. Such a model may show high accuracy on training data, but poorly generalise the data to new examples.

Instead, the optimal model should be near the global minimum but with low variance and bias. Such a model will produce stable and accurate predictions on new data. Therefore, it is important to consider not only the accuracy of the model on training data, but also its ability to generalise to new examples when selecting the optimal model.

AI 2023. Meet ChatGPT. Retraining Sacred knowledge.

Andrey Dik 2023.10.26 12:04 #33166

Aleksey Vyazmikin #:

Do I understand correctly that you represent learning as finding the answer of a function without some of its members?

For example, we have observations A+B+C=Y, A and B - as predictors, and C is unknown, getting acquainted with the answers (markup) on the sample through optimisation we select compensation of C value, which can be expressed through manipulations, including multiplication by coefficients in case of NS, or through mutual rules. I.e. there is an active investigation of the function in order to replace the missing variables and approach the correct answer. It is also assumed that the results of function calculation in the sample are representative enough, which will allow to use the result on new data.

However, in the case of optimisation in the tester, we can change the values of variables (essentially representing them as functions), assuming that we don't know them, or even know which ones will give more weight to our compensatory function in the form of rules approximating Y values.

I don't quite understand the question. Please clarify.

Aleksey Vyazmikin 2023.10.26 12:06 #33167

Maxim Dmitrievsky chatGPT.

Max, [26 Oct 2023, 18:59:31]:

How does the Displacement-Dispersion Dilemma relate to the search for a global minimum?

ChatGPT & Midjourney | AI bot, [26 Oct 2023, 18:59:38]:

The bias-dispersion dilemma refers to the model selection problem in machine learning. It indicates that as model complexity increases (e.g., adding new features or increasing the number of parameters), there is an increase in the variance (spread) of model predictions, but there may be a bias (error) towards simpler models.

How is it proposed to consider the variance (spread) of error in classification rather than regression?

Andrey Dik 2023.10.26 12:06 #33168

Maxim Dmitrievsky #:

Max, [26 Oct 2023, 19:03:50]:

The optimal model should correspond to the global minimum in this case, or not?

ChatGPT & Midjourney | AI bot, [26 Oct 2023, 19:03:54]:

No, the optimal model does not necessarily correspond to the global minimum. In some cases, especially in the presence of noise in the data, the global minimum can be achieved by retraining the model. Such a model may show high accuracy on training data, but poorly generalise the data to new examples.

Instead, the optimal model should be near the global minimum but with low variance and bias. Such a model will produce stable and accurate predictions on new data. Therefore, it is important to consider not only the accuracy of the model on training data, but also its ability to generalise to new examples when selecting the optimal model.

Ask GPT, when should you stop training? You haven't answered that question yourself yet.

Aleksey Vyazmikin 2023.10.26 12:06 #33169

Andrey Dik #:

I don't quite understand the question. Please clarify.

So my calculations are not correct and the answer is "no" - you have a different idea of the process.

Andrey Dik 2023.10.26 12:07 #33170

Andrew, [26.10.2023 13:19]
what is a stopping criterion in neural network training in particular and machine learning in general?

GPT Chat bot, [26.10.2023 13:19]
Stopping criteria in neural network training and machine learning in general depend on the task and the learning algorithm. Some common stopping criteria include:

1. Achievement of sufficiently low error: A threshold error value can be defined, below which the model is considered to have achieved sufficient accuracy. Training is stopped when the error on the validation dataset or test dataset reaches this value.

2. Convergence: It is possible to stop training when the model stops improving significantly after each training epoch or after a certain number of iterations. Various criteria can be used here, such as change in error between epochs, change in model weights or change in predicted values.

3. Time Limit Exceeded: Training can be stopped if the training time exceeds a given threshold.

4. Early Stopping: An early stopping strategy can be used in which training is stopped when the model performance on the validation dataset or test dataset starts to deteriorate after a certain number of training epochs.

5. Reaching a maximum number of epochs: Training can be limited to a certain number of epochs, after which the training process is stopped.

6. Specific number of iterations: In some cases, it is possible to specify a certain number of iterations after which the training will be stopped.

7. Loss control on training dataset: The training run can be stopped if the loss function on the training dataset reaches a certain value or stops decreasing.

The choice of stopping criterion depends on the specific task, the available data and the training algorithm. It is often recommended to use a combination of different criteria to achieve the best results.

Andrei, [26.10.2023 13:23]
can we say that learning is the essence of optimisation, the goal of which is to reach a global extremum?

GPT Chat bot, [26.10.2023 13:23]
Yes, we can say that learning in machine learning can be considered as an optimisation process, the goal of which is to reach a global extremum. Training a model boils down to finding a combination of parameters (weights or coefficients) that minimises the loss function or maximises the evaluation function.

The loss function is a measure of the discrepancy between the predicted values of the model and the actual values. The goal is to find values of the model parameters at which the loss function reaches a minimum (global extremum) to achieve the best predictive ability of the model.

Optimisation algorithms such as stochastic gradient descent, genetic algorithms or Charged System Search are used to update the model parameters and find the optimal solution. These algorithms seek to find a global extremum or a good local extremum of the loss function to achieve the best model performance.

Thus, machine learning can be viewed as an optimisation, where the goal is to find a combination of model parameters that minimises the loss function and achieves the best model performance.

Machine Learning and Neural Interpolation, approximation and the Regularization

Machine learning in trading: theory, models, practice and algo-trading - page 3317