Machine learning in trading: theory, models, practice and algo-trading - page 3339

 
Maxim Dmitrievsky #:
statistical learning

kozul is self-promotion, a new sticker on old trousers.

Maxim Dmitrievsky #:
Where is the statistical output after resampling and cv? And construction of the final classifier. Take this topic and develop it. This is the basis of kozul.

Tuls for Creating Efficient Modells, comparing multiple modells vis resampling. Next should be something like statistical inference and unbiased model building.

We need statistical inference. It gives some results in comparison with the same RL and other methods.

Search in R: statistical learning, weak supervised learning, functional augmentation learning.

Kozul is unfair advertising, a new sticker on old trousers.

Tuls for creatin' efective modells, comparing multiple modells vis a vis resampling. Next should be something like stat inference and unbiased model building.

This is the standard of machine learning and much of the book deals with these very issues, which are many years old and for which many tools have been invented. Part 3 of the book is called: Tools for Creating Effective Models with the following content:

- 10 Resampling for performance evaluation

- 11 Comparing models with resampling

- 12 Model tuning and the dangers of overfitting

- 13 Grid search

- 14 Iterative search

- 15 Viewing multiple models

In addition there is chapter 20"Ensembles of Models" which tells how to build the final model.

We need statistical lerning.

Need? Please: CRAN Task View: Machine Learning & Statistical Learning

10 Resampling for Evaluating Performance | Tidy Modeling with R
10 Resampling for Evaluating Performance | Tidy Modeling with R
  • Max Kuhn and Julia Silge
  • www.tmwr.org
The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles. This book provides a thorough introduction to how to use tidymodels, and an outline of good methodology and statistical practice for phases of the modeling process.
 
Ensembles are already closer to kozul, at least you can equalise the bias, with increased variance.

But you will still have a lot of noise in predictions (because the variance is larger), what will you do with it? That is, the TS even on the traine will have, say, only 60% of profitable trades. And the same or less on the test.

Yeah, you will start staking to correct this noise... well, try it.
 
Maxim Dmitrievsky #:
This is for beginners tips, you need a kozul and the ability to think

Here, go to the statistical office, don't crowd the front desk.

Can I get a thesis on how to build the final model, according to this book? I'm on my phone, I can't look at it right now.

A model ensemble, where the predictions of multiple single learners are aggregated to make one prediction, can produce a high-performance final model. The most popular methods for creating ensemble models are bagging(Breiman 1996a), random forest(Ho 1995;Breiman 2001a), and boosting(Freund and Schapire 1997). Each of these methods combines the predictions from multiple versions of the same type of model (e.g., classifications trees). However, one of the earliest methods for creating ensembles ismodel stacking(Wolpert 1992;Breiman 1996b).

Model stacking combines the predictions for multiple models of any type. For example, a logistic regression, classification tree, and support vector machine can be included in a stacking ensemble.

This chapter shows how to stack predictive models using thestacks package . We'll re-use the results from Chapter15 where multiple models were evaluated to predict the compressive strength of concrete mixtures.

The process of building a stacked ensemble is:

  1. Assemble the training set of hold-out predictions (produced via resampling).
  2. Create a model to blend these predictions.
  3. For each member of the ensemble, fit the model on the original training set.


20.5 CHAPTER SUMMARY

This chapter demonstrated how to combine different models into an ensemble for better predictive performance. The process of creating the ensemble can automatically eliminate candidate models to find a small subset that improves performance. Thestacks package has a fluent interface for combining resampling and tuning results into a meta-model.



This is the author's view of the problem, but it is not the only way to combine multiple models - there are stacks packages in R for combining models. For example, caretEnsemble: Ensembles of Caret Models

20 Ensembles of Models | Tidy Modeling with R
20 Ensembles of Models | Tidy Modeling with R
  • Max Kuhn and Julia Silge
  • www.tmwr.org
The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles. This book provides a thorough introduction to how to use tidymodels, and an outline of good methodology and statistical practice for phases of the modeling process.
 
We need ensemble and stacking, i.e., bousting over the classifiers. Ensemble removes bias and stacking removes variance. In theory it can work, in practice I haven't done it. And it will be a lot of models, which is unpleasant in production.

Because when you get to production, you'll be stuck with a lot of models. And you want one or two.

Plus it doesn't solve the issue that you don't always need to be in the market. The model will be hammering all the time. Because of these, let's say, nuances, the whole cycle from development to implementation breaks down.
The tester will be slow to test, everything will be slow, cotton-wool.
 
There also seems to be a confusion between ensemble and stacking in the book. In short, it's a normal approach, but it can be wacky in production. And it doesn't require a mountain of packages.

Oh, it also doesn't solve the most important markup problem.
 
Like the recent link to Vladimir's article. It is an example of the most wacky TC creation. When you have done a lot of work, transformations, and the output is some model that you can get by random brute force without doing anything. It's interesting, but unproductive.
 
Maxim Dmitrievsky #:
everything will be slow, cotton.
Maxim Dmitrievsky #:
It also seems that the book confuses ensemble and stacking. In short, this is a normal approach, but it can be cotton in production.
Maxim Dmitrievsky #:
As you recently gave a link to Vladimir's article. An example of the most wacky TC creation.

What kind of cottoniness?

 
Forester #:

What's with the cottoniness?

A synonym for slow
 

I suggest we go back to kozul, statistical learning and reliable AI.

P.Z.

Figure out the finer details of it.

 
Lorarica #:

I suggest we go back to kozul, statistical learning and reliable AI.

P.Z.

Figure out the finer details of it.

If you're straight to the details, here's a read on it

Reason: