Machine learning in trading: theory, models, practice and algo-trading - page 302

 
Andrey:

This is an interesting thread. A lot of flubber, but some smart thoughts. Thank you.


+
 
Alexander Ivanov:
))) The main thing is communication and the process. It seems that some people already create neural bots. I would like to try it.

Unfortunately, the threshold for entering the subject is very high. The field of MO itself is already old enough and just the number of various branches and methods tends to infinity.

And if you have not been engaged before, it is possible to drown in this sea of information). And I do not want to pick up pieces, all the same I need some systematic approach to the tool.

But I have not found any coherent systematic information so far.

 
SanSanych Fomenko:


For me the prediction error is not the main problem. For me, the main problem is retraining the model. Either I have even faint evidence that the model is NOT retrained, or the model is not needed at all.

I have written many times on this thread (and others, too) about diagnosing overfitting and the tools to combat overfitting. In short, it is clearing input predictors from noise, and the model itself is of secondary importance.

Everything else is of no interest to me, because any result without overtraining considerations is just what happened, now, maybe tomorrow, and after tomorrow the depo is dumped.

So overtraining is when the difference between the rate on the lern and the test, if the test (outside the training sample) you have a good rate then everything is OK. In a sense it is impossible to avoid the overfit, it can only be reduced to an acceptable level.


PS: Mr.Mihail Marchukajteswas offered to prove that Reshetov classifier is cool, you can try too, I wonder if someone will be able to squeeze more than 65% accuracy out of this data) ))

 
toxic:

Well, overtraining is when the speed diverges on the lern and the test, if you have a good speed on the test (outside the training sample), then everything is okay. In a sense overfeed cannot be avoided at all, it can only be reduced to an acceptable level.


PS: Mr.Mihail Marchukajteswas offered to prove the steepness of the Reshetov classifier, you may also try it, I wonder if someone will be able to squeeze more than 65% accuracy out of this data) ))


The tester is some finish work. And we need the confidence intervals of TC performance.
Yuriy Asaulenko:

Unfortunately the threshold of entry into the subject is very high. The very field of MO is already old enough and just the number of different branches and methods already tends to infinity.

And if you haven't dealt with it before, you can drown in this sea of information). And I do not want to pick up pieces, all the same I need some systematic approach to the tool.

But I have not found any coherent systematic information so far.


It is not.

Systematicity is the use of EVERYTHING: preparation of the initial data, fitting the model(s) and evaluation of this model.

In the first approximation it all gives rattle - you can see and play with it. If you take my article, the effort is minimal (a couple of hours for everything), as it not only contains instructions, but also data for exercises.

 
Yuriy Asaulenko:

Unfortunately, the threshold for entering the subject is very high. The very field of MO is already old enough and just the number of different branches and methods already tends to infinity.

And if you have not been engaged before, it is possible to drown in this sea of information). And I do not want to pick up pieces, all the same I need some systematic approach to the tool.

But I have not found any coherent systematic information so far.

Our world is set up so that the profitability of the topic is a monotonous function of the height of the threshold of entry into the subject. The higher the threshold of entry (not necessarily conceptual complexity, can be and on the money threshold, social connections, geographical location, etc.) the more potentially profitable deal.


What is easy to take on many, as a rule costs little and is not even able to feed an adult, not to mention all sorts of excesses.

 
The same is true of the business:

The way our world works is that the profitability of a topic is a monotonic function of the height of the threshold of entry into the topic. The higher the threshold of entry (not necessarily conceptual complexity, can be a threshold for money, social connections, geographical location, etc.) the more potentially profitable the case.

For what is easy to many, usually costs little and is not even able to feed an adult, not to mention all the excesses.

This is certainly true. But a high entry threshold also increases all sorts of risks. Not necessarily financial.
 
toxic:

What many people easily take on is usually worth little and cannot even feed an adult....

++

or more precisely, a no-per-sex thing at all.

 
I'm watching the branch like this and I realize that it's gone...
 
mytarmailS:
I'm watching the thread like this and realize it's gone...

The very fact of its existence is surprising)))

The topic is the kind of thing that is harmful to talk about out loud, in detail, so...

 
SanSanych Fomenko:

It is not.

Systematicity is the use of EVERYTHING: preparing the raw data, fitting the model(s), and evaluating that model.

In the first approximation it all gives rattle - you can see and play with it. If you take my article, the labor involved is minimal (a couple of hours for everything), since it not only has instructions, but also data for exercises.

By a systematic approach, though, I mean an understanding of what you are doing and, accordingly, the ability to plan and predict the results of your actions.

Thanks for the article. Since I am not familiar with any particular software, it is perfect for a newbie - simple and understandable. The only thing I don't understand is which method, regression or classification?
Naturally, I immediately began to try it on my own systems. If any question is difficult, then God forbid, it will be found in the course of the play.

1. I don't use candlesticks to enter and exit - only a stream of quotes, and the candlesticks only on the history from the previous candlestick. In training it is ok, let it learn by candlestick, but how to make Rattle swallow the quotes flow in the current candle is still a mystery. The candlestick's flow should be analyzed somehow.

2. What to do with rebuildable predictors? For example, with regression lines and sigmas. You cannot even paste them into the history (for learning), we need functions that calculate them on the fly and remove their traces from the history.

Similarly, we have flickering predictors that do not always exist and are built from certain points of the series, and in general they can also be rebuilt in the course of the play.

4 The issue of normalizing predictors by items 2 and 3 - it is fundamentally impossible.

And the history for predictors should be calculated in the course of both training and work.

So far, we have nothing but confusion.