Machine learning in trading: theory, models, practice and algo-trading - page 3286

 
Maxim Dmitrievsky #:
Above I wrote about matstat. Before that I wrote about kozul. Even earlier I wrote about Oracle errors (markup errors), when the data is marked up in a way you don't understand. What absolutely comes out of this is the realisation that on different chunks and lengths of training, the results will vary. Depends on the data, which is not provided or described.

That's what the experiment is about - which is more important volume or chronology.

Just the emphasis is on quote data, not just sampling with some other observations independent of chronology. If time is important, then methods of sample splitting, for example, such as cross-validation should be used with caution.

Time matters if the market changes its behaviour significantly and irrevocably, which should lead to the impossibility of obtaining an acceptable model when controlling learning from a sample that differs significantly in chronological time.

The markup issue itself is of course important, but it can be put out of brackets here.

If you are very interested in the markup, it is based on the strategy from the Expert Advisor, the code of which is in my article.

I used the following settings:

А. Tester settings:

- Symbol: EURUSD

- Time Frame: M1

- Interval from 01.01.2010 to 01.09.2023

B. CB_Exp EA strategy settings :

- Period: 104

- Time Frame: 2 Minutes

- Moving Method: Smoothed

- Price Calculation Base: Close price


Predictors are the same as in that Expert Advisor.

 
Andrey Dik #:

I don't know, but it's interesting to know.

I'm very glad, so it's not for nothing that I'm expressing my thoughts out loud here.

 
Aleksey Vyazmikin #:

That's what the experiment is about - what's more important volume or chronology.

There's no problem of what's more important. There's a markup problem. As long as you bracket that, I'll bracket your endeavours.

 
Aleksey Vyazmikin #:

The emphasis is on quote data, not just a sample with some other observations independent of chronology. If time is of the essence, then sample splitting methods such as cross-validation should be used with caution.

I use Valking Forward for the same reasons.
And yes - strong dependence on the size of the train plot. For example, on 20000 rows something on the forward is found, but on 5000 or 100000 - random.

 
Forester #:

I use Valking Forward for the same reasons.
And yes - strong dependence on the size of the train section. For example, on 20000 lines something on the forward is found, but on 5000 or 100000 - random.

But you can think why it is so, can't you? ) because of incorrect partitioning and random falling into the interval where it is most correct :)

the word "random" already hints at matstat.
 
Maxim Dmitrievsky #:

There isn't a problem of which is more important. There's a markup problem. As long as you're bracketing that, I'll be bracketing your endeavours.

Everyone sees the problem differently. And they look for their own solutions. I am looking for stability in data with any markup and this is a different direction, not contradicting yours.

For me the most important question is how to estimate stability with high probability or monitor its measurement with small accumulation of observations. This will allow to work with any markup.

 
Aleksey Vyazmikin #:

Everyone sees problems differently. And looks for their own solutions. I am looking for stability in data with any markup and this is a different direction, not contradicting yours.

For me the most important question is how to estimate stability with high probability or monitor its measurement with small accumulation of observations. This will allow to work with any markup.

It must be a cool direction to mark up cats as camels and look for stability in that.

I've gotten used to any dialogue with you coming to a mental dead end.

 
Maxim Dmitrievsky #:

It must be a cool trend to mark cats as camels and look for stability in that.

I've gotten used to any dialogue with you coming to a mental impasse.

What difference does it make whether cats or camels if you can evaluate their usefulness outside of training?

Strange logic indeed.

 
Aleksey Vyazmikin #:

Who cares if it's cats or camels if you can evaluate their usefulness outside of training?

😀
 
Forester #:

I use Valking Forward for the same reasons.
And yes - strong dependence on the size of the train section. For example, on 20000 lines something on the forward is found, but on 5000 or 100000 - random.

If "something" is found, what is its lifetime outside of training usually?

Reason: