Is there a pattern to the chaos? Let's try to find it! Machine learning on the example of a specific sample. - page 20

 
Aleksey Vyazmikin #:

And how do you interpret that - there is a pattern, but you won't find it? Or is the pattern in the randomness?

just read carefully what it says.

There's nothing between the lines, it's verbatim and very clear.

You have formed for the second time a question that does not correspond to what I wrote.

 
elibrarius #:

How do you do it without greed? Calculate another one for each split and select a pair at once, but in your case the calculation time will increase 5000+ times. It's easier to average a hundred models.

I think rather towards quality, i.e. towards additional evaluation criteria.

Another approach - weights for predictors that will ensure consistency of their application - not a clear tree framework. It's sort of "define time first", then "estimate volatility" and "current price position".

elibrarius #:

To reduce the influence of randomness it is correct. Otherwise you have to do averaging of 20-100 models like in the forest.

There they have another trick, and I don't fully understand the process - at the beginning they build a tree on a truncated sample (if not forced on the whole sample), and then count the results in leaves on the whole sample. Apparently the splits are on the subsample, and the weights in the leaves are already on the whole sample.

elibrarius #:

I.e. it turns out that the refining trees may not be the best, but randomly worse.

Hence the scatter in the models from draining to profitable.

Nah, training always improves the result at each iteration and the models in the training sample are always in the plus side. Albeit with a financial spread.

elibrarius #:

Judging by the distribution graphs, there are more drain models, i.e. if we average, the average result will be unprofitable.

Yes, but this is more of a special case.

elibrarius #:

Maybe random-strength = 0? Hopefully the Seed changes will stop changing the model after that. Maybe create a model with better refinement trees rather than randomly bad ones. If the best model will be a drain, then searching on this data from 10000 random models randomly the best one is the way to a drain on the real.

I tried with zero, but I understand that theScore of the trees became the same for all, and therefore the same ones were chosen randomly :)). Or still somewhere random number generator is used.

elibrarius #:

Or still average several randomly selected models, as in the forest. Because the best one can be retrained.

I.e. select models from three samples and then average like? Maybe so - I don't want to move to ensembles of models yet, I still need to look into the possibility of improving the construction of the models themselves.

 
Aleksey Vyazmikin #:

Another approach is weights for predictors that will ensure consistency in their application - not a clear tree framework. It's like "let's define time first", then "estimate volatility" and "current price position".

Tried this - divided first by day of the week and/or hour of the day. The models turned out worse than if it finds the first splits by itself. You can do it too - divide the sample into 5 parts by day of the week and train 1 model for each day. Or an hour or whatever you want.

Aleksey Vyazmikin #:

Nah, training always improves the result at each iteration and the models in the training sample are always in the plus. Albeit with a financial spread.

It is clear that there will be improvement, but not the best, but with randomisation of score.
I don't watch Trein at all, so as not to distract me. He will always be good.

Aleksey Vyazmikin #: Tried it with zero, but I understand that theScore of trees became the same for all, and it means that they were chosen from the same ones randomly :)) Or still a random number generator is used somewhere.

If Score has become without randomisation, and the results are still different, it means that randomisation is used somewhere else.

Aleksey Vyazmikin #: I.e. select models by three samples and then average them? Maybe so - I don't want to go to ensembles of models yet, I still need to look into the possibility of improving the construction of the models themselves.

Not to select - but to take all randomly generated ones. And average them. Like in a forest. There, too, random trees are averaged. But you can experiment with the best of the random ones.

 
elibrarius #:

I tried this - I divided first by day of the week and/or hour of the day. The models were worse than if it finds the first splits by itself. You can do it too - divide the sample into 5 parts by day of the week and train 1 model for each day. Or an hour or whatever you want.

Yes I have that option :) I managed to catch interesting results on one of the predictors and make a signalling strategy for it.

However, I'm talking about something else here, about the priority of predictors selection by the model during training.

elibrarius #:

If Score became without randomisation and the results are still different, it means that randomisation is being used somewhere else.

Obviously :) The developer is still silent on this topic.

elibrarius #:

Not culling - but taking all randomly generated ones in a row. And average them. Like in a forest. There, too, random trees are averaged. But you can experiment with the best of the random ones.

Such things can be done with a large Recall, or by pre-grouping the models by similarity of response points to a positive class, otherwise there will always be a negative recommendation.

 

I reduced the sample of the last experiment - with the predictors I selected earlier - I removed two years 2014-2015 from the train sample.

If last time there were 208 models with profits over 3000 in the exam sample, now there are 277. Is it a coincidence or have the samples become more similar?

By the way, the average profit balance on the test sample last time was 982 points and on the truncated sample was 2115, but on exam it is almost unchanged -1114 vs -1214.


Any idea how else to improve the result?

 
Good afternoon, can you post the files from the first post, I also want to try an idea.
 
Aleksey Vyazmikin #:

If last time there were 208 models with profit over 3000 on the sample exam, now there are 277. Is it a coincidence or have the samples become more similar?

You have an extremely high dependence on Seed, i.e. on the new data. In data where there are patterns, the picture would be like on the traine: all in + with small deviations.
Try to change the initial Seed (try several variants). If the number of successful patterns will vary as much, then it is random, as well as the sequence of the HNC when changing the seeda.

Aleksey Vyazmikin #:

By the way, the average profit balance on the sample of test models on the last step is 982 points, and on the truncated sample it is 2115, but on the exam it is almost unchanged -1114 vs -1214.

Strange that the average on the test got > 0. Maybe you mean traine? The test doesn't seem to participate in learning or participates indirectly, for choosing a traine.

Aleksey Vyazmikin #:

Any ideas how to improve the result?

Most likely the model is overtrained or undertrained. And the last option - there are no patterns.
If it is over-trained - try to reduce the number of trees down to 1. If it is under-trained - you can increase the tree depth.
You have probably already tried the variant with more trees.
The lack of patterns is more difficult. If you didn't find them with 5000+ predictors, I don't even know how else to look for them. I don't know how you came up with those 5000+ either. I haven't dug in that direction yet. Since it takes a lot longer to calculate, but I guess I'll have to, since it's also about 50/50 on the OOS.

 

By the way, do you build the balance line with time on the horizontal axis or just with an even step between deals? Judging by the charts - the second.

Here is an example:

The top one has 3000+ trades, the bottom one has 600+. If you make them just with equal spacing, you will get beautiful trends. But here you can see that for 5 years trading was only a few days and there is no sense in making a robot that will sleep for months/years. You will just switch it off.

The picture is just in the theme of black and white swans. As you can see, MO "bites" them well and adjusts to them, if they are present.
@fxsaber has also researched this matter https://www.mql5.com/ru/blogs/post/749420
I completely agree with him that it is necessary to remove the influence of both white and black swans.

If in the optimiser you can apply a custom criterion and somehow select other options, in MO there are only standard options for selecting splits and here you can only cut out pieces of history. The problem is that the white swan moment is unknown before the model is trained. And if you cut it out - the model will become completely different and there may be its own white swan. We have to think and experiment...

Фильтр белых лебедей.
Фильтр белых лебедей.
  • www.mql5.com
В любом исследовании сначала идет подготовка исходных данных. На фин. рынках это почти всегда истории котировок. В зависимости от источника, они могут обладать определенными особенностями. Сегодня
 
RomFil #:
Good afternoon, can you post the files from the first post, I also want to try an idea.

Hello. Yes, I will try to post it today.

 
elibrarius #:

You have an extremely high dependence on Seed, i.e. HGF on new data. In data where there are patterns, the picture would be like on the trayne: all in + with small deviations.
Try to change the initial Seed (try several variants). If the number of successful patterns varies as much, then it is random, as well as the sequence of the HGC when changing the seeda.

Seed on each model changes sequentially from 1 to 10000 - that's the point of generating different models. What happens if Seed is not fixed and it is taken from the whole space (or how it is generated - also a question) - I don't know - you can check.

What is the basis for the statement that the result should be similar to test? I assume that samples are not homogeneous - there is no comparable number of similar examples in them, and I think that probability distributions on quanta differ a bit.

Yes, let the model be built by chance - let's say, but does that mean it doesn't describe the identified regularity?

elibrarius #:

Strange that the mean on the test turned out > 0. Maybe you mean the traine? The test doesn't seem to be involved in learning, or is involved indirectly, to select a traine.

It participates only to control the stopping of training, i.e. if there is no improvement on test while training on train, then training stops and trees are removed to the point where there was the last improvement on the test model.

It is just such a case that there may be no improvement, but the deterioration is not strong, but more generalised examples on the train sample, and the learning algorithm tells to stop this case. If we disable this feature, then another question becomes open - how many trees should be used in the model. I think about another option - we train a fixed number of trees, and then we truncate the model using the test sample, but we will have to calculate the balance at each step, which is certainly expensive.

Now I think it would be good to make the training stop not by one sample, but by a set of subsamples, which would give a check on the persistence of the pattern over time.

elibrarius #:

Most likely the model is overtrained or undertrained. And the last option is that there are no patterns.

If it is over-trained - try to reduce the number of trees down to 1. If it is under-trained - you can increase the tree depth.
You must have already tried the variant with a large number of trees.
The lack of patterns is more difficult. If you didn't find them with 5000+ predictors, I don't even know how else to look for them. I don't know how you came up with those 5000+ either. I haven't dug in that direction yet. Since it takes a lot longer to calculate, but I guess I'll have to, since it's also about 50/50 on the OOS.

Apparently I did not clearly indicate the sample I used - it is the sixth (last) sample from the experiment described here, so there are only 61 predictors.

As I pointed out above - yes, the models are not fully trained in the sense that they do not describe the entire sample train, and this is generally normal, because the market is changing and all combinations simply cannot be there and at each subsequent sample there will be a different number of them, and possibly with a different average outcome for them. We are not working with a representative sample, so we cannot expect a complete description - my goal is to extract a couple of stable patterns.

As for trees, there is a setting for the learning rate (--learning-rate), which is related to the number of trees, i.e. the higher the rate, the fewer trees are needed to describe the sample. So, it turns out that if you increase the rate (0.3), the number of models that pass conditional filtering is more, sometimes more than twice, the last experiments are just on such settings and the average number of trees is 10 pieces, while their depth is 6 splits. Trees in CB are somewhat different - there is one split on the whole level of a symmetric tree, which makes their application faster than the classic variant, but they are less informative one by one. In the latest releases, you can use classical trees, but I don't have an interpreter for their models in MQL5, so I don't use them, so I don't get upset.

In general, I can add more predictors, because now they are used only with 3 TF, with a few exceptions - I think a couple of thousand more can be added, but whether all of them will be used properly in training is doubtful, given that 10000 seed variants for 61 predictors give such a spread....

And of course you need to pre-screen predictors, which will speed up training.

elibrarius #:

By the way, do you build the balance line with time on the horizontal axis or just with an even step between trades? Judging by the charts - the second.

Here is an example:

The top one has 3000+ trades, the bottom one has 600+. If you make them just with equal indentation, you will get beautiful trends. But here you can see that for 5 years trading was only a few days and there is no sense in making a robot that will sleep for months/years. You will just switch it off.

The balance is built sequentially without taking into account the chronology of the calendar. Yes, I see what you mean, but in my concept it is a matter of the later stages of preparing for trading patterns.

elibrarius #:

The picture is just on the topic of black and white swans. As you can see the MO "bites" on them well and adjusts to them, if there are any.

It is logical that emissions are deviations, I just think that these are inefficiencies, which should be learnt by removing white noise. In other areas, simple primitive strategies often work, especially in flat market areas.