Machine learning in trading: theory, models, practice and algo-trading - page 3170

 
mytarmailS #:
If you get profit on OOS as on training, it means that this effect (directed drain on OOS) is inherent only in the markets and we can make hypotheses further on

Forum on trading, automated trading systems and testing trading strategies.

Machine learning in trading: theory, models, practice and algo-trading

fxsaber, 2023.08.16 11:38 AM

This is the kind of nonsense that happens. On the left OOS passes, on the right - not. And the right side literally immediately "dives".

Can you see that OOS passes on the left?

 
fxsaber #:

Can you see the OOS going through on the left?

We're talking about the effect on the right

Duplicate the experiment completely, but with synthetic data.


======================================

The OOS on the left is also a fit, but of a kind of second order


Imagine you have only 1000 variants of TC, in general.


your steps 1 and 2

1) You start to optimise/search for a good TS, this is the train data (fitting/searching/optimisation).

Let's say you've found 300 variants where the TC makes money...

2) Now you are looking for a TC out of these 300 variants which will pass OOS is test data. You have found say 10 TCs that earn both on the traine and on the test ( OOS ).


So what is point 2 ?

It is the same continuation of fitting, only your search(fitting/searching/optimisation) has become a little deeper or more complex, because now you have not one condition of optimisation (pass traine), but two (pass test + pass traine).

 
fxsaber #:

This is the picture seen by almost every user of the tester. I am interested in the explanation.

In this picture, the statistical significance is quite high: more than 3000 non-overlapping positions.

I assume that this is the effect of market changes within Sample itself. For example, Sample had a real pattern in the beginning and then nothing. But the fitting happened for the whole Sample.

We should somehow avoid such breakdowns within Sample.


The opposite effect can also happen: on the left OOS - down, on the right - up. I.e. no pattern was found in the initial piece of Sample, but only fitting.

OOS should always be to the RIGHT.

If the OOS is LEFT, there is no way to guarantee that the TC is NOT overtrained and is NOT looking ahead . These are the first major issues to address when testing a TC BEFORE anything else.


Which one do you have? It makes no difference! It doesn't matter if it is either one of them or both of them. You need to test correctly and basta - OOS on the right.

And it's better to forget about the tester and form files for testing as follows:


We have two files.


The first file is divided randomly by sample into three parts: training, testing and validation. Study on a (random) training sample, then check on a random testing and validation sample - these are all DIFFERENT pieces of the first file. Compare the result. If they are approximately equal, then check on the second "natural sequence" file. If they are approximately equal here too, we get the main conclusion: our TC is NOT overtrained and does NOT look ahead. Only having this conclusion it makes sense to talk about anything else: accuracy, profitability and other things, all of which are SECONDARY.


I note that there are actually no other ways to check for looking ahead and retraining.

 
fxsaber #:

This is the kind of thing that happens. On the left OOS passes, on the right - not. And on the right side, it literally "dives" immediately.


It happens most of the time.

I.e. literally immediately significant dive. The nature of the dive is not clear. I think it should be something close to SB, but I see such a picture too often.


It feels like if you run an inverted TC after optimisation, you may not even drain.

P-hacking (or data-dredging) is a statistical practice in which a researcher analyses data until he finds a statistically significant result. He or she may change the parameters of the analysis, select only certain data, or make multiple comparisons to find significant relationships or differences in the data. This can lead to false positives and distort scientific conclusions. P-hacking is a form of scientific dishonesty and can lead to incorrect recommendations and decisions based on false premises.


***as you rightly point out, the reverse can also happen
 
fxsaber #:

Can you see the OOS going through on the left?

If the training period is reduced, will the chart trend reversal occur as quickly?

I don't know much about tick strategies, but one of the factors for this behaviour is the lack of comparable data during training, for example - the training was mostly trending down on some TF.

I don't know what training method you are using, if it is tree systems or filters just clamping the range of a conditional indicator (function), it is worth estimating the number of examples falling into each of such ranges.

A possible situation is data drift and a shift in the probability outcome distribution for the filter/list.

For example, when I select quantum segments on a sample for training, and then estimate their distribution (percentage of correct and incorrect responses to the target 0||1) on two other samples, then meeting the stability criterion on 3 samples is found in the range of 25%-30% - it is clear that in this case the model has more chances to choose an unstable predictor, which will stop working on one of the sites.

In the end, everything comes down to analysing simple regularities, namely, searching for reasons to consider them as such, rather than random observation of a comet's tail in a telescope.

 
fxsaber #:

Can you see the OOS going through on the left?

How long does the system remain profitable?

I have encountered similar behaviour of the system, when on the OOS on the right there is a sharp plum, I do not think that it is connected directly with a sharp 180 degree reversal of the found market patterns (it would indicate the reasons of mystical nature, application of voodoo practices and in general anything rather than any real problems like retraining or adjustment, because it is at least strange, when a sharp plum always happens after the end of training). Usually it is due to some errors in the code causing false positives (or false negatives) as Max said above, correction of which leads to random behaviour on the OOS right in the worst case (overtraining) or gradual fading of profitability in the best case (fading of found patterns and/or their gradual change).

 
Andrey Dik #:

how long does the system stay profitable?

I have encountered similar behaviour of the system, when on OOS on the right side there is a sharp plunge, I don't think it is connected directly with a sharp 180 degree reversal of the found market patterns (it would indicate reasons of mystical nature, application of voodoo practices and in general anything rather than any real problems like retraining or adjustment, because it is at least strange, when a sharp plunge always happens after the end of training). Usually it is due to some errors in the code causing false positives (or false negatives) as Max said above, correction of which leads to random behaviour on the OOS right in the worst case (overtraining) or gradual fading of profitability in the best case (fading of found patterns and/or their gradual change).

And if the TS has many parameters or is very well fitted, the dips are always sharp. Because it worked "on the thin". A large number of parameters leads to increasing errors, they add up. If even just coarsen the TS and make fewer parameters, it is not so beautiful in the tester, but it collapses more smoothly.

We can give an analogy with a poker at martingale. There is a large number of failed positions. Replace it with a large number of failed parameters or something else. The result is the same.

Because pi hacking doesn't fix the problem, it sweeps it under the rug. By reducing bias and increasing variance, and vice versa. The errors are still there, just hidden.
 

I added visualisation of graphs on test and exam subsamples, and cut train - removed the initial piece so that the pictures would be compatible.

In fact, these are time sequential sections of train->test->exam.

After looking at the gif, it becomes clear that test and exam samples rather reduced the amplitude of the oscillation than got a trend in any direction.

However, if you look closely, you can see that at some iterations there is an improvement on these samples, i.e. we can assume that these are the rules (in the form of quantum segments) that show stability on different samples. It can also be noted that different sections change differently from iteration to iteration, i.e. improvement on test does not have a direct correlation with improvement on exam.

As I wrote above - this is explained by the change in the bias of the probability of belonging to a class of an individual quantum segment.

Quantum segments themselves, as a signal to skip the target signal, i.e. to bring it to zero, or in other words to divide the sample into two parts, are selected according to their cost estimation. That is, the cost of reducing erroneous signals is estimated. At each iteration the recalculation is performed and the variant with the lowest price is removed.

Here is how the price changes according to one of the calculation methods. Below is a gif where each point is a quantum segment (axis x is a sequence number).

Will the result change significantly on test/exam samples if the price is chosen randomly at the first iteration?

 

Between iterations 4 and 5, we can see how the test subsample dramatically loses the number of correct responses of the target, which immediately leads to a divergence (delta increases) with the exam sample.


 
Aleksey Vyazmikin #:

Will the result change significantly on the test/exam samples if the price is randomly selected on the first iteration?

I will answer myself - yes, it will.

I randomly selected the first quantum segment to exclude the signal (string) 1000 times.

Here are a couple of example gifs, how the process went with different random first quantum iterations (it can be leaves).


And here are static pictures at the moment of intermediate iteration - different stages of selection and randomisation.

What conclusion can be drawn?

1. You might get lucky and randomly find a working pattern :)

2. Without reducing the number of false patterns, it is difficult to build a model using only the principle of greed.

3. You need to develop methods to estimate the regularity observed in a quantum segment or sheet.

4. Randomness does not prove that one is successful in machine learning.

5. A logically valid model is required to be successful.

6. Success on a test sample does not always mean success on an exam sample, and vice versa.

What other conclusions can be drawn?