Machine learning in trading: theory, models, practice and algo-trading - page 375

 

1) Do I correctly understand the meaning of dividing the dataset into training, validation, and test:?

a) we teach on the tutorial
b) check the error on the validation test, if it is much higher than on the training one, then go back to step 1, until the error is the same (how much is the same - 5% of the total error? For example, 15% on the training and 20% on the validation one)
c) Check on a test one - if the error is the same with the first two sets (how much the error is the same?), then the model is stable and can be run, if not - forget about it. And look for other predictors, change filtering, etc.

2) By the way, what is the error rate for training/validation/test to aim for? 15/20/20% or maybe 5/10/15%? or other?

3) I don't quite understand why it is recommended to mix training examples? We will process each example anyway.

 
elibrarius:

By the way, what is the error rate for learning/validation/testing to aim for? 15/20/20% or maybe 5/10/15%?

Previous, yes, something like that.

As for error, it depends on the specifics. If, say, MO or NS determine entry into a trade, then 50% error may be enough. For example, in a successful trade you get an average of 2-3 pts profit, and in an unsuccessful one pt loss. In this case 0.5 is not a bad probability.

 
Yuriy Asaulenko:

The previous one, yes, something like that.

As for error, it depends on the specifics. If, say, MO or NS determines the entry into the trade, then 50% errors may be enough. For example, in a successful trade you get an average of 2-3 points of profit, and in an unsuccessful one point of loss. In this case 0.5 is not a bad probability.

0.5 may be too low... What is the target value you really can reach in practice (not some other NS problems, but for trading)?
Because I want to learn up to 10%, but if this figure is unrealistic, then I'll be wasting my own time and CPU time. Let's say - what is the best error you have achieved and at what level can you stop and not look for improvement?
 
elibrarius:
0.5 may be a little low... And what values should I strive for? Which ones can I really reach in practice (not some other NS tasks, but for trading)?
Because I want to learn up to 10%, but if this figure is unrealistic, then I'll be wasting my time and my CPU. Let's say - what is the best error you have achieved and at what level can you stop and not look for improvement?

0.5 is not enough? Oh, come on.) I already gave this example: poker player has probability of winning 1/9-1/6, and good players are always in profit.

And all my systems worked at ~0.5 probability, and were always in the plus. As far as I know, many TS work with probability close to 0.5 - it was mentioned at autotrading conference, in particular.

"And then I want to train up to 10%, but if it is an unrealistic figure" - whether real or unrealistic depends on the specific task. For example, I trained NS for MAC crossing - so it's almost 100% reliable)).

 
Yuriy Asaulenko:

0.5 is not enough? Oh, come on.) I already gave this example: poker player has probability of winning 1/9-1/6, and good players are always in profit.

And all my systems worked at ~0.5 probability, and were always in the plus. As far as I know, many TS work with probability close to 0.5 - it was mentioned at autotrading conference, in particular.

"And then I want to train up to 10%, but if it is an unrealistic figure" - whether real or unrealistic depends on the specific task. For example, I trained NS for MACs crossing - so it's almost 100% reliable)).

It's true, you can do it without any forecast (50%), you just need to take the stop more than the stop. Actually you can't forecast anything, nobody knows where the price will go in Forex, only insiders know.

 
Vasily Perepelkin:

In fact, it is impossible to predict anything, no one knows where the price will go in the Forex market, only the insiders, the puppeteers can know that.

Actually it is possible. I don't think it's possible to use 0.5 for making a forecast and taking a profit more than a stopper.) - I.e., without a forecast. We're flipping a completely different coin.))
 
elibrarius:

1) Do I correctly understand the meaning of dividing the dataset into training, validation, and test:?

a) study on the trainer.
b) check the error on the validation one and if it is much bigger than on the training one, we come back to the point 1 until the error is the same (how much is the same - 5% of the total error? For example, 15% on the training and 20% on the validation one)
c) Check on a test one - if the error is the same with the first two sets (how much the error is the same?), then the model is stable and can be run, if not - forget about it. And look for other predictors, change filtering, etc.

2) By the way, what is the error rate for training/validation/test to aim for? 15/20/20% or maybe 5/10/15%?

3) I don't quite understand why it is recommended to mix training examples? We are going to process each example anyway.


1) Not all and it's fundamental.

We take one large file. We divide it into two unequal parts.

Split the larger part like you described. We get the errors, which should be approximately equal.

Then we test the model on the second part of the file. The error in this part again should not be very different.

This is the most important proof of the absence of feathering (overfitting).


The magnitude of the error? This is a kind of constant, which is derived from the set of predictors that can be reduced by fitting the kind of model.


For example.

If you have all four errors around 35%, then by selecting a model, God forbid you reduce the error to 30%.


PS.

An error of less than 10% is a clear sign of overtraining. If you have such an error, you should double-check.

 

I found an early-stop training with validation section in ALGLIB:

Neural network training using early stopping (base algorithm - L-BFGS with regularization).
...
The algorithm stops if validation set error increases for a long
enough or step size is small enough (there are task where
validation set may decrease for eternity). In any case solution
returned corresponds to the minimum of validation set error.

Judging by the code, it does not compare the error in the training and validation sections, but searches for the minimal error in the validation section. And it stops if it doesn't find a better one after 30 iterations, or if all iterations have passed.

But I'm not sure if this method is better/more accurate than the usual one... Unless the number of training cycles is several times higher...

Here's what came out:

Average error in training (80%) =0.535 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6
Mean error in validation (20%) plot =0.298 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6
Full plot (training + validation plot):
Average learning error=0.497 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6
Mean error on the test (20%) section =0.132 nLearns=200 NGrad=142782 NHess=0 NCholesky=0 codResp=6

It feels like there was a fit to the validation plot. The test one is generally successful, but it wasn't in training and wasn't compared, apparently just a coincidence.
It's an ensemble count with this function, and there's a 2/3 split and everything is mixed between both plots, I'll try to do the same...
Shuffled it:

Mean error in the training (60%) plot =0.477 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6
Average error in validation (40%) plot =0.472 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6
Full plot (training + validation plot):
Average learning error=0.475 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6
Average error on test (20%) section =0.279 nLearns=10 NGrad=10814 NHess=0 NCholesky=0 codResp=6

Due to the mixing, the error is equalized on the training and validation sections.

Something seems wrong to me, because in real trading the bars will follow their own order, and not mixed up with those of an hour and a day ago.
And if the "nature" of the market changes, it means that it is necessary to re-learn or look for new NS models.

 
Yuriy Asaulenko:
Actually it is possible. It's not the same as 50/50 with forecast and Buy more than Stop, God willing)) - I.e., without a forecast. We flip a completely different coin.))
Well I'm saying it's possible, we open randomly and take a stop that is 2x larger than the stop and that's it, statistically it will be profitable, for example 100 trades with 10 points profit and 100 with 5 points loss, as a result we have 500 points profit, we don't need forecast.
 
SanSanych Fomenko:


1) Not all and it is fundamental.

We take one big file. Divide it into two unequal parts.

Divide the larger part in the way you described. We get the errors, which should be about equal.

After that, the model is checked on the second part of the file. The error in this part again should not be very different.

This is the most important proof of the lack of feathering (overfitting).

4 plots turns out? Training/validation/test1/test2 ?

How many cycles of training/validation do you need to do? Haven't seen any information about that anywhere... 1 cycle total? - and right after that we either approve or change something in the predictor set or network scheme? More precisely for N cycles of training we will be shown one best.