Machine learning in trading: theory, models, practice and algo-trading - page 3324

 
Aleksey Vyazmikin #:

My point is that the outcome evaluation metrics that we are used to using in trading and machine learning are only part of evaluating the quality of the resulting model/tuning/approximation.

What is important is under what conditions we achieved this. How much information was required to achieve this. We need to assess the stability of the observations over time. The contribution of each predictor.

The problem with complex models with a large number of predictors and decision rules (be they trees or neurons) is that they create complex patterns that are unlikely to repeat in their entirety, hence the bias in the probability of assignment to one of the classes. Earlier I posted a picture of "what the trees are buzzing about" which showed that most leaves simply don't activate on new data.

All this is from the fact that we are dealing with a "function" (actually their sum) that cannot be fully explored to approximate it. Which means we need to pay special attention only to those that are more understood/known. It is better to keep the model "silent" on new data, as it is not familiar with the situation, than to operate on single cases from the past.

So the question arises - how to make the model silent if it is not sure, and to give confidence if the probability of favourable events is high.

We need methods of correcting ready-made models. It can be implemented through the impact on the model after training, or by applying models of two classes - one of the bousting type and the other of the K nearest neighbours type.

I have made earlier models that simply stopped giving trading signals over time. And yes, this is better than shifting the probability of correct answers to 50/50 on new data. The point - when training to get the answers of the grid in a narrow range of numbers, over time the answers begin to go out of the range and signals disappear. This is a very labour-intensive process, I have not managed to automate training and further trading.

This is one of the approaches, probably there are others, it is necessary to study this topic.

 
Andrey Dik #:

The essence - during training to achieve hitting the grid answers in a narrow range of numbers, over time the answers begin to go out of the range and signals disappear. This is a very labour-intensive process, I have not managed to automate training and further trading completely.

I implemented this idea with the help of MT5 optimiser:

I set a condition for my pseudo-neuron: "Open BUY if the result of the set gives a number in the range Close[1] +/- 0.00050 pips."
Input - one number only - Close[2];

The optimiser starts to search hard for profitable trades, but instead I sort the sets by the number of trades.

When the optimiser is finished, you choose the most draining set with the largest number of trades - it naturally means that the Expert Advisor guessed the maximum number of future prices.

Then I switch to the testing mode, where the condition has already been changed:"Open BUY if the result of the set gives a number greater than Close[1] by N points "

Voila: the forward is profitable in a year.

One problem: it worked only with an hour candle at 2 am. Closing - at the opening of the next hour.

I somehow found such a pattern. It worked on EURUSD, USDCHF and EURGBP at another morning hour.

 
Andrey Dik #:
grid responses in a narrow range of numbers

Are we talking about NS or another grid?

Andrey Dik #:
over time the responses start to move out of range and signals disappear

Is it because of the change in the totals of predictors, or maybe only one stopped showing the "desired" result?

In general, the question of why something is broken can be very important for further ideas.

 
Aleksey Vyazmikin #:

1. Are we talking about NS or a different grid?

2. Is it because of the change in the total scores of the predictors, or maybe only one of them stopped showing the "desired" result?

In general, the question of why something broke can be very important for further ideas.

1. Yes, of course.

2. Maybe I didn't put it in the right way. Nah, it was just a positive effect - the trade would gradually come to naught on new data. As soon as the number of trades decreased below a given level per unit of time you need to train again. I.e. not a decrease in trading efficiency on OOS as a signal to retrain, but a decrease in the number of trades.

I.e., instead of talking nonsense bringing losses of trading on OOS, the NS gives silence on unfamiliar data in response.

 
Aleksey Nikolayev #:
Your link also talks about linking "profile" to cross-validation, for which it might be easier to find packets.

I don't see the connection here. From what words does it follow?

 
Forester #:

The work is experimental. Here's a quote from http://www.ccas.ru/frc/papers/students/VoronKoloskov05mmro.pdf

It's unlikely that every experiment was created a package.

Oh, and the experiment is artificial. Noise was added to the data set clearly separated by classes. And the clear separation is only for 1 feature - the Y axis. If we remove the noise (all data from 0.2 to 0.8), it turns out that we leave examples only with the distance to another class not less than 0.6. I mean the most complicated 3rd variant in the picture:


Go to real life and add your 5000 predictors that will be noise to this single working fiche. In clustering you calculate the total distance between points in this 5001 dimensional space. 0.6 working will never be found in this chaos.

I think any classifiers will do it better, the same tree will find this single feature and divide by it, first through 0.5 and then it will reach splits of 0.2 and 0.8 followed by leaves with 100% purity.

It is claimed that with this algorithm it was possible to win the first places on kagle, I do not think that there were simple tasks....

Shall we try to figure it out? I don't understand formulas - to my great regret.

 
mytarmailS #:
One of Vladimir Perervenko's articles described this method, and there was an example with code, of course

I learnt about this algorithm from the video, there are some formulas on the slade - it's hard to call it a code.

Where did you see an example code?

 
Andrey Dik #:

1. Yes, of course.

2. Maybe I didn't put it that way. Nah, it was just a positive effect - the trade would gradually come to naught on new data. As soon as the number of trades decreased below a given level per unit of time you need to train again. I.e. not a decrease in trading efficiency on OOS as a signal for retraining, but a decrease in the number of trades.

I.e., instead of talking nonsense bringing losses of trading on OOS, the NS gives silence on unfamiliar data in response.

That's what I realised. Just asking if the cause of this has been identified. Not about what's broken, but why the signals are missing.

 
Aleksey Vyazmikin #:

I learnt about this algorithm from the video, there are some formulas on the slade - it's hard to call it code.

Where did you see an example of code?

Is this trolling?

 
mytarmailS #:

Is this trolling?

What's trolling?

Here's the video.