Machine learning in trading: theory, models, practice and algo-trading - page 105

 
Dr.Trader:

I predict only 2 classes - "buy" and "sell", that is, I will always have some trade open. I work with one model, I do not see the point of making two models that give just opposite results.

But I would like to gradually move to 3 classes - "buy"/"close everything and not trade"/"sell". That would give you a chance to trade with a more complex strategy. I've tried it a couple of times, but I've had trouble training models to three classes, especially if the model is regression with subsequent rounding of the result to classes.
I think it's worth trying to create two models in which the original 1/0/-1 classes are transformed into 1/0/0 for the first model(buy only), and 0/0/1 for the second model (sell only). This will lead to unbalanced classes in each model (the number of examples with one class is much higher than the other), but I have found good metrics for evaluating models that work under such conditions - F-score and kappa. I haven't done anything in this direction yet, but such a plan looks possible enough.

The problems in the methods of this branch are not only in non-stationarity of time series (different scales of the same patterns in price and time), but also in variability of patterns. That is, the patterns that yesterday foretold a growth, today may mean a subsequent flat, or even a fall.

Separate teaching for buying and selling partially solves these problems. I wrote a few pages ago how, I have two grids, one learning on buy signals (1;0) - buy or do nothing, and another on sell (-1;0) - sell or do nothing. At the output of these two grids is a switch that brings the signals together in a (-1;0;1) pattern. The work of such a committee of grids turns out to be very interesting: at the beginning there are good signals, the number of signals gradually goes down, that is, the answer is more often 0 at the output (patterns stop being recognized and this is better than incorrect trade signals), and after a while there start to be incorrect trade signals (patterns have changed). So already at the stage of signals decrease below a given threshold, it can and should be considered as the beginning of the need for new training on fresh data.

But in fact there is no such "training" and "practice". All sorts of cross-validation and OOS checks do not and cannot produce the effect expected of them. The point is that such tricks are nothing but a search and then selection of those values that approximate work satisfactorily both in the training and in the testing area, i.e. this set of parameters already exists initially among all their possible variants, and it is equivalent to a selection at once at the whole history area.

Nevertheless, the use of two models (in my case, two grids) is, in my opinion, the best that can be applied among the currently available methods of "machine learning". It's not training or coaching, it's a way to optimize the model.

There is no real training available today. Recognizing the same or similar patterns is not a result of learning, it is a result of remembering. Training must imply a certain thinking process (albeit primitive), which would allow to reason and draw conclusions when receiving new information, as well as the ability to generate new information independently. The market requires just such an approach - thinking, which as far as I know does not exist today. And what we use today - memorization, but not thinking, unfortunately.

 
Andrey Dik:

But, in fact, there is no such thing as "training" or "coaching. All sorts of crossvalidation and OOS checks do not and cannot give the effect expected of them. The point is that such tricks are nothing but a search and then selection of those values that approximate work satisfactorily both in the training and in the testing area, i.e. this set of parameters already exists initially among all their possible variants, and this is equivalent to selection at once on the whole history area.

There are models that pass crossvalidation but fail the test on new data. And there are models that are able to pass crossvalidation and then somehow trade something in profit. But if the model fails the crossvalidation, there is no point in trying to trade with it on new data.
To my mind this is a very good first step in selecting predictors and model parameters. Next - roll forward test to find out if the model (or heuristic of choosing model parameters and predictors) can work on new data. If it can't, we need to change something.

You have two neurons - you didn't get their parameters from the ground, did you? Layers, learning rate, inhibition, weight control, and dozens of other parameters that neurons have. You must have done some sort of roll forward test to determine the appropriate parameters, which is inherently crossvalidation too. I don't believe that you can just take two neurons with default parameters and start trading profit immediately.

 
Dr.Trader:

I do not believe that you can just take two neurons with default parameters, and start trading immediately in the plus.

Why not?
 
Dr.Trader:

1. There are models that pass crossvalidation, but do not pass the test on new data.

2. And there are models that can pass crossvalidation, and after that somehow trade something in profit.

But if the model fails the crossvalidation, there is no point in trying to trade with it on new data.

4. In my opinion this is a very good first step to select predictors and model parameters.

5. Next - roll forward test to find out if the model can handle new data. If it can't - we have to change something.

6. You have two neurons - you didn't get their parameters from the ground, did you? Layers, learning speed, inhibition, weight control, and dozens of other parameters that neurons have. You must have done some sort of roll forward test to determine the appropriate parameters, which is inherently crossvalidation too. I don't believe that you can just take two neurons with default parameters and start trading in profit immediately.

1. Yes. It doesn't mean that the model is bad. It can either mean that the model has "insufficient memory" for the given amount of data or it can mean that it hasn't found the right parameter set for the whole section together with the training sections and validation sections. The problem is that there is no way to determine exactly what the problem is with test failure on the new data.

2. Yes. That means "sufficient memory." You can get the same thing in point 1 just by reducing the amount of training data. The fact that we got a plus on the unknown data can be considered a pure "lucky", because we have found patterns, that are the same or very similar to what the system knows and at the same time the exact market reactions that happened before the corresponding patterns happened happened. The problem is that the patterns are constantly changing gradually, and sometimes suddenly abruptly, and there are no guarantees for the future.

3. Yes. But it's not much worse if it would have passed, for the above-mentioned reasons.

4. Yes, a good one. But not to provide confidence in the robustness of the model in the future, but only to get the validations passed and nothing more.

5. Walk-forward is the only way to validate the system. The only one. In my opinion, if you manage to create a system that is at least 55-60% efficient (that is, a decrease in efficiency of 30% or more compared to the "learning" plots), then you can consider that success has been achieved. But even this will only mean that a compromise between the speed of changes in market patterns and the system's ability to memorize the necessary amount of information, but it will not at all mean that the system is "stupid" by 30%, but it will mean that the system does not recognize 30% of new or changed old patterns.

6. The two neuronics approach I mentioned does not mean "learning" as such. It is just the same "memorization", just applied in a clever way. The same old patterns are forgotten and new ones are unrecognized, as you do, but at the same time I don't trade in unfamiliar patterns, my trading goes to zero with time, and then the false signals start. I do (used to do before) roll forward testing where new training took place when the number of trades per time unit decreased below the permissible threshold, that is why I got different length of test sections on the roll forward with the same "training" sections. It turned out just the average efficiency decrease by the same 30%.

 
Combinator:
Why?

Neuronics, just like any other model, will simply find some patterns in the available data. Whether these patterns correspond to some internal processes in Forex, or they are just picked up by neuronics according to the principle "multiply something with something, if only the result coincides" - no one knows, there are no guarantees. For it to be able to detect the internal processes of forex, its structure (layers, connections) must somehow correspond to forex, it must be specifically configured for it, only then it will start to produce consistently good results.

For example, convolutional neural networks are very popular now for picture classification. Their use sounds very simple (download Python library, and you're done), but people forget that a lot of universities for decades are involved in "recognize picture" contest, where they fight for every winning percentage. Trendy things like "coloring a picture to look like Van Gogh" or substituting a face in a picture are the result of decades of work by universities, with all their undergraduate and graduate students (and professors probably) who spend a lot of time adjusting model parameters, or developing new models.
For forex this is also possible, but it takes just as much effort. In this case, the winnings are real forex profits, that is why model configurations are not bragged about and are not publicly available, each new contestant has to start all over again.

 
Dr.Trader:

I do not believe that you can just take two neurons with default parameters, and start trading immediately in the plus.

Of course, I don't believe in that either. Because that would mean creating a thinking machine that only needs to be trained once in its lifetime. Humanity is a long way from being a thinking machine.
 
Andrey Dik:

Separate training for buying and selling partially solves these problems. I wrote a few pages ago how, I have two grids, one learning on buy signals (1;0) - buy or do nothing, and another on sell (-1;0) - sell or do nothing.

My ternary classifier also has two grids, but with different hidden layers and both classify the prediction of the next price direction, i.e. 1 or -1. There is a switch on the output, which in case one of the grids outputs 1 and the other -1 (they contradict each other), the switch outputs 0 - do nothing.

Everything is clear with buying and selling for the dependent variable values as they can be determined by the pattern followed by the direction of price movement. But here a tricky question arises, by what condition do your binary classifiers mark the dependent variable at 0 - do nothing?

There is no such condition as facts in nature, IMHO.

 
Andrey Dik:

5. ...

6. ...

These explanations make it sound realistic. How much of a rare pattern do you trade? Let's say, if there are training examples with the classes"open buyposition " and "close all trades", what would be the percentage ratio of these classes? I can suppose that the "buy" class corresponds to the sharp price movement upwards by hundreds of points, i.e. the amount of "buy" class is about 10% of all the training examples?
 
Andrey Dik:

1. Yes. This does not mean that the model is bad. It could mean either "insufficient memory" of the model for the amount of data provided,

2. Because we have caught patterns that are the same or very similar to what the system knows and at the same time exactly the same market reactions have happened before the corresponding patterns. The problem is that the patterns are constantly changing gradually and sometimes abruptly and there are no guarantees for the future.

1) There is a network which is able to pre-learn if it receives data it did not know before, it may be interesting for you to read, the network is called SOINNhttps://www.google.com.ua/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=soinn

2) I'm tired of writing about it, moreover I practically proved that the market goes against my own statistics and even explained the mechanics, why it happens and all the classical way of training do not apply to it, but nobody is interested, everybody does the same.

Google
Google
  • www.google.com.ua
Голосовой поиск – это очень удобно! Чтобы найти нужную информацию, скажите "О'кей, Google" и произнесите запрос.
 
Yury Reshetov:

My ternary classifier also has two meshes, but with different hidden layers and both classify the fact of the next price direction, i.e. 1 or -1. There is a switch on the output, which in case one of the grids gives out 1 and the other -1, the switch gives out 0 - nothing to do.

Everything is clear when it comes to buying and selling as they can be identified by the pattern followed by the direction of price movement. But here we face a tricky question. On what condition do your binary classifiers mark the dependent variable as 0 - do nothing?

There is no such condition as facts in nature, IMHO.

SELL BUY Interpretation

-1 0 sell

0 0 fence

0 1 buy

-1 1 fence

Here is the switch table. You can see from it that the signals occur when the grid signals are there and do not contradict each other. After training, the patterns are recognized and a successful trade occurs, over time the readings begin to either contradict each other (out 0 - fence) or both grids stop recognizing patterns (out 0 - fence). Therefore, over time there is a decrease in the number of trades, i.e. the "don't be sure - don't trade" principle.

But your question seems to be about something else: how do you get the corresponding grid to buy/sell instead of sitting on the fence? - The answer is simple, a points system. Points are awarded for correct answers, and penalties for incorrect and fence. It is necessary to pick up a ratio of points, this is also a very difficult task, because you need to achieve coordinated work between the two grids in the end, but the result is worth the trouble.

Imho, the buy and sell patterns are different. That's the idea, as you probably have.