Machine learning in trading: theory, models, practice and algo-trading - page 414

 
elibrarius:

Why is it not known? The number of clusters to divide into - is set at startup as an input value: K - desired number of clusters, K>=1

Suppose I've split data into 4 groups, what should I do with them?


I meant that I don't know beforehand what class belongs to what in the sample... what to do with them afterwards with respect to trading I don't know, maybe to see which cases belong to sell signals and which to buy signals, what is more and so on...
 
Aleksey Terentev:
Unfortunately, I haven't dealt with alglib. The ML packages that I got acquainted with all allowed me to change the layer's activation function.
Basically, if you have enough knowledge, and if the library allows it, you can inherit the neuron class and register there your activation function.
But these are extreme methods.

I wanted to cram in and write a couple of my recurrent layers, but it's good that I came to my senses. =)
 
Aleksey Terentev:
In principle, if you have enough knowledge, and if the library allows it, you can inherit the neuron class and write there your activation function.
But this is an extreme method.

Once I wanted to get into, and write a couple of my recurrent layers, and it is good that I came to my senses. =)
There's just an initial choice of network type by output type, you don't have to rewrite anything (and all inner layers are rigidly defined as non-linear)
 

I have been working on the signals for a long time now, but I am still losing, I need to work better on the predictors and targets. But it is still losing, I need to work better on the predictors and targets.


 
Guys about the two outputs with probabilities. I think you are absolutely right, to buy one exit 0.9 then to sell the other exit 0.1. But why it is needed is an interesting question. In the out-of-sample area both inputs will give 0.9 what then???? Most likely there is a back and forth jitter. This also happens in the market when there is uncertainty. The market does not know where to go, and the signal has already appeared. And as they say you get more information...
 
Do you want to write a long post, how to recognize the market completely????? Anyway an idea how you can try it, especially since it would be faster to do in two three computers, consider parallel calculation, I have on 3 cores...
 

I remember someone scolding me for having to orient my model every morning for further work with it. This is how my straight model worked today. Bad, you say, of course bad, I'll tell you... And now mirror it in your head and start trading from the third signal. How about now????? And you say that the orientation method is bullshit....

And no need to coddle Granny!!!! :-)))))

 

Well, since you're nastivate, I'll tell you one idea about the collection for processing data. It is really very difficult to train a model with high level of generalization on a large enough area, because the market is a living organism and blah, blah, blah. The longer the training period, the worse the model performs, but longer. Task: Make a long-running model. Split or method two, however for those who use a committee of two networks.

We have three states "Yes", "No" and "Don't know" when the nets show in different directions.

We train the network on the whole section, in our case 452 entries. The network learned this set at 55-60%, let's assume that the responses "Don't know" on the training sample were 50%, respectively 226 signals the network could not learn. Okay, now we build a new model ONLY on the "Don't know" states, that is, we try to build the model on such quasi states, which misled the first model. The result is about the same out of 226 only half will be recognized, the rest will get the "Don't Know" state, then build the model again. the result is 113, then 56, then 28, then 14. At 14 records are not known to any of the previous models Jprediction Optimizer will usually count up to 100% generalizability.

As a result, we have a "Pattern System" that recognizes the entire market in a three-month area.

Here's another way, besides "Context of the day" How you can break the market into subspaces and perform training by getting exactly "System of models" Here's an example....

 

To be honest, I have done a little different partitioning into subspaces, but the essence remains the same.

There was a general file of 288 lines, I divided it into three samples the number of records of the training sample Specified in the line Tootal patterns.

* Sensitivity of generalization abiliy: 74.07407407407408%
* Specificity of generalization ability: 70.96774193548387%
* Generalization ability: 72.41379310344827%
* TruePositives: 20
* FalsePositives: 7
* TrueNegatives: 22
* FalseNegatives: 9
* Total patterns in out of samples with statistics: 58

The last one:

* Sensitivity of generalization abiliy: 61.904761904761905%
* Specificity of generalization ability: 60.0%
* Generalization ability: 60.869565217391305%
* TruePositives: 39
* FalsePositives: 24
* TrueNegatives: 45
* FalseNegatives: 30
* Total patterns in out of samples with statistics: 138

And the last one.

* Sensitivity of generalization abiliy: 69.04761904761905%
* Specificity of generalization ability: 66.0%
* Generalization ability: 67.3913043478261%
* TruePositives: 29
* FalsePositives: 13
* TrueNegatives: 33
* FalseNegatives: 17
* Total patterns in out of samples with statistics: 92

Without conditionally each of them should gain, but note the total number of transactions in this area 54 pieces (the basic strategy). And here's what happened when they worked all together at once.


 
This is all off-sample plot since 05.29 on 15 minutes. It's already the third week. But if it does not gain more then in principle a pittance of the approach, but I believe...... :-)