Machine learning in trading: theory, models, practice and algo-trading - page 184

 
Andrey Dik:

I previously described my approach to dividing into 3 classes (sell, fence, buy). The "fence" class includes all cases that contradict each other or cannot be divided into buy and sell classes. It turns out that just 3-10% fall into the buy and sell classes. The beauty of this approach is that working with unknown data (real) with time the net stops recognizing market situations and starts attributing them to "the fence", i.e. it gradually stops trading. This is a hundred times better than starting to make more and more mistakes with entry over time.

But all to no avail, no one needs, no one listens.

What is a fence? When the net to buy says buy and the net to sell says sell?

It is the same Reshetov ternary.

 
Andrey Dik:

I previously described my approach to dividing into 3 classes (sell, fence, buy).

This is already implemented in jPrediction. That is why it is called not a binary (two-class), but a ternary (three-class) classifier.

Implemented in a very trivial way:

  1. Training: we train two binary classifiers on two non-intersecting parts of sample.
  2. Crossvalidation: Testing binary classifiers on those parts of the sample where they were not trained.
  3. Classification: If the readings of both binary classifiers coincide, then we take the output value of either of them as the result. If the readings of the binary classifiers are different, we get the output: "sit on the fence and smoke bamboo".

Previously it was supposed to use the "bicycle" method:

  1. Train the binary classifier on one half of the sample
  2. Test it on the second half of the sample
  3. Using ROC analysis, raise one of the thresholds to increase sensitivity, and lower the second threshold to increase specificity.
  4. Classification: If the pattern to be classified is higher than the sensitivity threshold - buy. If the pattern is below the specificity threshold - sell. If the pattern is between both thresholds - sit on the fence and smoke bamboo.

However the "bicycle" described above produces more false signals compared to the classification by two binaries due to the lack of crossvalidation, but it is easier to implement.

The dumbest and most futile ternary binary, though the most primitive in implementation: it is an ANN with three outputs. If each such output has its own classification threshold, then they get not three but eight potentially possible states, of which only three are unambiguous (a value above the threshold on only one of the three outputs), and five are unclear how to interpret (values above the threshold on more than one of the outputs, or below the threshold on all three outputs).

 
mytarmailS:

What is the fence? When the net to buy says buy and the net to sit says sit?

This is the same Reshetov ternary

No. I won't give you the link, look it up.
 
Yury Reshetov:

This is already implemented in jPrediction. That is why it is called not a binary (two-class), but a ternary (three-class) classifier.

Implemented in a very trivial way:

  1. Training: training two binary classifiers on two non-intersecting parts of the sample.
  2. Crossvalidation: testing binary classifiers on those parts of the sample where they were not trained.
  3. Classification: If the readings of both binary classifiers coincide, then we take the output value of either of them as the result. If the readings of the binary classifiers are different, we get the output: "sit on the fence and smoke bamboo".

Previously it was supposed to use the "bicycle" method:

  1. Train the binary classifier on one half of the sample
  2. Test it on the second half of the sample
  3. Using ROC analysis, raise one of the thresholds to increase sensitivity, and lower the second threshold to increase specificity.
  4. Classification: If the pattern to be classified is higher than the sensitivity threshold - buy. If the pattern is below the specificity threshold - sell. If the pattern is between the thresholds - sit on the fence and smoke bamboo.

However the above "bicycle" gives more false signals compared to the classification by two binary, but it is easier to implement.

The dumbest and most futile ternary binary and the most primitive in implementation: it is an ANN with three outputs. If each such output has its own classification threshold, then they have not three but eight potentially possible states, of which only three are unambiguous (a value above the threshold on only one of the three outputs), and five are unclear how to interpret (values above the threshold on more than one of the outputs, or below the threshold on all three outputs).

There is another way, which you have not considered. The output is the same from the neuronkey, but the area of values is conditionally divided into three areas. I did it as [-1.5...1.5]. The middle value region [-1.0...1.0] is the "fence". It turns out that the more familiar the situation is for the neural network, the more it is excited, and values tend to lean towards the extreme values more. Values outside the range [-1.0...1.0] are corresponding signals for Buy and Sell.

But people are still torn about the binary classification.

 
Andrey Dik:

There is another way, which you have not considered. The output is the same from the neuronkey, but the area of values is conditionally divided into three areas. I did it as [-1.5...1.5]. The central value area [-1.0...1.0] is the "fence".

I didn't ignore it, but you didn't read it carefully. See two-threshold "bicycle" method with one binary n. 3 and para. 4, and I quote:

Yury Reshetov:
...

Prior to this, the "bicycle" method was supposed to be applied:

  1. Train the binary classifier on one half of the sample
  2. Test it on the second half of the sample
  3. Using ROC analysis, raise one of the thresholds to increase sensitivity, and lower the second threshold to increase specificity.
  4. Classification: If the pattern to be classified is higher than the sensitivity threshold - buy. If the pattern is below the specificity threshold - sell. If the pattern is between the thresholds - sit on the fence and smoke bamboo.
...
 
Yury Reshetov:

The problem of your approach is that initially (before ternary filtering of buy/sell signals) your MOs are trained on data that can explain maybe 5% of all data, you see? before ternary filtering MOs are already trained on noise and their outputs are corresponding

Andrey Dik:

I think it's the same story here...

==================================

My way doesn't use MO at all in decision making and doesn't try to explain the whole sample, but only what it considered a strong statistical regularity, and if such data is only 0,01% of all data then only they will remain...

 
Andrey Dik:

...

But people still continue to be tormented by binary classification.

Because most people find it easier to take a ready-made package with already implemented binary classification, than to experiment with ternary. Not everyone likes to reinvent the wheel, because not all ideas get good results. Some people find it easier to ride a ready-made bicycle, even if it has square wheels.

If ternary classification is used, most sources on machine learning suggest the most unpromising method: training an ANN with three outputs, which is easy to implement but completely unsuitable in practice.

 
Yury Reshetov:

Because most people find it easier to get a ready-made package with binary classification already implemented than to experiment with ternary. Not everyone likes to reinvent "bicycles" because not all ideas yield good results. Some people find it easier to ride a ready-made bicycle, even if it has square wheels.

If ternary classification is used, most sources on machine learning suggest the most unpromising method: train an ANN with three outputs, which is easy to implement, but completely unsuitable in practice.

Well, yes, I agree, it is.

But one thing is certain (pardon the taftalogy) - a binary is the worst thing that can be used for the market.

 
mytarmailS:

The problem with your approach is that initially (before ternary filtering of buy/sell signals) your MOs are trained on data that can explain maybe 5% of all data, you see? before ternary filtering your MOs are already trained on noise and their outputs are correspondingly

Don't talk nonsense. In jPrediction, an algorithm is implemented to downsize the inputs so that you don't get a model at the output that has been trained on noisy or low-valued predictors. That is, a choice is made from a set of models with different combinations of predictors, of which only the one with the best generalizability remains.
 
mytarmailS:

You see, we're trying to divide the whole sample into buy and sell classes, and by doing that we want to predict absolutely every market movement, but our predictors are so shitty that they can objectively predict only~3% of all movements, so what do we need? we need to try to take at least that3% and just throw out the rest of the indivisible stuff because it is the same garbage on entry/noise that needs to be sifted/reason for retraining, etc... call it what you want, it's all right...

I see that you understand the cause of the problems. But I'm trying to solve it differently than you suggest.

I tend to follow SanSanych's words - you need to dial in predictors and target that are not garbage. With good predictors, you get a graph of training examples, not like I did in my last post, but like Vizard_. It's a lot harder than eliminating conflicting training examples, but I think a proper selection of predictors will be more reliable in the end.

I can't say anything about your method, I'm not good at it, but I hope you can do it.