Machine learning in trading: theory, models, practice and algo-trading - page 3609

 
Maxim Dmitrievsky #:

No

Very different principle, very different goals.
The whole is greater than the sum of its parts. Don't take it out of context. That way you don't see the elephant.

So we see things differently. Okay. (chuckles)

 
Aleksey Vyazmikin #:

So we see things differently. Okay. (chuckles)

The point was originally written: to reduce overtraining of classifiers, which automatically implies improvement on new data. Taking into account that there are at least some regularities, but due to overtraining they are not visible.

Random markup of transactions creates a lot of noise and contradictions. The method of getting rid of contradictions is indicated by Buddha.
Through the settings you can adjust and see what happens.

If the method does not show anything at all in terms of levelling the results on a trace/test even when all clusters are corrected, then there is no alpha in the data.

This is roughly the meaning implied.


 
Maxim Dmitrievsky #:
The point is originally written

That idea is clear.

I can just see how these probability shifts change in dynamics.

Here is a quantum segment on training

It's the same on validation (usually test for me).

And it is the same below on test (exam at me).

On the graphs we plot ZZ on the error balance (weighted class balance - at 50%/50% we add +1 for class "1" and subtract -1 for class "0") we can see that in the dynamics of each sample there is a bias in the pattern, i.e. examples with class "0" are predominantly more - a trend. Such a quantum segment (or cluster) will be good for training.

And here is another variant on the graphs below - three samples in a row (train, test, exam).


You can see that on train there was a clear trend and class "1" was more frequent than average, but on the sample "test" the situation has changed, here there is already a shift towards zero, though not big on this example, but it happens big, and on exam the situation has normalised again and there is a trend on 2/3 with a shift in probability, though then they went flat again.

There may be a case when test and exam show the opposite shift of probability throughout the whole sample, i.e. a certain regularity has stopped working.

 
Aleksey Vyazmikin #:

This idea is understandable.

I just see how these probability shifts change in dynamics.

Here's a quantum segment on the training

It's also on validation (which is usually my test).

And he's also below on the test (exam for me).

On the graphs ZZ is plotted on the balance of errors (weighted class balance - at 50% by 50% we add +1 for class "1" and subtract -1 for class "0") we can see that in the dynamics of each sample there is a shift in the pattern, i.e. examples with class "0" are predominantly more - a trend. Such a quantum segment (or cluster) will be good for training.

And here is another variant below on the graphs - three samples in a row (train, test, exam).


It can be seen that on train there was a clear trend and class "1" occurred more often than average, but on the sample "test" the situation has changed, there is already a shift towards zero, although not a big one on this example, but it happens big, but on exam the situation has normalised again and there is a trend of 2/3 with a shift in probability, although then they went flat again.

There may be a case when test and exam show the opposite shift of probability throughout the whole sample, i.e. a certain regularity has stopped working.

I wrote: fix labels for all clusters, let there be 200 of them. And compare the training with OOS. If everything is bad, there is no point in working with such a sample.
If everything is normal, reduce the number of corrections, for a smoother balance curve, while the curve on the traine is similar to the curve on the test.
 
Maxim Dmitrievsky #:
That's what I wrote: fix the labels for all clusters, let there be 200 of them. And compare the training with OOS. If everything is bad, there is no point in working with such a sample.
If everything is normal, reduce the number of corrections, for a smoother balance curve, while the curve on the traine is similar to the curve on the test.

In theory, why might the model work on new data if the probabilities shift? I get great models if you look up the probabilities are stable, but if you don't, there are a lot of false positives and poor training results. I assume that you have taken successful predictors under the general partitioning or short interval, since you consistently get good models.

In fact, we know that there are correct predictors, but because of the "false" ones, the model breaks down.

So far I have not implemented exactly your approach, I just look at the probability shift after clustering and am not satisfied with the result. Perhaps, I have a lot of predictors and it is necessary to take not all columns (predictors), but as you wrote earlier - around ten.

 
Aleksey Vyazmikin #:

In theory, why might the model work on new data if the probabilities shift? I get great models if I look for stable probabilities, but if I don't, I get a lot of false positives and poor training results. I assume that you have taken successful predictors under common partitioning or short interval since you consistently get good models.

In fact, we know that there are correct predictors, but because of the "false" ones, the model breaks down.

So far I have not implemented exactly your approach, I just look at the probability shift after clustering and am not satisfied with the result. Perhaps, I have a lot of predictors and it is necessary to take not all columns (predictors), but as you wrote earlier - around ten.

This is quite a difficult question for me/either I didn't think about it, maybe I need a person who is well versed in theorver.

I didn't know him and then forgot.

Example

Suppose you have three clusters with probabilities of a profitable trade:

  • Probability of a profitable trade for cluster 1: P ( A 1 ) = 0.6
  • Probability of a profitable transaction for cluster 2: P ( A 2 ) = 0.5
  • Probability of a profitable transaction for cluster 3: P ( A 3 ) = 0.4

Calculation of the final probability

For independent events, the probability that at least one of the events will occur can be calculated as follows:

  1. The probability that none of the events will occur:

    P ( none of the events will occur ) = ∏ i = 1 n ( 1 - P ( A i ) ) P(none oftheeventswilloccur) = i=1n(1- P (Ai ) ))

    where P ( A i ) P (Ai ) is the probability of a profitable transaction for cluster i i.

    For our example:

    P ( none of the events will occur ) = ( 1 - 0.6 ) × ( 1 - 0.5 ) × ( 1 - 0.4 ) P(none oftheeventswilloccur) = (1- 0.6)× (1- 0.5)× (1- 0.4) P ( none of the events will occur ) = 0.4 × 0.5 × 0.6 = 0.12 P(none of theeventswilloccur) = 0.4× 0.5× 0.6 = 0.12
  2. The probability that at least one of the events will occur:

    P ( at least one of the events will occur ) = 1 - P ( none of the events will occur ) P(atleastone of the eventswill occur) = 1- P(none ofthe events will occur ) P ( at least one of the events will occur ) = 1 - 0.12 = 0.88 P(atleastoneof theeventswill occur) = 1- 0.12 = 0.88

Total

The final probability of a profitable trade, given the probabilities for the three clusters, is 0.88 or 88%.

Explanation

  • The probability that neither event occurs is calculated as the product of the probabilities of the opposite events (i.e., the probabilities that the trade will not be profitable for each cluster).
  • The probability that at least one of the events will occur is calculated as the complement of the probability that none of the events will occur.

This method allows to correctly take into account the independence of events and calculate the final probability of a profitable deal.

MathML Namespace
  • www.w3.org
MathML Namespace
 
H.I. I think that the proposed calculation is too optimistic.
Probably we should also take into account the probability of each cluster falling out. Or even calculate it differently.
 
Maxim Dmitrievsky #:
H.I. I think that the proposed calculation is too optimistic.
Probably we should also take into account the probability of each cluster falling out. Or even calculate it differently.

That's better.

If the alternation of clusters falling out obeys some probability law, then to calculate the final probability ofa profitable trade, it is necessary to take this law into account. Let's consider an example for three clusters.

Example for three clusters

Suppose you have three clusters with probabilities of a profitable trade:

  • P ( A 1 ) = 0.6 P(A1 ) = 0.6
  • P ( A 2 ) = 0.5 P(A2 ) = 0.5
  • P ( A 3 ) = 0.4 P(A3 ) = 0.4

Also assume that the dropout probabilities of each cluster obey some probability law. For example, the dropout probabilities of clusters may be as follows:

  • P ( cluster 1 ) = 0.3 P(cluster 1) = 0.3
  • P ( cluster 2 ) = 0.5 P(cluster 2) = 0.5
  • P ( cluster 3 ) = 0.2 P(cluster 3) = 0.2

Calculating the final probability

To calculate the final probability of a profitable trade, we need to consider the probabilities of each cluster falling out and their probabilities of a profitable trade.

  1. The probability of a profitable transaction for each cluster, given its probability of falling out:

    P ( profitable transaction for cluster 1 ) = P ( cluster 1 ) × P ( A 1 ) = 0.3 × 0.6 = 0.18 P(profitabletransactionforcluster 1) = P(cluster1)× P(A1)= 0.3× 0.6 = 0.18 P ( profitable trade for cluster 2 ) = P ( cluster 2 ) × P ( A 2 ) = 0.5 × 0.5 = 0.25 P(profitabletransactionforcluster 2)= P(cluster2)× P(A2)= 0.5× 0.5= 0.25 P ( profitable trade for cluster 3 ) = P ( cluster 3 ) × P ( A 3 ) = 0.2 × 0.4 = 0.08 P(profitabletransactionforcluster 3) = P(cluster3)× P(A3)= 0.2× 0.4 = 0.08
  2. Final probability of a profitable trade:

    P ( profitable transaction ) = P ( profitable transaction for cluster 1 ) + P ( profitable transaction for cluster 2 ) + P ( profitable transaction for cluster 3 ) P( profitable transaction) = P(profitabletransactionforcluster 1) + P( profitable transactionforcluster 2) + P( profitable transactionforcluster 3) P ( profitable transaction ) = 0.18 + 0.25 + 0.08 = 0.51 P(profitabletransaction) = 0.18 + 0.25 + 0.08 = 0.51

Total

The final probability of a profitable trade, given the probabilities of each cluster falling out and their probabilities of a profitable trade, is 0.51 or 51%.

Explanation

  • The probability of a profitable trade for each cluster is calculated as the product of the probability of a cluster falling out and the probability of a profitable trade for that cluster.
  • The final probability of a profitable trade is calculated as the sum of the probabilities of a profitable trade for each cluster.

This method allows to correctly take into account the probabilistic law of cluster falling out and calculate the final probability of a profitable deal.

 

Overall conclusion:

Combining clusters can lead to an improvement in the overall probability of a profitable trade, but this depends on the specific conditions and characteristics of the clusters. Let's look at a few scenarios in which combining clusters can be useful.

Scenario 1: Independent events

If the events (profitable trades) in different clusters are independent, then combining clusters can increase the overall probability of a profitable trade. This is because the probability that at least one of the events will occur increases as the number of independent events increases.

Example

Suppose you have three clusters with probabilities of a profitable trade:

  • P ( A 1 ) = 0.6 P(A1 ) = 0.6
  • P ( A 2 ) = 0.5 P(A2 ) = 0.5
  • P ( A 3 ) = 0.4 P(A3 ) = 0.4

The probability that at least one of the events will occur:

P ( at least one of the events will occur ) = 1 - P ( none of the events will occur ) P(atleastone of the eventswill occur) = 1- P(none ofthe events will occur ) P ( none of the events will occur ) = ( 1 - 0. 6 ) × ( 1 - 0.5 ) × ( 1 - 0.4 ) = 0.4 × 0.5 × 0.6 = 0.12 P(none oftheeventswilloccur) = (1- 0.6)× (1- 0.5)× (1- 0.4) = 0.4× 0.5× 0.6 =xml-ph-0030@deepl.in12 P ( at least one of the events will occur ) = 1 - 0.12 = 0.88 P(atleastoneof theeventswill occur) = 1- 0.12 = 0.88

Thus, combining clusters increases the overall probability of a profitable trade to 0.88 or 88%.

Scenario 2: Dependent Events

If the events in different clusters are dependent, then combining clusters may not significantly improve the overall probability of a profitable trade. In this case, conditional probabilities and correlations between events must be considered.

Example

Suppose you have three clusters with probabilities of a profitable trade:

  • P ( A 1 ) = 0.6 P(A1 ) = 0.6
  • P ( A 2 ) = 0.5 P(A2 ) = 0.5
  • P ( A 3 ) = 0.4 P(A3 ) = 0.4

If the events are dependent, the inclusion-exclusion formula must be used to calculate the total probability of a profitable trade:

P ( A 1 ∪ A 2 ∪ A 3 ) = P ( A 1 ) + P ( A 2 ) + P ( A 3 ) - P ( A 1 ∩ A 2 ) - P ( A 1 ∩ A 3 ) - P ( A 2 ∩ A 3 ) + P ( A 1 ∩ A 2 ∩ A 3 ) P(A1 A2 A3)= P(A1)+ P(A2)+ P( A3)- P( A1 xml-ph-0034@deepl

If the conditional probabilities and correlations between events are unknown, the computation of the overall probability becomes more complicated.

Scenario 3: Probabilistic law of cluster dropout

If the alternation of cluster fallout obeys some probabilistic law, then combining clusters can improve the overall probability of a profitable trade if the cluster fallout probabilities and their probabilities of a profitable trade are distributed in such a way that they increase the overall probability.

Example

Suppose you have three clusters with probabilities of a profitable trade:

  • P ( A 1 ) = 0.6 P(A1 ) = 0.6
  • P ( A 2 ) = 0.5 P(A2 ) = 0.5
  • P ( A 3 ) = 0.4 P(A3 ) = 0.4

And the probabilities of clusters falling out:

  • P ( cluster 1 ) = 0.3 P(cluster 1) = 0.3
  • P ( cluster 2 ) = 0.5 P(cluster 2) = 0.5
  • P ( cluster 3 ) = 0.2 P(cluster 3) = 0.2

Probability of a profitable trade for each cluster, given the probability of its falling out:

P ( profitable trade for cluster 1 ) = 0.3 × 0.6 = 0.18 P(profitabletradeforcluster 1) = 0.3× 0.6 = 0.18 P ( profitable transaction for cluster 2 ) = 0.5 × 0.5 = 0.25 P(profitabletransactionforcluster 2) = 0.5× 0.5 = 0.25 P ( profitable transaction for cluster 3 ) = 0.2 × 0.4 = 0.08 P(profitabletradeforcluster 3) = 0.2× 0.4 = 0.08

The final probability of a profitable trade:

P ( profitable trade ) = 0.18 + 0.25 + 0.08 = 0.51 P(profitabletrade) = 0 .18 + 0 .25 + 0 .08 = 0 .51

Thus, combining clusters increases the overall probability of a profitable trade to 0.51 or 51%.

Conclusion

Combining clusters can lead to an improvement in the overall probability of a profitable trade if the events are independent or if the probabilities of clusters falling out and their probabilities of a profitable trade are distributed in such a way that they increase the overall probability. In the case of dependent events, conditional probabilities and correlations between events must be taken into account.

 
Maxim Dmitrievsky #:

Overall conclusion:

Combining clusters can lead to an improvement in the overall probability of a profitable trade, but this depends on the specific conditions and characteristics of the clusters. Let's look at a few scenarios in which combining clusters can be useful.

Scenario 1: Independent events

If the events (profitable trades) in different clusters are independent, then combining clusters can increase the overall probability of a profitable trade. This is because the probability that at least one of the events will occur increases as the number of independent events increases.

Example

Suppose you have three clusters with probabilities of a profitable trade:

  • P ( A 1 ) = 0.6 P(A1 ) = 0.6
  • P ( A 2 ) = 0.5 P(A2 ) = 0.5
  • P ( A 3 ) = 0.4 P(A3 ) = 0.4

The probability that at least one of the events will occur:

P ( at least one of the events will occur ) = 1 - P ( none of the events will occur ) P(atleastone of the eventswill occur) = 1- P(none ofthe events will occur ) P ( none of the events will occur ) = ( 1 - 0. 6 ) × ( 1 - 0.5 ) × ( 1 - 0.4 ) = 0.4 × 0.5 × 0.6 = 0.12 P(none oftheeventswilloccur) = (1- 0.6)× (1- 0.5)× (1- 0.4) = 0.4× 0.5× 0.6 =xml-ph-0030@deepl.in12 P ( at least one of the events will occur ) = 1 - 0.12 = 0.88 P(atleastoneof theeventswill occur) = 1- 0.12 = 0.88

Thus, combining clusters increases the overall probability of a profitable trade to 0.88 or 88%.

Scenario 2: Dependent Events

If the events in different clusters are dependent, then combining clusters may not significantly improve the overall probability of a profitable trade. In this case, conditional probabilities and correlations between events must be considered.

Example

Suppose you have three clusters with probabilities of a profitable trade:

  • P ( A 1 ) = 0.6 P(A1 ) = 0.6
  • P ( A 2 ) = 0.5 P(A2 ) = 0.5
  • P ( A 3 ) = 0.4 P(A3 ) = 0.4

If the events are dependent, the inclusion-exclusion formula must be used to calculate the total probability of a profitable trade:

P ( A 1 ∪ A 2 ∪ A 3 ) = P ( A 1 ) + P ( A 2 ) + P ( A 3 ) - P ( A 1 ∩ A 2 ) - P ( A 1 ∩ A 3 ) - P ( A 2 ∩ A 3 ) + P ( A 1 ∩ A 2 ∩ A 3 ) P(A1 A2 A3)= P(A1)+ P(A2)+ P( A3)- P( A1 xml-ph-0034@deepl

If the conditional probabilities and correlations between events are unknown, the computation of the overall probability becomes more complicated.

Scenario 3: Probabilistic law of cluster dropout

If the alternation of cluster fallout obeys some probabilistic law, then combining clusters can improve the overall probability of a profitable trade if the cluster fallout probabilities and their probabilities of a profitable trade are distributed in such a way that they increase the overall probability.

Example

Suppose you have three clusters with probabilities of a profitable trade:

  • P ( A 1 ) = 0.6 P(A1 ) = 0.6
  • P ( A 2 ) = 0.5 P(A2 ) = 0.5
  • P ( A 3 ) = 0.4 P(A3 ) = 0.4

And the probabilities of clusters falling out:

  • P ( cluster 1 ) = 0.3 P(cluster 1) = 0.3
  • P ( cluster 2 ) = 0.5 P(cluster 2) = 0.5
  • P ( cluster 3 ) = 0.2 P(cluster 3) = 0.2

Probability of a profitable trade for each cluster, given the probability of its falling out:

P ( profitable trade for cluster 1 ) = 0.3 × 0.6 = 0.18 P(profitabletradeforcluster 1) = 0.3× 0.6 = 0.18 P ( profitable transaction for cluster 2 ) = 0.5 × 0.5 = 0.25 P(profitabletransactionforcluster 2) = 0.5× 0.5 = 0.25 P ( profitable transaction for cluster 3 ) = 0.2 × 0.4 = 0.08 P(profitabletradeforcluster 3) = 0.2× 0.4 = 0.08

The final probability of a profitable trade:

P ( profitable trade ) = 0.18 + 0.25 + 0.08 = 0.51 P(profitabletrade) = 0 .18 + 0 .25 + 0 .08 = 0 .51

Thus, combining clusters increases the overall probability of a profitable trade to 0.51 or 51%.

Conclusion

Combining clusters can lead to an improvement in the overall probability of a profitable trade if the events are independent or if the probabilities of clusters falling out and their probabilities of a profitable trade are distributed in such a way that they increase the overall probability. In the case of dependent events, conditional probabilities and correlations between events must be taken into account.

What is a "profitable trades"? What probability is that? 0.5, 0.6 ....

Where is the threshold that should formalise the notion of "profitable deal"? The problem is that the calculated threshold should remain a "threshold" at least at the next step of prediction.