Z-Score Computation

 

In this article, https://www.mql5.com/en/articles/1492, the author present the following method of computing Z-Score.

Z=(N*(R-0.5)-P)/((P*(P-N))/(N-1))^(1/2)

where:
N - total amount of trades in a series;
R - total amount of series of profitable and losing trades;
P = 2*W*L;
W - total amount of profitable trades in the series;
L - total amount of losing trades in the series.

Im sure some of you are familiar with this as Ive seen this (very informative) article referenced several times.

I would like to implement this but the definition of W, L, & N seem rather vague:

If a series is (as defined above) a sequence of only +'s or only -'s (+ meaning profitable trades, - meaning loss), then what meaning is there to say that "W is the total amount of profitable trades in the series"..this is defined implicitly by the type of sereis we have (+ or -, win or loss).

For any given series, if it is a + series, W will be the total amount of trades in that series (defined above as N), since all of the trades in a "+" series are wins..the instant there is a loss, that series is over and a new series, one of -'s, begins. L for a "+" series will always be 0. And vice versa for a series of losses or -'s

Thus, if the definitions above are taken as read (or at least as I understand them) the following should be true:

For any "+" series: W=N, L=0, P=0 !!

For any "-" series: W=0, L=N, P=0 !!

((P*(P-N))/(N-1))^(1/2) ..in either case P turns out to be zero thus, looking at the denominator this, in turn, must always be zero and the function is undefined!

Clearly, I am missing something about the way W & L are defined, but the way it is worded I struggle to find a different interpretation.

Is it to mean that W is simply # of winning trades, L is # of losing trades, and N is total trades, overall (over the whole testing interval)? If this is the case, what reason is there to even define them in terms of a "series"?? Perhaps W is average number of winning trades per series of wins. L - average/series of losses, and N is total trades.. this would make more sense to me.. but it doesnt work since eventually, N becomes greater than P and you end up taking the Sqrt of a negative value.

Has anyone successfully implemented this? If so, what did you take for W & L?

 

http://books.google.com/books?id=iUSzmtLJ3AgC&pg=PA8&lpg=PA8&dq=trading+Dependence+z-score&source=bl&ots=Ztgo5Eap9y&sig=YFBSCNgkBwrDBj6NN7NT70Kzvik&hl=en&ei=ZjKqTcibBcLpgAey6IT0BQ&sa=X&oi=book_result&ct=result&resnum=8&ved=0CEkQ6AEwBw#v=onepage&q=trading%20Dependence%20z-score&f=false

Sorry, it wouldn't let me link so you may copy and paste. Since you're mathematically inclined, could you post a code which does the calculations if you create one? You may wanna private message Rosh, he wrote the article and ask him why he used that short hand in his article.

 

After reading article and comments if I understand correctly it should state:

W - total amount of profitable trades in all profitable series;
L - total amount of losing trades in all losing series.

The article is ambigous on the point, but there's a bit more clear comment:

"Win trades plus Loss trades must be eqauls to Total trades: W+L=N "

o/

 
I have seen this in the comments as well: W is simply Wins, L is losses, N is number of trades.. it seems to give results that are more or less in the right scale but according to what I observed, are inaccurate. It is very difficult to get negative values for Z score, hence the result rarely implies a "positives dependency" even when I applied it to a strategy that wins 99% of the time (TP at 1 pip, SL at 1000) ..clearly, for such a case, Z should be strongly negative but it is not. Something still seems off. I did PM Rosh but no answer for several days now which is why I took to the forums. I will post the code that gather all the quantities for this tomorrow.
 
I'm looking forward to the answer :)))
 
Might also check out The Trade Encouragement Factor
 

Thanks, I think I may have come up with something similar. Actually I came up with an alternative method for computing Z-score of a trading system.

The following code builds two arrays.. one of series of wins, one of series of losses (denoted pSeries & mSeries). The count of each of these series is stored in W & L, respectively.

         static int pSeries[], mSeries[]; bool new_series;
	 //--- LastProfit is basically OrderProfit()
         static int tik, profit[2]; 
         tik++; if(tik>1) tik=0; profit[tik]=LastProfit; 
         if((profit[0]>=0 && profit[1]<=0) || (profit[0]<=0 && profit[1]>=0)) new_series=true;

         if(LastProfit > 0){
            if(new_series) ArrayResize(pSeries,ArraySize(pSeries)+1);
            pSeries[ArraySize(pSeries)-1]++;}
         else if(LastProfit < 0){
            if(new_series) ArrayResize(mSeries,ArraySize(mSeries)+1);
            mSeries[ArraySize(mSeries)-1]++;}
         
         double W = ArraySize(pSeries);
         double L = ArraySize(mSeries);

Also, lets define "Wins" as the total number of wins over your entire backtest (also the sum of the pSeries array), "Losses" as the number of losses over your backtest (sum of mSeries array), and N as the total number of trades (N=Wins+Losses).

Suppose now that you take the following ratios:

WR = W / Wins

-and-

LR = L / Losses

..what do these quantities tell us?

Suppose that you have a perfectly negatively dependent system (a wins is ALWAYS followed by a loss, and vice versa) what would the series array of such a system look like? It would be [1,1,1,1,1.....].. each series consists of just one win or loss

What about a perfectly positively correlated system, what would it's series array look like? It would just be [Wins] or [Losses] ..only one entry, its value equal to the number of wins or losses in your testing period (since they all occurred consecutively). Therefore any trading system must fall between these two extremes.

In terms of WR & LR, the two extremes can be represented as:

WR = 1, LR =1 for perfect negative dependency (since the size of the array is equal to the number total wins)

WR = 1/Wins, LR = 1/Losses for a perfect positive dependency (arrays are of size 1, Wins/Losses are whatever they are)

Simply adding these two quantities will give us an overall picture of how each array contributes to the overall system.

WR + LR = Overall dependency factor (F)

F falls within the range [2/N, 2] for any given system (2/N being perfectly positive, 2 being perfectly negative) hence, F can be expressed as a percentage of this range, lets call this P

P = (WR + LR) / (2 - 2/N), if P is taken as a random variable and normally distributed, we can compute it's mean as mu = sum(P)/N;

all that remains is to calculate the standard deviation of P over your testing period, the following code does this:

         static double sigmaP;
         sigmaP += MathPow(P-mu,2)/n;
         sigmaP  = MathSqrt(sigmaP);

And Z can be computed using the standard formula

Z = (P - mu) / sigmaP

..I have tested this method using several trading systems and the results seem to be reasonable and as expected. A system where TP = 1, SL=100 yields a Z-score of -3.24 ..which is to be expected since many consecutive wins are separated by one (huge) loss. The strategy I use for live trading has Z = -2.48.. very believable value. Other strategies that alternate wins/losses give positive values & a strategy I put together based the random number function MathRand() gave values between -1 & 1 (indeterminate dependency).

I invite your feedback on this method.

 

Here's code for the whole thing. Its taken out of context so you'll need to adapt it to your EA in order to implement it. It's meant to be run either at the closing of an order or inside deinit() over OrdersHistoryTotal()

         //--- Z-Score
         static int pSeries[], mSeries[]; bool new_series;
         static int tik, profit[2]; tik++; 
         if(tik>1) tik=0; profit[tik]=LastProfit; 
         if((profit[0]>=0 && profit[1]<=0) || (profit[0]<=0 && profit[1]>=0)) new_series=true;
         if(LastProfit > 0){
            if(new_series) ArrayResize(pSeries,ArraySize(pSeries)+1);
            pSeries[ArraySize(pSeries)-1]++;}
         else if(LastProfit < 0){
            if(new_series) ArrayResize(mSeries,ArraySize(mSeries)+1);
            mSeries[ArraySize(mSeries)-1]++;}
         double W=ArraySize(pSeries);
         double L=ArraySize(mSeries);
         if(Wins>0) double W.Ratio=W/Wins;   //.. 1 = strongly negative correlation (+-+-+-), 1/Wins = strongly positive (++++++)
         if(Loss>0) double L.Ratio=L/Loss;   //.. 1 = strongly negative correlation (+-+-+-), 1/Loss = strongly positive (------)
         if(n>1) double P = (W.Ratio+L.Ratio)/(2-(2/n));
         static double mu, sigmaP; mu += P/n;
         sigmaP += MathPow(P-mu,2)/n;
         sigmaP  = MathSqrt(sigmaP);
         if(sigmaP>0) double Z = (P-mu)/sigmaP;
 

I created an MQL5 solution to the formula of the OP (although the code would probably work in MT4 w/o much modification).

I'm adding my code to clarify the discussion and how the formula can be calculated.

You will need to incorporate the member variables into your own solution.

I tested my version against the Z-Score given by the MT5 strategy tester, and the results are identical.

The Ralph Vince book, The Mathematics of Money Management, gives a pretty good description of how the value is calculated with an example.

The only value that is non-obvious is m_series. To get this, simply increment this member variable whenever you switch from profit to loss or from loss to profit. So, using +/- to indicate profits/losses:

+ - + + - + + -

gives an m_series of 6.

+ + + - - - +

gives an m_series of 3.

/* 
    • Plus sign (negative dependence):
        ◦ a profit will be followed by a loss, whereas the loss will be followed by a profit.
    • Minus sign (A positive dependence):
        ◦ a profit will be followed by a profit, a loss will be followed by a loss again.
        
   A Z-score below -3 indicates that a win will be followed by a win (and loss followed by loss)
   will occur with a probability of 3 sigma (99.67%).
   
   Note: there is no dependence upon the amount won; simply # of wins/losses 
   and their sequence.
 */
double CMoneyManagement::GetZScore(void)
{
   // Ralph Vince talks about 30. That may be a bit much.
   const int Z_SCORE_WINDOW = 10;
   const double Z_SCORE_DEFAULT_VALUE = 0.0;
 
   // Return value
   double Z = Z_SCORE_DEFAULT_VALUE;
  
   if ( m_totalTrades < Z_SCORE_WINDOW )
   {
      Print("Z-Score not primed");
      return Z;
   }

   // P equals 2 x W x L
   //     • W is the total number of winning trades during a series
   //     • L is the total number of losing trades during a series
   const double W = m_profitTrades;
   const double L = m_lossTrades;
   const double P=2.0*W*L;
   Print(StringFormat("Z P [%.2f]", P));

   // Z = [N x (R – 0.5) – P] / [(P x (P – N)] / (N – 1)]½
   //     • N is the total number of trades during a series;
   //     • R is the total number of series of winning and losing trades;
   const double N = m_totalTrades;
   const double R = m_series;

   const double numerator = (N*(R-0.5)-P);
   const double denominator = MathSqrt((P*(P-N))/(N-1));
   if ( denominator != 0.0 )
   {
      Z=numerator/denominator;   
   }

   Print(StringFormat("Z Z [%.2f]", Z));
   
   return Z;
}
 

Anthony Garot:

I tested my version against the Z-Score given by the MT5 strategy tester, and the results are identical.

The Ralph Vince book, The Mathematics of Money Management, gives a pretty good description of how the value is calculated with an example.

I found a discrepancy on how Z-Score is calculated based upon what constitutes a "win."

Specifically:

(1)

In Ralph Vince's book (as mentioned above), he gives an example:

-3 +2 +7 -4 +1 -1 +1 +6 -1 0 -2 +1

which—as he states—yields a Z-score of .9082951063. As Vince goes through the math, he specifically counts the "0" as a minus (a loss).

He says, "Note that a trade with a P&L of 0 is regarded as a loss."

Thus, in my code:

if (profitLoss > 0) win++;

(2)

MT5 counts a gain of $0.00 (with $0.00 swap) to be a profit trade, not a loss. See: TesterStatistics(STAT_PROFIT_TRADES) with $0 trade for evidence.

That is to say, if I want my Z-Score to match MT5's Z-Score, I need to use:

if (profitLoss >= 0) win++;


Ergo, a discrepancy.