Why is Strategy Tester so unreliable?

 

I had abandoned Strategy Tester for more than a year.

Then I decided to try it again.

Terrible experience just like in the old times.

I write an EA that is an absolute winner in the backtests, then I put it to work on a live account (demo, of course) for three days and it loses money for three days straight like a drunken sailor.

That's happened with half a dozen EAs I have written. Apparently, Strategy Tester results are completely meaningless.

What is Strategy Tester ever good for?

 
It is very useful to see if your EA is doing what you have designed it to do. If it is buying/selling closing whatever, when it should ( as designed… ) 
 

You should use only 80-90% of the available historical data at the most to test/build your strategy or entry points. And then use thte rest 10-20% of the data to verify your strategy or entry points. Most of my ideas failed to pass the verification phase. Most of the ideas may not work at the very beginning. But after adding some filters or optimization, it may seem to work but it only works against the historical data, unfortunately  we don't know the future so we don't know when the exact same setups would repeat themselves again.

Another issue would be network latency if your strategy is heavily OnTick based.

 
William William #:

You should use only 80-90% of the available historical data at the most to test/build your strategy or entry points. And then use thte rest 10-20% of the data to verify your strategy or entry points. Most of my ideas failed to pass the verification phase. Most of the ideas may not work at the very beginning. But after adding some filters or optimization, it may seem to work but it only works against the historical data, unfortunately  we don't know the future so we don't know when the exact same setups would repeat themselves again.

Another issue would be network latency if your strategy is heavily OnTick based.


I don't understand what you mean by "verification" and that 80-90% separation. Would you like to elaborate a little? Please?

 
whoowl #:


I don't understand what you mean by "verification" and that 80-90% separation. Would you like to elaborate a little? Please?

It's over-optimization. This way you could test your strategy on out of sample historical data. The curve will change dramatically if your strategy/entry point doesn't work from the very beginning. Machine Learning practices use 80% of the data for model learning and 20% of the data for model verification in order to avoid or mitigate over-fitting or over-optimization.  20% is not a rule but just make sure that you have enough historical data for verification. It saves you sometime for live market verification, you know that it doesn't work before putting it in live market. Of course live market verification is still needed but you could try many more ideas this way. You are testing possible gems other than illusions in live market. 

We all tend to get as higher profit factor as possible and as smoother trend as possible.  If your broker could give you 5 years of 5min data, I guess you don't need the verification process but I have 11-months of 5min data in MT4(I guess it's the same for all brokers). I can make a very good looking curve and profit factor in the strategy tester but it turned out to be losing money for several days without a single winning trade in the live market. 

 
William William #:

It's over-optimization. This way you could test your strategy on out of sample historical data. The curve will change dramatically if your strategy/entry point doesn't work from the very beginning. Machine Learning practices use 80% of the data for model learning and 20% of the data for model verification in order to avoid or mitigate over-fitting or over-optimization.  20% is not a rule but just make sure that you have enough historical data for verification. It saves you sometime for live market verification, you know that it doesn't work before putting it in live market. Of course live market verification is still needed but you could try many more ideas this way. You are testing possible gems other than illusions in live market. 

We all tend to get as higher profit factor as possible and as smoother trend as possible.  If your broker could give you 5 years of 5min data, I guess you don't need the verification process but I have 11-months of 5min data in MT4(I guess it's the same for all brokers). I can make a very good looking curve and profit factor in the strategy tester but it turned out to be losing money for several days without a single winning trade in the live market. 


Sorry, but I still have no idea what you are talking about.
It seems you expanded on the benefits of the procedure but didn't make me understand what that procedure is, how it works, what it entails.

 
William William #:

It's over-optimization. This way you could test your strategy on out of sample historical data. The curve will change dramatically if your strategy/entry point doesn't work from the very beginning. Machine Learning practices use 80% of the data for model learning and 20% of the data for model verification in order to avoid or mitigate over-fitting or over-optimization.  20% is not a rule but just make sure that you have enough historical data for verification. It saves you sometime for live market verification, you know that it doesn't work before putting it in live market. Of course live market verification is still needed but you could try many more ideas this way. You are testing possible gems other than illusions in live market. 

We all tend to get as higher profit factor as possible and as smoother trend as possible.  If your broker could give you 5 years of 5min data, I guess you don't need the verification process but I have 11-months of 5min data in MT4(I guess it's the same for all brokers). I can make a very good looking curve and profit factor in the strategy tester but it turned out to be losing money for several days without a single winning trade in the live market. 

That is Why my opinion is that strategy tester is only for testing functionality of the EA , not for profitability… it is very easy to see the difference … live on demo/live for 1-2 weeks, then test it in strategy tester for the same period and same conditions …
You will see that results are different  … you will see that entry/exists are different … 
 
It's all about which EA are you going to try.

If you trade around midnight tester will be always unreliable, especially on MT4 and especially with data provided by platform/metaquotes/brokers.

Spread on tester (on MT4) is fixed, on real time is always floating and huge during these hours.

Just for making an example.

Tester is reliable for some things and not for others. It depends on a lot of different factors.

If know how your wanted asset exactly works, and how tester exactly works, you will understand which thing can be tested in reliable way and which not.
 
Fabio Cavalloni #: If you trade around midnight tester will be always unreliable,

By “midnight” he means 5PM Eastern time, the start of the FX day.

  1. How can MetaQuotes know all brokers' (they come and go daily) Time zone and Daylight savings time (if they use it and including historical changes for back testing)? Do you have that information for just you and your broker? Only then, with code, can you convert session times to broker's time to UTC to local time. You can use offset inputs but then you must maintain them correctly, through all three DST changes when they occur.
              When is the time zone problem going to be fixed? - General - MQL5 programming forum (2020)

  2. Foreign Exchange (FX) market opens 5 PM New York (NY)/Eastern Time (ET) Sunday and ends 5 PM NY Friday. Some brokers start after (6 PM is common) and end before (up to 15 minutes) due to low volatility.
              Checking for Market Closed - Expert Advisors and Automated Trading - MQL5 programming forum

    Swap is computed 5 PM ET. No swap if no open orders at that time.

  3. Brokers use a variety of time zones. Their local time, with or without Daylight Savings Time (DST), London, UTC, London+2, UTC+2, NY+7.

    Only with NY+7 does the broker's 00:00 equals 5 PM ET and the start of a daily bar (and H4) is the start of a new FX day.

    GMT/BST brokers, means there is a 1 or 2 hour D1/H4 bar on Sunday (depending on NY DST), and a short Friday bar. (Problems with indicators based off bars.)

    GMT+2 is close but doesn't adjust for NY DST.

    EET is closer, except when their DST doesn't match NY's. Last Sunday of March and 1:00 on the last Sunday of October vs second Sunday in March and return at 2:00 AM EDT to 1:00 AM EST on the first Sunday in November.

  4. Non-NY+7, means the chart daily bar overlaps the start, and converting broker time to NY time requires broker to UTC to NY timezone conversions.


  5. If you search the web, you will find differing answers. Those are all wrong (half the year) because they do not take DST into account (or that it changed for the US in 2007 [important when testing history.])


  6. Then there are (non-24 hour markets) with H4 candles that start on odd hours.
              Why My XAUUSD 4H candles start with 1 hour shift? - Currency Pairs - General - MQL5 programming forum (2019)
              H4 first opened candle - MT5 - General - MQL5 programming forum (2020)

    And H1 on the half hour.

  7. See also Dealing with Time (Part 1): The Basics - MQL5 Articles (21.10.01)
    and Dealing with Time (Part 2): The Functions - MQL5 Articles (21.10.08)

 

What indicators are you using?  What is the basis of your strategy?


Things to keep in mind


  1. Markets conditions change. Your EA needs to determine what conditions to trade, or do research to determine what market days and hours are best and set a schedule for it to run
  2. Up trends are different than down trends and a trading signals with the trend will be different than a trade against the trend.
  3. Understand the indicators that you use for confluences.  If an indicator is based off the same principles then you will get false confirmations.  For example, using rsi and stochastic to confirm may be a double false.  Understand what indicators are lagging vs leading.
  4. Price action is King.  Utilize it.
  5. Spread will change the results drastically.  You can prove it, run your strategy tester in a time period that it performed well.  Change the default spread and run the same period.  Changes can be drastic.
  6. Identify if your strategy is making bad entries or not exiting properly.
  7. Be sure you have reliable tick data and do not use your brokers data.  I use tickstory.com


 

Daniel Cioca #
:

It is very useful to see if your EA is doing what you have designed it to do. If it is buying/selling closing whatever, when it should ( as designed… ) 

100% correct.  I think the OP needs to review the purpose of a backtest.  In short, it is there for you to make sure what you want the code to do is what is in fact being produced in the test.  So if you code that you want to buy 1 lot when some 5 period moving average crosses above over the 30 period  moving average, then is the backtest doing this on each trigger without error?

What a backtest cannot do is tell you if a strategy is guaranteed to be profitable.  You must understand the why behind a strategy before you implement it.   Otherwise you would just be "getting lucky" for a while until it stops working favorably.  Most people quit in protest because they do not understand why they are profitable.    Which usually means there is no fundamental basis for their profitability.

whoowl:

....I write an EA that is an absolute winner in the backtests, then I put it to work on a live account (demo, of course) for three days and it loses money for three days straight like a drunken sailor.

That's happened with half a dozen EAs I have written. Apparently, Strategy Tester results are completely meaningless.

What is Strategy Tester ever good for?


OP gave us nodetails as to what the strategy does, how it trades, etc.  So how can we objectively evaluate his blanket statements?

What is a sufficient number of years for an algorithmic trading backtest to be trustworthy?
What is a sufficient number of years for an algorithmic trading backtest to be trustworthy?
  • www.quora.com
Answer (1 of 4): It really depends upon your algorithm and the time horizon over which it works. Years may not be necessary if you’re trading intraday at the second or decisecond level of granularity. If you’re trading at the millisecond level of granularity you don’t need help from Quora. More ...