If you replace
static const double Points[] = {1.0 e-0, 1.0 e-1, 1.0 e-2, 1.0 e-3, 1.0 e-4, 1.0 e-5, 1.0 e-6, 1.0 e-7, 1.0 e-8};
to the switch variant, you can see the quality of the switch implementation in numbers.
Consider the cleaned up version of the script with NormalizeDouble:
#define EPSILON (1.0 e-7 + 1.0 e-13) #define HALF_PLUS (0.5 + EPSILON) //+------------------------------------------------------------------+ //| | //+------------------------------------------------------------------+ double MyNormalizeDouble(const double Value,const int digits) { static const double Points[]={1.0 e-0,1.0 e-1,1.0 e-2,1.0 e-3,1.0 e-4,1.0 e-5,1.0 e-6,1.0 e-7,1.0 e-8}; return((int)((Value > 0) ? Value / Points[digits] + HALF_PLUS : Value / Points[digits] - HALF_PLUS) * Points[digits]); } //+------------------------------------------------------------------+ //| | //+------------------------------------------------------------------+ ulong BenchStandard(const int Amount=1.0 e8) { double Price=1.23456; const double point=0.00001; const ulong StartTime=GetMicrosecondCount(); //--- for(int i=0; i<Amount;i++) { Price=NormalizeDouble(Price+point,5); } Print("Result: ",Price); // специально выводим результат, чтобы цикл не оптимизировался в ноль //--- return(GetMicrosecondCount() - StartTime); } //+------------------------------------------------------------------+ //| | //+------------------------------------------------------------------+ ulong BenchCustom(const int Amount=1.0 e8) { double Price=1.23456; const double point=0.00001; const ulong StartTime=GetMicrosecondCount(); //--- for(int i=0; i<Amount;i++) { Price=MyNormalizeDouble(Price+point,5); } Print("Result: ",Price); // специально выводим результат, чтобы цикл не оптимизировался в ноль //--- return(GetMicrosecondCount() - StartTime); } //+------------------------------------------------------------------+ //| | //+------------------------------------------------------------------+ void OnStart(void) { Print("Standard: ",BenchStandard()," msc"); Print("Custom: ",BenchCustom(), " msc"); }
Results:
Custom: 1110255 msc Result: 1001.23456 Standard: 1684165 msc Result: 1001.23456
Immediate remarks and explanations:
- static is necessary here so that the compiler takes this array outside the function and doesn't construct it on the stack every time the function is called. The C++ compiler does the same.
static const double Points
- To prevent the compiler from throwing the loop away because it is useless, we should use the results of calculations. For example, Print the variable Price.
- There is an error in your function - boundaries of digits are not checked, which may easily lead to array overruns.
For example, call it as MyNormalizeDouble(Price+point,10) and catch the error:array out of range in 'BenchNormalizeDouble.mq5' (19,45)
The method of speeding up by not checking is acceptable, but not in our case. We must handle any erroneous data input. - Let's add a simple condition for an index greater than 8. To simplify the code, replace the type of the variable digits with uint, to make one comparison for >8 instead of an additional condition <0
//+------------------------------------------------------------------+ //| | //+------------------------------------------------------------------+ double MyNormalizeDouble(const double Value,uint digits) { static const double Points[]={1.0 e-0,1.0 e-1,1.0 e-2,1.0 e-3,1.0 e-4,1.0 e-5,1.0 e-6,1.0 e-7,1.0 e-8}; //--- if(digits>8) digits=8; //--- return((int)((Value > 0) ? Value / Points[digits] + HALF_PLUS : Value / Points[digits] - HALF_PLUS) * Points[digits]); }
- Let's run the code and... We are surprised!
Custom: 1099705 msc Result: 1001.23456 Standard: 1695662 msc Result: 1001.23456
Your code has overtaken the standard NormalizeDouble function even more!
Moreover, addition of the condition even reduces the time (actually it is within the error margin). Why is there such a difference in speed? - All this has to do with a standard error of performance testers.
When writing tests you should keep in mind the full list of optimizations that can be applied by the compiler. You need to be clear about what input data you are using and how it will be destroyed when you write a simplified sample test.
Let's evaluate and apply the whole set of optimizations that our compiler does, step by step. - Let's start with constant propagation - this is one of the important mistakes you made in this test.
You have half of your input data as constants. Let's rewrite the example with their propagation in mind.ulong BenchStandard(void) { double Price=1.23456; const ulong StartTime=GetMicrosecondCount(); //--- for(int i=0; i<1.0 e8;i++) { Price=NormalizeDouble(Price + 0.00001,5); } Print("Result: ",Price); //--- return(GetMicrosecondCount() - StartTime); } ulong BenchCustom(void) { double Price=1.23456; const ulong StartTime=GetMicrosecondCount(); //--- for(int i=0; i<1.0 e8;i++) { Price=MyNormalizeDouble(Price + 0.00001,5); } Print("Result: ",Price," ",1.0 e8); //--- return(GetMicrosecondCount() - StartTime); }
After launching it, nothing has changed - it must be so. - Go on - inline your code (our NormalizeDouble cannot be inlined)
This is what your function will look like in reality after inevitable inline. Saving on calls, saving on array fetches, checks are removed due to constant analysis:ulong BenchCustom(void) { double Price=1.23456; const ulong StartTime=GetMicrosecondCount(); //--- for(int i=0; i<1.0 e8;i++) { //--- этот код полностью вырезается, так как у нас заведомо константа 5 //if(digits>8) // digits=8; //--- распространяем переменные и активно заменяем константы if((Price+0.00001)>0) Price=int((Price+0.00001)/1.0 e-5+(0.5+1.0 e-7+1.0 e-13))*1.0 e-5; else Price=int((Price+0.00001)/1.0 e-5-(0.5+1.0 e-7+1.0 e-13))*1.0 e-5; } Print("Result: ",Price); //--- return(GetMicrosecondCount() - StartTime); }
I didn't summarise pure constants so as not to waste time. they are all guaranteed to collapse at compile time.
Run the code and get the same time as in the original version:Custom: 1149536 msc Standard: 1767592 msc
don't mind the chattering of numbers - at the level of microseconds, timer error and floating load on the computer, this is within normal limits. the proportion is fully maintained. - Look at the code you actually started testing because of the fixed source data.
Since the compiler has a very powerful optimization, your task was effectively simplified. - So how should you test for performance?
By understanding how the compiler works, you need to prevent it from applying pre-optimizations and simplifications.
For example, let's make the digits parameter variable:#define EPSILON (1.0 e-7 + 1.0 e-13) #define HALF_PLUS (0.5 + EPSILON) //+------------------------------------------------------------------+ //| | //+------------------------------------------------------------------+ double MyNormalizeDouble(const double Value,uint digits) { static const double Points[]={1.0 e-0,1.0 e-1,1.0 e-2,1.0 e-3,1.0 e-4,1.0 e-5,1.0 e-6,1.0 e-7,1.0 e-8}; //--- if(digits>8) digits=8; //--- return((int)((Value > 0) ? Value / Points[digits] + HALF_PLUS : Value / Points[digits] - HALF_PLUS) * Points[digits]); } //+------------------------------------------------------------------+ //| | //+------------------------------------------------------------------+ ulong BenchStandard(const int Amount=1.0 e8) { double Price=1.23456; const double point=0.00001; const ulong StartTime=GetMicrosecondCount(); //--- for(int i=0; i<Amount;i++) { Price=NormalizeDouble(Price+point,2+(i&15)); } Print("Result: ",Price); // специально выводим результат, чтобы цикл не оптимизировался в ноль //--- return(GetMicrosecondCount() - StartTime); } //+------------------------------------------------------------------+ //| | //+------------------------------------------------------------------+ ulong BenchCustom(const int Amount=1.0 e8) { double Price=1.23456; const double point=0.00001; const ulong StartTime=GetMicrosecondCount(); //--- for(int i=0; i<Amount;i++) { Price=MyNormalizeDouble(Price+point,2+(i&15)); } Print("Result: ",Price); // специально выводим результат, чтобы цикл не оптимизировался в ноль //--- return(GetMicrosecondCount() - StartTime); } //+------------------------------------------------------------------+ //| | //+------------------------------------------------------------------+ void OnStart(void) { Print("Standard: ",BenchStandard()," msc"); Print("Custom: ",BenchCustom()," msc"); }
Run it and... we get the same speed result as before.
Your code gains about 35% as before. - So why is it so?
We still cannot save ourselves from optimization due to inlining. Saving 100 000 000 calls by passing data through the stack into our function NormalizeDouble, which is similar in implementation, might well give the same speed increase.
There is another suspicion that our NormalizeDouble has not been implemented in the direct_call mechanism when loading the function relocation table in MQL5 program.
We'll check it in the morning and if so, we'll move it to direct_call and check the speed again.
Here is a study of NormalizeDouble.
Our MQL5 compiler has beaten our system function, which shows its adequacy when compared to the speed of C++ code.
If you replace
to the switch variant, you can see the quality of the switch implementation in numbers.
You are confusing direct indexed access to a static array by a constant index (which degenerates into a constant from a field) and switch.
Switch can't really compete with such a case. Switch has several frequently used optimizations of the form:
- "notoriously ordered and short values are put into a static array and indexed" - the simplest and fastest, can compete with the static array, but not always
- "several arrays by ordered and close chunks of values with zone boundary checks" - this already has a brake
- "we check too few values through if" - no speed, but it is the programmer's own fault, he uses switch inappropriately
- "very sparse ordered table with binary search" - very slow for the worst cases
In fact, the best strategy for switch is when the developer deliberately tried to make a compact set of values in the lower set of numbers.
Consider the cleaned up version of the script with NormalizeDouble:
Results:
Immediate remarks and explanations:
- static is needed here for the compiler to put this array outside the function and not build it on the stack at each function call. The C++ compiler does the same thing.
- To prevent the compiler from throwing out the loop due to its uselessness, we must use the results of calculations. For example, Print the variable Price.
- There is an error in your function that does not check the bounds of digits, which can easily lead to array overruns.
For example, call it as MyNormalizeDouble(Price+point,10) and catch the error:
The method of speeding up by not checking is acceptable, but not in our case. We must handle any erroneous data input. - Let's add a simple condition about the index greater than 8. To simplify the code, let's replace the type of the variable digits with uint, to make one comparison for >8 instead of additional condition <0
double MyNormalizeDouble( const double Value, const uint digits ) { static const double Points[] = {1.0 e-0, 1.0 e-1, 1.0 e-2, 1.0 e-3, 1.0 e-4, 1.0 e-5, 1.0 e-6, 1.0 e-7, 1.0 e-8}; const double point = digits > 8 ? 1.0 e-8 : Points[digits]; return((int)((Value > 0) ? Value / point + HALF_PLUS : Value / point - HALF_PLUS) * point); }
- This is a standard error of performance testers.
When writing tests we should keep in mind the full list of optimizations that can be applied by the compiler. You need to be clear about what input data you are using and how it will be destroyed when you write a simplified sample test. - So how should you test for performance?
By understanding how the compiler works, you need to prevent it from applying pre-optimizations and simplifications.
For example, let's make the digits parameter variable:
This is the NormalizeDouble study.
Our MQL5 compiler beat our system function, which shows its adequacy when compared to the speed of C++ code.
You are confusing direct indexed access to a static array by a constant index (which degenerates into a constant from a field) and switch.
Switch can't really compete with such a case. Switch has some commonly used optimizations of the kind:
- The "deliberately ordered and short values are put into a static array and indexed by switch" is the simplest and fastest, and can compete with a static array, but not always.
This is just such a case of ordering.
In fact, the best strategy for switch is when the developer has deliberately tried to make a compact set of values in the bottom set of numbers.
Here's just such a case of orderliness.
Tried it on a 32 bit system. The change to switch in the example above caused serious braking. I haven't checked it on the new machine.
There are actually two compiled programs in every MQL5: a simplified one for 32 bits and one maximally optimized for 64 bits. In 32 bit MT5 the new optimizer doesn't apply at all and the code for 32 bit operations is as simple as MQL4 in MT4.
All the efficiency of the compiler that can generate code ten times faster only when executed in the 64-bit version of MT5: https://www.mql5.com/ru/forum/58241
We are fully focused on 64-bit versions of the platform.
- reviews: 8
- www.mql5.com
On the subject of NormalizeDouble there is this nonsense
Forum on trading, automated trading systems and strategy testing
How do I go through an enumeration consistently?
fxsaber, 2016.08.26 16:08
There is this note in the function description
This is only true for symbols which have minimum price step 10^N, where N is integer and not positive. If the minimum price step has a different value, then normalizing the price levels before OrderSend is a meaningless operation that will return false OrderSend in most cases.
NormalizeDouble is completely discredited. Not only slow implementation, but also meaningless on multiple exchange symbols (e.g. RTS, MIX, etc.).
double CTrade::CheckVolume(const string symbol,double volume,double price,ENUM_ORDER_TYPE order_type) { //--- check if(order_type!=ORDER_TYPE_BUY && order_type!=ORDER_TYPE_SELL) return(0.0); double free_margin=AccountInfoDouble(ACCOUNT_FREEMARGIN); if(free_margin<=0.0) return(0.0); //--- clean ClearStructures(); //--- setting request m_request.action=TRADE_ACTION_DEAL; m_request.symbol=symbol; m_request.volume=volume; m_request.type =order_type; m_request.price =price; //--- action and return the result if(!::OrderCheck(m_request,m_check_result) && m_check_result.margin_free<0.0) { double coeff=free_margin/(free_margin-m_check_result.margin_free); double lots=NormalizeDouble(volume*coeff,2); if(lots<volume) { //--- normalize and check limits double stepvol=SymbolInfoDouble(symbol,SYMBOL_VOLUME_STEP); if(stepvol>0.0) volume=stepvol*(MathFloor(lots/stepvol)-1); //--- double minvol=SymbolInfoDouble(symbol,SYMBOL_VOLUME_MIN); if(volume<minvol) volume=0.0; } } return(volume); }
Well, you can not do so clumsily! It could be many times faster, forgetting about NormalizeDouble.
double NormalizePrice( const double dPrice, double dPoint = 0 ) { if (dPoint == 0) dPoint = ::SymbolInfoDouble(::Symbol(), SYMBOL_TRADE_TICK_SIZE); return((int)((dPrice > 0) ? dPrice / dPoint + HALF_PLUS : dPrice / dPoint - HALF_PLUS) * dPoint); }
And for the same volume then do
volume = NormalizePrice(volume, stepvol);
For prices do
NormalizePrice(Price, TickSize)
It seems correct to add something similar to overload the NormalizeDouble standard. Where the second parameter "digits" will be a double instead of int.
By 2016, most C++ compilers have arrived at the same levels of optimisation.
MSVC makes one wonder about the improvements with every update, and Intel C++ as a compiler has merged - it still hasn't recovered from its "internal error" on large projects.
Another of our improvements in the compiler in the 1400 build is that it is faster at compiling complex projects.
On topic. You have to create alternatives to the standard functions, because they sometimes give you the wrong output. Here's an example of SymbolInfoTick alternative
// Получение тика, который на самом деле вызвал крайнее событие NewTick bool MySymbolInfoTick( const string Symb, MqlTick &Tick, const uint Type = COPY_TICKS_ALL ) { MqlTick Ticks[]; const int Amount = ::CopyTicks(Symb, Ticks, Type, 0, 1); const bool Res = (Amount > 0); if (Res) Tick = Ticks[Amount - 1]; return(Res); } // Возвращает в точности то, что SymbolInfoTick bool CloneSymbolInfoTick( const string Symb, MqlTick &Tick ) { MqlTick TickAll, TickTrade, TickInfo; const bool Res = (MySymbolInfoTick(Symb, TickAll) && MySymbolInfoTick(Symb, TickTrade, COPY_TICKS_TRADE) && MySymbolInfoTick(Symb, TickInfo, COPY_TICKS_INFO)); if (Res) { Tick = TickInfo; Tick.time = TickAll.time; Tick.time_msc = TickAll.time_msc; Tick.flags = TickAll.flags; Tick.last = TickTrade.last; Tick.volume = TickTrade.volume; } return(Res); }
You may call SymbolInfoTick on each event NewTick in the tester and sum up volume-field to know stock turnover. But no, you can not! We have to make a much more logical MySymbolInfoDouble.
On the subject of NormalizeDouble there is this nonsense
Well, it cannot be so clumsy! You can make it many times faster forgetting about NormalizeDouble.
And for the same volume do
For prices do
It seems correct to add something like this as an overload to the NormalizeDouble standard. Where the second parameter "digits" will be a double instead of int.
You can optimize everything around it.
This is an endless process. But in 99% of cases it is economically unprofitable.
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
You agree to website policy and terms of use
NormalizeDouble
The result is 1123275 and 1666643 in favour of MyNormalizeDouble (Optimize=1). Without optimization, it is four times faster (in memory).