Comments in source file changes runtime of functions

 

A short overview:

This issue was reported on "Metatrader Questions and Answers" end of November 2022.

This is the original description, I posted back then:

"

Alexey, I have found a very strange behaviour concerning the compilers optimization.

I hope you can give some insigt to this, as it is to my understanding completley irradic. Following to know to reproduce this, in the attached archive, you find a project concerning RadixSort, which is taken from codebase and had some issues which Ireported to the developer (amrali). But while working with the code and "drag-racing" it to sqeeze more performance out of it, I cam across a very strange behaviour of the compiler.

The File RadixSort.mqh contains three functions, all having (about) the functionality. - But when I change code in one function, one of the other functions suddenly perfums up to 30% better, which I dont understand at all...

To reproduce, you need to compile and run the script as is to receive the better results, then go into the file RadixSort.mqh and remove the lines 458 to 465 and the line 479. Then compile the project and let it run again, in the output at the end, you will see the function idx_RadixSort() as degraded in performance alot. But the code change that had been done, was in the function opt_RadixSort.

This does not make any sense to me, could you maybe shed some light on this?
"



To be noted, the code itself did not even change... - the #define SOME_STRANGE_COMPILER_BEHAVIOUR is defined and thus includes the same code snippet, but there is a huge difference in performance, on a totally different function.

Additionally, is there a way to understand how what and why the compiler optimizes, is there some manual on how to write good code for the optimizer, so we can get the most out of our code and the translation process?

I am asking, because I noticed, obviously the compiler defines some variables in loops as (c++)volatile, at least thats what it seems, also some fnuctions seem to be implicit __inline declared...

It would be nice to have some understanding of whats going on behind the curtain. - Especially concerning the optimizations, as I have been stumbling across some strange behaviours previously already. - Forum is no help here, seems to me its a rather unknown issue, that the optimizer sometimes realyl does not optimize to the benefits.

Maybe in some future version you could include a compiler directive along the lines of "#nooptimization_begin" and "#nooptimization_end" so that a good coder would be able to exlude some code from optimizations. - Also the introduction of the keywords __inline (probably not so much, because implicit) and volatile would be really nice to have, especially for experienced programmers/coders.

Yes, I know, I keep asking for ridicolous stuff, but in fact MQL has reached a level of versality where it begins to compete with grown ups like C/C++ and that will be the new level of expectancy...

I personalyl think the compiler is realyl some good work, aspecially the optimizer, considering the anout of bad code it gets to see, and how well the results are in the end.

Keep up the good work, and please provide some insight to how more advanced coders could utilize the optimizer to theri advantage.



I would appreciate some insights into this issue, the build version back then was 3521 on MT5. - I have not checked if this is still the same behaviour, but I guess it is so.

Additionally, there is a huge difference when dealling with different datatypes.

RadixSort1 is related to the comment "issue" in the source code.

RadixSort2 and RadixSort3 are related to different datatypes and their impact on execution speeds.


Here is a screenshot of what I have found:


The original messages can be read o nthis thread, if anyone needs to reference the original reporting of the issue.

https://www.mql5.com/en/messages/01eaf69eb021d801

You will need to scroll back to around 25. Nov 2022.


Thank you for anyone who is willing to look into this.

Files:
RadixSort1.zip  32 kb
RadixSort2.zip  83 kb
RadixSort3.zip  75 kb
 
Actually, I have come to a similar conclusion before. I noticed that sometimes mql5 compiler performed inconsistent optimizations, which were related to the structure of the source code (even adding a new function to the file) . 

My solution was to benchmark the final code edit to make sure runtime performance is still OK.
 
amrali #:
Actually, I have come to a similar conclusion before. I noticed that sometimes mql5 compiler performed inconsistent optimizations, which were related to the structure of the source code (even adding a new function to the file) . 

My solution was to benchmark the final code edit to make sure runtime performance is still OK.
It is really strange, especially because the source code should not be an influencing part of the resulting binary.

Comments should not even be part of the input to the compiler in the first place. I would have expected them to be removed in the preprocessor stage.

Conspiracy on! Maybe the binaries include more than we would expect... At least this could be a hint to such.

But I would guess it has to do with memory paging and the functions used to protect the binary code from extraction. Who knows what these protection mechanics do to the binary.

After all, somehow the code needs to be "scrambled" so you cannot read it from a memory dump.

I don't know what they are using, byte-shifting, LSB/MSB conversion, on the fly decryption...

But for sure some security measures to protect the byte code, I guess.

Probably we will never know.



 

Build 3683.

Original code (RadixSort1) run :

#################
Average results over 10 runs:
ArraySort() -> 1021755 microsec
amrali_RadixSort() -> 99400 microsec
RadixSort() -> 121404 microsec
IDX RadixSort() -> 203097 microsec

After the change requested :

Run 1

#################
Average results over 10 runs:
ArraySort() -> 1033613 microsec
amrali_RadixSort() -> 99617 microsec
RadixSort() -> 119855 microsec
IDX RadixSort() -> 211775 microsec

Run 2

#################
Average results over 10 runs:
ArraySort() -> 1019371 microsec
amrali_RadixSort() -> 96409 microsec
RadixSort() -> 118986 microsec
IDX RadixSort() -> 201510 microsec

With SOME_STRANGE_COMPILER_BEHAVIOUR commented.

#################
Average results over 10 runs:
ArraySort() -> 1027501 microsec
amrali_RadixSort() -> 97276 microsec
RadixSort() -> 117883 microsec
IDX RadixSort() -> 205760 microsec

Nothing noticeable.

 
Alain Verleyen #:

Build 3683.

Original code (RadixSort1) run :

#################
Average results over 10 runs:
ArraySort() -> 1021755 microsec
amrali_RadixSort() -> 99400 microsec
RadixSort() -> 121404 microsec
IDX RadixSort() -> 203097 microsec

After the change requested :

Run 1

#################
Average results over 10 runs:
ArraySort() -> 1033613 microsec
amrali_RadixSort() -> 99617 microsec
RadixSort() -> 119855 microsec
IDX RadixSort() -> 211775 microsec

Run 2

#################
Average results over 10 runs:
ArraySort() -> 1019371 microsec
amrali_RadixSort() -> 96409 microsec
RadixSort() -> 118986 microsec
IDX RadixSort() -> 201510 microsec

With SOME_STRANGE_COMPILER_BEHAVIOUR commented.

#################
Average results over 10 runs:
ArraySort() -> 1027501 microsec
amrali_RadixSort() -> 97276 microsec
RadixSort() -> 117883 microsec
IDX RadixSort() -> 205760 microsec

Nothing noticeable.

Can be considered within margin of error. I agree.

May I ask your CPU? - I suspect hardware to be the cause.

I will verify on my systems and post results later as well.



 
Dominik Christian Egert #:
Can be considered within margin of error. I agree.

May I ask your CPU? - I suspect hardware to be the cause.

I will verify on my systems and post results later as well.



Windows 10 build 19044, 12 x Intel Core i7-9750H  @ 2.60GHz, AVX, 7 / 15 Gb memory, 19 / 279 Gb disk