Alternative implementations of standard functions/approaches - page 11

 
Nikolai Semko:
Can you please send me the link?

Didn't keep it. Mentioned on the forum here. Looked through the search engines myself.

Вопрос к сообществу программистов по поводу авторства
Вопрос к сообществу программистов по поводу авторства
  • 2017.11.24
  • www.mql5.com
Общее обсуждение: Вопрос к сообществу программистов по поводу авторства
 
fxsaber:

Didn't keep it. Mentioned on the forum here. Searched through the search engines myself.

I have seen it. It's all very primitive without pixel colour mixing.
It's just that everything I came across on the forums was kindergarten level. And I'm already in the 5th grade.
 
Nikolai Semko:
It's just that everything I've encountered on the forums has been at kindergarten level. And I'm already in the 5th grade.

Obviously, all these bikes have been rebuilt many times over. Even books were published, right down to asm implementations.

Nowadays the basics are hard to find, as almost everyone uses relevant APIs for all occasions.

So you just have to register on forums and ask around.

 
fxsaber:

Obviously, all these bikes have been rebuilt many times over. Even books have been published, up to and including asm implementations.

Now the basics are hard to find, as almost everyone uses relevant APIs for all occasions.

So you just have to register on forums and ask.

That's the thing: it's difficult. Anyway, I couldn't find it. Perhaps I wasn't looking hard enough. On the forums, everyone will send you to the standard closed libraries and wonder why you need it, when everything is available. Of course, I wouldn't worry my head if I wrote in Java, JavaScript and the like. or if the marketplace wasn't needed.
Ok, I'm already used to being proudly alone in this matter for now. I'll continue, moreover, I have practically no blank spots in understanding of almost any implementation in this direction. And on the other hand I have acquired some unique skills.
 
pavlick_:

Why don't you use LONG_MAX/MIN? It would look nicer somehow. It looks nice, I think. I've played your tests with gcc (with min modification, of course, compiler is very old 5.4.0, what I had at hand):


Well, yes, it's not nice. ButLONG_MAX= 9223372036854775807 is more than 9007199254740992. And hexadecimal form of this number - 0x20000000000000 is berated because it must be only for ulong type. I don't even know how to make it clearer. I cannot write (ulong)(1<<53) because it is a time-consuming operation.

The double type begins to contain integers without fractional parts not from theLONG_MAX value but from the maximum possible mantissa. But 53 bits are allowed for the mantissa, i.e. 2^53=9007199254740992.

pavlick_:

Your code timing fails - output is in miliseconds (not nano), and I still don't understand why we need minus t0.

t0 is time of full cycle of 1000000 passes of sum of prime double

while t is time of the same cycle of sum of the same double values, but passed through functions ceil, ceil, round etc.

I proceeded from the logic that the difference (t-t0) is the net time spent on these functions.

Of course, more objectivity can only be achieved by making several measurements.

- In nano I calculate on the basis of time taken to perform one function out of 1,000,000. Exactly in nano is correct.

pavlick_:

I ran your tests on gcc (with min modifications, of course, compiler is very old 5.4.0, what was at hand):

1. Compiling with -O3.

2. Compilation with -Ofast

So it turns out. That the compiled MQL5 code runs faster than even Ofast? It's hard to believe. You must have had a 32 bit compiler there.
 
Nikolai Semko:

Not to write (ulong)(1<<53), as this is already a time-consuming operation.

This operation is not time-consuming, like all operations with constants, including strings.

input long l = (ulong)1 << 53;
input string s = (string)__DATETIME__ + __FILE__;
 
fxsaber:

This operation is timeless like all constants, including strings.

Wow - cool! Thanks. And I thought it counts every time. Yeah, well, it's logical, you can already calculate it at compile time.
Well, that's it then:

double Ceil (double x) { return double((x>(long)1 << 53 || x<-(long)1 << 53 )?x:(x-(long)x>0)?(long)x+1:(long)x);}
double Round(double x) { return double((x>(long)1 << 53 || x<-(long)1 << 53 )?x:(x>0)?(long)(x+0.5):(long)(x-0.5));}
double Floor(double x) { return double((x>(long)1 << 53 || x<-(long)1 << 53 )?x:(x>0)?(long)x:((long)x-x>0)?(long)x-1:(long)x);}
2018.08.26 18:04:07.638 TestRound (EURUSD,M1)   Время цикла без округления = 1.302 наносекунд, сумма = 115583114403605978808320.00000000
2018.08.26 18:04:07.642 TestRound (EURUSD,M1)   Время выполнения функции ceil =  2.389 наносекунд, Контрольная сумма = 1.15583114403606 e+23
2018.08.26 18:04:07.644 TestRound (EURUSD,M1)   Время выполнения функции Ceil =  0.223 наносекунд, Контрольная сумма = 1.15583114403606 e+23
2018.08.26 18:04:07.648 TestRound (EURUSD,M1)   Время выполнения функции floor = 2.884 наносекунд, Контрольная сумма = 1.15583114403606 e+23
2018.08.26 18:04:07.649 TestRound (EURUSD,M1)   Время выполнения функции Floor = 0.122 наносекунд, Контрольная сумма = 1.15583114403606 e+23
2018.08.26 18:04:07.654 TestRound (EURUSD,M1)   Время выполнения функции round = 3.413 наносекунд, Контрольная сумма = 1.15583114403606 e+23
2018.08.26 18:04:07.656 TestRound (EURUSD,M1)   Время выполнения функции Round = 0.222 наносекунд, Контрольная сумма = 1.15583114403606 e+23
2018.08.26 18:04:07.656 TestRound (EURUSD,M1)   Идет бесконечный поиск расхождения по случайным числам double ... Прервите скрипт, когда надоест ждать

However, it would be more correct to writeDBL_MANT_DIG instead of 53

double Ceil (double x) { return double((x>(long)1 << DBL_MANT_DIG || x<-(long)1 << DBL_MANT_DIG )?x:(x-(long)x>0)?(long)x+1:(long)x);}
double Round(double x) { return double((x>(long)1 << DBL_MANT_DIG || x<-(long)1 << DBL_MANT_DIG )?x:(x>0)?(long)(x+0.5):(long)(x-0.5));}
double Floor(double x) { return double((x>(long)1 << DBL_MANT_DIG || x<-(long)1 << DBL_MANT_DIG )?x:(x>0)?(long)x:((long)x-x>0)?(long)x-1:(long)x);}

Case of minimal gain, if all values of double are fractional.

2018.08.26 18:20:35.408 TestRound (EURUSD,M1)   Время выполнения функции sqrt = 1.083 наносекунд, сумма = 81969849.90928555
2018.08.26 18:20:35.413 TestRound (EURUSD,M1)   Время выполнения функции ceil =  3.579 наносекунд, Контрольная сумма = 5250492895.0
2018.08.26 18:20:35.416 TestRound (EURUSD,M1)   Время выполнения функции Ceil =  1.249 наносекунд, Контрольная сумма = 5250492895.0
2018.08.26 18:20:35.422 TestRound (EURUSD,M1)   Время выполнения функции floor = 3.931 наносекунд, Контрольная сумма = 5249492896.0
2018.08.26 18:20:35.424 TestRound (EURUSD,M1)   Время выполнения функции Floor = 0.513 наносекунд, Контрольная сумма = 5249492896.0
2018.08.26 18:20:35.427 TestRound (EURUSD,M1)   Время выполнения функции round = 1.519 наносекунд, Контрольная сумма = 5249992896.0
2018.08.26 18:20:35.429 TestRound (EURUSD,M1)   Время выполнения функции Round = 0.571 наносекунд, Контрольная сумма = 5249992896.0
Files:
TestRound.mq5  11 kb
 
Nikolai Semko:
So it turns out. That the compiled MQL5 code works faster than even Ofast? I find it hard to believe that you must have had a 32-bit compiler.

I took the minus t0 out of everything (thought it was some kind of error) and my output has the whole loop metered, not a single pass. If we convert to your form of output in nanoseconds per iteration (in the first line "Cycle time without rounding" - we have the same way of counting), we get:

-O3
Время цикла без округления = 1.099 наносекунд, сумма = 1.15583114 e+23
-Ofast
Время цикла без округления = 0.552 наносекунд, сумма = 1.15583114 e+23

There is not much acceleration on gcc (and even slower on -Ofast). On mcc there is significant speedup judging by your test, but:

you have 985'651 out of 1'000'000 i.e. almost all iterations satisfy the condition x < MIN || x > MAX.


-Ofast disables all inf/nan checks, errno setting, i.e. bare rounding on fpu is left. And this bare rounding cannot be defeated by a simple comparison of x < MIN || x > MAX.

 
pavlick_:

There is not much acceleration on gcc (and even slower on -Ofast). On µl it's significant.

However, it's hard to say. We threw out t0 for nice figures and got 20 times difference. Even minimal additional code in form of loop (+t0) makes beautiful result in several tens of times to less attractive in about two times. And what can you say if it's not just a loop but a real algorithm doing something useful? The difference won't be visible at all, it will hang somewhere far after the decimal point and will hardly become a bottleneck. In a real application mutex pickup, cpu barriers, memory allocation are much more costly than rounding. All in all, it's not worth the gamble, imho.

 
pavlick_:

Yes the difference won't be visible at all, will hang out somewhere far after the decimal point and is unlikely to be a bottleneck. In a real application taking mutex, cpu barriers, memory allocation are much more costly than rounding. All in all, it's not worth the gamble, imho.

This is true 99% of the time, yes.

Before you optimise, you should make sure you have something to optimise.

In my practice I remember only one case when my own implementation of atof really helped. Although it seemed to me that it did.

And you should keep in mind that any optimization (except ***) is not free.