You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Can you please send me the link?
Didn't keep it. Mentioned on the forum here. Looked through the search engines myself.
Didn't keep it. Mentioned on the forum here. Searched through the search engines myself.
Obviously, all these bikes have been rebuilt many times over. Even books were published, right down to asm implementations.
Nowadays the basics are hard to find, as almost everyone uses relevant APIs for all occasions.
So you just have to register on forums and ask around.
Obviously, all these bikes have been rebuilt many times over. Even books have been published, up to and including asm implementations.
Now the basics are hard to find, as almost everyone uses relevant APIs for all occasions.
So you just have to register on forums and ask.
Why don't you use LONG_MAX/MIN? It would look nicer somehow. It looks nice, I think. I've played your tests with gcc (with min modification, of course, compiler is very old 5.4.0, what I had at hand):
Well, yes, it's not nice. ButLONG_MAX= 9223372036854775807 is more than 9007199254740992. And hexadecimal form of this number - 0x20000000000000 is berated because it must be only for ulong type. I don't even know how to make it clearer. I cannot write (ulong)(1<<53) because it is a time-consuming operation.
The double type begins to contain integers without fractional parts not from theLONG_MAX value but from the maximum possible mantissa. But 53 bits are allowed for the mantissa, i.e. 2^53=9007199254740992.
Your code timing fails - output is in miliseconds (not nano), and I still don't understand why we need minus t0.
t0 is time of full cycle of 1000000 passes of sum of prime double
while t is time of the same cycle of sum of the same double values, but passed through functions ceil, ceil, round etc.
I proceeded from the logic that the difference (t-t0) is the net time spent on these functions.
Of course, more objectivity can only be achieved by making several measurements.
- In nano I calculate on the basis of time taken to perform one function out of 1,000,000. Exactly in nano is correct.
pavlick_:
I ran your tests on gcc (with min modifications, of course, compiler is very old 5.4.0, what was at hand):
1. Compiling with -O3.
2. Compilation with -Ofast
Not to write (ulong)(1<<53), as this is already a time-consuming operation.
This operation is not time-consuming, like all operations with constants, including strings.
This operation is timeless like all constants, including strings.
Wow - cool! Thanks. And I thought it counts every time. Yeah, well, it's logical, you can already calculate it at compile time.
Well, that's it then:
However, it would be more correct to writeDBL_MANT_DIG instead of 53
Case of minimal gain, if all values of double are fractional.
So it turns out. That the compiled MQL5 code works faster than even Ofast? I find it hard to believe that you must have had a 32-bit compiler.
I took the minus t0 out of everything (thought it was some kind of error) and my output has the whole loop metered, not a single pass. If we convert to your form of output in nanoseconds per iteration (in the first line "Cycle time without rounding" - we have the same way of counting), we get:
There is not much acceleration on gcc (and even slower on -Ofast). On mcc there is significant speedup judging by your test, but:
you have 985'651 out of 1'000'000 i.e. almost all iterations satisfy the condition x < MIN || x > MAX.
-Ofast disables all inf/nan checks, errno setting, i.e. bare rounding on fpu is left. And this bare rounding cannot be defeated by a simple comparison of x < MIN || x > MAX.
There is not much acceleration on gcc (and even slower on -Ofast). On µl it's significant.
However, it's hard to say. We threw out t0 for nice figures and got 20 times difference. Even minimal additional code in form of loop (+t0) makes beautiful result in several tens of times to less attractive in about two times. And what can you say if it's not just a loop but a real algorithm doing something useful? The difference won't be visible at all, it will hang somewhere far after the decimal point and will hardly become a bottleneck. In a real application mutex pickup, cpu barriers, memory allocation are much more costly than rounding. All in all, it's not worth the gamble, imho.
Yes the difference won't be visible at all, will hang out somewhere far after the decimal point and is unlikely to be a bottleneck. In a real application taking mutex, cpu barriers, memory allocation are much more costly than rounding. All in all, it's not worth the gamble, imho.
This is true 99% of the time, yes.
Before you optimise, you should make sure you have something to optimise.
In my practice I remember only one case when my own implementation of atof really helped. Although it seemed to me that it did.
And you should keep in mind that any optimization (except ***) is not free.