OpenCL: internal implementation tests in MQL5 - page 44

 
joo:

It looks like it does.

Not sure, because I try not to use tricky constructions (hard to perceive by eye), but probably shouldn't, as it can speed up code.

Your code should probably be slower, because the b variable is redeclared at each loop iteration.

My code is shorter and faster anyway. You can check both for speed and equivalence
"__kernel void MFractal(                                    \r\n"
"                       __global int *out                   \r\n"
"                      )                                    \r\n"
"  {                                                        \r\n"
"   int i = get_global_id(0);                               \r\n"
"   out[i]= i;                                              \r\n"
"  }                                                        \r\n"; //;-))
;-)
 
MetaDriver: My code is shorter and faster anyway. You can check for both speed and equivalence;-)
No, that's not fair. You should handle Andrei's code and speed it up.
 
Mathemat:
Nah, well that's not fair. You handle Andrei's code and speed it up.

What have I done?

What is required of the optimiser? Reducing overhead while keeping the results equivalent! So check it out, it all adds up to a penny.

:)

// By the way !!! This code does not drop my driver even with #define BUF_SIZE 1024*1024*4 !!!

// This is a breakthrough!

;))))

 
MetaDriver:

What did I do?

What is required of the optimiser? Reducing overhead while keeping the results equivalent! So check it out, it all adds up to a penny.

:)

Did it add up? How did you check? :O
 
joo:
Does it fit? How did you check? :O
How, how! With a calculator! I ran eight of them in parallel and checked.
 
MetaDriver:
How, how! On the calculator! I ran eight of them in parallel and checked.

Your calculator is lying. :)

The cycle adds 0+1+2+3+...+99999999. And after every 10000 steps the value is 0.

What did you do? - You assigned a fly number and that's it. How can my results be the same as yours?

 

I've been doing a bit of research and suspected that this

"__kernel void MFractal(                                    \r\n"
"                       __global int *out                   \r\n"
"                      )                                    \r\n"
"  {                                                        \r\n"
"   out[get_global_id(0)]= get_global_id(0);                \r\n"
"  }                                                        \r\n";

The thing is, get_global_id() is most likely not a function, but a super-fast register operation.

If someone really needs and wants to, please report the results to the studio. it will be useful.
 
MetaDriver:

What is required of the optimiser? Reducing overhead while keeping the results equivalent! So check it out, it all adds up to a penny.

// By the way !!! This code does not drop my driver even with #define BUF_SIZE 1024*1024*4 !!!

I smoked a bit longer and got upset. On the other hand - if the driver doesn't drop it, the equivalence in the result is already incomplete.

Shit... What a bummer. :(

 
MetaDriver: You've smoked longer and got upset. On the other hand, since the driver doesn't drop it, the equivalence in the result is already incomplete.

Nah, you haven't been smoking well.

1^3 + 2^3 + 3^3 + ... + 1000000000^3 = 1000000000^2 * (1000000000 + 1)^2 / 4

Left and right are the same thing, full equivalence.

Only with the left expression you will heat the stone for quite a long time, while with the right one it will be almost instantaneous, the processor won't notice and will stay cold.

P.S. I upgraded to OpenCL 1.2 (it's beta). Please note the small addition after the version number - sse2.

2012.03.22 07:43:28     Terminal        CPU: GenuineIntel  Intel(R) Pentium(R) CPU G840 @ 2.80 GHz with OpenCL 1.2 (2 units, 2793 MHz, 7912 Mb, version 2.0 (sse2))

Not to say that it has dramatically improved, but it has speeded up on some tests. Say, on Tast_Mand_ (well, you're a pervert, Andrewha) - 5% to 10%. Just a little, but good.

 
joo:

Your calculator is lying. :)

The loop adds 0+1+2+3+...+99999999. And after every 10000 steps the value is 0.

What did you do? - You assigned a fly number and that's it. So how can my results be the same as yours?

I gave you almost 24 hours to come to your senses. You still insist? :)

Let's see:

// Это твой код (оригинал)
"  {                                \r\n"
"   int i = get_global_id(0);       \r\n"
"   for(int u=0;u<100000000;u++)   \r\n"
"   {                           \r\n"
"    out[i]+=u;              \r\n"  
"    if(out[i]>10000)     \r\n"
"      out[i]=0;         \r\n" // после десятитысячного шага итерации этот оператор начнёт выполняться в каждом цикле.
"   }                    \r\n" // т.е. на выходе из цикла по любому будем иметь out[i] = 0;
"   out[i]+= i;          \r\n" // ecли к нулю прибавить номер мухи то получится... сам посчитай..... на калькуляторе... :)
"  }                     \r\n";// есть ещё вариант, надёжнее - распечатай результаты и сравни.  ;-)))