OpenCL: internal implementation tests in MQL5 - page 58

 
Mathemat:

Well, finally, we've been waiting for you. If you are ready to experiment, install Intel OpenCL Runtime ( link instructions).

Run the script without changing anything in it and post the log from "Experts" tab just like your previous colleagues did. The script will take about 3 minutes to run on your gem, bear with me. At the same time we will find out how OpenCL runs on the coolest barebones Sandy Bridge stone.

If you don't have the patience or if you think it's beyond your abilities, that's OK, no offence taken.

P.S. The purpose of this script is to see what a bare CPU can do without any discrete video dragons. I suspect that with Intel Runtime properly inserted, this script will have acceleration around 200 or even slightly higher.

There is some doubt that Intel OpenCL Runtime is installed correctly.

2012.04.23 00:17:51 ParallelTester_00-01x__3 (EURUSD,H1) CpuTime/GpuTime = 1347.164383561644

2012.04.23 00:17:51 ParallelTester_00-01x__3 (EURUSD,H1) Result on Cpu МахResult==0.9316 at 10253 pass

2012.04.23 00:17:51 ParallelTester_00-01x__3 (EURUSD,H1) Соunt indicators = 16; Count history bars = 144000; Count pass = 12800

2012.04.23 00:17:51 ParallelTester_00-01x__3 (EURUSD,H1) CPU time = 295029 ms

2012.04.23 00:12:56 ParallelTester_00-01x__3 (EURUSD,H1) Result on Gpu МахResult==0.9316 at 10253 pass

2012.04.23 00:12:56 ParallelTester_00-01x__3 (EURUSD,H1) Соunt indicators = 16; Count history bars = 144000; Count pass = 12800

2012.04.23 00:12:56 ParallelTester_00-01x__3 (EURUSD,H1) GPU time = 219 ms

2012.04.23 00:12:56 ParallelTester_00-01x__3 (EURUSD,H1) OpenCL init OK!

2012.04.23 00:12:56 ParallelTester_00-01x__3 (EURUSD,H1) CLGetInfoInteger() returned 4


 
casinonsk:

There are doubts that Intel OpenCL Runtime is installed correctly.

2012.04.23 00:17:51 ParallelTester_00-01x__3 (EURUSD,H1) CPU time = 295029 ms

2012.04.23 00:12:56 ParallelTester_00-01x__3 (EURUSD,H1) Result on Gpu МахResult==0.9316 at 10253 pass

2012.04.23 00:12:56 ParallelTester_00-01x__3 (EURUSD,H1) Соunt indicators = 16; Count history bars = 144000; Count pass = 12800

2012.04.23 00:12:56 ParallelTester_00-01x__3 (EURUSD,H1) GPU time = 219 ms

You probably set the CLContextCreate() argument to something other than zero. Well, I asked you not to change anything! We already know the capabilities of your two-headed video dragon. But it seems that you have even several of them.

According to my rough estimations, at CLContextCreate(0) you should have figures about CPU time = 180000 ms and GPU time = 900 ms (approximately). Your first runtime is strangely high for such a CPU. Maybe it was overloaded with other tasks?

Can you just run the script as I attached it - without changing anything in the code, not a single character?

P.S. Of course, maybe device = 0 corresponds not to the CPU but to something else. Well, then experiment (from 0 to 3). The highest GPU time should in theory correspond to the bare CPU, i.e. host.

 
Mathemat:

You have not set zero as an argument of CLContextCreate() but something else. Well, I asked you not to change anything! We already know the capabilities of your two-headed video dragon. But you seem to have several of them.

According to my rough estimate, at CLContextCreate(0) you should have figures on the order of CPU time = 180000 ms and GPU time = 900 ms (approximately). Your first runtime is strangely high for such a CPU. Maybe it was overloaded with other tasks?

Can you just run the script as I attached it - without changing anything, not a single character?

P.S. Of course, maybe device = 0 corresponds not to the CPU but to something else. Well, then experiment (from 0 to 3). The highest GPU time should in theory correspond to the bare CPU, i.e. host.

I didn't change the parameters! Ran it as is.

According to the video, yes it is 2x590.

I tried it again with CLContextCreate(0)1,2,3, the result is the same as before.

The problem may be withIntel OpenCL Runtime.

 
casinonsk:

Didn't change the parameters! Started as is.

Ran it again with CLContextCreate(0)1,2,3 the result is the same as before.

All the results are the same? I don't believe it... Well, this script can't have GPU time = 219 ms on a bare CPU. At the very best, it would be about 800 ms, but not about 200 ms.

Here's my typical result (I have Pentium G840 CPU):

2012.04.22 22:23:09    ParallelTester_00-01x (EURUSD,H1)    CpuTime/GpuTime = 88.40817091454272
2012.04.22 22:23:09    ParallelTester_00-01x (EURUSD,H1)    Result on Cpu МахResult==1.05116 at 7785 pass
2012.04.22 22:23:09    ParallelTester_00-01x (EURUSD,H1)    Соunt indicators = 16; Count history bars = 144000; Count pass = 12800
2012.04.22 22:23:09    ParallelTester_00-01x (EURUSD,H1)    CPU time = 235873 ms
2012.04.22 22:19:13    ParallelTester_00-01x (EURUSD,H1)    Result on Gpu МахResult==1.05116 at 7785 pass
2012.04.22 22:19:13    ParallelTester_00-01x (EURUSD,H1)    Соunt indicators = 16; Count history bars = 144000; Count pass = 12800
2012.04.22 22:19:13    ParallelTester_00-01x (EURUSD,H1)    GPU time = 2668 ms
2012.04.22 22:19:10    ParallelTester_00-01x (EURUSD,H1)    OpenCL init OK!
2012.04.22 22:19:10    ParallelTester_00-01x (EURUSD,H1)    CLGetInfoInteger() returned 1

OK, let's move on.

 
Mathemat:

Are all the results the same? I don't believe it... Well, this script can't have GPU time = 219 ms on a bare CPU. At best it will be about 800 ms, but not about 200 ms.

Ok, forget it.

I just tried again withCLContextCreate(0)

 2012.04.23 01:27:15 ParallelTester_00-01x__3 (EURUSD,H1) CpuTime/GpuTime = 1265.405982905983

2012.04.23 01:27:15 ParallelTester_00-01x__3 (EURUSD,H1) Result on Cpu МахResult==1.48772 at 2051 pass

2012.04.23 01:27:15 ParallelTester_00-01x__3 (EURUSD,H1) Соunt indicators = 16; Count history bars = 144000; Count pass = 12800

2012.04.23 01:27:15 ParallelTester_00-01x__3 (EURUSD,H1) CPU time = 296105 ms

2012.04.23 01:22:19 ParallelTester_00-01x__3 (EURUSD,H1) Result on Gpu МахResult==1.48772 at 2051 pass

2012.04.23 01:22:19 ParallelTester_00-01x__3 (EURUSD,H1) Соunt indicators = 16; Count history bars = 144000; Count pass = 12800

2012.04.23 01:22:19 ParallelTester_00-01x__3 (EURUSD,H1) GPU time = 234 ms

2012.04.23 01:22:18 ParallelTester_00-01x__3 (EURUSD,H1) OpenCL init OK!

2012.04.23 01:22:18 ParallelTester_00-01x__3 (EURUSD,H1) CLGetInfoInteger() returned 5


 
casinonsk: Just tried again withCLContextCreate(0)

It's obviously on a discrete card, not on CPU: such speedups on emulation are hardly possible. And the number of devices you have is already 5, it's creepy.

If you would not mind running a slightly modified code, please, and post the result here. In the code, calculations for the various OpenCL devices are put into a loop (they should be fast) while calculation on x86, the longest one, is executed only once. It will be long, but the script itself is executed once.

I realise that I am already boring you. But in any case it would be good info for the Support Team.
 

I have an interesting result )) Got it, it's not the total number of devices, it's the current number

2012.04.22 22:02:51     ParallelTester_00-01 x_cycle (EURUSD,M30)        OpenCL init OK! Device number = 0

Although calculations are in progress, I'll post them when ready.

By the way, neither CLGetInfoInteger() nor CL_DEVICE_COUNT are present in help.

PS. result

2012.04.22 22:02:51     ParallelTester_00-01 x_cycle (EURUSD,M30)        OpenCL init OK! Device number = 0
2012.04.22 22:03:03     ParallelTester_00-01 x_cycle (EURUSD,M30)        GPU time = 11357 ms
2012.04.22 22:03:03     ParallelTester_00-01 x_cycle (EURUSD,M30)        Соunt indicators = 16; Count history bars = 144000; Count pass = 12800
2012.04.22 22:03:03     ParallelTester_00-01 x_cycle (EURUSD,M30)        Result on Gpu МахResult==1.68487 at 9198 pass
2012.04.22 22:03:03     ParallelTester_00-01 x_cycle (EURUSD,M30)        OpenCL init OK! Device number = 1
2012.04.22 22:03:04     ParallelTester_00-01 x_cycle (EURUSD,M30)        GPU time = 998 ms
2012.04.22 22:03:04     ParallelTester_00-01 x_cycle (EURUSD,M30)        Соunt indicators = 16; Count history bars = 144000; Count pass = 12800
2012.04.22 22:03:04     ParallelTester_00-01 x_cycle (EURUSD,M30)        Result on Gpu МахResult==1.68487 at 9198 pass
2012.04.22 22:10:13     ParallelTester_00-01 x_cycle (EURUSD,M30)        CPU time = 428706 ms
2012.04.22 22:10:13     ParallelTester_00-01 x_cycle (EURUSD,M30)        Соunt indicators = 16; Count history bars = 144000; Count pass = 12800
2012.04.22 22:10:13     ParallelTester_00-01 x_cycle (EURUSD,M30)        Result on Cpu МахResult==1.68487 at 9198 pass
2012.04.22 22:10:13     ParallelTester_00-01 x_cycle (EURUSD,M30)        CpuTime/GpuTime = 429.565130260521
 
fyords: By the way, there is no CLGetInfoInteger() or CL_DEVICE_COUNT in the help.

PS. result

Update your help, yours is outdated.

2. You have

2012.03.04 22:27:16     Terminal        GPU: NVIDIA Corporation GeForce GT 440 with OpenCL 1.1 (2 units, 1660 MHz, 1024 Mb, version 295.73)
2012.03.04 22:27:16     Terminal        CPU: AuthenticAMD AMD Athlon(tm) II X4 630 Processor with OpenCL 1.1 (4 units, 2812 MHz, 2048 Mb, version 2.0)

very likely the first number, 11357 ms, refers to the host (bare CPU), and the second, 998 ms, refers to the graphics card. The order of gain on the host, roughly speaking, is quite consistent with the acceleration achieved on AMD OpenCL (about 38 times on a 4-core Athlon II). Although a bit small actually, it should be more, somewhere close to 50-60. It is quite possible that your memory is very slow.

The CpuTime/GpuTime figure, of course, is only calculated for the last calculated device.

 
Mathemat:

1. update the help, yours is clearly out of date.

2. The CpuTime/GpuTime figure, of course, is only calculated for the last device calculated.

1. updated manually everything is there, thank you, but is the help not updated along with the terminal and meta-editor?

2. Yes, but still, nice ).

 
papaklass:

With your hardware it's almost clear, you only have the host being an OpenCL device. On the other hand, it's strange how you got such a high result earlier (page 51):

2012.04.08 13:28:01     ParallelTester_00-02-s16x7x3k (EURUSD,H1)       OpenCL init OK!
2012.04.08 13:28:08     ParallelTester_00-02-s16x7x3k (EURUSD,H1)       GPU time = 7145 ms
2012.04.08 13:28:08     ParallelTester_00-02-s16x7x3k (EURUSD,H1)       Соunt inticators = 16; Count history bars = 50000; Count pass = 4096
2012.04.08 13:28:08     ParallelTester_00-02-s16x7x3k (EURUSD,H1)       Result on Gpu МахResult==3.86669 at 1682 pass
2012.04.08 13:35:11     ParallelTester_00-02-s16x7x3k (EURUSD,H1)       CPU time = 422888 ms
2012.04.08 13:35:11     ParallelTester_00-02-s16x7x3k (EURUSD,H1)       Соunt inticators = 16; Count history bars = 50000; Count pass = 4096
2012.04.08 13:35:11     ParallelTester_00-02-s16x7x3k (EURUSD,H1)       Result on Cpu МахResult==3.86669 at 1682 pass
2012.04.08 13:35:11     ParallelTester_00-02-s16x7x3k (EURUSD,H1)       CpuTime/GpuTime = 59.18656403079076

For some reason I can't get the cursor out of the code insertion. And if I quote someone, the same nonsense. Is it a forum bug?

More likely a bug in the forum engine, but not always so. I usually click on the HTML bar and manually insert a couple of letters after the last tag. Then I go back to the visual editing mode of the post.