Evaluating CPU cores for optimisation - page 3

 
Aleksey Vyazmikin:

The idea is that you could get a faster processor (buy it on eBay or Ali) and it would be fine, but it's not fast enough.

No, it's a soldered-in processor. It is, in fact, a media player, for watching all sorts of nonsense for sleep, connected to a projector. The point of the experiment was to test on the cheesiest hardware available (which is where the metatrader runs).

Added:

A good start:

code generated          
0 error(s), 0 warning(s), 344528 msec elapsed   
 
Serhii Shevchuk:

No, there's a processor soldered in. It's basically a media player, for watching all sorts of nonsense for sleep, connected to a projector. The point of the experiment was to test on the most cheesy hardware available (which runs Metatrader).

Ah, so that's it, it was the availability of 8 gig RAM that confused me!

Serhii Shevchuk:

A good start:

And if you move the calculations to the body rather than to a function, it takes hours to compile...

 
Aleksey Vyazmikin:

Please test this version on FX with 4 and 8 agents.

So.
Agents: 4. Optimisation passes: 8. Results:

2019.08.10 01:31:47.465 Core 4  pass 2 returned result 1001000.00 in 0:02:54.814
2019.08.10 01:31:48.782 Core 1  pass 0 returned result 1001000.00 in 0:02:56.136
2019.08.10 01:31:49.263 Core 2  pass 4 returned result 1001000.00 in 0:02:56.557
2019.08.10 01:31:50.412 Core 3  pass 6 returned result 1001000.00 in 0:02:57.711
2019.08.10 01:34:45.025 Core 4  pass 3 returned result 1001000.00 in 0:02:57.560
2019.08.10 01:34:45.482 Core 2  pass 5 returned result 1001000.00 in 0:02:56.218
2019.08.10 01:34:45.637 Core 1  pass 1 returned result 1001000.00 in 0:02:56.850
2019.08.10 01:34:49.330 Core 3  pass 7 returned result 1001000.00 in 0:02:58.916
2019.08.10 01:34:49.330 Tester  optimization finished, total passes 8
2019.08.10 01:34:49.341 Statistics      optimization done in 5 minutes 58 seconds
2019.08.10 01:34:49.341 Statistics      shortest pass 0:02:54.814, longest pass 0:02:58.916, average pass 0:02:56.845

Agents: 8. Optimisation passes: 8. Results:

2019.08.10 01:41:03.259 Core 2  pass 1 returned result 1001000.00 in 0:04:22.641
2019.08.10 01:41:07.004 Core 8  pass 7 returned result 1001000.00 in 0:04:25.297
2019.08.10 01:41:07.715 Core 7  pass 4 returned result 1001000.00 in 0:04:26.076
2019.08.10 01:41:08.051 Core 1  pass 0 returned result 1001000.00 in 0:04:27.445
2019.08.10 01:41:09.096 Core 6  pass 5 returned result 1001000.00 in 0:04:27.458
2019.08.10 01:41:09.459 Core 4  pass 3 returned result 1001000.00 in 0:04:28.851
2019.08.10 01:41:09.695 Core 3  pass 2 returned result 1001000.00 in 0:04:29.082
2019.08.10 01:41:09.966 Core 5  pass 6 returned result 1001000.00 in 0:04:28.213
2019.08.10 01:41:09.966 Tester  optimization finished, total passes 8
2019.08.10 01:41:09.977 Statistics      optimization done in 4 minutes 29 seconds
2019.08.10 01:41:09.977 Statistics      shortest pass 0:04:22.641, longest pass 0:04:29.082, average pass 0:04:26.882

Added:

So you're right - the FX processor smells like hyperthreading (although I took it as a full 8-core).

And the agent manager sees it as 8-core too (with the existing 7 agents it offers to add 1 more, see image):

agents

P.S. Now this is interesting!

Deleted all the agents. Decided to add again. And then - for the first time in I don't know how many years - I was asked to add 4 agents instead of the usual 8:

agents2019

So, something has changed in the tester in relation to ancient AMD stones, and, moreover, it's very fresh changes.

P.P.S.

Yes, build 2085 also proposed to add 8 agents. A very fresh innovation:

2085

As time goes on, everything is shrinking. There used to be eight agents, now there are four.
 
Serhii Shevchuk:

So.
Agents: 4. Optimization passes: 8. Results:

Agents: 8. Optimisation passes: 8. Results:

Added:

So you're right - the FX processor smells like hyperthreading (although I took it as a full 8-core).

And the agent manager sees it as 8-core too (with the existing 7 agents it offers to add 1 more, see image):


P.S. Now this is interesting!

Deleted all the agents. Decided to add again. And then - for the first time in I don't know how many years - I was asked to add 4 agents instead of the usual 8:

So, something has changed in the tester in relation to ancient AMD stones, and, moreover, it's very fresh changes.

P.P.S.

Yes, build 2085 also proposed to add 8 agents. Quite a fresh innovation:


As time goes on, everything is shrinking. There used to be eight agents, now there are four.

I wasn't aware that an update was already happening to build 2097 - I've been using it since mid July (as a tester), and I was aware of this approach to disabling questionable agents - hypertrading and such as AMD, all agents can still be used on the local machine for now, and hopefully this won't change. Why this is done - I suspect for a fair assessment of the value of resources when selling them in the claud.

Still, it's interesting that the gain from 8 threads is very decent - 25%, and it's worth it to load all 8 agents.

At AMD on two computing cores (ALU), responsible for operations of addition and subtraction, and also comparison logic, and maybe more, 1 core (FPU) responsible for division and multiplication, i.e. operations with floating point, i.e. it is not hypertrading in the pure form.

 

Caught a netbook, but it was faster than a Xeron - surprised.

2019.08.05 22:37:53.817 Terminal        Windows 7 Service Pack 1 (build 7601), IE 11, UAC, Intel Atom  N570 @ 1.66 GHz, Memory: 578 / 2035 Mb, Disk: 56 / 280 Gb, GMT+3

2019.08.10 23:58:27.648 Core 1  pass 0 returned result 1001000.00 in 0:09:33.408
2019.08.10 23:58:28.363 Core 2  pass 2 returned result 1001000.00 in 0:09:34.188
2019.08.11 00:08:04.213 Core 1  pass 1 returned result 1001000.00 in 0:09:36.913
2019.08.11 00:08:05.355 Core 2  pass 3 returned result 1001000.00 in 0:09:37.257
2019.08.11 00:08:05.355 Tester  optimization finished, total passes 4
2019.08.11 00:08:05.366 Statistics      optimization done in 19 minutes 11 seconds
2019.08.11 00:08:05.366 Statistics      shortest pass 0:09:33.408, longest pass 0:09:37.257, average pass 0:09:35.441
 

Got a chance to test mainframe with FX-8350 - no 4000Mhz overclocking.

2019.08.11 10:41:32.541 Terminal        Windows 7 Service Pack 1 (build 7601) x64, IE 10, AMD FX-8350 Eight-Core Processor , Memory: 21877 / 24533 Mb, Disk: 51 / 499 Gb, GMT+3

Tree_Brut_TestPL - 4 agents

2019.08.11 10:57:01.270 Core 3  pass 2 returned result 1001000.00 in 0:01:21.389
2019.08.11 10:57:01.466 Core 1  pass 0 returned result 1001000.00 in 0:01:21.616
2019.08.11 10:57:01.851 Core 4  pass 6 returned result 1001000.00 in 0:01:21.950
2019.08.11 10:57:03.201 Core 2  pass 4 returned result 1001000.00 in 0:01:23.292
2019.08.11 10:58:21.943 Core 3  pass 3 returned result 1001000.00 in 0:01:20.680
2019.08.11 10:58:22.763 Core 1  pass 1 returned result 1001000.00 in 0:01:21.304
2019.08.11 10:58:23.899 Core 4  pass 7 returned result 1001000.00 in 0:01:22.056
2019.08.11 10:58:26.569 Core 2  pass 5 returned result 1001000.00 in 0:01:23.375
2019.08.11 10:58:26.569 Tester  optimization finished, total passes 8
2019.08.11 10:58:26.579 Statistics      optimization done in 2 minutes 47 seconds
2019.08.11 10:58:26.579 Statistics      shortest pass 0:01:20.680, longest pass 0:01:23.375, average pass 0:01:21.957

Tree_Brut_TestPL - 8 agents

2019.08.11 11:11:21.820 Core 7  pass 5 returned result 1001000.00 in 0:02:03.874
2019.08.11 11:11:22.139 Core 2  pass 2 returned result 1001000.00 in 0:02:04.354
2019.08.11 11:11:24.113 Core 3  pass 6 returned result 1001000.00 in 0:02:06.141
2019.08.11 11:11:24.195 Core 4  pass 1 returned result 1001000.00 in 0:02:06.470
2019.08.11 11:11:24.394 Core 6  pass 0 returned result 1001000.00 in 0:02:06.539
2019.08.11 11:11:24.917 Core 5  pass 7 returned result 1001000.00 in 0:02:06.966
2019.08.11 11:11:28.852 Core 1  pass 3 returned result 1001000.00 in 0:02:11.027
2019.08.11 11:11:30.336 Core 8  pass 4 returned result 1001000.00 in 0:02:12.302
2019.08.11 11:11:30.336 Tester  optimization finished, total passes 8
2019.08.11 11:11:30.346 Statistics      optimization done in 2 minutes 13 seconds
2019.08.11 11:11:30.346 Statistics      shortest pass 0:02:03.874, longest pass 0:02:12.302, average pass 0:02:07.209

Tree_Brut_TestPL_F - 4 agents

2019.08.11 11:15:56.836 Core 4  pass 0 returned result 1001000.00 in 0:02:03.360
2019.08.11 11:15:57.088 Core 3  pass 6 returned result 1001000.00 in 0:02:03.567
2019.08.11 11:15:57.744 Core 2  pass 4 returned result 1001000.00 in 0:02:04.248
2019.08.11 11:15:58.259 Core 1  pass 2 returned result 1001000.00 in 0:02:04.762
2019.08.11 11:17:53.839 Core 2  pass 5 returned result 1001000.00 in 0:01:56.106
2019.08.11 11:17:55.203 Core 3  pass 7 returned result 1001000.00 in 0:01:58.126
2019.08.11 11:17:55.210 Core 4  pass 1 returned result 1001000.00 in 0:01:58.387
2019.08.11 11:17:55.615 Core 1  pass 3 returned result 1001000.00 in 0:01:57.366
2019.08.11 11:17:55.615 Tester  optimization finished, total passes 8
2019.08.11 11:17:55.625 Statistics      optimization done in 4 minutes 03 seconds
2019.08.11 11:17:55.625 Statistics      shortest pass 0:01:56.106, longest pass 0:02:04.762, average pass 0:02:00.740

Tree_Brut_TestPL_F - 8 agents

2019.08.11 11:24:05.758 Core 6  pass 6 returned result 1001000.00 in 0:03:30.450
2019.08.11 11:24:06.511 Core 1  pass 2 returned result 1001000.00 in 0:03:32.370
2019.08.11 11:24:07.029 Core 4  pass 3 returned result 1001000.00 in 0:03:32.860
2019.08.11 11:24:08.345 Core 2  pass 1 returned result 1001000.00 in 0:03:34.210
2019.08.11 11:24:08.447 Core 5  pass 7 returned result 1001000.00 in 0:03:33.167
2019.08.11 11:24:08.482 Core 3  pass 0 returned result 1001000.00 in 0:03:34.280
2019.08.11 11:24:08.768 Core 8  pass 4 returned result 1001000.00 in 0:03:33.688
2019.08.11 11:24:10.260 Core 7  pass 5 returned result 1001000.00 in 0:03:35.018
2019.08.11 11:24:10.260 Tester  optimization finished, total passes 8
2019.08.11 11:24:10.270 Statistics      optimization done in 3 minutes 37 seconds
2019.08.11 11:24:10.270 Statistics      shortest pass 0:03:30.450, longest pass 0:03:35.018, average pass 0:03:33.255

And yet, there is a clear performance gain from using 8 agents vs. 4 agents, although the speed per agent is significantly higher when using 4 agents.

 

The next computer will be an older Phenom II X6 1065T with a 2900Mhz CPU.

2019.08.11 12:08:13.905 Terminal        Windows 7 (build 7600) x64, IE 8, AMD Phenom II X6 1065 T Processor, Memory: 5996 / 8191 Mb, Disk: 79 / 224 Gb, GMT+3

Tree_Brut_TestPL - 6 agents

2019.08.11 12:11:06.766 Core 3  pass 0 returned result 1001000.00 in 0:02:16.413
2019.08.11 12:11:07.043 Core 2  pass 4 returned result 1001000.00 in 0:02:16.593
2019.08.11 12:11:08.079 Core 4  pass 2 returned result 1001000.00 in 0:02:17.538
2019.08.11 12:11:08.277 Core 6  pass 3 returned result 1001000.00 in 0:02:17.698
2019.08.11 12:11:16.133 Core 1  pass 1 returned result 1001000.00 in 0:02:25.720
2019.08.11 12:11:17.239 Core 5  pass 5 returned result 1001000.00 in 0:02:26.692
2019.08.11 12:11:17.239 Tester  optimization finished, total passes 6
2019.08.11 12:11:17.249 Statistics      optimization done in 2 minutes 27 seconds
2019.08.11 12:11:17.249 Statistics      shortest pass 0:02:16.413, longest pass 0:02:26.692, average pass 0:02:20.109

Tree_Brut_TestPL - 3 agents

2019.08.11 12:16:01.529 Core 2  pass 4 returned result 1001000.00 in 0:02:17.960
2019.08.11 12:16:01.530 Core 1  pass 2 returned result 1001000.00 in 0:02:17.960
2019.08.11 12:16:01.787 Core 3  pass 0 returned result 1001000.00 in 0:02:18.219
2019.08.11 12:18:19.602 Core 2  pass 5 returned result 1001000.00 in 0:02:18.073
2019.08.11 12:18:19.630 Core 1  pass 3 returned result 1001000.00 in 0:02:18.100
2019.08.11 12:18:20.100 Core 3  pass 1 returned result 1001000.00 in 0:02:18.311
2019.08.11 12:18:20.100 Tester  optimization finished, total passes 6
2019.08.11 12:18:20.110 Statistics      optimization done in 4 minutes 37 seconds
2019.08.11 12:18:20.110 Statistics      shortest pass 0:02:17.960, longest pass 0:02:18.311, average pass 0:02:18.103

Checked it on 3 agents, to see the difference in architecture compared to FX - here we have 6 FPU's honestly.

Tree_Brut_TestPL_F - 6 agents

2019.08.11 12:23:16.283 Core 1  pass 0 returned result 1001000.00 in 0:03:39.626
2019.08.11 12:23:16.652 Core 5  pass 4 returned result 1001000.00 in 0:03:39.614
2019.08.11 12:23:16.861 Core 3  pass 1 returned result 1001000.00 in 0:03:40.286
2019.08.11 12:23:17.968 Core 2  pass 2 returned result 1001000.00 in 0:03:41.294
2019.08.11 12:23:30.936 Core 4  pass 5 returned result 1001000.00 in 0:03:53.860
2019.08.11 12:23:32.878 Core 6  pass 3 returned result 1001000.00 in 0:03:55.949
2019.08.11 12:23:32.878 Tester  optimization finished, total passes 6
2019.08.11 12:23:32.888 Statistics      optimization done in 3 minutes 57 seconds
2019.08.11 12:23:32.888 Statistics      shortest pass 0:03:39.614, longest pass 0:03:55.949, average pass 0:03:45.104

Only 20 seconds behind FX in core speed, which isn't bad considering the 1100Mhz frequency difference!

 

My result (Inter Core i7-8700, 3.2 GHz, 16 Gb, hypertrading enabled, there are 12 agents, but only six worked, by the number of physical cores) :

2019.08.11 12:35:27.825 Core 04 pass 1 returned result 1001000.00 in 0:01:10.378
2019.08.11 12:35:27.841 Core 01 pass 0 returned result 1001000.00 in 0:01:10.241
2019.08.11 12:35:28.620 Core 06 pass 2 returned result 1001000.00 in 0:01:11.130
2019.08.11 12:35:28.626 Core 03 pass 5 returned result 1001000.00 in 0:01:11.036
2019.08.11 12:35:28.704 Core 12 pass 3 returned result 1001000.00 in 0:01:11.100
2019.08.11 12:35:29.296 Core 02 pass 4 returned result 1001000.00 in 0:01:11.706
2019.08.11 12:35:29.296 Tester  optimization finished, total passes 6
2019.08.11 12:35:29.307 Statistics      optimization done in 1 minutes 13 seconds
2019.08.11 12:35:29.307 Statistics      shortest pass 0:01:10.241, longest pass 0:01:11.706, average pass 0:01:10.931
It seems to me that for hypertrading, it is very important that the data fits into the CPU cache. Virtual cores benefit precisely when the processor does not need to access main memory, when all data is in cache. Correspondingly, if large arrays are being calculated (real ticks in a couple of years) - hypertrading will be of little use. However, when processing relatively small data (in my experience, about a year or a year and a half for 1MOHLC), hypertrading gives quite a noticeable performance gain.
 
Georgiy Merts:

My result (Inter Core i7-8700, 3.2 GHz, 16 Gb, hypertrading enabled, there are 12 agents, but only six worked, by the number of physical cores) :

It seems to me that for hypertrading, it is very important that the data fits into the cache of the processor. Virtual cores benefit when the processor does not need to access main memory, when all data is in cache. Correspondingly, if large arrays are being calculated (real ticks in a couple of years) - hypertrading will be of little use. However, when processing relatively small data (in my experience, about a year or a year and a half for 1MOHLC), hypertrading gives quite a noticeable performance gain.

Now it is suggested to put up tests ofTree_Brut_TestPL andTree_Brut_TestPL_F - if possible (with indication what kind of Expert Advisor), then give(give) information on two variants, well, with hypertrading(all agents) and without - and while there is no understanding of objective, need it or not.

 
Aleksey Vyazmikin:

Now we are proposing to put up tests ofTree_Brut_TestPL andTree_Brut_TestPL_F - if possible (indicating what kind of Expert Advisor), then give (give) information on the two versions, well, with hypertrading (all agents) and without - and while there is no understanding of objective, you need it or not.

Above - Tree_Brut_TestPL_F

Here is the second one, in the same configuration:

2019.08.11 14:32:43.819 Core 07 pass 4 returned result 1001000.00 in 0:00:33.157
2019.08.11 14:32:44.209 Core 04 pass 5 returned result 1001000.00 in 0:00:33.494
2019.08.11 14:32:44.291 Core 03 pass 3 returned result 1001000.00 in 0:00:33.664
2019.08.11 14:32:44.415 Core 01 pass 1 returned result 1001000.00 in 0:00:33.846
2019.08.11 14:32:44.568 Core 02 pass 0 returned result 1001000.00 in 0:00:34.031
2019.08.11 14:32:44.683 Core 05 pass 2 returned result 1001000.00 in 0:00:34.082
2019.08.11 14:32:44.683 Tester  optimization finished, total passes 6
2019.08.11 14:32:44.693 Statistics      optimization done in 0 minutes 34 seconds
2019.08.11 14:32:44.693 Statistics      shortest pass 0:00:33.157, longest pass 0:00:34.082, average pass 0:00:33.712

Same (without F) when running 12 virtual cores:

2019.08.11 14:36:39.685 Core 05 pass 4 returned result 1001000.00 in 0:01:43.939
2019.08.11 14:36:39.843 Core 03 pass 2 returned result 1001000.00 in 0:01:44.102
2019.08.11 14:36:40.327 Core 04 pass 3 returned result 1001000.00 in 0:01:44.581
2019.08.11 14:36:40.394 Core 07 pass 5 returned result 1001000.00 in 0:01:44.650
2019.08.11 14:36:40.514 Core 10 pass 7 returned result 1001000.00 in 0:01:44.215
2019.08.11 14:36:40.707 Core 06 pass 6 returned result 1001000.00 in 0:01:44.454
2019.08.11 14:36:40.732 Core 09 pass 9 returned result 1001000.00 in 0:01:44.367
2019.08.11 14:36:40.885 Core 02 pass 1 returned result 1001000.00 in 0:01:45.143
2019.08.11 14:36:41.253 Core 01 pass 0 returned result 1001000.00 in 0:01:45.512
2019.08.11 14:36:41.727 Core 12 pass 11 returned result 1001000.00 in 0:01:45.325
2019.08.11 14:36:41.786 Core 11 pass 10 returned result 1001000.00 in 0:01:45.407
2019.08.11 14:36:41.899 Core 08 pass 8 returned result 1001000.00 in 0:01:45.563
2019.08.11 14:36:41.899 Tester  optimization finished, total passes 12
2019.08.11 14:36:41.909 Statistics      optimization done in 1 minutes 46 seconds
2019.08.11 14:36:41.909 Statistics      shortest pass 0:01:43.939, longest pass 0:01:45.563, average pass 0:01:44.771

Expert "with F" when running 12 virtual cores:

2019.08.11 14:48:04.349 Core 09 pass 10 returned result 1001000.00 in 0:03:16.005
2019.08.11 14:48:06.012 Core 02 pass 0 returned result 1001000.00 in 0:03:18.194
2019.08.11 14:48:06.269 Core 11 pass 4 returned result 1001000.00 in 0:03:18.152
2019.08.11 14:48:06.902 Core 01 pass 3 returned result 1001000.00 in 0:03:18.869
2019.08.11 14:48:06.925 Core 10 pass 8 returned result 1001000.00 in 0:03:18.590
2019.08.11 14:48:06.958 Core 05 pass 5 returned result 1001000.00 in 0:03:18.816
2019.08.11 14:48:07.269 Core 07 pass 7 returned result 1001000.00 in 0:03:19.044
2019.08.11 14:48:07.460 Core 04 pass 2 returned result 1001000.00 in 0:03:19.470
2019.08.11 14:48:07.818 Core 06 pass 6 returned result 1001000.00 in 0:03:19.634
2019.08.11 14:48:08.151 Core 12 pass 11 returned result 1001000.00 in 0:03:19.777
2019.08.11 14:48:08.463 Core 03 pass 1 returned result 1001000.00 in 0:03:20.563
2019.08.11 14:48:09.072 Core 08 pass 9 returned result 1001000.00 in 0:03:20.736
2019.08.11 14:48:09.072 Tester  optimization finished, total passes 12
2019.08.11 14:48:09.083 Statistics      optimization done in 3 minutes 23 seconds
2019.08.11 14:48:09.083 Statistics      shortest pass 0:03:16.005, longest pass 0:03:20.736, average pass 0:03:18.987

I'm going to turn off hypertrading now...

Expert "without F" and without hypertrading with six passes:

2019.08.11 14:56:16.918 Core 5  pass 4 returned result 1001000.00 in 0:01:09.291
2019.08.11 14:56:16.984 Core 1  pass 1 returned result 1001000.00 in 0:01:09.452
2019.08.11 14:56:17.228 Core 3  pass 2 returned result 1001000.00 in 0:01:09.635
2019.08.11 14:56:17.797 Core 6  pass 5 returned result 1001000.00 in 0:01:10.163
2019.08.11 14:56:18.164 Core 2  pass 0 returned result 1001000.00 in 0:01:10.725
2019.08.11 14:56:18.284 Core 4  pass 3 returned result 1001000.00 in 0:01:10.675

Expert "with F" and without hypertrading with six passes:

2019.08.11 15:01:43.644 Core 3  pass 2 returned result 1001000.00 in 0:01:10.138
2019.08.11 15:01:43.984 Core 1  pass 0 returned result 1001000.00 in 0:01:10.494
2019.08.11 15:01:44.121 Core 5  pass 4 returned result 1001000.00 in 0:01:10.585
2019.08.11 15:01:44.310 Core 2  pass 1 returned result 1001000.00 in 0:01:10.868
2019.08.11 15:01:44.318 Core 4  pass 3 returned result 1001000.00 in 0:01:10.782
2019.08.11 15:01:44.397 Core 6  pass 5 returned result 1001000.00 in 0:01:10.859
2019.08.11 15:01:44.397 Tester  optimization finished, total passes 6
2019.08.11 15:01:44.407 Statistics      optimization done in 1 minutes 12 seconds
2019.08.11 15:01:44.407 Statistics      shortest pass 0:01:10.138, longest pass 0:01:10.868, average pass 0:01:10.621
2019.08.11 15:01:44.407 Statistics      6000 frames (2.36 Mb total, 412 bytes per frame) received