评估CPU内核的优化 - 页 5

 
Vladimir Pastushak:

2990WX则不同。它由四个Zeppelin芯片组成,有32个处理核心。在X399平台上,AMD对这款处理器施加了一些限制,以便不损害EPYC服务器芯片的销售。

这些限制中最主要的是只有四个内存控制器。虽然还有两个Zeppelin芯片,但AMD称它们为计算芯片。这意味着他们不能访问本地PCIe或DRAM,为此他们必须通过Infinity Fabric来处理I/O组件。由于晶体的数量是原来的两倍,因此Infinity结构的带宽是原来的两倍,如果使用DDR4-3200内存,带宽大约为25Gb/s。

在这里,如果有主动的内存处理,那也只是偶尔的读取--从内存中读取64倍的EA代码是非常快的,显然不需要100秒的时间

2990WX 在同等核数的负载下,不可能比FX-8350慢!

此外,你和我在R中加载了这个处理器,在那里,性能明显好于FX-8350,每个线程吃了大约100兆字节。

看起来编译器是针对英特尔处理器的一些特殊性进行调整的。

虽然,可能是代理是相互独立的--就像不同的程序,然后他们可以用不断的数据过载来加载总线,以获得每个代理的新工作(执行的代码块),但我显然不是这方面的专家。

如果这是真的,那么现在是时候改变代理思想,使程序(EA)对所有核心通用,并同步执行代码本身--在更多的核心上,这可能比目前的异步执行要快。

 

建设2085年
蝶变9版葡萄酒4.0.1
华硕P8P67PRO
英特尔@酷睿i7-3770K CPU @ 3.50GHz
RAM 4x4 16Gb


树_Brut_TestPL

Pass: 8, Agent: 2

2019.08.12 07:30:47.921 Core 2  pass 4 returned result 1001000.00 in 0:01:37.923
2019.08.12 07:30:48.917 Core 1  pass 0 returned result 1001000.00 in 0:01:39.007
2019.08.12 07:32:28.151 Core 2  pass 5 returned result 1001000.00 in 0:01:40.231
2019.08.12 07:32:28.161 Core 1  pass 1 returned result 1001000.00 in 0:01:39.245
2019.08.12 07:34:07.317 Core 1  pass 2 returned result 1001000.00 in 0:01:39.156
2019.08.12 07:34:08.936 Core 2  pass 6 returned result 1001000.00 in 0:01:40.786
2019.08.12 07:35:46.231 Core 1  pass 3 returned result 1001000.00 in 0:01:38.914
2019.08.12 07:35:51.699 Core 2  pass 7 returned result 1001000.00 in 0:01:42.764
2019.08.12 07:35:51.699 Tester  optimization finished, total passes 8
2019.08.12 07:35:51.709 Statistics      optimization done in 6 minutes 42 seconds
2019.08.12 07:35:51.710 Statistics      shortest pass 0:01:37.923, longest pass 0:01:42.764, average pass 0:01:39.753
2019.08.12 07:35:51.710 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 07:35:51.710 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)

Pass: 8, Agent: 4

2019.08.12 07:39:22.201 Core 1  pass 0 returned result 1001000.00 in 0:01:38.523
2019.08.12 07:39:25.351 Core 4  pass 6 returned result 1001000.00 in 0:01:41.332
2019.08.12 07:39:27.966 Core 2  pass 2 returned result 1001000.00 in 0:01:44.256
2019.08.12 07:39:28.480 Core 3  pass 4 returned result 1001000.00 in 0:01:44.641
2019.08.12 07:41:00.476 Core 1  pass 1 returned result 1001000.00 in 0:01:38.275
2019.08.12 07:41:06.496 Core 4  pass 7 returned result 1001000.00 in 0:01:41.146
2019.08.12 07:41:09.869 Core 2  pass 3 returned result 1001000.00 in 0:01:41.903
2019.08.12 07:41:10.728 Core 3  pass 5 returned result 1001000.00 in 0:01:42.248
2019.08.12 07:41:10.729 Tester  optimization finished, total passes 8
2019.08.12 07:41:10.739 Statistics      optimization done in 3 minutes 27 seconds
2019.08.12 07:41:10.739 Statistics      shortest pass 0:01:38.275, longest pass 0:01:44.641, average pass 0:01:41.540
2019.08.12 07:41:10.739 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 07:41:10.739 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)

Pass: 8, Agent: 8

2019.08.12 07:47:10.314 Core 3  pass 2 returned result 1001000.00 in 0:03:45.744
2019.08.12 07:47:10.573 Core 8  pass 7 returned result 1001000.00 in 0:03:44.805
2019.08.12 07:47:15.145 Core 5  pass 4 returned result 1001000.00 in 0:03:50.281
2019.08.12 07:47:15.701 Core 7  pass 6 returned result 1001000.00 in 0:03:50.128
2019.08.12 07:47:15.765 Core 2  pass 1 returned result 1001000.00 in 0:03:51.302
2019.08.12 07:47:16.624 Core 6  pass 5 returned result 1001000.00 in 0:03:51.547
2019.08.12 07:47:17.686 Core 4  pass 3 returned result 1001000.00 in 0:03:53.025
2019.08.12 07:47:30.052 Core 1  pass 0 returned result 1001000.00 in 0:04:05.750
2019.08.12 07:47:30.052 Tester  optimization finished, total passes 8
2019.08.12 07:47:30.062 Statistics      optimization done in 4 minutes 07 seconds
2019.08.12 07:47:30.062 Statistics      shortest pass 0:03:44.805, longest pass 0:04:05.750, average pass 0:03:51.572
2019.08.12 07:47:30.062 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 07:47:30.062 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)


树_Brut_TestPL_F

Pass: 8, Agent: 2

2019.08.12 08:01:23.565 Core 1  pass 0 returned result 1001000.00 in 0:03:41.797
2019.08.12 08:01:28.112 Core 2  pass 4 returned result 1001000.00 in 0:03:46.278
2019.08.12 08:05:03.684 Core 1  pass 1 returned result 1001000.00 in 0:03:40.121
2019.08.12 08:05:13.202 Core 2  pass 5 returned result 1001000.00 in 0:03:45.092
2019.08.12 08:08:43.180 Core 1  pass 2 returned result 1001000.00 in 0:03:39.499
2019.08.12 08:08:56.696 Core 2  pass 6 returned result 1001000.00 in 0:03:43.497
2019.08.12 08:12:23.381 Core 1  pass 3 returned result 1001000.00 in 0:03:40.204
2019.08.12 08:12:38.250 Core 2  pass 7 returned result 1001000.00 in 0:03:41.557
2019.08.12 08:12:38.250 Tester  optimization finished, total passes 8
2019.08.12 08:12:38.260 Statistics      optimization done in 14 minutes 58 seconds
2019.08.12 08:12:38.260 Statistics      shortest pass 0:03:39.499, longest pass 0:03:46.278, average pass 0:03:42.255
2019.08.12 08:12:38.260 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 08:12:38.260 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)


Pass: 8, Agent: 4

2019.08.12 08:26:59.764 Core 1  pass 0 returned result 1001000.00 in 0:03:52.901
2019.08.12 08:27:00.641 Core 2  pass 2 returned result 1001000.00 in 0:03:53.639
2019.08.12 08:27:01.711 Core 3  pass 4 returned result 1001000.00 in 0:03:54.624
2019.08.12 08:27:02.128 Core 4  pass 6 returned result 1001000.00 in 0:03:54.908
2019.08.12 08:30:49.743 Core 2  pass 3 returned result 1001000.00 in 0:03:49.105
2019.08.12 08:30:50.377 Core 3  pass 5 returned result 1001000.00 in 0:03:48.668
2019.08.12 08:30:51.670 Core 1  pass 1 returned result 1001000.00 in 0:03:51.908
2019.08.12 08:30:54.910 Core 4  pass 7 returned result 1001000.00 in 0:03:52.785
2019.08.12 08:30:54.911 Tester  optimization finished, total passes 8
2019.08.12 08:30:54.921 Statistics      optimization done in 7 minutes 49 seconds
2019.08.12 08:30:54.921 Statistics      shortest pass 0:03:48.668, longest pass 0:03:54.908, average pass 0:03:52.317
2019.08.12 08:30:54.921 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 08:30:54.921 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)

Pass: 8, Agent: 8

2019.08.12 08:38:39.221 Core 8  pass 7 returned result 1001000.00 in 0:06:25.500
2019.08.12 08:38:51.812 Core 6  pass 5 returned result 1001000.00 in 0:06:38.644
2019.08.12 08:38:55.103 Core 2  pass 1 returned result 1001000.00 in 0:06:42.620
2019.08.12 08:39:04.616 Core 7  pass 6 returned result 1001000.00 in 0:06:51.090
2019.08.12 08:39:04.697 Core 4  pass 3 returned result 1001000.00 in 0:06:51.862
2019.08.12 08:39:07.278 Core 3  pass 2 returned result 1001000.00 in 0:06:54.651
2019.08.12 08:39:13.762 Core 1  pass 0 returned result 1001000.00 in 0:07:01.299
2019.08.12 08:39:19.159 Core 5  pass 4 returned result 1001000.00 in 0:07:06.182
2019.08.12 08:39:19.159 Tester  optimization finished, total passes 8
2019.08.12 08:39:19.169 Statistics      optimization done in 7 minutes 08 seconds
2019.08.12 08:39:19.169 Statistics      shortest pass 0:06:25.500, longest pass 0:07:06.182, average pass 0:06:48.981
2019.08.12 08:39:19.169 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 08:39:19.169 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)
 
Roman:
Debian9 Wine 4.0.1。
华硕P8P67PRO
英特尔@酷睿i7-3770K CPU @ 3.50GHz

RAM 4x4 16Gb


树_Brut_TestPL


树_Brut_TestPL_F

谢谢你的测试--更新了评级。

在这里你可以看到,如果没有超交易,它一点也不好 - 比赛扬G3900还慢...

我认为任何比赛扬更慢的东西都应该被改变...
 
Aleksey Vyazmikin:

谢谢你的测试--更新了评级。

在这里你可以看到,如果没有超交易,它一点也不好 - 比赛扬G3900还慢...

我认为比赛扬更慢,最好是已经改变了...

也许是因为Wine的原因,测试不正确。
因为在Wine下,代理显示Intel Pentium 4 2.40 GHz
目前还不清楚实际使用的是什么配置。
也许我们应该在评级中加入葡萄酒的修正。
我稍后将尝试在Windows10的虚拟机 上做一个测试。

 
Roman:

有可能是因为Wine的原因,测试结果不正确。
在Wine下,代理显示Intel Pentium 4 2.40 GHz
而且也不清楚实际使用的是什么配置。
也许我们应该在评级中加入葡萄酒的修正。
我稍后将尝试在Windows10的虚拟机 上做一个测试。

你就不能在Windows 7/10中尝试一下,不使用虚拟机吗?

 
Aleksey Vyazmikin:

你就不能在Windows 7/10中尝试一下,不使用虚拟机吗?

我有linux作为我的主要系统,现在因为测试要重新安装操作系统,因为它不是kamilfo))。
我可能会在一段时间后尝试,当我把它重新安装回Windows时,因为我确信Wine不适合mt5。

 
Roman:

我的主要系统是linux,我不太愿意为测试目的重新安装操作系统))。
我可能做了一段时间后,重新安装系统后,我应该换回wine。 我确信wine不适合mt5。

如果测试结果更好的话,这就是拥有一个Windows系统的好理由,至少是为了测试的目的......

 

虚拟机 并没有帮助。
最有可能的问题是虚拟化,包括Wine和VM上的虚拟化。
因为i7的4个核心,比赛扬的2个核心要差,这有点奇怪。

2093年建成
Windows10虚拟机VirtualBox
华硕P8P67PRO
英特尔@酷睿i7-3770K CPU @ 3.50GHz
4x4 16Gb RAM

树_Brut_TestPL

Pass: 8, Agent: 2

2019.08.12 09:26:18.494 Core 2  pass 4 returned result 1001000.00 in 0:01:45.727
2019.08.12 09:26:23.425 Core 1  pass 0 returned result 1001000.00 in 0:01:50.722
2019.08.12 09:28:03.437 Core 2  pass 5 returned result 1001000.00 in 0:01:45.554
2019.08.12 09:28:11.791 Core 1  pass 1 returned result 1001000.00 in 0:01:49.402
2019.08.12 09:29:47.937 Core 2  pass 6 returned result 1001000.00 in 0:01:44.503
2019.08.12 09:30:00.442 Core 1  pass 2 returned result 1001000.00 in 0:01:48.654
2019.08.12 09:31:33.388 Core 2  pass 7 returned result 1001000.00 in 0:01:45.454
2019.08.12 09:31:49.437 Core 1  pass 3 returned result 1001000.00 in 0:01:48.999
2019.08.12 09:31:49.437 Tester  optimization finished, total passes 8
2019.08.12 09:31:49.448 Statistics      optimization done in 7 minutes 17 seconds
2019.08.12 09:31:49.448 Statistics      shortest pass 0:01:44.503, longest pass 0:01:50.722, average pass 0:01:47.376
2019.08.12 09:31:49.448 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 09:31:49.448 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)

Pass: 8, Agent: 4

2019.08.12 09:36:41.044 Core 1  pass 2 returned result 1001000.00 in 0:01:49.154
2019.08.12 09:36:44.487 Core 3  pass 6 returned result 1001000.00 in 0:01:52.522
2019.08.12 09:36:44.793 Core 4  pass 0 returned result 1001000.00 in 0:01:52.905
2019.08.12 09:36:46.034 Core 2  pass 4 returned result 1001000.00 in 0:01:54.096
2019.08.12 09:38:31.290 Core 1  pass 3 returned result 1001000.00 in 0:01:50.251
2019.08.12 09:38:37.438 Core 3  pass 7 returned result 1001000.00 in 0:01:52.956
2019.08.12 09:38:39.069 Core 4  pass 1 returned result 1001000.00 in 0:01:54.280
2019.08.12 09:38:41.761 Core 2  pass 5 returned result 1001000.00 in 0:01:55.731
2019.08.12 09:38:41.761 Tester  optimization finished, total passes 8
2019.08.12 09:38:41.772 Statistics      optimization done in 3 minutes 50 seconds
2019.08.12 09:38:41.772 Statistics      shortest pass 0:01:49.154, longest pass 0:01:55.731, average pass 0:01:52.736
2019.08.12 09:38:41.772 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 09:38:41.772 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)

Pass: 8, Agent: 8

2019.08.12 09:45:29.276 Core 3  pass 1 returned result 1001000.00 in 0:04:06.742
2019.08.12 09:45:29.448 Core 7  pass 7 returned result 1001000.00 in 0:04:06.761
2019.08.12 09:45:29.760 Core 4  pass 5 returned result 1001000.00 in 0:04:07.075
2019.08.12 09:45:30.929 Core 6  pass 3 returned result 1001000.00 in 0:04:08.325
2019.08.12 09:45:30.963 Core 8  pass 4 returned result 1001000.00 in 0:04:08.323
2019.08.12 09:45:30.972 Core 2  pass 2 returned result 1001000.00 in 0:04:08.400
2019.08.12 09:45:31.038 Core 1  pass 0 returned result 1001000.00 in 0:04:08.553
2019.08.12 09:45:31.677 Core 5  pass 6 returned result 1001000.00 in 0:04:08.990
2019.08.12 09:45:31.677 Tester  optimization finished, total passes 8
2019.08.12 09:45:31.687 Statistics      optimization done in 4 minutes 09 seconds
2019.08.12 09:45:31.687 Statistics      shortest pass 0:04:06.742, longest pass 0:04:08.990, average pass 0:04:07.896
2019.08.12 09:45:31.687 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 09:45:31.688 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)


树_Brut_TestPL_F

Pass: 8, Agent: 2

2019.08.12 10:11:35.102 Core 1  pass 0 returned result 1001000.00 in 0:03:59.375
2019.08.12 10:11:38.365 Core 2  pass 4 returned result 1001000.00 in 0:04:02.605
2019.08.12 10:15:34.255 Core 1  pass 1 returned result 1001000.00 in 0:03:59.164
2019.08.12 10:15:39.553 Core 2  pass 5 returned result 1001000.00 in 0:04:01.198
2019.08.12 10:19:31.585 Core 1  pass 2 returned result 1001000.00 in 0:03:57.340
2019.08.12 10:19:39.795 Core 2  pass 6 returned result 1001000.00 in 0:04:00.252
2019.08.12 10:23:29.253 Core 1  pass 3 returned result 1001000.00 in 0:03:57.677
2019.08.12 10:23:39.829 Core 2  pass 7 returned result 1001000.00 in 0:04:00.043
2019.08.12 10:23:39.829 Tester  optimization finished, total passes 8
2019.08.12 10:23:39.840 Statistics      optimization done in 16 minutes 05 seconds
2019.08.12 10:23:39.840 Statistics      shortest pass 0:03:57.340, longest pass 0:04:02.605, average pass 0:03:59.706
2019.08.12 10:23:39.840 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 10:23:39.840 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)


Pass: 8, Agent: 4

2019.08.12 10:01:30.501 Core 4  pass 2 returned result 1001000.00 in 0:04:07.769
2019.08.12 10:01:31.482 Core 1  pass 4 returned result 1001000.00 in 0:04:08.725
2019.08.12 10:01:33.679 Core 3  pass 6 returned result 1001000.00 in 0:04:10.886
2019.08.12 10:01:33.751 Core 2  pass 0 returned result 1001000.00 in 0:04:11.076
2019.08.12 10:05:39.244 Core 4  pass 3 returned result 1001000.00 in 0:04:08.754
2019.08.12 10:05:40.932 Core 1  pass 5 returned result 1001000.00 in 0:04:09.460
2019.08.12 10:05:43.819 Core 3  pass 7 returned result 1001000.00 in 0:04:10.149
2019.08.12 10:05:44.517 Core 2  pass 1 returned result 1001000.00 in 0:04:10.777
2019.08.12 10:05:44.518 Tester  optimization finished, total passes 8
2019.08.12 10:05:44.528 Statistics      optimization done in 8 minutes 23 seconds
2019.08.12 10:05:44.528 Statistics      shortest pass 0:04:07.769, longest pass 0:04:11.076, average pass 0:04:09.699
2019.08.12 10:05:44.528 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 10:05:44.528 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)


Pass: 8, Agent: 8

2019.08.12 09:54:56.856 Core 1  pass 2 returned result 1001000.00 in 0:06:44.190
2019.08.12 09:54:58.155 Core 5  pass 3 returned result 1001000.00 in 0:06:45.405
2019.08.12 09:54:58.173 Core 7  pass 7 returned result 1001000.00 in 0:06:45.282
2019.08.12 09:55:00.715 Core 3  pass 1 returned result 1001000.00 in 0:06:48.091
2019.08.12 09:55:01.192 Core 6  pass 6 returned result 1001000.00 in 0:06:48.373
2019.08.12 09:55:02.774 Core 4  pass 4 returned result 1001000.00 in 0:06:50.014
2019.08.12 09:55:02.917 Core 8  pass 5 returned result 1001000.00 in 0:06:50.123
2019.08.12 09:55:02.977 Core 2  pass 0 returned result 1001000.00 in 0:06:50.408
2019.08.12 09:55:02.977 Tester  optimization finished, total passes 8
2019.08.12 09:55:02.988 Statistics      optimization done in 6 minutes 51 seconds
2019.08.12 09:55:02.988 Statistics      shortest pass 0:06:44.190, longest pass 0:06:50.408, average pass 0:06:47.735
2019.08.12 09:55:02.988 Statistics      8000 frames (3.14 Mb total, 412 bytes per frame) received
2019.08.12 09:55:02.988 Statistics      local 8 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)
 

我决定研究一下处理器之间的指令区别--好吧,i7-8700不可能突然有这样的性能提升,所以为了比较,我把2990WX、FX-8350、E5-2670拿出来。

这里有一张地图上的说明。

灰色--说明都在那里。

绿色------说明不适合所有的人

粉红色--类似的技术/指令

蓝色 - 独特的处理器指令

黄色--显示与i7-8700相比缺乏指示




信息来源。

我们看到2990WX拥有FX-8350和i7-8700中的所有指令,这意味着对于相同的任务,核心的性能应该是相当的(由于频率的原因,可能会慢一点,但这是在理论上,如果我们不考虑微处理器构造的进展,纯粹是由逻辑的可用性)。同时,FX-8350的指令在2990WX中被取消了,也许没有被取消,只是给了另一个名字(营销)--对于那些理解的人来说,这是很好的检查。

此外,让我们比较一下i7-8700和E5-2670,并注意其指令的存在和相对于FX-8350而言--我们看到其他处理器没有指令BMI1、F16C、FMA3--它们负责什么,它们的缺席是否关键--这就是问题所在!


Сравнение процессоров
  • chaynikam.info
  • www.chaynikam.info
Особенности работы с таблицей В таблицу можно добавить не более 6 процессоров (кнопка "Добавить процессор"). Для ускорения поиска интересующего процессора пользуйтесь фильтром. Процессоры в таблице можно менять местами, перетаскивая их в нужное место с помощью мышки. "Ухватить" процессор для перетаскивания можно за ячейку с его названием...
 
Roman:

虚拟机 没有帮助。
最有可能的是,问题出在虚拟化上,包括Wine和VM。
因为i7的4个核心,比赛扬的2个核心要差,这似乎很奇怪。

是的,这里有一些奇怪的现象--我们需要获得更多的统计数据来评估情况。

i7-3770K 没有BMI1、FMA3指令--也许这就是原因。