Compilation of MQL5 programmes with AVX / AVX2 + FMA3 / AVX512 + FMA3 instruction set from build 3902 - page 3

 

Distracted: people don't understand why Windows 11 requires modern processors. They get sidetracked by the TPM discussion.

In fact, Microsoft technicians dream of getting rid of compiling the operating system kernel/applications under a rusty bucket and switching to AVX at least. That way it will be possible to improve speed and capabilities.

But they haven't switched _ yet_, they are afraid of incompatibilities and are dragging it out.

 

I just missed the information that AVX2 Terminal will be released later.

We will be releasing the third version of the terminal soon, built using AVX2 and FMA3.

 
Renat Fatkhullin #:

Distracted: people don't understand why Windows 11 requires modern processors. They get sidetracked by the TPM discussion.

In fact, Microsoft technicians dream of getting rid of compiling the operating system kernel/applications under a rusty bucket and switching to AVX at least. This way it will be possible to improve speed and capabilities.

But they haven't switched to AVX, they are afraid of incompatibilities and are dragging on.

I think everyone here is much more concerned about the speed of EX5 execution and terminal guts with a tester.

 
fxsaber #:

I think everyone here is much more concerned about the speed of EX5 and gutting the terminal with the tester.

Windows optimisation level is at the root of performance of all programs running in Windows.

Because all programs massively use Windows API, which has no idea about AVX/AVX2 etc. But in some places the operating system could produce results much faster.

 

Not to be unsubstantiated - I myself build the official Intel IAVF opensource driver for the latest versions of very modern NICs on Ubuntu 22.04:

make install
....

/tmp/iavf-4.9.1/src/.iavf.mod.o.cmd
cmd_/tmp/iavf-4.9.1/src/iavf.mod.o := gcc -Wp,-MMD,/tmp/iavf-4.9.1/src/.iavf.mod.o.d -nostdinc
-isystem /usr/lib/gcc/x86_64-linux-gnu/11/include -I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi
-I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h
-I./ubuntu/include -include ./include/linux/compiler_types.h
-D__KERNEL__
-fmacro-prefix-map=./= -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE
-Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security
-std=gnu89 -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -fcf-protection=none -m64 -falign-jumps=1 -falign-loops=1 -mno-80387 -mno-fp-ret-in-387
-mpreferred-stack-boundary=3 -mskip-rax-setup -mtune=generic -mno-red-zone
-mcmodel=kernel -DCONFIG_X86_X32_ABI
-Wno-sign-compare -fno-asynchronous-unwind-tables -mindirect-branch=thunk-extern -mindirect-branch-register -mindirect-branch-cs-prefix
-mfunction-return=thunk-extern -fno-jump-tables -mharden-sls=all -fno-delete-null-pointer-checks
-Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-address-of-packed-member
-O2 -fno-allow-store-data-races -Wframe-larger-than=1024 -fstack-protector-strong -Wimplicit-fallthrough=5 -Wno-main -Wno-unused-but-set-variable
-Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-stack-clash-protection -g -gdwarf-5 -pg
-mrecord-mcount -mfentry -DCC_USING_FENTRY -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wno-stringop-truncation -Wno-zero-length-bounds
-Wno-array-bounds -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -Wno-alloc-size-larger-than -fno-strict-overflow -fno-stack-check
-fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wno-packed-not-aligned -I/tmp/iavf-4.9.1/src
-fsanitize=bounds -fsanitize=shift -fsanitize=bool -fsanitize=enum -DMODULE -DKBUILD_BASENAME='"iavf.mod"' -DKBUILD_MODNAME='"iavf"' -D__KBUILD_MODNAME=kmod_iavf
-c -o /tmp/iavf-4.9.1/src/iavf.mod.o /tmp/iavf-4.9.1/src/iavf.mod.c

Explicitly disabled: sse, sse2, mmx, 3d now, AVX

Address Sanitizer is enabled, which slows down the whole code. At least O2 optimisation is enabled.

How can I hope that Intel x710/810 NICs work efficiently and with minimal latency? The whole operating system requires the kernel and drivers to be built at minimum.

 
Renat Fatkhullin #:

Not to be unsubstantiated

Share a link to a resource where they share figures of comparative performance before and after recompiling software, including the operating system.

 

I will have to compile EAs on the same server where I run them, just in case. Or AVX will turn out to be on the wrong system on my laptop)

And with methaquot servers it can become quite fun).

 
Aleksey Nikolayev #:

I will have to compile EAs on the same server where I run them, just in case. Or AVX will end up on the wrong system on my work laptop)

And with methaquot servers it can become quite fun).

In our VPS network all servers with AVX at least. Most of them have AVX2.

 
fxsaber #:

Share a link to a resource where they share figures of comparative performance before and after recompiling software, including the operating system.

Everything is on google.

Synthetic tests are not very revealing, especially from hardware manufacturers. To draw a conclusion, you need to check a lot of benchmarks, not to believe in one suggested one.

You need to test everything yourself on your own programmes with understanding of your own case. Float/double heavy maths is well accelerated. And the compiler should be correct - only Clang.

 

In the current beta 3905, available at MetaQuotes-Demo, we have extended the information for MQL5:

void OnStart()
  {
   Print("CPU name:         ",TerminalInfoString(TERMINAL_CPU_NAME));
   Print("CPU cores:        ",TerminalInfoInteger(TERMINAL_CPU_CORES));
   Print("CPU architecture: ",TerminalInfoString(TERMINAL_CPU_ARCHITECTURE));   // новое
   Print("");
   Print("EX5 architecture: ",__CPU_ARCHITECTURE__);                            // новое
  }

CPU name:         12th Gen Intel Core i9-12900K
CPU cores:        24
CPU architecture: AVX2 + FMA3

EX5 architecture: AVX

You can recognise on the fly the capabilities of the processor on which the terminal is running.

Using the string macro __CPU_ARCHITECTURE__ you can find out and check which set of commands the EX5 file is built for.


When loading fails, it writes:

your CPU architecture does not allow to run the file 'test.ex5', AVX512 required, you have AVX2