OpenCL: real challenges

 

There are plenty of topics about OpenCL, but the tasks cited are too far removed from trading.

So what can OpenCL give to traders?

Yes, I have not yet studied OpenCL so I want to learn and clarify the main points:

  1. Isn't it a separate program that receives input and outputs data? So there is no interaction from MQL at all?
  2. Is it worthwhile implementing array search in OpenCL where all you need is to get confirmation that a match exists?
  3. During the optimization, each thread has to perform its own OpenCL initialization and there is no possibility to connect to the active thread already?
  4. When optimizing, is the graphics card not slowed down by multiple OpenCL threads simultaneously?
 
Roffild:

There are plenty of topics about OpenCL, but the tasks cited are too far removed from trading.

So what can OpenCL give traders?

What you can take, that is what it will give.

Yes, I have not studied OpenCL yet so I want to learn and clarify the main points:

Well, study. The Internet will help. Do not get carried away with the questions on the forum, or rather the entry-level questions. All the information on "how it works" is in the Internet and the articles on the forum. I, for example, have a desire to help when I see that the man himself plowing into the study and already visible a certain level of mastering the material. I do not want to help freeloaders who are in the posture of "here I am, teach me already!", and I only want "right in the brain.)

I will make an exception today (I am in a good mood). However, with an apparent slippery slope at the initial stage, I will likely ignore "baby talk" and answer only specific questions on programming, provided they are not too dumb.

1. this is a separate program which receives input and outputs data, isn't it? So there is no interaction from MQL at all?

During the kernel operation there is no interaction. Interaction is similar to that in case of a function call. 1: 1. set initial parameters, 2. run calculation, 3. take the result

2. Is it even worth bringing array search into OpenCL if all you want is to get confirmation that a match exists?

I don't know. It depends on the task at hand. Maybe you don't have to. Or maybe you should. You know where the Telepath Club is? :)

3. During optimization, each thread has to do its own OpenCL initialization and there is no way to connect to the active thread?

I don't quite understand the question. If you mean optimization in the tester/optimizer of the terminal strategies, I think it must be so. I have not tried to run OPTZL from the optimizer. I did it in the tester, but everything there is consistent and one kernel can be used multiple times, it is obvious without any explanation.
4 When optimizing, is the graphics card not slowed down by multiple OpenCL threads simultaneously?

I haven't tried running it in an optimizer but if several calls overlap each other (I was doing it launching several OpenCL indicators and Expert Advisors simultaneously), it will of course slow down.

// It doesn't take resources from the ceiling, does it?

If memory of video card is overloaded (for example, if several processes try to load arrays in video memory in parallel exceeding its memory capacity), then you can "break the driver" - get a full restart (reset) of video card and driver, followed by a message about driver crash. At least this repeatedly happened to my card/driver. This does not lead to irreversible damage, but programs caused the crash are usually hanging (they have to be restarted). There were cases of terminal hangs before, but not lately.

// However, I haven't "torn" the driver with immodest memory queries for a long time, I've found "limits of luxury" approximately. :)

 
Roffild: There are a lot of threads on OpenCL but the tasks outlined in the example are too far from being commercial.

It's very simple: you take a task close to a trading one (say, analysis of quote history) and try to solve it using OpenCL. After a few unsuccessful attempts, further reading of the literature and new attempts, it will work, I guarantee. But only on condition that you are able to digest the English-language literature and have a bit of persistence as well as a minimum amount of brains.

OpenCL drivers are already quite well optimized both for video cards and for emulation on processors (in the latter case, Intel processors are probably much stronger). So, you have to try hard to not get any positive results at all in the end.

 
Roffild:

So what can OpenCL give traders?

On points 1-4, you have already been answered, while I will venture to answer your main question (of course, this is my point of view only): the vast majority of traders will get nothing from OpenCL, they would better leave this "bread" to programmers.
 
Folks, who's doing the shuffle? Is it possible to transfer to OpenCL the solution of a linear system of equations? The size of the system can be really big and there are other nuances.
 
TheXpert:
Guys, who's doing the magic? Is it possible to transfer to OpenCL the solution of a linear system of equations? The size of the system can be really big and there are other nuances.

Solving SLAEs in OpenCL

This is a good pdf that answers your question.

 

Nikolai, thank you for your responsiveness. There's no CUDA and no code.

Oh, I forgot to mention one more thing - matrix construction takes more time than its solution :) so, you may need to parallelize the construction.

 
TheXpert:

Nikolai, thank you for your responsiveness. There's no CUDA and no code.

Oh, I forgot to say one more thing - matrix construction takes more time than its solution :) so I may need to parallelize it.

What's the source data? // format, data structure

I mean, what do we build the matrix from? A bunch of buffers? A tree? From [...] ?

Документация по MQL5: Стандартные константы, перечисления и структуры / Структуры данных
Документация по MQL5: Стандартные константы, перечисления и структуры / Структуры данных
  • www.mql5.com
Стандартные константы, перечисления и структуры / Структуры данных - Документация по MQL5
 
TheXpert:

Nikolai, thank you for your responsiveness. There is no CUDA and no code.

Oh, I forgot one more thing - matrix takes more time to build than to solve it :) so you may need to parallel the building.

I meant the scheme, not the implementation. Of course, CUDA is different but the general scheme is the same.

I agree with Vladimir, you're not giving enough information to help you.

I don't think anyone else will join in, so if you don't want to make it public, you may send it to any of the presenters in private.

 
MetaDriver:

I mean, what do we build the matrix from? A bunch of buffers? A tree? From [...] ?

Roughly speaking, there is a space of huge dimensionality (10 -- 1000 and more), for it we need to solve the MNC problem.

The solution of the ANM problem boils down to

(1) construct equations of derivatives

(2) by solving a system of equations derived in (1)

Now (1) takes the lion's share of the solution time. The larger the dimensionality, the larger the fraction.

 

Parallel algorithms have only two features that give them an advantage over sequential algorithms.

They are the combing algorithm, where each tooth takes its thread and pulls along the entire length.

And the pyramid roll/unroll. Unwrapping is less common, mostly curling.

If the problem statement does not include any of these features, the parallel solver will not give an advantage, and more often will be slower due to the cost of memory loading.