Machine learning in trading: theory, models, practice and algo-trading - page 1033

 
Roffild:

I am a programmer, not a telepath. If you have any questions, I'll give you the answers...

Answer: 42 :D

If you are ready to answer as a programmer, here is a small question - a programmer's task, at the same time checking the skils you insist on.

In the attached file there is a template of EURUSD H1 signals for the Expert Advisor, we need to determine the algorithm of their formation.

If you want you can post the solution as an EA and show in action the power of predictor bruteforcing and your machine learning library.

All wishing to machine-learners offer to join, I am also ready by MO to solve the proposed problem or any others, presented in the form of templates.

Perhaps in this mode of scribing we will be able to work out at least some common approaches and formats)).

Files:
EA_EURUSD_H1.tpl  130 kb
 

Before saving the template I should have removed all the indicators. Maybe, of course, not because of the indicator name=main, but the data is not displayed.

And where is the guarantee that the strategy is profitable? Maybe it's just a lucky piece of history...

It seems that no one has read my library, because there are no questions about it. Everyone wants to get the grail, without understanding the means of finding it.

 
Roffild:

Before saving the template I should have removed all the indicators. Maybe, of course, not because of the indicator name=main, but the data is not displayed.

And where is the guarantee that the strategy is profitable? Maybe it's just a lucky piece of history.

It seems that nobody has read my library, because there are no questions about it. Everyone wants to get the grail, not understanding the means to find it.

The main record is in all templates, it does not interfere, and the data there are only graphical objects - arrows, blue - BUY, red - SELL.

You open the chart EURUSD H1 and download the file (menu Charts\Template\Load Template...) and check the context menu Objects List.


And no one is asking you for a grail, just to solve the problem and confirm in practice what you say and how the library works.

 
Aleksey Terentev:
If you take interest in advanced neural network architectures very interesting ideas come up. Of course it is hard to get into details, I need experience with frameworks for dithering, and an understanding of vector mathematics in general.
But it is worth it.
I have nothing much to show you about the market, the market is ***inok) It takes a lot of time.
Come to us in the discord, we have a quiet and cozy) I will describe and show you by examples, how to prepare a deep network.

Don't do it, he'll fuck your brains out

 
Aleksey Terentev:
If you take interest in advanced neural network architectures very interesting ideas come up. Of course it is hard to get into details, you need to have experience with the frameworks for dithering, and an understanding of vector mathematics in general.
But it is worth it.
I have nothing much to show you about the market, the market is ***inok) It takes a lot of time.
Come to us in the discord, we have a quiet and cozy) Describe and show you by examples, how to prepare a deep network.

I communicate on the forum and do not support sects

)

 
Roffild:

I am a programmer, not a telepath. If you have any questions, I'll give you the answers...

Answer: 42 :D

I am interested in the concept of why it should work, not what connects to what and how fast

Theoretical explanation of the approach, I have not realized it from the code and it's no fun to put java and spark etc., just to understand it

i.e. how do you see the MO and how do you work with it, the depth of your understanding, so to speak.

if you answer 43 i won't ask again :)

 

Part of my library, which is on MQL5, is not directly linked to Apache Spark. There is a separate module in Java that converts the data to be used in Spark. This module should be ported to Python.

Apache Spark is a distributed big data processing system + module for Random Forests. It can process data on 1000 servers (Facebook stumbled upon such a threshold).

Big data - when file processing doesn't fit into RAM.

Given: 800 predictors over 2 years in 5GB.

Task: Use some cheap Amazon servers to create 250 trees in 1-2 hours.

Solution: AWS EMR + Apache Spark.

Is there a way to solve this problem without using Spark?

 

Apache Spark lets you forget about the lack of RAM.

I created a Random Forest of 500 trees with 7000 predictors and 30 GB of data. Amazon ran for 15 hours on two servers with 16 CPUs.

 
Roffild:

Apache Spark lets you forget about the lack of RAM.

I've created a Random Forest of 500 trees with 7000 predictors and 30 GB of data. Amazon was running 15 hours on two servers with 16 CPUs.

And the point of 7000 predictors in a random forest? It's still retraining. I took about 30-40 predictors and trained the forest. Then I ran each one separately, and that's how I selected and left 4 predictors.

That forest trained on four predictors turned out a little better than with 30-40, but not by much. Quotes in particular forex are more random data type and I get something like +5% (correctly predict 55%) to the desired class relative to the negative.

Maybe, of course, it is possible to somehow isolate a component of the price series, which better separate the classes, but I haven't managed to do it yet.

That is my point, there is no need to multiply predictors. There is little point in doing so, only the forest will retrain even faster.

 
Roffild:

Apache Spark lets you forget about the lack of RAM.

I created a Random Forest of 500 trees with 7000 predictors and 30 GB of data. Amazon ran for 15 hours on two servers with 16 CPUs.

They decided to cram in the un-capable and multiply the predictors for no good reason.

Where did you get so many predictors from? What's their importance? There's 1/3 of the forest not included in the training set and 95% have low importance. And what kind of response does the system have now with so many predictors, 3 hours for 1 forecast? )