Discussing the article: "Neural networks made easy (Part 67): Using past experience to solve new tasks"

MetaQuotes 2024.04.08 12:48

Check out the new article: Neural networks made easy (Part 67): Using past experience to solve new tasks.

In this article, we continue discussing methods for collecting data into a training set. Obviously, the learning process requires constant interaction with the environment. However, situations can be different.

Reinforcement learning is built on maximizing the reward received from the environment during interaction with it. Obviously, the learning process requires constant interaction with the environment. However, situations are different. When solving some tasks, we can encounter various restrictions on such interaction with the environment. A possible solution for such situations is to use offline reinforcement learning algorithms. They allow you to train models on a limited archive of trajectories collected during preliminary interaction with the environment, while it was available.

Of course, offline reinforcement learning has some drawbacks. In particular, the problem of studying the environment becomes even more acute as we deal with a limited training sample, which is not able to accommodate all the versatility of the environment. This is especially true in complex stochastic environments. In the previous article, we discussed one of the options for solving this tasak (the ExORL metho ).

However, sometimes restrictions on interactions with the environment can be critical. The process of environmental exploration can be accompanied by positive and negative rewards. Negative rewards can be highly undesirable and can be accompanied by financial losses or some other unwanted loss that you cannot accept. But tasks rarely appear out of nowhere. Most often we optimize an existing process. And in our age of information technology, we one can almost always find experience in interacting with the environment being explored in the process of solving tasks similar to the mentioned one. It is possible to use data from real interaction with the environment, which can, to one degree or another, cover the required space of actions and states. Experiments using such experience to solve new tasks when controlling real robots are described in the article "Real World Offline Reinforcement Learning with Realistic Data Source". The authors of the paper propose a new framework for training models: Real-ORL.

Author: Dmitriy Gizlyk

JimReaper 2023.12.09 15:56 #1

THIS IS GENIUS WORK Dmitriy! I Love this! 🥰🥰🥰

Alexey Volchanskiy 2023.12.23 08:32 #2

If you know the topic, write an article about using Google Colab + Tensor Flow. I can give a real trading task and calculate the input data.

Dmitriy Gizlyk 2023.12.23 17:24 #3

Alexey Volchanskiy #:
If you know the topic, write an article about using Google Colab + Tensor Flow. I can give a real trading task and calculate the inputs.

I don't know how much it is in the subject of this site?

Anil Varma 2024.04.21 12:40 #4

Hi @Dmitriy Gizlyk

First of all hats off to your efforts on this wonderful series on AI and ML.

I have gone through all the articles from 1 to 30 in a row in a single day. Most of the Files you provided worked without any problem.

However I have jumped to article 67 and tried to run 'ResearchRealORL'. I am getting following errors.

2024.04.21 17:59:59.935 Tester  "NNME\Part67\RealORL\ResearchRealORL.ex5" 64 bit
2024.04.21 18:00:00.133 Experts optimization frame expert ResearchRealORL (EURUSD,H1) processing started
2024.04.21 18:00:00.156 Tester  Experts\NNME\Part67\RealORL\ResearchRealORL.ex5 on EURUSD,H1 from 2023.01.01 00:00 to 2023.07.31 00:00
2024.04.21 18:00:00.157 Tester  EURUSD: history data begins from 2002.09.03 00:00
2024.04.21 18:00:00.157 Tester  EURUSD: history data begins from 2002.09.03 00:00
2024.04.21 18:00:00.157 Tester  complete optimization started
2024.04.21 18:00:00.168 Core 1  agent process started on 127.0.0.1:3000
2024.04.21 18:00:00.178 Core 2  agent process started on 127.0.0.1:3001
2024.04.21 18:00:00.188 Core 3  agent process started on 127.0.0.1:3002
2024.04.21 18:00:00.200 Core 4  agent process started on 127.0.0.1:3003
2024.04.21 18:00:00.213 Core 5  agent process started on 127.0.0.1:3004
2024.04.21 18:00:00.225 Core 6  agent process started on 127.0.0.1:3005
2024.04.21 18:00:00.237 Core 7  agent process started on 127.0.0.1:3006
2024.04.21 18:00:00.271 Core 8  agent process started on 127.0.0.1:3007
2024.04.21 18:00:00.886 Core 4  connecting to 127.0.0.1:3003
2024.04.21 18:00:00.897 Core 4  connected
2024.04.21 18:00:00.911 Core 4  authorized (agent build 4260)
2024.04.21 18:00:00.928 Core 4  common synchronization completed
2024.04.21 18:00:01.062 Core 2  connecting to 127.0.0.1:3001
2024.04.21 18:00:01.070 Core 2  connected
2024.04.21 18:00:01.081 Core 2  authorized (agent build 4260)
2024.04.21 18:00:01.096 Core 2  common synchronization completed
2024.04.21 18:00:01.110 Core 1  connecting to 127.0.0.1:3000
2024.04.21 18:00:01.118 Core 1  connected
2024.04.21 18:00:01.131 Core 1  authorized (agent build 4260)
2024.04.21 18:00:01.131 Core 4  pass 0 tested with error "OnInit returned non-zero code 1" in 0:00:00.152
2024.04.21 18:00:01.131 Core 4  pass 1 tested with error "OnInit returned non-zero code 1" in 0:00:00.006
2024.04.21 18:00:01.146 Core 1  common synchronization completed
2024.04.21 18:00:01.146 Core 4  pass 6 tested with error "OnInit returned non-zero code 1" in 0:00:00.004
2024.04.21 18:00:01.146 Core 4  pass 7 tested with error "OnInit returned non-zero code 1" in 0:00:00.003
2024.04.21 18:00:01.162 Core 4  pass 8 tested with error "OnInit returned non-zero code 1" in 0:00:00.004
...

2024.04.21 18:00:01.454 Statistics      optimization done in 0 minutes 01 seconds
2024.04.21 18:00:01.454 Statistics      shortest pass 0:00:00.000, longest pass 0:00:00.000, average pass 0:00:00.000
2024.04.21 18:00:01.454 Statistics      local 20 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)
2024.04.21 18:00:01.454 Core 1  connection closed
2024.04.21 18:00:01.455 Core 2  connection closed

Could you please help where I am wrong?

Regards and thanks a lot for all your efforts to teach us ML in MQL5.

MetaEditor - Professional editor Trading Signals Virtual Hosting for 24/7

lj1616 2024.04.23 12:55 #5

阿尼尔·瓦尔玛# ：

Hello @Dimitri Gizlik

First of all, hats off to you for your efforts in creating this wonderful series of articles on AI and ML.

I browsed all the articles from 1 to 30 in one day continuously. Most of the files you provided work fine.

However, I went to section 67 and tried to run "ResearchRealORL". I received the following error.

Could you help me out where I'm going wrong?

Thank you very much for all your efforts in teaching us ML in MQL5.

I also found this error. It may be an error in reading the sample file when initializing. I have been studying this error for a long time.

New comment