Discussing the article: "Neural networks made easy (Part 40): Using Go-Explore on large amounts of data"

 

Check out the new article: Neural networks made easy (Part 40): Using Go-Explore on large amounts of data.

This article discusses the use of the Go-Explore algorithm over a long training period, since the random action selection strategy may not lead to a profitable pass as training time increases.

As the training period of the Go-Explore algorithm increases, certain difficulties arise. Some of them include:

  1. Curse of dimensionality: As the training period increases, the number of states an agent can visit grows exponentially, making it more difficult to find the optimal strategy.

  2. Environmental change: As the training period increases, changes in the environment may occur that may affect the agent's learning outcomes. This can cause a previously successful strategy to become ineffective or even impossible.

  3. Difficulty in selecting actions: As the training period increases, the agent may need to consider the broader context of the task to make informed decisions. This can complicate the task of choosing the optimal action and require more complex methods for optimizing the algorithm.

  4. Increased training time: As the training period increases, the time required to collect enough data and train the model also increases. This can reduce the efficiency and speed of agent training.

As the training period increases, the problem of increasing the dimension of the state space that needs to be explored may arise. This may lead to the "curse of dimensionality" problem, where the number of possible states grows exponentially with increasing dimensionality. This makes state space exploration difficult and can cause the algorithm to spend too much time exploring irrelevant states.

To check the quality and efficiency of the trained model, we test it on training and test samples. It is important to note that our model was able to make a profit on historical data for the first week of May 2023, which was not included in the training set but directly followed it.

Test sample (May 2023) Test sample (May 2023)

Author: Dmitriy Gizlyk