Discussing the article: "Neural networks made easy (Part 39): Go-Explore, a different approach to exploration"

 

Check out the new article: Neural networks made easy (Part 39): Go-Explore, a different approach to exploration.

We continue studying the environment in reinforcement learning models. And in this article we will look at another algorithm – Go-Explore, which allows you to effectively explore the environment at the model training stage.

The main idea of Go-Explore is to remember and return to promising states. This is fundamental for effective operation when the number of rewards is limited. This idea is so flexible and broad that it can be implemented in a variety of ways. 


Unlike most reinforcement learning algorithms, Go-Explore does not focus on directly solving the target problem, but rather on finding relevant states and actions in the state space that can lead to achieving the target state. To achieve this, the algorithm has two main phases: search and reuse.

The first phase is to go through all the states in the state space and record each state visited in a state "map". After this, the algorithm begins to study each visited state in more detail and to collect information about actions that can lead to other interesting states.

The second phase is to reuse previously learned states and actions to find new solutions. The algorithm stores the most successful trajectories and uses them to generate new states that can lead to even more successful solutions.

Author: Dmitriy Gizlyk