Discussion of article "Neural networks made easy (Part 34): Fully Parameterized Quantile Function"

 

New article Neural networks made easy (Part 34): Fully Parameterized Quantile Function has been published:

We continue studying distributed Q-learning algorithms. In previous articles, we have considered distributed and quantile Q-learning algorithms. In the first algorithm, we trained the probabilities of given ranges of values. In the second algorithm, we trained ranges with a given probability. In both of them, we used a priori knowledge of one distribution and trained another one. In this article, we will consider an algorithm which allows the model to train for both distributions.

This approach enables the training of a model that is less sensitive to the 'number of quantiles' hyperparameter. Their random distribution allows the expansion of the range of approximated functions to non-uniformly distributed ones.

Before the data is input into the model, an embedding of randomly generated quantiles is created according to the formula below.

There are different options in combining the resulting embedding with the tensor of the original data. This can be either a simple concatenation of two tensors or a Hadamard (element-by-element) multiplication of two matrices.

Below is a comparison of the considered architectures, presented by the authors of the article.


The model effectiveness is confirmed by tests carried out on 57 Atari games. Below is a comparison table from the original article [8


Hypothetically, given the unlimited size of the model, this approach allows learning any distribution of the predicted reward.

Author: Dmitriy Gizlyk

 
Is the nn architecture similar to the one from the previous article except for the last layer?
 
happy side #:
Is the nn architecture similar to the one from the previous article except for the last layer?

yes