Discussing the article: "Role of random number generator quality in the efficiency of optimization algorithms" - page 4

 
Andrey Dik #:

Yes, of course, making changes, making improvements. So a codebase would be a great solution.

Better than git. You can see the version history, easily see all changes in each file, report problems or comment on a particular line of code, and much more.

 

IMHO, to evaluate the effectiveness of optimisation, the primary importance is the performance of the found parameters on OOS sample (say, k-fold cross validation), and I haven't noticed such a topic yet - maybe I missed it somewhere.

Usually the problem is not to find the most meticulous optimisation algorithm, but to ensure that the system continues to perform as well as it does. I have a feeling that the most optimal parameters on the backtest are not long-lived enough on the forward, and therefore the efficiency should be evaluated in some other way, for example, by the size of continuous smooth clusters of parameters giving the average highest FF value.

By the way, I didn't see any comparison of speed in the tabulations.

 
Stanislav Korotky #:

and, in order to evaluate the efficiency of optimisation, the work of the found parameters on OOS sample (say, k-fold cross validation) is of primary importance, and I haven't noticed such a topic yet - maybe I missed it somewhere.

Usually the problem is not to find the most meticulous optimisation algorithm, but to make the system continue to perform as well as possible. I have a feeling that the most optimal parameters on the backtest are not long-lived enough on the forward, and therefore the efficiency should be evaluated in some other way, for example, by the size of continuous smooth clusters of parameters giving the average highest FF value.

By the way, I didn't see any comparison of speed in the tabulations.

You touch upon the issue of robustness of trading systems, this issue has no direct relation to the optimisation algorithms themselves.

In this series of articles, the optimisation algorithms themselves, as tools, are considered and compared with each other.

The code execution speed of the algorithms is negligible compared to the calculation of FF in practical tasks, so there is no special sense to make measurements.

Roughly speaking, we are comparing just shovels, which shovels cut the ground better and to what depth, it has no relation where they will be used: to dig holes for potatoes or for cucumbers and the influence of shovels on the speed of maturing vegetables is not investigated.


ZЫ. Optimisation and robustness of system parameters are different things, which for some reason are erroneously linked together. To identify robust parameters, other methods should be used (either as post-processing of results or in parallel with optimisation), such as identifying clusters of parameters with the general behaviour of the system and other methods.

 
Andrey Dik #:

You touch on the issue of robustness of trading systems, this issue has nothing directly to do with the optimisation algorithms themselves.

But you decided to delve into the issue of the influence of the quality of the HSC on the optimisation (i.e. you deviated from what is directly related to the optimisation algorithm). This is commendable. But imho, more important is the stability of the found solution, which is probably achieved by different ways of evaluating "better" in the FF space. This is directly related to the optimisation algorithm and its practical applicability.

 
Stanislav Korotky #:

1. You have decided to delve into the issue of the influence of the quality of the DST on the optimisation (i.e. you have deviated from what is directly related to the optimisation algorithm). This is commendable.

2. But imho, more important is the stability of the solution found, which is probably achieved by different ways of estimating "betterness" in the FF space. This is directly related to the optimisation algorithm and its practical applicability.

1. The question of the influence of the quality of the HCS is a legitimate one, since all the algorithms considered so far are exclusively stochastic in nature (except the Nelder-Mead algorithm). Therefore, the DSTs are directly related to the AO.

2. answered in the post above. I will add a little bit, the tests use test functions that fulfil the role of what we need. This is very important to realise the moment. Imagine having the ability to perform ALL possible runs of all possible sets, i.e. perform a complete enumeration. Now imagine that you need to select some of them, right? How do you select them? - There are some criteria by which you will choose from all parameters. If you know which parameters (solutions) to choose from all the possible ones obtained by a complete search, then you know which criteria to apply in optimisation. The role of criteria is fulfilled by FF.

What FF to apply to ensure robustness of the system? - Optimisation algorithms do not answer this question and have never answered it. Optimisation algorithms help to find only if you know what to look for.

The responsibility of choosing a FF lies entirely with the researcher.

The question about the influence of the quality of the HF on the results of AO has repeatedly appeared on this forum and on others, I have not seen the answer to this question in the public domain, now the answer is there and tools for verification are given.
 

Optimisation is shrouded in many myths, abuses and misconceptions, so I'd like to reiterate again:

Let's conduct a mental experiment. All possible variants of the system parameters are obtained by a complete enumeration. In this case, the optimisation algorithm is excluded completely, absolutely, there is no optimisation algorithm. Having all possible variants of parameters in hand, the question arises: which parameters to choose? - Knowing the answer to this question it is possible to describe the selection criteria from all possible variants in the form of a fitness function, which can already be applied in the process of optimisation. Thus the optimisation algorithm will allow to get the desired result, the same as in a full search, but faster. Having conducted this mental experiment it becomes clear that the optimisation algorithm is not "guilty" in any way in what it has found, the user who has not accurately described the desired criteria is guilty.

And this is important.

 
Andrey Khatimlianskii #:

It's better than git. You can see the version history, easily see all changes in each file, you can report problems or comment on a particular line of code, and much more.

Glad to see you among the commenters, Andrei.

It's not easy with github, problems with motivation and time)).

 
Andrey Dik #:

Good to see you among the commenters, Andrew.

It's not easy with github, problems with motivation and time)).

Interesting topic!

With github -- a day at most to learn, experiment, and then it's pure enjoyment. Ready to help.


Andrey Dik #:

Optimisation is shrouded in a lot of myths, abuses and misconceptions, so I would like to repeat it again:

Let's conduct a mental experiment. All possible variants of the system parameters are obtained by a complete enumeration. In this case, the optimisation algorithm is excluded completely, absolutely, there is no optimisation algorithm. Having all possible variants of parameters in hand, the question arises: which parameters to choose? - Knowing the answer to this question it is possible to describe the selection criteria from all possible variants in the form of a fitness function, which can already be applied in the process of optimisation. Thus the optimisation algorithm will allow to get the desired result, the same as in a full search, but faster. After conducting this mental experiment it becomes clear that the optimisation algorithm is not "guilty" of what it has found, but the user is guilty of not accurately describing the desired criteria.

And this is important.

To support Stanislav's point.

FF cannot describe the space around it, it has no information from outside.

And the task of a researcher is to find not a single sharp peak around which the results are jagged and bad, but a more or less extensive plateau, where the results are not very different.

How can the FF determine its position in space? It can't, it only knows its absolute coordinate. So we need an algorithm (call it not optimisation, if it cuts your ear) that searches not for the maximum (albeit global), but for the best plateau. The definition of "best" is, of course -- a separate interesting question, but it is obvious that parameters from the centre of the plateau will be much more stable on the forward than parameters from a sharp peak surrounded by a moat.

 
Andrey Khatimlianskii #:

Supporting Stanislaw's thought.

FF cannot describe the space around it, it has no information from outside.

And the researcher's task is to find not a single sharp peak around which the results are jagged and bad, but a more or less extensive plateau, where the results are not very different.

How can the FF determine its position in space? It can't, it only knows its absolute coordinate. That's why we need an algorithm(call it not optimisation, if it cuts your ears) that searches not for the maximum (albeit global), but for the best plateau. The definition of "best" is, of course -- a separate interesting question, but it is obvious that parameters from the centre of the plateau will be much more stable on the forward than parameters from a sharp peak surrounded by a moat.

What Stanislav is talking about and what you're talking about now has nothing to do with optimisation. And you don't understand some aspects of the fitness function correctly. But, it's already good that you agree to separate optimisation from highlighted colour.

Seems like a dedicated separate article is needed to cover these issues.

But, let's start by talking here. So, again, let's pretend that we have the opportunity to do a full system parameter sweep available to us. We make a run on the history of the system with all possible in principle parameters. Now, please answer the question: "Do you have a way to select from all possible parameters one set (or several) that you are willing to use to run the system on new data? This is a question not only for you, but for everyone who is willing to answer this question. As we can see, now we are not talking about any optimisation algorithm at all, only about choosing a set of parameters to run the system on unknown data.

 
Andrey Dik #:

let's imagine that we have an opportunity to perform a complete search of the system parameters. We make a run on the history of the system with all possible in principle parameters. Now, please answer the question: "Do you have a way to select from all possible parameters one set (or several) that you are ready to use for the system operation on new data?

I would choose only the tops of high local hills (not cliffs).