Discussing the article: "Role of random number generator quality in the efficiency of optimization algorithms" - page 5

 

There are two approaches to solving the problem of evaluating the stability of a solution found during optimisation: theoretical and practical.

The theoretical one implies immersion in mathematical calculations of articles, which usually have the words stability and optimisation in their titles.

The practical one can be implemented in different ways, but it boils down to the research work done by the authors of the articles from the first point. For example, at the expense of redundant N FF calls in the vicinity of each travelled point to estimate local derivatives and depending on them adjust the performance by some penalty criteria. Or you could noise the coordinates and/or FF values during the optimisation process - this is the cheapest way (nothing needs to be changed in the basis of the optimisation algorithms themselves). There is also an idea to cluster the optimisation results and identify the points with the maximum of the conditional formula (Index-FF)/(spread-indicator-FF)*(minimum-dispersion-by-parameters). The point, where a single good result will fall, will have zero spread, and should be treated as unsatisfactory.

 
Stanislav Korotky results and identify the points with the maximum of the conditional formula (Index-FF)/(spread-indicator-FF)*(minimum-dispersion-by-parameters). The point, where a single good result will fall, will have zero spread, and should be treated as unsatisfactory.

Your post is getting a bit ahead of itself. That's not a bad thing, but it doesn't help answer some of the questions that have arisen here in the discussion.

Saber has already given his answer to my question, please answer more by you Stanislav and let's wait for Andrei's answer. Let me remind you of the question:

Andrey Dik #:

Let's imagine that we have an opportunity to perform a complete enumeration of the system parameters. We make a run on the history of the system with all possible in principle parameters. Now, please answer the question: "Do you have a way to choose from all possible parameters one set (or several), which you are ready to use for the system operation on new data?

Then we will be able to reveal most fully the problematic places in these questions.

Please note: in this formulation of the question there is no speech about optimisation algorithms.

 
Andrey Dik #:

Saber has already given his answer to my question, please answer Stanislav and let's wait for Andrei's answer. Let me remind you of the question:

IMHO, I just answered this question - 3 options at a quick glance (someone can think of more): count derivatives(*/**), clustering (**), noise (*) - (one asterisk - during the optimisation process, two asterisks - on the results of optimisation). Either option will result in adjustments to the FF point scores. Next, select the best set as usual, but it will no longer be just the maximum FF, but with stability correction.

 
Andrey Dik #:

But, let's start by talking here. So, again, let's pretend that we have access to a complete search of the system parameters. We make a run on the history of the system with all possible parameters in principle. Now, please answer the question: "Do you have a way to select from all possible parameters one set (or several) that you are willing to use to run the system on new data? This is a question not only for you, but for everyone who is willing to answer this question. As we can see, now we are not talking about any optimisation algorithm at all, only about choosing a set of parameters to run the system on unknown data.

Let it be the top of one of the hills. But I immediately stipulated that "The definition of "bestness" is a separate interesting question".

The final result should take into account the results of passes within a certain radius.

On a 3d chart, this would be the top of a hill that is higher than the others, or slightly lower but has gentler slopes.

From the point of view of TC -- it will be the aggregated result of TC in some range of values of each of its parameters.

 
The goal is to find this region without a complete search.
 
Stanislav Korotky #:

IMHO, I just answered this question - 3 options at a guess (someone can think of more): count derivatives(*/**), clustering (**), noise (*) - (one asterisk - during the optimisation process, two asterisks - on the results of optimisation). Either option will result in adjustments to the FF point scores. Next, select the best set as usual, but it will no longer be just the maximum FF, but with stability correction.

The first line in your post:

Stanislav Korotky #:

I see 2 approaches to solving the problem of evaluating the stability of the solution found during optimisation: theoretical and practical.

So far we do not do optimisation, we have done a complete enumeration of parameters.

 
Andrey Khatimlianskii #:

Let it be the top of one of the hills. But I immediately stipulated that "The definition of "better" is an interesting separate question".

The final result should take into account the results of passes within a certain radius.

On a 3D graph, this would be the top of a hill that is higher than the others, or slightly lower, but has gentler slopes.

From the point of view of TC -- it will be the aggregated result of TC operation in some range of values of each of its parameters.

Andrey Khatimlianskii #:
The task is to find this region without a complete search.

So, is there a way to select from all possible parameters in the results of a full search the set that we will use on unfamiliar data? We have done a full search, there is no optimisation here.

Now it is very important to answer this question.

 
Andrey Khatimlianskii #:
The problem is to find this region without a complete search.

I am sure that such a problem is often posed and therefore has several published solutions.

By the way, if there are already calculated FF values in trillions of points (full search), then finding multidimensional hills among them is an optimisation problem.

So it comes down to optimisation in any case.


I proposed an iterative approach, when before each iteration the areas of the found vertex (at the previous iteration) are poked out as another GA run.

 
Thank you for your comments, dear Fxsaber, Stanislav and Andrey. Respecting your vast experience in development and other spheres, let me express my point of view on aspects related to the notion of "optimisation". By no means I have a goal to "teach", but I want to convey my vision as a person who has taken apart hundreds of other people's optimisation algorithms and designed several of my own algorithms.

If I can convey my thought, it will allow you to look at optimisation from a slightly different angle, which will undoubtedly enrich your already vast experience and knowledge, as well as to add to my baggage of experience in the process of discussion.

"If" - because there are external influences on this article discussion thread that prevent this from happening. So far these influences have been successfully bought down, but that is no guarantee that this will always be the case.
 
Andrey Dik #:

The first line in your post:

We don't do optimisation yet, we did a complete parameter enumeration.

Is this some kind of terminological game? I suggested 3 ways to choose the best set - they are also suitable for the case of a full run on history in all possible combinations.

For example, such a well-known problem: there is an NS (let's say, trading on price increments), and optimisation is used to find the weights of this network. If we apply your algorithms head-on, we will get an over-trained NS, which will not be able to work on new data. Therefore, it is critical to divide the original dataset into two, and while performing optimisation on one half, control the quality on the second half.