What to feed to the input of the neural network? Your ideas... - page 38

 
Ivan Butko #:

Please clarify in the context of a regular MT5 optimiser and a regular EA.

What would it look like?

Take two sets from the optimisation list (uncorrelated), combine them and run? Or is something else in mind

According to the results you got some coefficients, write them into the code of your EA before the code that works with the variables to be optimised.

As a result, you get a filter from the first pass, and the second network learns not from all examples, but only from those filtered by the first NS.

The goal is to increase the expectation matrix by reducing false inputs - judging by the graph it is very low.

Or, as I have already suggested earlier - to try to train on the filtered results using other methods - the same CatBoost.

 
Andrey Dik #:
Specialists from the MoD branch say

That's not the point - you can search for anything and how you want, it's just that the optimal variant on the search data does not guarantee the same result on the data outside this search (in relation to the market).

If there is an algorithm that not only searches for optimal parameters from the mathematical point of view on the FF, but also takes into account the data structure, allowing to see some stable dependence on two samples after training, then it is really valuable for application to the market. And it is always possible to just find something by chance, which I have shown in the MO topic - in my recent research.

 
Aleksey Vyazmikin #:

According to the results you got some coefficients, write them into the code of your NS before the code that works with the optimised variables.

As a result, you get a filter from the first pass, and the second network learns not from all examples, but only from those filtered by the first NS.

The goal is to increase the expectation matrix by reducing false inputs - judging by the graph it is very low.

Or, as I have already suggested earlier - to try to train on the filtered results using other methods - the same CatBoost.

Ah, got it

Thanks for the idea

 
Ivan Butko #:

Ah, got it

Thanks for the idea

You're welcome! Share your results.

 
Aleksey Vyazmikin #:

Yes, please! Share your results.

and bots.

Is that from the perseptron bot article? Then don't.
 
Aleksey Vyazmikin #:

1. That's not the point - you can search for anything and how you want,

2. simply the optimal variant on the search data does not guarantee the same result on the data outside this search (in relation to the market).

3. If there is an algorithm that not only searches for optimal parameters from the mathematical point of view on the FF, but also takes into account the structure of the data, allowing to see some dependence stable on two samples after training, then it is really valuable for application to the market.

4. and it is always possible to find something by chance, as I have shown in the topic of MO - in my recent research.

1. what is the point? - Not that you can search for anything, but exactly what is the point of the search?

2. and what guarantees it?

3. I haven't seen a specific one-piece algorithm that can find "dependence stable over two samples after training", but there are logical considerations, a set of techniques and considerations on how this can be achieved (or at least to understand which way to go).

4. Why look for something random when you can look for something non-random?

 
Andrey Dik #:

1. What's the point? - not that you can search for anything, but what is the point of the search?

2. What guarantees it?

3. I haven't seen a specific one-piece algorithm that can find "dependence stable over two samples after training", but there are logical considerations, a set of techniques and considerations on how to achieve this (or at least to understand which way to go).

4. Why look for something random when you can look for something non-random?

1. The point is that it's not only the algorithm that matters, but also the nature/origin of the data - we're talking about both the (non)stationarity of the processes and the non-representativeness of the sample if we're talking about allowing different optimisation methods.

2. I don't have an answer - I'm looking for one.

3. I would be interested to hear about it.

4. Here I wrote that if there is no algorithm from point 2, then all approaches essentially randomly show efficiency on new data - better or worse is not a matter of algorithm logic (although I admit there are heuristics that improve the result), but the randomness of the data obtained determines the result.

When you examine a known function, it (its constants) does not change from the incoming variables, the more examples you have, the better chance you have of picking its coefficients correctly. In market data there are limits on the number of examples that can be obtained, and there is a problem in that there are many such number generators working at once with their own function (even if they are large participants whose rules of behaviour are approved and printed, following the example of our Central Bank). So it turns out that ideally it is possible to adjust to similar rules of behaviour of different participants, there will be such a noisy function that describes only a part of the market, and that is how I see the situation. At the same time, participants can flow from one behaviour function to another....

The optimisation algorithms from the articles themselves are very interesting. Try to consider the admissibility of their application head-on, maybe also take a hundred different functions mixed together and try to describe at least one? In general, make the experiment more complicated.

 
Aleksey Vyazmikin #:

1. The point is that it is not only the algorithm that matters, but also the nature/provenance of the data - we are talking about both (non)stationarity of the processes and non-representativeness of the sample, if we are talking about the admissibility of different optimisation methods.

2. I don't have an answer - I'm searching.

3. I would be interested to hear about it.

4. I wrote here that if there is no algorithm from point 2, then all approaches essentially randomly show performance on new data - better or worse is not a matter of algorithm logic (although I admit there are heuristics that improve the result), but the randomness of the data obtained determines the result.

When you examine a known function, it (its constants) does not change from the incoming variables, the more examples you have, the better chance you have of picking its coefficients correctly. In market data there are limits on the number of examples that can be obtained, and there is a problem in that there are many such number generators working at once with their own function (even if they are large participants whose rules of behaviour are approved and printed, following the example of our Central Bank). So it turns out that ideally it is possible to adjust to similar rules of behaviour of different participants, there will be such a noisy function that describes only a part of the market, and that is how I see the situation. At the same time, participants can flow from one behaviour function to another....

The optimisation algorithms from the articles themselves are very interesting. Try to consider the admissibility of their application head-on, maybe also take a hundred different functions mixed together and try to describe at least one? In general, make the experiment more complicated.

Further on the text: two quotations, after that - my text

1.

That's not the point - you can search for anything and how you want,

The point is that not only the algorithm is important, but also the nature/origin of the data - we are talking about both (non)stationarity of processes and non-representativeness of the sample, if we are talking about the admissibility of different optimisation methods.

On the first point, I did not say anything about optimisation algorithms. The essence of any search is the optimum described by the user. No matter what optimisation algorithm is used, AO only allows you to avoid the need to do a complete search. If it is possible to do a complete search, then AO is not needed, but the question of "what exactly is needed" remains, and it does not matter whether the process is stationary or not.

2.

simply an optimum on search data does not guarantee the same result on data outside of that search (with respect to the market).

I don't have an answer - I'm searching.

No one has a clear answer to this question. If one does not know what to search for, then no one and nothing will do it for the one who is searching, because it is not known what exactly to search for, and since this is the case, it is impossible to classify any result as "what was searched for".

3.

I haven't seen a specific one-piece algorithm that can find "dependence stable over two samples after training", but there are logical considerations, a set of techniques and considerations as to how this can be achieved (or at least to understand which way to go).

I'd be interested to hear about that.

And I, under favourable circumstances, would be happy to talk about it (if you know what I mean).

4.

And just finding something by chance is always possible, as I showed in the MO topic - in my recent research.

Here I wrote that if there is no algorithm from point 2, then all approaches essentially randomly show efficiency on new data - better or worse is not a matter of algorithm logic (although I admit there are heuristics that improve the result), but the randomness of the data obtained determines the result.

Exactly, if you don't know what to look for, it is impossible to find it.

When you examine a known function, it (its constants) does not change with the input variables, the more examples you have, the more chances you have to find its coefficients correctly. In market data there are limits on the number of examples that can be obtained, and there is a problem in that there are many such number generators working at once with their own function (even if they are large participants whose rules of behaviour are approved and printed, following the example of our Central Bank). So it turns out that ideally it is possible to adjust to similar rules of behaviour of different participants, there will be such a noisy function that describes only a part of the market, and that is how I see the situation. At the same time, participants can flow from one behavioural function to another....

This is what the discussion is about.

The optimisation algorithms from the articles themselves are very interesting. Try to consider the question of admissibility of their application head-on, maybe also take a hundred different functions and mix them and try to describe at least one? In general, it is more complicated to set up an experiment.

Let me put it this way, optimisation algorithms (understanding their internal logic and search methods) open your eyes to many things and open ways and visions of search paths (be it optimisation in a general form or learning in a private one). I'm planning an article with such an experiment, with a mix of features.

What do you mean by"admissibility of head-on application"? - I don't understand this formulation of the question.

Just the other day I started a customer's project, an advisor with a neural network (there is a possibility to choose an internal optimisation algorithm, SGD and many others), training, validation and everything as it should be, the project is several dozens of dense lines of code..... So what am I saying...? is that neuronka shows new results every time on the same training data.))))) Neuronics doesn't perform well on new data? Well, it shows different results on the same data, so what new data can we talk about? But it's not about neuronka, because neuronka is just a static formula, weights and offsets are adjusted, but neuronka doesn't change. Then what's the matter, why do you get different results? And the matter is elementary - SGD (and others) gets stuck, attention, on the loss function))) I.e., not only the neural network is trained incomprehensibly, but also the internal optimisation algorithm is not able to pull out.

I want to do an experiment, to compare the same neuralink on results with gradient and classical optimisation algorithms. I've ventilated this question, I haven't found any research on this topic, I only see and hear as dogma "you should use gradient descents and so on", there is no clear answer to the question "why you should". I asked different gpt-like people, one of them gives random articles from one known scientific resource, another one doesn't give links but stubbornly says that "it's necessary", and the third one admitted that he was taught to say so and he doesn't have any proofs of his words, but he would be glad to know about them (proofs).

 
Andrey Dik #:

On the first point, I didn't say anything about optimisation algorithms. The essence of any search is the optimum described by the user. No matter what optimisation algorithm is used, AO only avoids the need to do a complete search. If it is possible to do a complete search, then AO is not needed, but the question of "what exactly is needed" remains, and it does not matter whether the process is stationary or not.

1. And I thought that you made a reference to the MO thread, because the opinion of some participants about the algorithms reviewed in your articles was critical. I just decided to clarify, as I understood, what is the reason of disagreement in substance, in my opinion. I don't want to bring up the topic of personalities.

Andrey Dik #:
Nobody has a clear answer to this question. If you do not know what to look for, then no one and nothing will do it for the one who is looking for, because it is not known what exactly to look for, and if so, then any result can not be classified as "what was looking for".

2. I know what to look for - statistically stable dependences on quantum segments of predictors, but what signs tell about it - I don't know yet. When building a model, you need to have a high probability of making the right step at a new iteration - that's all :)

Andrey Dik #:

And I, under favourable circumstances, will be pleased to tell you (if you know what I mean).

3. I don't get it at all, to be honest.

Andrey D ik #:

Exactly, if you do not know what exactly to look for, it is impossible to find it.

4. Still, you do not understand what I mean. To simplify, imagine that you can, in an iterative approach, according to the optimisation algorithm, forcibly check all outcomes of variants from which you need to make a choice, and such a choice will bring the model closer to a particular goal on all available samples. And, here in the pocket is such a model (or coefficients - whatever you want to call it), which is good on the available data, but it is not necessarily the best option. I.e., the algorithm peeks at what the FF will be on the final data when choosing a step to solve for in an iteration. That's what I'm saying, when you don't peek, you can accidentally get a good result that depended on random steps. We're talking about market data. And the result can become bad when fed new data. Anyway, my point is that it's not so easy to understand whether you got a good model by chance or due to an algorithm on an unrepresentative subsample.

Andrey Dik #:

This is what the discussion is about.

Andrey Dik #:

Let me put it this way, optimisation algorithms (understanding their internal logic and search methods) open your eyes to many things and open ways and visions of search paths (be it optimisation in a general way or learning in a particular way). I'm planning an article with such an experiment, with a mix of features.

I would be interested to read about such an experiment and how much it reflects the model of market formation.

Andrey Dik #:

What does it mean"admissibility of head-on application"? - I don't understand this formulation of the question.

It means using it in the same way as the standard optimiser with its genetics and standard FFs. At the same time, acceptability here refers to the probability of obtaining a stable model (settings) on new data. So it is clear what will be searched and found....

By the way, have you evaluated how algorithms cope with categorical features?

Andrey Dik #:
Then what is the point, why do you get different results?

Many algorithms use randomisation for variability, haven't you come across such outside of MOE? You want repeatable results, fix the seed.

Andrey Dik #:
I want to do an experiment, to compare the same neuron by results with gradient and classical optimisation algorithms.

It is necessary to compare not just one model, but at least a hundred - by some descriptive statistical criteria. Just the probability of choosing the right settings (model) you can estimate on new data...

Andrey Dik #:
Ventilated this question, did not find any research on this topic

Gradient descent is used as an efficient method to utilise computational resources. There are more complicated and heavier methods, as the lecturer mentioned in one video, but I did not remember, only the rationale is clear.

 
This is how branches get spammed....