What to feed to the input of the neural network? Your ideas... - page 39

 
Aleksey Vyazmikin #:
This is how branches get spammed....

I don't mind

You're constructive.

 
Alexey Volchanskiy #:

Considering I have a scalper running with a 1 Hz ask/bid sampling rate, trading on monthly bars seems like a mental aberration to me. Without the slightest bit of trolling.

remembering any DSP: in the local sandbox, ticks have a frequency of 3-5 Hz ... extremely unpleasant for a 1 sample.

 
Andrey Dik #:


... tens of thousands* of dense lines of code....

tens of thousands of dense lines of code...

Alexei, I'll answer later.
 
I've been thinking

Why is the output of a neural network only BUY and SELL? Well, let's say HOLD.

After all, the same SoftMax can choose... anything, there are no limits to the flight of fancy.

For example, we take two different strategies. One is flat, the other is trending. We shove to the input as usual - what we shoved earlier. At the output, we decide which strategy will trade now (or monitor its signals).

And here Softmax shows the flat strategy: the flat strategy checks the presence of a signal, TP, SL and so on according to its ready formalised rules.


Then again: analysis of the input data. The NS decides that now the chart is better suited for a trend strategy and passes control to it.

UPD

I'll make something simple. If there is something interesting, I will post it.
 
Ivan Butko #:
I've been thinking

Why is the output of a neural network only BUY and SELL? Well, let's say HOLD.

After all, the same SoftMax can choose... anything, there are no limits to the flight of fancy.

For example, we take two different strategies. One is flat, the other is trending. We shove to the input as usual - what we shoved earlier. At the output, we decide which strategy will trade now (or monitor its signals).

And here Softmax shows the flat strategy: the flat strategy checks the presence of a signal, TP, SL and so on according to its ready formalised rules.


Then again: analysis of input data. The NS decides that now the chart is better suited for a trend strategy and passes control to it.

UPD

I'll make something simple. If there is something interesting, I will post it.

It is possible to distribute the outputs into rollback (limit) and breakdown (stop), but as a rule the network is too... either to add neurons to the state when GPT will be obtained, or vice versa, to simplify the options of possible actions of the network. the first option was voiced by me several years ago, but was pelted with tomatoes in the MO branch, although the same people who were opponents of increasing the number of neurons are now even trying to master LLM in relation to markets.

In short, you should try and not listen to anyone. you can listen to me (reference to the famous phrase 😊).

 
Andrey Dik #:

It is possible to distribute the outputs into rollback (limit) and breakdown (stop), but as a rule the network is too... either to add neurons to the state when GPT is obtained, or vice versa, to simplify the options of possible actions of the network. the first option was voiced by me several years ago, but was pelted with tomatoes in the MO branch, although the same people who were opponents of increasing the number of neurons are now even trying to master LLM in relation to markets.

In short, you should try and not listen to anyone. you can listen to me (reference to the famous phrase 😊).

Thanks for the idea.

Limit and stop are essentially different directions. In some one strategy, or also of the two: where it trades the best price, the other on a breakout.


Andrey Dik #:

The first option was voiced by me several years ago, but was thrown with tomatoes in the MO thread, although the same people who were opponents of increasing the number of neurons are now even trying to master LLM as applied to markets.


Well done. Flag in their hands and perhaps they will share the grail with us when the chat will write it to them.

I am stubbornly reluctant to draw the grail.

 
Aleksey Vyazmikin #:

1. And I thought you made a reference to the MO thread, because some participants had critical opinions about the algorithms discussed in your articles. I just decided to clarify, as I understood, what is the reason of disagreement in substance, in my opinion. I don't want to bring up the topic of personalities.

2. I know what to look for - statistically stable dependencies on quantum segments of predictors, but I don't know yet what signs tell about it. When building a model, it is necessary to have a high probability of making the right step at a new iteration - that's all :)

3. ...

4. Still, you don't understand what I mean. To simplify, imagine that you can, in an iterative approach, according to the optimisation algorithm, forcibly check all outcomes of variants from which you need to make a choice, and such a choice will bring the model closer to a particular goal on all available samples. And, here in the pocket is such a model (or coefficients - whatever you want to call it), which is good on the available data, but it is not necessarily the best option. I.e., the algorithm peeks at what the FF will be on the final data when choosing a step to solve for in an iteration. That's what I'm saying, when you don't peek, you can accidentally get a good result that depended on random steps. We're talking about market data. And the result can become bad when fed new data. Anyway, my point is that it's not easy to understand whether you got a good model by chance, or due to an algorithm , on an unrepresentative subsample.

5. So the use is the same as the standard optimiser with its genetics and standard FFs. Admissibility here refers to the probability of obtaining a stable model (settings) on new data. It is clear what will be searched for and found....

6. By the way, have you evaluated how algorithms cope with categorical features?

7. Many algorithms use randomisation for variability, haven't you come across any outside of MOE? You want repeatability of the result, fix the seed.

It is necessary to compare not only one model, but at least a hundred of them - according to some descriptive statistical criteria. Just the probability of choosing the right settings (model) you can estimate on new data...

8. Gradient descent is used as an efficient method of using computational resources. There are more complicated and heavier methods, as the lecturer mentioned in one video, but I didn't remember, only the rationale I understood.

1. The disagreement, as it seems to me, is that opponents of optimisation algorithms deny their applicability on market data, although they do not actively notice (or pretend not to notice) that AOs are present in one form or another in the same neural networks and other MO tools.

2. What degree of robustness is required? A little bit, or a little bit more? It is required to achieve the maximum possible stability on new data, this is the fitness function that needs to be maximised.

3. ...

4. The optimisation algorithm is not the only component of the optimisation process (for some reason everyone forgets this). The AO itself can be compared to petrol for a car, which doesn't care where the car goes, if the petrol is bad the car may not go at all, and the better the petrol, the faster the car can go (i.e. reach the goal faster). I have already given an example several times to understand the role of AO in the optimisation process, I have given a scheme. Let me remind you again, let's imagine that it is possible to make a complete search of parameters (whether it is a simple TC or using MO-methods) and get the value of the fitness function of these parameters, then as we can see, the optimisation algorithm is not involved in this example at all, but there is still a need to choose from all the parameters of the complete search. It can be seen that AO always acts as an accelerator of the result, it itself does not participate in the correctness of the result selection (the fitness function is only an external parameter for AO). Only the fitness function takes part in the correctness of result selection (correctness can be understood as anything, including the ability to successfully operate the TS on new data). Therefore, in the context of talking about robustness or the ability of the system to work successfully on new data, one should look not at the AO, but at the fitness function, what it consists of and everything that precedes the performance of the fitness function (the performance of decision evaluation).

5. See previous paragraphs.

6. In the process of researching this topic. On check at the moment is an article on this topic. Thinking of continuing to expose it (the topic).

7. Randomise the initial state of the system parameters not in order to get random results at the output, but in order to just cover as large a possible area of possible variants of parameters. The output should not be random, but a quite definite result (in terms of robustness - the maximum of the fitness function describing in itself and the robustness index). Here it is convenient to use the method of limits, the first iteration - random parameters, the last iteration - required parameters. Between them the area of fitness function values, finding in which shows the efficiency of the optimisation algorithm, the more right to the random result, the worse AO, respectively, the more left, closer to the required maximum of the optimal result (I repeat, the optimal result, satisfying the maximum possible robustness of the system), the more efficient AO. If the neural network shows different results, and with a large scatter, it means that the algorithm used as part of the neural network is stupidly stuck somewhere in the local extremum on the fitness function (the loss function was used).

8. Did the lecturer probably tell you that algorithms often used for training networks get stuck elementary? - Probably not, but he emphasised, I suppose, that they are very fast. Yes, they are fast because they have no population and therefore reduce the number of required runs on the training data, but that's what they were designed for - to be fast, but convergence is affected (nothing in this world is free).

Alexey, I hope you will now look at the raised topic from a slightly different angle, different from the one accepted in the MO branch, and in general, many other places. MOSists are very similar to believers who take many things on faith (it is neither bad nor good, just sometimes prevents to look at many things from the point of view of logic), or like fanatical alchemists who combine MOS methods in hope to get the philosopher's stone - a working on OOS MOS system. I am not an opponent of MO, but I always try to take things apart to understand the impact of every cog in the machine on the result.

 
Andrey Dik #:

1. The disagreement, as I see it, is that opponents of optimisation algorithms deny their applicability on market data, although they actively ignore (or pretend not to notice) that ARs are present in one form or another in the same neural networks and other MO tools.

2. What degree of robustness is required? A little bit, or a little bit more? It is required to achieve the maximum possible stability on new data, this is the fitness function that needs to be maximised.

3. ...

4. The optimisation algorithm is not the only component of the optimisation process (for some reason everyone forgets this). The AO itself can be compared to petrol for a car, which doesn't care where the car goes, if the petrol is bad the car may not go at all, and the better the petrol, the faster the car can go (i.e. reach the goal faster). I have already given an example several times to understand the role of AO in the optimisation process, I have given a scheme. Let me remind you again, let's imagine that it is possible to make a complete search of parameters (whether it is a simple TC or using MO-methods) and get the value of the fitness function of these parameters, then as we can see, the optimisation algorithm is not involved in this example at all, but there is still a need to choose from all the parameters of the complete search. It can be seen that AO always acts as an accelerator of the result, it itself does not participate in the correctness of the result selection (the fitness function is only an external parameter for AO). Only the fitness function takes part in the correctness of the result selection (correctness can be understood as anything, including the ability to successfully operate the TS on new data). Therefore, in the context of talking about robustness or the ability of the system to work successfully on new data, one should look not at the AO, but at the fitness function, what it consists of and everything that precedes the performance of the fitness function (the performance of decision evaluation).

5. See previous paragraphs.

6. In the process of researching this topic. I have an article on this topic on check at the moment. I think I will continue to explore it (the topic).

7. Randomise the initial state of the system parameters not in order to get random results at the output, but in order to just cover as large as possible possible area of possible variants of parameters. The output should not be random, but a quite definite result (in terms of robustness - the maximum of the fitness function describing in itself and the robustness index). Here it is convenient to use the method of limits, the first iteration - random parameters, the last iteration - required parameters. Between them the area of fitness function values, finding in which shows the efficiency of the optimisation algorithm, the more right to the random result, the worse AO, respectively, the more left, closer to the required maximum of the optimal result (I repeat, the optimal result, satisfying the maximum possible robustness of the system), the more efficient AO. If the neural network shows different results with a large spread, it means that the algorithm used in the neural network is stuck somewhere in the local extremum on the fitness function (the loss function was used).

8. Did the lecturer probably tell you that algorithms often used for training networks get stuck elementary? - Probably not, but he emphasised, I suppose, that they are very fast. Yes, they are fast because they have no population and therefore reduce the number of runs required on the training data by a factor of one, but that's what they were designed for - to be fast, but convergence is affected (nothing in this world is free).

Alexey, I hope you will now look at the raised topic from a slightly different angle, different from the one accepted in the MO branch, and in general, many other places. MOSists are very similar to believers who take many things on faith (it is neither bad nor good, just sometimes prevents to look at many things from the point of view of logic), or like fanatical alchemists who combine MOS methods in hope to get the philosopher's stone - a working on OOS MOS system. I am not an opponent of MO, but I always try to take things apart to understand the impact of every cog in the machine on the result.

1. Above, I have already written about head-on application, about the fact that the articles are considered not as abstract algorithms, but as a replacement for the algorithm of the standard optimiser, which is supposed to be used with standard FFs. And this approach is not very effective, as many people have already figured out. This happens because in every article of any author, the reader tries to find something useful for trading. It is just necessary to take this into account and not to take offence at the readers for this. Maybe you should give examples of FFs in the article that take into account not only the indicators describing the financial result, but also other indicators affecting it, which are implied but not named?

2. Resilience in this context is a binary variable that comes from measuring the bias of the probability of encountering one of the classes on a quantised predictor segment relative to the number of all representatives of the class in the sample. When you change subsamples, the probability bias should not change, so that would be the stability. This is like finding stationarity in non-stationary processes. Then the model is built on these data, and the more correctly detected such quantum segments are, the more probability to choose it at each step of the algorithm of model building, and hence the more probability to build the necessary model. It is clear that the result validation section is not evaluated initially. As a result, there is a goal, there is a metric for evaluation, but what affects the result is not completely clear - we need additional evaluation metrics.

Below is a graph showing the probability (percentage) of selecting a stable quantum segment from the pool at each iteration of the model building algorithm for each of the two classes.

4. I have already written in the first paragraph, and here I will only repeat that people try to understand why they need it, and come to understand the alternative to the standard optimiser with its genetics. What other parameters that are far from the market should be set in FF - it is not clear to most people.

7. I didn't write that the goal is to get random results at the output. The goal is to consider different ways of finding a solution, including changing the abstract landscape from different dimensions.

8. I think it is not correct to assume that people who are knowledgeable in their field do not have the necessary knowledge and experience. This applies to the abstract lecturer as well as to many forum members. Sometimes, before proving your position, you should understand the logic of your opponent. We are engaged in a developing direction, there may be different points of view on the situation, which may change, so it is not productive to think categorically. And if you claim something referring to your experience - I did so and so, but the result was sad, I think it's because "that...". - then maybe someone will suggest a solution, or share his result with similar initial data.

Everything you've written about your work so far here, I've read your articles, as I think many others who have written comments on them have. It is the assumption that people are underdeveloped, which is evident in their beliefs, that leads to conflicts with you. I observe that the MO thread is just an example of one where any assertion or approach is questioned, looking for a reason why a method is ineffective, even if it seems to be effective. That is why I see bias in judgements from your side. Many participants in the MO thread are not reasoned in their assertions, but it is not always from not having those arguments. I think it's an occupational deformity. Yes, it can be frustrating.

If you think you have a deeper knowledge of the issue, understand the mathematics of the process well, and want to benefit people, then pay attention to the approaches of missionaries in ancient times - find common ground and dialogue from there. And if you do not want to do it, you can just ignore other representatives with their views and beliefs. Who needs to read your clever thoughts and will make necessary conclusions for himself.

In general, I tried to show you the other side of the cause of the conflict, in the hope that it will stop, and its participants will hear each other, and begin to adequately treat criticism, without mutual insults.

 
Aleksey Vyazmikin #:

8.

Don't you realise that with this message you are not extinguishing the conflict, but adding oil to the fire?

If you did not do it intentionally, then I suggest that everyone pretend that point 8. in Aleksey Vyazmikin 's post simply does not exist.

 
I like Alexei's interpretation, it is close to the real state of affairs. But we can also add that you should not aggressively teach what you yourself have not yet fully understood. Especially if you have no supporting results. You can prove your point with references to authoritative scientific papers (as is done in the MO thread) or in other ways that should summarise the essence of the narrative, without subjective dogma. Like I've been doing neural networks for 20 years and now I'm the smartest.