What to feed to the input of the neural network? Your ideas... - page 62

 
Aleksey Nikolayev #

In the context of trading, I would also remind you of fxsaber's approach, in which a plateau, rather than a top, is searched for. It also turns out that the problem is not clearly formalised as an optimisation problem.

This is the section of optimisation about minimization of a noisy function. Noise function optimisation or something like that, I write from memory.
In fact, everything has already been invented, if a trader makes some know-how, it is rather from illiteracy.

And Saber simply made an early stop in a banal and primitive way, and there was hardly any plateau there, but rather just peace of mind for the soul.
 

Back to the technical implementation...

Optimisation is present when selecting a split in the tree. There are different formulas for this.
In case of 100% tree training, this optimisation determines only the path to absolute data learning. It does not affect the quality of learning, it is 100%. True learning is understood only as 100% learning. Therefore, learning (exact memorisation of what is taught) != optimisation.

But if we undertrained, i.e. stopped splitting before full memorisation, we can stop at different variants of this path - then the model and quality of learning (degree of undertraining) will be different with different split selection algorithms, different learning depth, different number of examples in the worksheet.
Undertraining is a bad idea in the case of teaching unambiguous/exact data (multiplication table, Ohm's law, etc.). The more examples you give to memorise/learn the more accurate the answer will be on new data.

But in the situation with market data, in order not to memorise the noise you have to stop earlier, evaluate and select one of these underlearned models.
As a result, it turns out that optimisation and evaluation are needed for underlearned/unimproved models. A perfectly accurate database doesn't need evaluation, it has everything it wanted to train.

 

Learning is a process, not an outcome.

There can't be 100% or 50% learning. There are no such categories.

The quality of learning is checked only by validation and testing how the student has learnt the lessons. It is not tested by trainees simply repeating after the teacher or reading their notes.

The ability to learn and memorise is different for all models, don't reduce everything to trees and forests.

And this is where the magic happens, when the dumber student (model) is often a better predictor than the smarter one. It's just like in life. And there is a rationale for it.

All this is written in the theory of machine learning, which no one here has even heard of, although it is the basis of the basics.

There is magic going on in the world of MO, but not all this nonsense of yours.
 
mytarmailS #:
This is an optimisation section about minimising a noisy function. Noise function optimisation or something like that, I write from memory.
In fact, everything has already been invented, if a trader makes some know-how, it is rather from illiteracy.

And Saber simply made an early stop in a banal and primitive way, and there was hardly any plateau there, but rather just peace of mind for the soul.
One example of noisy optimisation

As I understand the general philosophy of the approach is as follows:
1. There is a model that learns the noise of the optimised function (it is different in different sections) and predicts it.
2. A certain average of the value + predicted noise is calculated.

3. We get a so-called plateau, not an absolute value
 
Forester #:

Back to the technical implementation...

Optimisation is present when selecting a split in the tree. There are different formulas for this.
In case of 100% tree training, this optimisation determines only the path to absolute data learning. It does not affect the quality of learning, it is 100%. True learning is understood only as 100% learning. Therefore, learning (exact memorisation of what is taught) != optimisation.

But if we undertrained, i.e. stopped splitting before full memorisation, we can stop at different variants of this path - then the model and quality of learning (degree of undertraining) will be different with different split selection algorithms, different learning depth, different number of examples in the worksheet.
Undertraining is a bad idea in case of teaching single-valued/exact data (multiplication table, Ohm's law, etc.). The more examples you give for memorisation/learning, the more accurate the answer will be on new data.

But in the situation with market data, in order not to memorise the noise you have to stop earlier, evaluate and select one of these underlearned models.
As a result, it turns out that optimisation and evaluation are needed for underlearned/unimproved models. A perfectly accurate database doesn't need evaluation, it has everything it wanted to train.

Expand on the following thought:

Assignment 1:
There is arithmetic and one of its operations is multiplication. And there are numbers from 0 to 9. Learn the rule of multiplication, make a multiplication table by multiplying two numbers by each other and learn the multiplication table.

Task 2:
There is a multiplication table. Here it is
...
...
...
Learn it.


In the second option, the learner does not know the multiplication rule, but knows the correct answers.


Are both of these activities learning?

If not, how would you categorise (describe) such activities
 
Maxim Dmitrievsky #:

Well, it isn't. The accuracy of responses on new data (and by new data we mean data other than training data) will depend on the properties of each particular model, not on the number of training examples.

If we consider the case of regularised data. Multiplication table, for example. The more examples you give, the more accurate the answers will be on the new data.
The new data should not be completely different, but between the training examples. I.e. interpolation will go more or less well. 1 tree will give the nearest training example. If by other data you mean data outside the boundaries of the training data, this is already extrapolation. The tree will give an extreme example, because it is the closest.

If we consider market data, then with a large value of noise, any peak from the true pattern will be mixed with noise peaks and we need to somehow choose the true peak and not the noise peak.
Your statements are correct here.

 
Forester #:

If we consider the case of patterned data. Multiplication tables, for example. The more examples you give, the more accurate the answers will be on the new data.
The new data should not be completely different, but between the training examples. I.e. interpolation will go more or less well. 1 tree will give the nearest training example. If by other data you mean data outside the boundaries of the training data, this is already extrapolation. The tree will give the outermost example, because it is the closest.

If we consider market data, then with a large value of noise, any peak from the true pattern will be mixed with noise peaks and we need to somehow choose the true peak and not the noise peak.
Your statements are correct here.

We know nothing about the absence or presence of patterns. We are just doing the training and talking about the same.

We talk about the common approach and what it means(magic). The magic of learning is counter-intuitive to the average person because peeps don't get the hang of it :)

Why it is important not to overtrain, why it is important not to undertrain. Why is it important to reduce the number of traits and parameters. etc. etc. etc.
 
Ivan Butko #:
Expand on the following thought:

Task 1:
There is arithmetic and one of its operations is multiplication. And there are numbers from 0 to 9. Learn the rule of multiplication, make a multiplication table by multiplying two numbers by each other, and learn the multiplication table.

Task 2:
There is a multiplication table. Here it is
...
...
...
Learn it.


In the second option, the learner does not know the multiplication rule, but knows the correct answers.


Are both learning?

If not, how would you categorise (describe) such activities?

Both are training. In the 1st case, a rule/law is taught. In the 2nd case the answers from the 1st are learnt.
Naturally, teaching rules, formulas, laws is more effective, because with a small formula you can get millions of answers without memorising them.
Here on the forum there were threads on the laws of the market and grails. Maybe there is a law, but the noise overlaps it(

 
Forester #:

Both are training. In the 1st case, a rule/law is taught. In the 2nd case the answers from the 1st case are learnt.
Naturally, teaching rules, formulas, laws is more effective, because owning a small formula you can get millions of answers without memorising them.
Here on the forum there were threads on the laws of the market and grails. Maybe there is a law, but the noise overlaps it(

Noise again.

Everyone talks about noise.

But how can we define noise if we don't know the rules and laws?

What if each tick is a component of rules and laws and the problem is the inability of architectures to decipher the "code" of a chart?

It seems like a postulate here (the idea of noise in a price chart)

 

Training requires correct training material. Modern MOEs are too good to complain about.

MOE with a teacher does not involve looking for sparse patterns in the data at all. It requires them to be present in meaningful quantities in the training sample.