What to feed to the input of the neural network? Your ideas... - page 58
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Generalisation is more like under-learning. I.e. they remembered it, but not absolutely accurately (they also involved the neighbours in it...). Almost like a schoolboy with a "C"))
But if we memorise something defined by a law (for example Ohm's law), there will be no overlearning, it is easier to get underlearning if there are few examples and an infinite number of them.
For trading, where patterns are almost non-existent and noisy, absolutely accurate memorisation along with noise will result in a loss.For some reason this has been called overlearning. Accurate memorisation is not harmful in itself, as in the case of pattern learning. Memorising noise/trash, on the other hand, is not beneficial.
What is wrong with the usual definition of learning - assigning specific values to the model parameters?
The usual definition of learning as assigning specific values to model parameters may be insufficient for several reasons:
Incompleteness of the process description: Model training involves not only the assignment of values to parameters, but also the process of optimising these parameters based on the data. This process may include selection of an optimisation algorithm, tuning of hyperparameters, selection of a loss function and other aspects that are not covered by simple value assignment.
Ignoring learning dynamics: Modeltraining is a dynamic process that may involve many iterations and steps. Simple value assignment does not capture this iterative nature, where parameters are gradually adjusted to minimise error.
Lack of data context: Model training is data-driven and the training process involves analysing and interpreting that data. Simply assigning values does not take into account how the data is used to train the model and how it affects the final parameters.
Failure to account for generalisation: The goal of model training is not only to minimise error on the training data, but also the ability of the model to generalise its knowledge to new, unseen data. Simply assigning values does not capture this aspect of generalisation.
Ignoring validation and testing: The training process also involves validating and testing the model to evaluate its performance and avoid overtraining. Simply assigning values does not account for these important steps.
Thus, a more complete definition of model learning should include a data-driven parameter optimisation process, taking into account learning dynamics, data context, generalisation ability and validation and testing steps.
About the training...
A couple of years ago I met this expression on a common (not technical site): databases based on neural networks.
In general, I agreed with this term for myself. I do trees myself - a tree-based database is also applicable.
1 leaf in a tree = 1 row in a database.
Differences:
Advantages of trees over databases: generalisation and quick search for the required leaf - no need to go through a million rows, the leaf can be reached through several splits.
Clustering generalises too. Kmeans - by proximity of examples to the centre of the cluster, other methods differently. You can also divide by max number of clusters = number of examples and you will get an analogue of database/leaves without generalisation.
Bottom line: tree learning = it's memorising/recording examples, just like a database. If you stop division/learning before the most accurate memorisation possible, youmemorise with generalisation.Neural networks are more difficult to understand and comprehend, but in essence also a database, though not as obvious as leaves and clusters.
Andrew of course wants to bring up the point that learning is optimisation. No - it is memorisation. But optimisation is also present. You can optimise over variations with learning depth, split methods, etc. Each step of optimisation will train a different model. But learning is not optimisation. It's memorisation.
About the training...
A couple of years ago I met this expression on a common (not technical site): databases based on neural networks.
In general, I agreed with this term for myself. I do trees myself - a tree-based database is also applicable.
1 leaf in a tree = 1 row in a database.
Differences:
Advantages of trees over databases: generalisation and fast search for the required leaf - no need to go through a million rows, the leaf can be reached through several splits.
Clustering generalises too. Kmeans - by proximity of examples to the centre of the cluster, other methods differently. You can also divide by max number of clusters = number of examples and you will get an analogue of database/leaves without generalisation.
Bottom line: tree learning = it's memorising/recording examples, just like a database. If you stop division/learning before the most accurate memorisation possible, youmemorise with generalisation.Neural networks are more difficult to understand and comprehend, but in essence also a database, though not as obvious as leaves and clusters.
Andrew of course wants to bring up the point that learning is optimisation. No - it is memorisation. But optimisation is also present. You can optimise over variations with learning depth, split methods, etc. Each step of optimisation will train a different model. But learning is not optimisation. It is memorisation.
Trees with stopping splitting earlier or clustering with fewer clusters - will generalise and merge data in leaves/clusters. These will be undertrained models, but in the presence of noise they may be more successful than models with exact recall.
There was an example at the beginning of the MO branch with teaching a scaffold the multiplication table. Since it was not fed an infinite number of possible choices for training, the forest produces sometimes exact answers, but mostly approximate answers. Clearly, it is undertrained. But it is able to generalise - finding and averaging the closest to the correct answers of individual trees.
With learning in noise - it's hard to assess quality. Especially if the noise is much stronger than the patterns, as in trading.
For this purpose they invented evaluation on validation and test samples, cross-validation, jacking forward, etc.Maximum quality of training will be at absolutely accurate memorisation, i.e. when all data is completely recorded in the database, or when training a tree to the very last possible split or clustering with number of clusters = number of examples.
Trees with stopping splitting earlier or clustering with fewer clusters - will generalise and merge data in leaves/clusters. These will be undertrained models, but in the presence of noise they may be more successful than models with exact recall.
There was an example at the beginning of the MO branch with teaching a scaffold the multiplication table. Since it was not fed an infinite number of possible choices for training, the forest produces sometimes exact answers, but mostly approximate answers. Obviously, it is undertrained. But it is able to generalise - finding and averaging the closest to the correct answers of individual trees.
With learning in noise - it's hard to evaluate. Especially if the noise is much stronger than the patterns, as in trading.
What is wrong with the usual definition of learning - assigning specific values to model parameters?
It doesn't capture the essence. You can assign any kind of gibberish and nonsense.
If we start from the opposite (memorisation/remembering), then learning is the identification of certain patterns through which you can create or identify new knowledge.
As an example: Chat writes poems on an arbitrary topic.
Maximising the quality of training is maximising the quality of predictions on new data. No one is interested in predictions on the training sample, because they are already known. That's not learning, it's approximation. You don't call approximation learning.