What to feed to the input of the neural network? Your ideas... - page 58

 
Forester #:

Generalisation is more like under-learning. I.e. they remembered it, but not absolutely accurately (they also involved the neighbours in it...). Almost like a schoolboy with a "C"))

But if we memorise something defined by a law (for example Ohm's law), there will be no overlearning, it is easier to get underlearning if there are few examples and an infinite number of them.

For trading, where patterns are almost non-existent and noisy, absolutely accurate memorisation along with noise will result in a loss.
For some reason this has been called overlearning. Accurate memorisation is not harmful in itself, as in the case of pattern learning. Memorising noise/trash, on the other hand, is not beneficial.
Generalisation is a balance between under and over :) Rough example from life: learnt Maxwell's formula well, but failed to apply it in reality in practice, this is over-learning. Knew that Maxwell's formula exists, but did not remember how it is written down, but in practice remembered about it, read it again and applied it. This is generalisation (learning) and not wasted years at uni.
 
What's wrong with the usual definition of learning - assigning specific values to model parameters?
 
Aleksey Nikolayev #:
What is wrong with the usual definition of learning - assigning specific values to the model parameters?
You can ask the model itself :)

The usual definition of learning as assigning specific values to model parameters may be insufficient for several reasons:

  1. Incompleteness of the process description: Model training involves not only the assignment of values to parameters, but also the process of optimising these parameters based on the data. This process may include selection of an optimisation algorithm, tuning of hyperparameters, selection of a loss function and other aspects that are not covered by simple value assignment.

  2. Ignoring learning dynamics: Modeltraining is a dynamic process that may involve many iterations and steps. Simple value assignment does not capture this iterative nature, where parameters are gradually adjusted to minimise error.

  3. Lack of data context: Model training is data-driven and the training process involves analysing and interpreting that data. Simply assigning values does not take into account how the data is used to train the model and how it affects the final parameters.

  4. Failure to account for generalisation: The goal of model training is not only to minimise error on the training data, but also the ability of the model to generalise its knowledge to new, unseen data. Simply assigning values does not capture this aspect of generalisation.

  5. Ignoring validation and testing: The training process also involves validating and testing the model to evaluate its performance and avoid overtraining. Simply assigning values does not account for these important steps.

Thus, a more complete definition of model learning should include a data-driven parameter optimisation process, taking into account learning dynamics, data context, generalisation ability and validation and testing steps.

 
In general, I wonder why valuable experts start discussing a complex and interesting topic without being orientated in it :)
 
Forester #:

About the training...

A couple of years ago I met this expression on a common (not technical site): databases based on neural networks.
In general, I agreed with this term for myself. I do trees myself - a tree-based database is also applicable.
1 leaf in a tree = 1 row in a database.

Differences:

1 row in the database contains only 1 example from the data stored in the database.

1 leaf contains:

1) 1 example and all exactly the same examples (when dividing the tree as much as possible up to the last difference)

or

2) 1 example and exactly the same examples + the most similar examples if the division stops earlier. This is called generalisation of examples.
Similar examples are defined differently by different algorithms when selecting tree splits.

Advantages of trees over databases: generalisation and quick search for the required leaf - no need to go through a million rows, the leaf can be reached through several splits.

Clustering generalises too. Kmeans - by proximity of examples to the centre of the cluster, other methods differently. You can also divide by max number of clusters = number of examples and you will get an analogue of database/leaves without generalisation.

Neural networks are more difficult to understand and comprehend, but in essence also a database, though not as obvious as leaves and clusters.

Bottom line: tree learning = it's memorising/recording examples, just like a database. If you stop division/learning before the most accurate memorisation possible, youmemorise with generalisation.

Andrew of course wants to bring up the point that learning is optimisation. No - it is memorisation. But optimisation is also present. You can optimise over variations with learning depth, split methods, etc. Each step of optimisation will train a different model. But learning is not optimisation. It's memorisation.
If you only knew how much nonsense you've been saying with a smart look.

But I don't have the time or inclination to explain it.
 
Forester #:

About the training...

A couple of years ago I met this expression on a common (not technical site): databases based on neural networks.
In general, I agreed with this term for myself. I do trees myself - a tree-based database is also applicable.
1 leaf in a tree = 1 row in a database.

Differences:

1 row in the database contains only 1 example from the data stored in the database.

1 leaf contains:

1) 1 example and all exactly the same examples (when dividing the tree as much as possible up to the last difference)

or

2) 1 example and exactly the same examples + the most similar examples if the division stops earlier. This is called generalisation of examples.
Similar examples are defined differently by different algorithms when selecting tree splits.

Advantages of trees over databases: generalisation and fast search for the required leaf - no need to go through a million rows, the leaf can be reached through several splits.

Clustering generalises too. Kmeans - by proximity of examples to the centre of the cluster, other methods differently. You can also divide by max number of clusters = number of examples and you will get an analogue of database/leaves without generalisation.

Neural networks are more difficult to understand and comprehend, but in essence also a database, though not as obvious as leaves and clusters.

Bottom line: tree learning = it's memorising/recording examples, just like a database. If you stop division/learning before the most accurate memorisation possible, youmemorise with generalisation.

Andrew of course wants to bring up the point that learning is optimisation. No - it is memorisation. But optimisation is also present. You can optimise over variations with learning depth, split methods, etc. Each step of optimisation will train a different model. But learning is not optimisation. It is memorisation.

and how is the quality of learning determined?
 
Andrey Dik #:

and how is the quality of instruction determined?
The maximum learning quality will be with absolutely accurate memorisation, i.e. with a complete record of all data in the database, or with training a tree to the very last possible split or with clustering with number of clusters = number of examples.

Trees with stopping splitting earlier or clustering with fewer clusters - will generalise and merge data in leaves/clusters. These will be undertrained models, but in the presence of noise they may be more successful than models with exact recall.

There was an example at the beginning of the MO branch with teaching a scaffold the multiplication table. Since it was not fed an infinite number of possible choices for training, the forest produces sometimes exact answers, but mostly approximate answers. Clearly, it is undertrained. But it is able to generalise - finding and averaging the closest to the correct answers of individual trees.

With learning in noise - it's hard to assess quality. Especially if the noise is much stronger than the patterns, as in trading.

For this purpose they invented evaluation on validation and test samples, cross-validation, jacking forward, etc.
 
Forester #:
Maximum quality of training will be at absolutely accurate memorisation, i.e. when all data is completely recorded in the database, or when training a tree to the very last possible split or clustering with number of clusters = number of examples.

Trees with stopping splitting earlier or clustering with fewer clusters - will generalise and merge data in leaves/clusters. These will be undertrained models, but in the presence of noise they may be more successful than models with exact recall.

There was an example at the beginning of the MO branch with teaching a scaffold the multiplication table. Since it was not fed an infinite number of possible choices for training, the forest produces sometimes exact answers, but mostly approximate answers. Obviously, it is undertrained. But it is able to generalise - finding and averaging the closest to the correct answers of individual trees.

With learning in noise - it's hard to evaluate. Especially if the noise is much stronger than the patterns, as in trading.

Maximising the quality of training is maximising the quality of predictions on new data. No one is interested in forecasts on the training sample, because they are already known. This is no longer learning, but approximation. You don't call approximation learning.

For example, a two-layer MLP is a universal approximator that can approximate any arbitrary function to any accuracy. Does that mean it is maximally quality trained - of course not. Otherwise they wouldn't be inventing other neural network architectures that are better at exactly learning, not fitting, for specific tasks.

Weak, although you seem to have been on the topic for a long time.
 
Aleksey Nikolayev #:
What is wrong with the usual definition of learning - assigning specific values to model parameters?

It doesn't capture the essence. You can assign any kind of gibberish and nonsense.

If we start from the opposite (memorisation/remembering), then learning is the identification of certain patterns through which you can create or identify new knowledge.

As an example: Chat writes poems on an arbitrary topic.

 
Maxim Dmitrievsky #:
Maximising the quality of training is maximising the quality of predictions on new data. No one is interested in predictions on the training sample, because they are already known. That's not learning, it's approximation. You don't call approximation learning.

For example, a two-layer MLP is a universal approximator that can approximate any arbitrary function to any accuracy. Does that mean it is maximally quality trained - of course not. Otherwise, we would not invent other neural network architectures that are better at learning, not fitting, for specific tasks.
So you've got to make up your mind.

Aproximation is not learning, but neuronics is an aproximator...

Neuronics doesn't train?


One thinks the database is a classifier, the other one is confused with approximation....

What are you experts? 😀