Machine learning in trading: theory, models, practice and algo-trading - page 2827

 
mytarmailS #:
h ttps:// youtu.be/4yrp71kAZxU
.

So what's so interesting about it?

 
when training neurons, how sure are you that it's not stuck somewhere local?
 
Andrey Dik #:
when training neurons, how sure are you that it's not stuck somewhere local?

In neuronics, splitting into batches helps to avoid

in other optimisation algorithms, it is also useful to run or split into batches several times, for example

plus adjusting the gradient step and other tricks.

You should still explore the vicinity of the optimum then through changing hyperparameters to see how survivable the system is.
 
Optimisation algorithm to divide into batches????? )
 
Maxim Dmitrievsky #:

in neuronics, splitting into batches helps to avoid

in other optimisation algorithms, it is also useful to run or split into batches several times, such as

plus adjusting the gradient step and other tricks.

still you need to explore the neighbourhood of the optimum then through changing hyperparameters to see how survivable the system is.

I'm embarrassed to ask, what are batches?

No, I mean, how can you be sure that the neuron doesn't get stuck somewhere? Is it tested for resistance to jamming?

 
Andrey Dik #:

I'm embarrassed to ask, what are batches?

No, I mean, how can you be sure that the neuron doesn't get stuck somewhere? Is it tested for resistance to getting stuck?

It's tested on new data, through early stopping, for example.

when the error drops on the training data and starts to grow on the new data. As long as it doesn't start growing on new data, it means it's not stuck yet.

batches are packets of data for training: it is not necessary to train the whole dataset at once, you can divide the data into packets and train them at each iteration.

Since the optima in each packet will vary, the average will be something less than optimal for all of them.

It's kind of hard to judge what the optimum is in principle. And if the sample is shifted, what to do? How to find this global optimum in a shifted subsample?

Well, if the question is in principle about optimisation algorithms, you are right to ask. But then you start shifting samples, which creates more problems than this question.

if you go from theory to practice a little bit :)

 
Maxim Dmitrievsky #:

is tested on new data, through early stoppage e.g.

when the error drops on the training data and starts to grow on the new data. As long as it doesn't start growing on new data, it means that we are not stuck yet

batches are packets of data for training: it is not necessary to train the whole dataset at once, you can divide the data into packets and train them at each iteration.

since the optima in each packet will vary, the average will be something less than optimal for all of them.

It's kind of hard to judge what the optimum is in principle. And if the sample is shifted, what to do? How to find this global optimum in a shifted subsample?

Well, if the question is in principle about optimisation algorithms, you are right to ask. But then you start shifting samples, which creates more problems than this question.

if you go from theory to practice a little bit :)

ah, so I understood your first answer correctly. there is no way to check jam resistance.

What you say "the error drops on the training ones and starts to grow on the new ones" is not a test for jamming, but just a criterion for stopping training.

my question is about optimisation algorithms that people here use to train neurons, not about increasing the stability of neurons on new data, which is the second stage. well, the first stage has not been discussed here at all yet))))

 
Andrey Dik #:

Ah, so I got your first answer right. there's no way to test for jamming resistance.

What you say about "the error drops on the training ones and starts to grow on the new ones" is not a test for jamming, but just a criterion for stopping training.

my question is about optimisation algorithms that people here use to train neurons, not about increasing the stability of neurons on new data, which is the second stage. well, the first stage has not been discussed here at all yet))))

Well, as if yes, it hasn't been discussed. What is the most popular in neuroncs is the Adam optimisation algorithm. Maybe you can test it somehow too
 
Maxim Dmitrievsky #:
Well, sort of, yeah, not discussed. In neurons, what is the most popular is the Adam optimisation algorithm. Maybe you can test it somehow, too

There you go, it's not discussed at all.

In practice, this means that the neuron will be undertrained, i.e., the error on new data will start growing earlier than if a more jam-resistant AO were used.

 
Yeah, I'll have a look at adam at my leisure, do some tests.