How do you practically assess the contribution of a "specific" input to the NS? - page 3

 
alexeymosc:
There's also an opposite situation: theoretically it's possible that there are two inputs with high informativity and one with low. Logically, you want to remove the third one, but if you remove it, the complex quadro relationship (three inputs - output) will be destroyed, and the two remaining inputs will not be so informative anymore.

So I do it on a ready-made example so to speak and immediately see the change in the result of the NS both on the training sample and outside it. If something collapses, it should affect the final result. I removed one input and there was no degradation; I removed another input and there was a degradation of less than 1% of the final result; I removed the third input and there was a degradation of 10%. Then the same with combinations of 2 inputs, 3 inputs etc.

I have only been doing this for a couple of hours, but I have already found a blank input, completely duplicating another input (as a result of a mistake) and 2 inputs, the impact of which is even a minimum of tenths of a percent. I think these 3 inputs are exactly unnecessary.

I also found 2 inputs the exclusion of which does not worsen the result, which was clear, but improves it, which is unexpected. With them I should experiment more, entries are obviously not empty and their influence on the result, even if in the opposite direction, can prove it.

Thanks to all of you, I have really got some useful advices.

 
Figar0:

So I do it on a ready-made example so to speak and immediately see the change in the result of the NS both on the training sample and outside it. If something collapses, it should affect the final result. I removed one input and there was no degradation; I removed another input and there was a degradation of less than 1% of the final result; I removed the third input and there was a degradation of 10%. Then the same with combinations of 2 inputs, 3 inputs etc.


This is the most reliable way of selecting inputs - brute force. It's hard, but it's honest. Good luck!
 

For 20 inputs, a clean sweep is something like 2^20 combinations, i.e. a million.

Information theory comes to mind again, but I'm not going to advise anything.

 
Mathemat:

For 20 inputs, a clean sweep is something like 2^20 combinations, i.e. a million.

So, you can go through "from here until dinner" or "until you get bored"...
And then a genetic test... ...test run.
 

It is possible to determine the "unneededness" of an input. The closer a neuron's weight is to 0, the more "unnecessary" it is. Essentially, the value of a neuron is multiplied by 0, andwhatever is in it will result in 0, i.e. noinput at all.

The most disadvantage of having such an unnecessary neuron is an unnecessarily increased learning time.

But such "unnecessary" neurons may occur not only in the input layer of the grid, but in any of its layers.

You can automate the process of search for unnecessary neurons after the test training - take a modulo value of the neuron weight and if the value is less than a certain threshold value, null it. Then you need to analyze which neurons have weight of 0 and exclude it from the network and re-train - training will be significantly faster and the result will be the same. And, of course, subsequently use such a thinned mesh.

 
joo:

It is possible to determine the "unneededness" of an input. The closer a neuron's weight is to 0, the more "unnecessary" it is. Essentially, the value of a neuron is multiplied by 0, andwhatever is in it will result in 0, i.e.the input is as if it does not exist at all.


this is true. but where does the input always equal zero come from? there can be no such thing.

Most likely, in this case we are talking about a signal that is incommensurably smaller than the other signals. this can easily be corrected by signal scaling.

The signal from SSI will be millions of times larger than the signal from OsMA. These signals are incommensurable and cannot be used without scaling them to the same scale.

 
joo:


But, such "unnecessary" neurons can end up not only in the input layer of the mesh, but in any layer of the mesh at all.


This cannot happen if the transformations in the neurons are non-linear
 
mersi:

That's true. But where does the input signal always equal zero come from?

most likely, in this case we are talking about a signal incommensurably smaller than the other signals. this can easily be corrected by the signal conversion.

The signal from SSI will be millions of times larger than the signal from OsMA. these signals are incommensurable and cannot be used without scaling them down.

I thought that for all neuronists, bringing (scaling) signals for the network into one range, suitable for feeding to the network, is like the "Our Father", but I was wrong, as I see it. :)

So, the signals are scaled and vary, say, in the range [-1.0;1.0]. But one of the input neurons has weight 0. What does it mean? - It means that the grid does not care what value this neuron has at the input, the result of the network does not depend on this input.

mersi:
this cannot happen with non-linear transformations in neurons

It can very well happen. And it often happens so when there are more neurons in inner layers than it is necessary to solve a problem.

 
joo:

I thought that for all neuronists, bringing(scaling) the signals for the network into one range, suitable for feeding to the network, was like the 'Our Father', but I was wrong, as I see it. :)

So, signals are scaled and vary, say, in range [-1.0;1.0]. But one of input neurons has weight 0. What does it mean?- It means that the network does not care what value this neuron has at the input - the result of network operation does not depend on this input.

It very well may. And this is often the case when there are more neurons in inner layers than it is necessary to solve a task.

At first glance this assertion does not seem false.

However, data from input Xi is simultaneously fed to several neurons and all their synapses are not necessarily equal to zero, so excluding input Xi will completely change the network output.

--------------

The more neurons in the network, the more accurate and complex the neural network is able to solve.

Developers of NS themselves limit the number of neurons in a network for the reason of sufficient accuracy of the result for an acceptable learning time, as the number of epochs required for network training grows in a power dependence on the number of neurons.

 
Figar0:

Not quite Friday, but ...

There is an NS, any NS, there is an input A={A1, A2, .... A20}. Train the NS and get a satisfying result. How do we practically evaluate the contribution of each element of input A1, A2, ... A20 to this result?

The options off the top of my head are:

1) Somehow sum up and calculate all weights with which the element passes through the network. I'm not quite clear how to do it, I would have to immerse myself in the network operation and calculate somehow some coefficients, etc.

2) Try to "zero out" somehow, or e.g. reverse an element of input vector and see how it affects the final result. So far I've settled on it.

But before realizing this second variant I decided to ask my advice. Who may have been thinking on this subject longer than me? Maybe someone can advise me a book-article?

Applying a solid science-based approach outside its econometric context raises childish questions.

Doing a regression:

Profit = s(1) * A0 + ... s(n) * A(n)

We estimate the coefficients of this regression.

Immediately we get

probability that a specific coefficient is equal to zero - we delete such an input

the probability that all the coefficients taken together are equal to zero

by ellipses we get correlation coefficients

test for redundant inputs

conduct a test for missing inputs

test for stability of coefficient values (evaluate their randomness)