Machine learning in trading: theory, models, practice and algo-trading - page 1956

 

By the way, who's been to the ICP webinar ????? Did I send you a link to it earlier?

Funny, they want to develop a device and software for typing thought by December. Overall I liked what they talked about and it was interesting!!!!! Who was there???

 
Renat Akhtyamov:

I can't say for sure, but I suspect that the analog number is immediately a value, and the digital number must be converted to this number.

But the only thing is that the processor is warm and I'm afraid that their analogs will float over time (give an error) and there will be in addition a lot of noise

Just analog calculations allow you to reduce errors and noise to zero.

A simple example - high end audio equipment is just the tube technology, to avoid sampling and, consequently, loss of accuracy.

 
Mihail Marchukajtes:

By the way, who's been to the ICP webinar ????? Did I send you a link to it earlier?

Funny, they want to develop a device and software for typing thought by December. Overall I liked what they talked about and it was interesting!!!!! Who was there?

We have already invented a cool motor for electric bicycles and sold the patent abroad.

all the more, typing and chipping are satanic and against the church, you can get burned for it

 
Aleksey Vyazmikin:

I wanted to point out that it is not the data that will be different, but the outcome, but now I was thinking, what if we actually add information about the surrounding data to the samples that were not involved in the construction of the model? If we just use the tree as an example, then before the last split, we look at statistics for other predictors, identify those that show a statistically positive variant of classification and look to ensure that in the overall sample these predictors are not correlated with the predictor in the last split, then we have additional information about market conditions, which is not taken into account when selecting the last split. Trees, on the other hand, are built according to the greedy principle, and the last split is certainly a competition among the predictors.

Then our split will be not just A>X, but A>X1 && B>X2&& C>X3- i.e. we will take into account the environment information.

At each split, not just the last one, the best split of all the predictors is chosen. Simply loop through the predictors, nested loop divides them differently and remembers how much cleaner the data became with such a split. Then they take the best one.

On the article: the key phrase there is " "Analog" technology allows you to get almost the same result when multiplying vector matrices with the assumption of less accuracy than when using data as digital 0 and 1. "
Accuracy will float depending on ambient tundra/desert temperature, on the heat of the chip itself after power-up, over time the parameters of some elements also float. Plus noise will change values, in the field they are few, under power lines some, in the city others, near radio transmitters others, etc.
For rough calculations of +-20% may be fine.

 
Aleksey Vyazmikin:
What are the analog calculations in the article - who understood?

These are not calculations, but measurements of analog signals right on the board, they add and subtract themselves, and then you can multiply and divide, and this is of course faster than counting, because it is a single operation) and then digitization. I think the structure is analog inputs, then an adder or divider of analog signal and at the end a meter in a certain cell and there are a lot of cells and access to them. Like charge-coupled devices, like matrices in cameras, laser range finders. They get an analog signal from a photocell at the input, measure the signal, so to speak, and give back a digital signal.

Correct, at the end of the cell is ADC, at the beginning is DAC. The CCD has a 0 or 1 output from the cell. So it is not quite true that they measure something.

 
elibrarius:

Each split, not just the last one, selects the best split of all the predictors. Simply cycle through the predictors, subdividing them in different ways and remembering how much cleaner the data became with such a split. Then they take the best one.

Still, it's interesting how hard it can be to get an idea :) Of course, what you have written all here knows, it is not news. I'm talking exactly about keeping measurements of parallel values of predictors on the last split, but not all of them, but those whose results are close to the selected one, but at the same time they don't correlate with each other on the general sample. In this case the decision will be made in the subarea not on the basis of one split, but taking into account the other splits supporting it. I do something similar when I'm grouping leaves now, but I have uncontrollable bidders, while here I'm supposed to do it for all the leaves.


elibrarius:

On the article: the key phrase there is " "Analog" technology allows you to get almost the same result when multiplying vector matrices with the assumption of less accuracy than when using data as digital 0 and 1. "

Accuracy will float depending on the ambient tundra/desert temperature, the heat of the chip itself after power-up, over time the parameters of some elements also float. Plus noise will change values, in the field they are few, under power lines some, in the city others, near radio transmitters others, etc.
For rough calculations +-20% may be fine.

The temperature can be stabilized, you can know the dependence on temperature changes to make adjustments in the intermediate digital results, and at the output.

 
Andrey Dik:

something like tube transistors, it seems, only very small.

Some kind of action with waves in essence? Incoming data is converted to a polynomial, and then the polynomial is converted to a wave and the waves are somehow "colliding/merging"?

 
Valeriy Yastremskiy:

This is not calculations, but measurements directly on the board of analog signals, they add and subtract themselves, and then you can multiply and divide, and it is certainly faster than counting, since it is a single operation) and then digitization. I think the structure is analog inputs, then an adder or divider of analog signal and at the end a meter in a certain cell and there are a lot of cells and access to them. Like charge-coupled devices, like matrices in cameras, laser range finders. They get an analog signal from a photocell at the input, measure the signal, so to speak, and give back a digital signal.

Correct, at the end of the cell is ADC, at the beginning is DAC. The CCD has a 0 or 1 output from the cell. So it is not quite true that they measure something.

Earlier here I posted about NS in the form of transparent platinum with refractions inside - light enters from one side, then according to optical laws is redistributed and comes out in another place - I think that here something similar should be, which will really give higher speed of data processing.

But how to make analog calculations, degree conversion, it's not clear with ADC...

 
Aleksey Vyazmikin:

Still, it's interesting how hard it can be to get an idea across :) Of course, what you have written everyone here knows, it is not news. I'm talking specifically about keeping measurements of parallel predictor values on the last split, but not all of them, but those whose results are close to the selected one, but at the same time they don't correlate with each other on the total sample. In this case the decision will be made in the subarea not on the basis of one split, but taking into account the other splits supporting it. I do something similar now when I'm grouping leaves, but my contenders are uncontrollable, and here I'm supposed to do it forcingly for all leaves.

That's right. It would be good to describe the sequence of actions right away...
Thought about your description again, I assume the following sequence:

1. Calculate correlation of all predictors on train
2. Build the tree
3. On the last split, remember e.g. the last 100 best splits. Reserve up to 100, so that there is plenty to choose from.
4. From these 100 choose 5 non-correlated with the predictor of the best split and non-correlated with each other.

Further it is not clear which of these 5 different splits to choose?
If random - then there will be an analogue of the random forest, which gives each tree random predictors and builds a tree on them.
If for average, then again analogous to random forest, the forest then finds arithmetic mean value of the final prediction from random tree forecasts.

 
Aleksey Vyazmikin:

Earlier here I posted about NS in the form of transparent platinum with refractions inside - light enters from one side, then by optical laws is redistributed and comes out in another place - I think that here something similar should be, which will really give higher speed of data processing.

But how to do analog calculations, there's degree conversion - it's not quite clear with ADC...

No, analog actions are sum, subtraction, multiplication, division, and possibly more complex logarithmic relationships, power relationships. And these are not calculations, but analog gauges in every cell. And DACs and ADCs are the input output, they don't take part in calculations, they provide the digital.

In the Neumann architecture both procedures and data are stored in memory and there is no parallel access to procedures and data, access to data, then to procedure, and back to data, hence limitations in data processing. And here procedures are stored in each cell by small device and there is access to procedure at once, with access to data.