Machine learning in trading: theory, models, practice and algo-trading - page 1963

 
Maxim Dmitrievsky:

several D-neurons (like a grid)

error, % = 45.10948905109489

goodbye )

I sent the author of the grid my cuts and my indignation by mail.
What did you determine? The authenticity of the banknotes?
 
Valeriy Yastremskiy:
What did you determine? The authenticity of the bills?

Yes

 
Maxim Dmitrievsky:

yes

Damaging logic.
 
Valeriy Yastremskiy:
Flawed logic.

There may be some pitfalls. For example you can not use negative values in signs, because he uses binarized in his microtests. There is nothing about this in the scanty description, it does not show errors.

 
Maxim Dmitrievsky:

maybe there are pitfalls. For example, you can't have negative values in traits, because it uses binarized ones in its microtests. There is nothing about it in the scanty description, it does not show any errors.

The flaws often occur based on some non-explicit signs. And to detect them is quite a problem in a seemingly correct logic.
 
Weights on one side and binarity on the other. That's what we've come from.
 

Interesting NS approach for Collaborative Filtering

You can take trading tools and strategies instead of people and movie IDs, and some metric instead of grades (expectation, etc.). Then calculate hidden variables for the instrument and strategy. And then everything you want. Select a system for a tool or generate it on the fly with the necessary characteristics, build synthetics for the system....

 
Maxim Dmitrievsky:
I sent the author of the grid my cuts and my indignation by mail

I wonder what he wrote back.

 
mytarmailS:

I wonder what he wrote off.

Nothing so far. There has to be some regularity in the samples, that's the whole point. It's a different approach. I think on regular sets and on should be taught. That is, the lower the entropy in the row, the better the res, and in that dataset the samples are randomly shuffled. In oabotschneonsteyan - it is not so much the pattern that matters, but their sequence
 
elibrarius:
We mix the cleanest split with the less clean ones. I.e. we worsen the result on the tray, in principle it is not important for us. But also not the fact that it will improve the result on the test, i.e. generalizability. Someone should try it... Personally, I don't think that generalization will be any better than a random forest.

It's much easier to limit the depth of the tree and not do the last split, stopping at the previous one. We'll end up with the same less clear sheet than if we did an extra split. Your option will give something in between whether we did a split or not. I.e. for example you will average a sheet at the 7th level of depth with your method. It will be slightly cleaner than the 6th depth level sheet. I think the generalization won't change much from this, and it's a lot of work to test the idea. You could also average a few trees with depth levels 6 and 7 - you'd get about the same as your methodology.

I probably didn't clarify earlier that there should be at least 1% of the indicators in the sheet on small samples and 100 on large samples, so of course the breakdown won't be to the point of no error in the sheet on any class.

You seem to misunderstand the last step - I see it as a statistical evaluation of the remaining 1% sample - in this sample we observe that the result improves with splits by different predictors, we get subspace information, for example:

If A>x1, then target 1 will be true 40%, which is 60% of the subsample

If B>x2, then target 1 will be correctly defined by 55%, which is 45% of the subsample

If A<=x1, then target 1 will be true 70%, which is 50% of the subsample

Each such split has a significance factor (haven't decided how to count it yet), and the last split has one as well.

and so on, let's say up to 5-10 predictors, then when applying, if we reach the last split, we add up the coefficients (or use a more complicated method of summation), and if the sum of coefficients exceeds the threshold, then the sheet is classified 1, otherwise zero.


A simple way to implement this is to forcibly build a forest up to the penultimate split, and then exclude predictors already selected from the sample, so that new ones would be selected. Or simply after building the tree, filter the sample by sheet and go through each predictor by itself in search of the best split that meets the criterion of completeness and accuracy.

And, the result on the training sample will improve if the other class "0" means no action and not the opposite input, otherwise there can be both improvement and deterioration.