Machine learning in trading: theory, models, practice and algo-trading - page 3400
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
how do you figure that out?
Like finding out which method of feature selection is best?
In fact, there are only two types, exhaustive search and euristic search methods (discrete optimisation).
Full search is always better, but not always possible if there are a lot of features. Besides, we are looking not for one best feature, but for the best subset of prizes, on a more or less normal dataset it is impossible to do a complete search because of the combinatorial explosion, so euristic - discrete optimisation (but without guarantee that we have found the best solution).
There is one good package, I haven't tested it deeply and I don't know the mathematics, but guys claim that the algorithm finds the best subgroup in polynomial time (very fast), i.e. it's not a complete search and not euristics. I used it a little bit, I think the package does what they say. So basically this is the leader of the selection methods.
https://abess-team.github.io/abess/
I think there's one for python, too.
====================================
And the point is not even in the efficient selection of prizes (although it is necessary), but in the generation of candidate features. This is the most important thing, this is the most important thing, these are sensors, eyes and ears of the trading system.
This abes is rubbish.
The best set of features on the history by the criterion of minimising the classification error.
And the problem is in the predictive ability of features, not in the classification error. There is a certain predictive ability of features and to it corresponds a certain classification error, which is a given for the available set of features. If you want to reduce the classification error, look for another set of features with increased predictive ability.
how do you figure that out?
Like finding out which method of feature selection is best?
That's right. Take a dozen samples and check the efficiency of building models on selected predictors for each sample.
Full search is always better, but not always possible if there are a lot of features.
This is of course obvious. That's why heuristics are interesting.
There is one good package, I haven't tested it deeply and I don't know the mathematics, but the guys say that the algorithm finds the best subgroup in polynomial time (very fast), i.e. it's not a complete search and it's not heuristic. I've used it a little bit, it seems to do what they say. So in essence, this is the leader of the methods of selection
I've run it on a sample - it's been five hours - I'm tired of waiting. In general, I understand it is suitable for regression more (including logistic for classification), and is not universal.
GPT offers such code in R for selecting and saving excluded predictors. I have here limited the number of predictors to 50 pieces - and decided to wait again.
Excluded all columns :) Either the method is like that, or a bug in the code..... Or maybe the predictors are that bad - I'll add more.
Correct. Take a dozen samples and check the efficiency of model building on selected predictors for each sample.
That's obvious, of course. That's why heuristics are interesting.
I ran it on a sample - it's been five hours already - I'm tired of waiting. In general, I understand that it is more suitable for regression (including logistic regression for classification), and is not universal.
GPT offers such code in R for selecting and saving excluded predictors. I have limited the number of predictors to 50 and decided to wait again.
Excluded all columns :) Either the method is like that, or a bug in the code..... Or maybe the predictors are that bad - I'll add more.
how many columns are in the data?
how many columns are in the data?
A little over 2000
I ran it on a sample - it's been five hours - tired of waiting.
Is it so difficult to read the example with code on the page with the method, why crap with GPT and do not understand what is being done?
it should count in a minute, not 5 hours.
A little over 2,000
Send me the sample data and I'll take a look.
and it's good for regression and classification, it says so, and there are examples... what kind of people.
It's all written down.
Is it so difficult to read the code example on the method page, why do you need to code with GPT and not understand what you are doing?
It should count in a minute, not 5 hours.
What's wrong with the code? It freezes on the help in studio :)
Send me the sample data and I'll take a look at it.
and it is suitable for both regression and classification, it says so, and there are examples... what kind of people.
It's all spelled out.
It says even on the screen "logistic regression" - of course I was looking at examples, which for some reason they already have in Python.
What's wrong with the code? It freezes on the help in the studio :)
It even says "logistic regression" on the screen - of course I was looking at examples, which for some reason are already in python.
Try this one
binary classification
1000 rows
50 thousand features / columns
the best feature subset was found in less than 3 seconds.
all features that are relevant to the target found and none of the 50,000 noisy ones.
1) What is wrong with the code? On the help in studio it hangs for me :)
2) It says "logistic regression" even on the screenshot - of course I looked at the examples, which for some reason are already in python.
1) EVERYTHING is wrong, what the hell is a lasso, it's a regression, and you put the data in for classification.
2) It says CLASSIFICATION :..... , titanic data, target binary survivor/non-survivor.
Logistic regression is a classification algorithm, e.g. it is used to classify texts.