How to train models based on different regressions, and find correlation in the dataset? - General

mytarmailS 2024.09.18 17:24 #36211

I don't know the first thing about graphs, unfortunately.

Maxim Dmitrievsky 2024.09.18 17:25 #36212

mytarmailS #:
I don't know the first thing about graphs unfortunately

reciprocally )

Roman 2024.09.18 23:14 #36213

Maxim Dmitrievsky #:

reciprocally )

You wrote that you need to find correlation in the dataset.
When I tried to train models based on different regressions, I noticed that the higher the correlation between features, the better the model was trained.
And preprocessing of the dataset in the form of data centring reduced the error.
You also write that the features should still be found. Maybe search for signs by correlation could be added somehow?
Well, and PCA algorithm seems to reduce large sample by selecting the main components.
Just sharing a thought, just in case.

How to minimise index Bayesian regression - Has Matstat Econometrics Matan

Maxim Dmitrievsky 2024.09.19 11:49 #36214

Roman #:

You wrote that you need to find the correlation in the dataset.
When I tried to train models based on different regressions, I noticed that the higher the correlation between the features, the better the model was trained.
And preprocessing of the dataset in the form of data centring reduced the error.
You also write that the features should still be found. Maybe search for signs by correlation could be added somehow?
Well, and PCA algorithm seems to reduce large sample by selecting the main components.
Just sharing a thought, just in case.

https:// en.m.wikipedia.org/wiki/Correlation_does_not_imply_causation

That's what this contest is about.

Correlation gives 0.36 maximum accuracy on new data. You can get 1.0 on training data.

Evgeni Gavrilovi 2024.09.19 12:08 #36215

Maxim Dmitrievsky #:

The correlation gives 0.36 maximum accuracy on the new data.

I have a maximum of 0.274, I'm ashamed to send such a result)))

On average, how much time do you spend on one training cycle?

Maxim Dmitrievsky 2024.09.19 12:09 #36216

Evgeni Gavrilovi #:

I get a maximum of 0.274, it's a shame to send such a result)))

How much time do you spend on one training cycle?

The longest time there is markup, it learns quickly. I mark up 1/10 of the dataset first, for speed, see what I get :)

There is no point in sending less than 0.5, it doesn't get into the top 10 anymore

What is your longest Questions from Beginners MQL5 ACTIONS news, forecasts, expectations

Evgeni Gavrilovi 2024.09.19 12:31 #36217

Maxim Dmitrievsky #:
The longest there is the marking, learns fast.

Fast is 5-10 minutes? For some people even 1 hour will seem fast )

Maxim Dmitrievsky 2024.09.19 12:33 #36218

Evgeni Gavrilovi #:

Fast is 5-10 minutes? For some people even 1 hour will seem fast )

less than a minute in a colaba, catbusters.

Maxim Dmitrievsky 2024.09.19 13:14 #36219

Evgeni Gavrilovi #:

Select the part of the dataset to be partitioned and trained, the very end of the Computing section (X_train, y_train)

print(f"Creating X_y_group_train from {len(names_datasets_train)} datasets and graphs")
MAX_SAMPLES = 1000
#  Получаем первые MAX_SAMPLES ключей
first_keys_f = list(names_datasets_train.keys())[:MAX_SAMPLES]
#  Создаем новый словарь с первыми MAX_SAMPLES записями
first_dict_f = {k: names_datasets_train[k] for k in first_keys_f}
first_keys_l = list(names_graphs_train.keys())[:MAX_SAMPLES]
#  Создаем новый словарь с первыми MAX_SAMPLES записями
first_dict_l = {k: names_graphs_train[k] for k in first_keys_l}

X_y_group_train = create_all_columns(
    {
        pearson_correlation: first_dict_f,
        #  enhanced_pearson_correlation: first_dict_f,
        #  fast_regression_analysis: first_dict_f,
        #  ttest: first_10_dict_f,
        #  mutual_information: first_10_dict_f,  #  uncomment this line to add features but at high computational cost
        label: first_dict_l,
    },
    n_jobs=-1,
)

Evgeni Gavrilovi 2024.09.19 13:16 #36220

Maxim Dmitrievsky #:

Select a part of the dataset for markup and training

Thanks, hadn't even thought of that

Machine learning in trading: theory, models, practice and algo-trading - page 3622