Discussing the article: "Quantization in machine learning (Part 1): Theory, sample code, analysis of implementation in CatBoost"

MetaQuotes 2024.03.01 13:03

Check out the new article: Quantization in machine learning (Part 1): Theory, sample code, analysis of implementation in CatBoost.

The article considers the theoretical application of quantization in the construction of tree models and showcases the implemented quantization methods in CatBoost. No complex mathematical equations are used.

So what is quantization and why is it used? Let's figure it out!

First, let's talk a little about the data. So, to create models (carry out training), we require data that is scrupulously collected in the table. The source of such data can be any information that can explain the target one (determined by the model, for example, a trading signal). Data sources are called differently - predictors, features, attributes or factors. The frequency of occurrence of a data line is determined by the occurrence of a comparable process observation of the phenomenon, about which information is being collected and which will be studied using machine learning. The totality of the data obtained is called a sample.

A sample can be representative - this is when the observations recorded in it describe the entire process of the phenomenon under study, or it can be non-representative when there is as much data as it was possible to collect, which allows only a partial description of the process of the phenomenon under study. As a rule, when we deal with financial markets, we are dealing with non-representative samples due to the fact that everything that could happen has not yet happened. For this reason, we do not know how the financial instrument will behave in case of new events (that have not occurred before). However, everyone knows the wisdom "history repeats itself". It is this observation that the algorithmic trader relies on in its research, hoping that among the new events there will be those that were similar to the previous ones, and their outcome will be similar with the identified probability.

Author: Aleksey Vyazmikin

Stanislav Korotky 2023.11.02 14:46 #1

Typos:

3. Сохранение таблиц квантования в указанный файл – ключ "--input-borders-file"

4. Loading quantisation tables from the specified file - key "--output-borders-file"

Reverse.

Sergey Pavlov 2023.11.02 14:59 #2

Quantisation in machine learning is not a quantum neural network (nor is quantum neural network training).

Aleksey Vyazmikin 2023.11.02 16:33 #3

Stanislav Korotky #:

Typos:

Opposite.

Thank you!

Aleksey Vyazmikin 2023.11.02 16:35 #4

Sergey Pavlov neural network learning).

Where is this asserted? Does the word "quantisation" seem to mislead and distort expectations?

Andrey Dik 2023.11.02 17:34 #5

Thanks for the article, interesting!

Aleksey Vyazmikin 2023.11.02 17:41 #6

Andrey Dik #:
Thanks for the article, interesting!

Very excited about it!

Yevgeniy Koshtenko 2023.11.05 22:01 #7

Very interesting article! Can I add you as a friend? I am new to ML. I try to code models and save them in ONNX, but I get plum nonsense or just elementary memorisation of historical data(

Machine learning in trading: [WARNING CLOSED!] Any newbie How to code?

Aleksey Vyazmikin 2023.11.06 03:04 #8

Yevgeniy Koshtenko #:
Very interesting article!

Thank you!

Yevgeniy Koshtenko #:
Can I add you as a friend? I am new to ML. I try to code models and save them in ONNX, but I get plum nonsense or just elementary memorisation of historical data(

Added you, although anyone can write to me - there is no programme restriction.

New comment