Discussing the article: "Ordinal Encoding for Nominal Variables"

 

Check out the new article: Ordinal Encoding for Nominal Variables.

In this article, we discuss and demonstrate how to convert nominal predictors into numerical formats that are suitable for machine learning algorithms, using both Python and MQL5.

Nominal variables represent categorical data where no inherent order or ranking exists between the categories. Examples specific to financial time series datasets might include:

  • Price bar types (e.g., pin bar, spinning top, hammer)
  • Days of the week (e.g., Monday, Tuesday, Wednesday)

These variables are purely qualitative, meaning there is no implied hierarchy or sequence among the categories. For instance, a pin bar formation is not inherently superior to a spinning top, nor is a bullish bar better than a bearish bar.

In numerical computing, it is common practice to assign arbitrary integers to distinct categories. However, if these integers are used as inputs to a machine learning algorithm, there is a risk that the assigned values may distort the information conveyed by the original data. The algorithm might incorrectly infer that larger values imply a certain relationship or ranking, even if none was intended.


Author: Francis Dube