Comparing, sorting, and searching in arrays
The MQL5 API contains several functions that allow comparing and sorting arrays, as well as searching for the maximum, minimum, or any specific value in them.
int ArrayCompare(const void &array1[], const void &array2[], int start1 = 0, int start2 = 0, int count = WHOLE_ARRAY)
The function returns the result of comparing two arrays of built-in types or structures with fields of built-in types, excluding strings. Arrays of class objects are not supported. Also, you cannot compare arrays of structures that contain dynamic arrays, class objects, or pointers.
By default, the comparison is performed for entire arrays but, if necessary, you can specify parts of arrays, for which there are parameters start1 (starting position in the first array), start2 (starting position in the second array), and count.
Arrays can be fixed or dynamic, as well as multidimensional. During comparison, multidimensional arrays are represented as equivalent one-dimensional arrays (for example, for two-dimensional arrays, the elements of the second row follow the elements of the first, the elements of the third row follow the second, and so on). For this reason, the parameters start1, start2, and count for multidimensional arrays are specified through element numbering, and not an index along the first dimension.
Using various start1 and start2 offsets you can compare different parts of the same array.
Arrays are compared element by element until the first discrepancy is found or the end of one of the arrays is reached. The relationship between two elements (which are in the same positions in both arrays) depends on the type: for numbers, the operators '>', '<', '==' are used, and for strings, the StringCompare function is used. Structures are compared byte by byte, which is equivalent to executing the following code for each pair of elements:
uchar bytes1[], bytes2[];
|
Based on the ratio of the first differing elements, the result of bulk comparison of the arrays array1 and array2 is obtained. If no differences are found, and the length of the arrays is equal, then the arrays are considered the same. If the length is different, then the longer array is considered greater.
The function returns -1 if array1 is "less than" array2, +1 if array1 is "greater than" array2, and 0 if they are "equal".
In case of an error, the result is -2.
Let's look at some examples in the script ArrayCompare.mq5.
Let's create a simple structure for filling the arrays to be compared.
struct Dummy
|
The class fields are filled with random numbers (each time the script is run, we will receive new values).
In the OnStart function, we describe a small array of structures and compare successive elements with each other (as moving neighboring fragments of an array with the length of 1 element).
#define LIMIT 10
|
Below are the results for one of the array options (for the convenience of analysis, the column with the signs "greater than" (+1) / "less than" (-1) is added directly to the right of the contents of the array):
[x] [y] // result
|
Comparing the two halves of the array to each other gives -1:
// compare first and second half
|
Next, we will compare arrays of strings with predefined data.
string s[] = {"abc", "456", "$"};
|
Finally, let's check the ratio of array fragments:
PRT(ArrayCompare(s0, s1, 1, 1, 1)); // second elements (with index 1) are equal
|
bool ArraySort(void &array[])
The function sorts a numeric array (including possibly a multidimensional array) by the first dimension. The sorting order is ascending. To sort an array in descending order, apply the ArrayReverse function to the resulting array or process it in reverse order.
The function does not support arrays of strings, structures, or classes.
The function returns true if successful or false in case of error.
If the "timeseries" property is set for an array, then the elements in it are indexed in the reverse order (see details in section Array indexing direction as in timeseries), and this has an "external" reversal effect on the sorting order: when you process such an array directly, you will get descending values. At the physical level, the array is always sorted in ascending order, and that is how it is stored.
In the script ArraySort.mq5 a 10 by 3, 2-dimensional array is generated and sorted using ArraySort:
#define LIMIT 10
|
According to the log, the first column is sorted in ascending order (specific numbers will vary due to random generation):
Before sort
|
The values in the following columns have moved synchronously with the "leading" values in the first column. In other words, the entire rows are permuted, despite the fact that only the first column is the sorting criterion.
But what if you want to sort a two-dimensional array by a column other than the first one? You can write a special algorithm for that. One of the options is included in the file ArraySort.mq5 as a template function:
template<typename T>
|
The given function only works with dynamic arrays because the size of array is doubled to assemble intermediate results in the second half of the array, and finally, the first half (original) is removed with ArrayRemove. That is why the original test array in the OnStart function was distributed through ArrayResize.
We encourage you to study the sorting principle on your own (or turn over a couple of pages).
Something similar should be implemented for arrays with a large number of dimensions (for example, array[][][]).
Now recall that in the previous section, we raised the issue of sorting an array of structures by an arbitrary field. As we know, the standard ArraySort function is not able to do this. Let's try to come up with a "bypass route". Let's take the class from the ArraySwapSimple.mq5 file from the previous section as a basis. Let's copy it to ArrayWorker.mq5 and add the required code.
In the Worker::process method, we will provide a call to the auxiliary sorting method arrayStructSort, and the field to be sorted will be specified by number (how it can be done, we will describe below):
...
|
Now it becomes clear why all the previous modes (values of the mode parameter) in the process method were negative: zero and positive values are reserved for sorting and correspond to the "column" number.
The idea of sorting an array of structures is taken from sorting a two-dimensional array. We only need to somehow map a single structure to a one-dimensional array (representing a row of a two-dimensional array). To do this, firstly, you need to decide what type the array should be.
Since the worker class is already a template, we will add one more parameter to the template so that the array type can be flexibly set.
template<typename T, typename R>
|
Now, let's get back to associations, which allow you to overlay variables of different types on top of each other. Thus, we get the following tricky construction:
union Overlay
|
In this union, the type of the structure is combined with an array of type R, and its size is automatically calculated by the compiler based on the ratio of the sizes of two types, T and R.
Now, inside the arrayStructSort method, we can partially duplicate the code of two-dimensional array sorting.
bool arrayStructSort(const int field)
|
Instead of an array with the original structures, we prepare the temp[][2] array of type R, extend it to the number of records in array, and write the following in the loop: the "display" of the required field field from the structure at the 0th index of each row, and the original index of this element at the 1st index.
The "display" is based on the fact that fields in structures are usually aligned in some way since they use standard types. Therefore, with a properly chosen R type, it is possible to provide full or partial hitting of fields in the array elements in the "overlay".
For example, in the standard structure MqlRates the first 6 fields are 8 bytes in size, and therefore map correctly onto the array double or long (these are R template type candidates).
struct MqlRates
|
With the last two fields, the situation is more complicated. If the field spread still can be reached using type int as R, then the field real_volume turns out to be at an offset that is not a multiple of its own size (due to the field type int, i.e. 4 bytes, before it). These are problems of a particular method. It can be improved, or another method can be invented.
But let's go back to the sorting algorithm. After the array temp is populated, it can be sorted with the usual function ArraySort, and then the original indexes can be used to form a new array with the correct structure order.
...
|
Before exiting the function, we use ArraySwap again, in order to replace the contents of an intra-object array array in a resource-efficient way with something new and ordered, which is received in the local array result.
Let's check the class worker in action: in the function OnStart let's define an array of structures MqlRates and ask the terminal for several thousand records.
#define LIMIT 5000
|
The CopyRates function will be described in a separate section. For now, it's enough for us to know that it fills the passed array rates with quotes of the symbol and timeframe of the current chart on which the script is running. The macro LIMIT specifies the number of requested bars: you need to make sure that this value is not greater than your terminal's setting for the number of bars in each window.
To process the received data, we will create an object worker with types T=MqlRates and R=double:
Worker<MqlRates, double> worker(rates); |
Sorting can be started with an instruction of the following form:
worker.process(offsetof(MqlRates, open) / sizeof(double)); |
Here we use the offsetof operator to get the byte offset of the field open inside the structure. It is further divided by the size double and gives the correct "column" number for sorting by the open price. You can read the sorting result element by element, or get the entire array:
Print(worker[i].open);
|
Note that getting an array by the method get moves it out of the inner array array to the outer one (passed as an argument) with ArraySwap. So, after that the calls worker.process() are pointless: there is no more data in the object worker.
To simplify the start of sorting by different fields, an auxiliary function sort has been implemented:
void sort(Worker<MqlRates, double> &worker, const int offset, const string title)
|
It outputs a header and the first and last elements of the sorted array to the log. With its help, testing in OnStart for three fields looks like this:
void OnStart() { ... Worker<MqlRates, double> worker(rates); sort(worker, offsetof(MqlRates, open) / sizeof(double), "Sorting by open price..."); sort(worker, offsetof(MqlRates, tick_volume) / sizeof(double), "Sorting by tick volume..."); sort(worker, offsetof(MqlRates, time) / sizeof(double), "Sorting by time..."); } |
Unfortunately, the standard function print does not support printing of single structures, and there is no built-in function StructPrint in MQL5. Therefore, we had to write it ourselves, based on ArrayPrint: in fact, it is enough to put the structure in an array of size 1.
template<typename S>
|
As a result of running the script, we can get something like the following (depending on the terminal settings, namely on which symbol/timeframe it is executed):
Sorting by open price...
|
Here is the data for EURUSD,M15.
The above implementation of sorting is potentially one of the fastest because it uses the built-in ArraySort.
If, however, the difficulties with aligning the fields of the structure or the skepticism towards the very approach of "mapping" the structure into an array force us to abandon this method (and thus, the function ArraySort), the proven "do-it-yourself" method remains at our disposal.
There are a large number of sorting algorithms that are easy to adapt to MQL5. One of the quick sorting options is presented in the file QuickSortStructT.mqh attached to the book. This is an improved version QuickSortT.mqh, which we used in the section String comparison. It has the method Compare of the template class QuickSortStructT which is made purely virtual and must be redefined in the descendant class to return an analog of the comparison operator '>' for the required type and its fields. For the user convenience, a macro has been created in the header file:
#define SORT_STRUCT(T, A, F) \
|
Using it, to sort an array of structures by a given field, it is enough to write one instruction. For example:
MqlRates rates[];
|
Here the rates array of type MqlRates is sorted by the high price.
int ArrayBsearch(const type &array[], type value)
The function searches a given value in a numeric array. Arrays of all built-in numeric types are supported. The array must be sorted in ascending order by the first dimension, otherwise the result will be incorrect.
The function returns the index of the matching element (if there are several, then the index of the first of them) or the index of the element closest in value (if there is no exact match), ti.e., it can be an element with either a larger or smaller value than the one being searched for. If the desired value is less than the first (minimum), then 0 is returned. If the searched value is greater than the last (maximum), its index is returned.
The index depends on the direction of the numbering of the elements in the array: direct (from the beginning to the end) or reverse (from the end to the beginning). It can be recognized and changed using the functions described in the section Array indexing direction as in timeseries.
If an error occurs, -1 is returned.
For multidimensional arrays, the search is limited to the first dimension.
In the script ArraySearch.mq5 one can find examples of using the function ArrayBsearch.
void OnStart()
|
For three predefined arrays (one of them is empty), the following statements are executed:
PRTS(ArrayBsearch(array, -1)); // 0
|
Further, in the populateSortedArray helper function, the numbers array is filled with random values, and the array is constantly maintained in a sorted state using ArrayBsearch.
void populateSortedArray(const int limit)
|
Each new value goes first into a one-element array element, because this way it's easier to insert it into the resulting array numbers using the function ArrayInsert.
ArrayBsearch allows you to determine where the new value should be inserted.
The result of the function is displayed in the log:
void OnStart()
|
int ArrayMaximum(const type &array[], int start = 0, int count = WHOLE_ARRAY)
int ArrayMinimum(const type &array[], int start = 0, int count = WHOLE_ARRAY)
The functions ArrayMaximum and ArrayMinimum search a numeric array for the elements with the maximum and minimum values, respectively. The range of indexes for searching is set by start and count parameters: with default values, the entire array is searched.
The function returns the position of the found element.
If the "serial" property ("timeseries") is set for an array, the indexing of elements in it is carried out in the reverse order, and this affects the result of this function (see the example). Built-in functions for working with the "serial" property are discussed in the next section. More details about "serial" arrays will be discussed in the chapters on timeseries and indicators.
In multidimensional arrays, the search is performed on the first dimension.
If there are several identical elements in the array with a maximum or minimum value, the function will return the index of the first of them.
An example of using functions is given in the file ArrayMaxMin.mq5.
#define LIMIT 10
|
The script will log something like the following set of strings (due to random data generation, each run will be different):
22242 5909 21570 5850 18026 24740 10852 2631 24549 14635
|