Generic Class Library - bugs, description, questions, usage and suggestions - page 35

 
Renat Fatkhullin:

Almost impossible and if you run into one, access is still super effective.

Collision at any level of efficiency is bad. Thanks for the reply. I'll keep that in mind.

 
fxsaber:

Collision at any level of efficiency is bad. Thank you for your reply. I'll keep that in mind.

Hashmaps are by definition not collision-free.

 
fxsaber #:

Question on collisions. Is it possible to run into a collision in such a case?

If 27,000 records have been produced.

You will constantly get collisions, especially when the collection grows (resizes). Since the hash is additionally truncated to the gang number, and there is always a limited number of them (ideally equal to one).

Conclusions:

1. It's not the crypto stability of the hash function that matters to map, it's its speed. The hash function just needs to provide a good randomness and that's it.

2. Resize is best avoided by all means. It is resizing that causes the main drawdown, not collisions of the hash function. Whenever possible, always initialize the collection with a preset number of items. Even if you don't know them exactly, give an approximate number - it will greatly help the algorithm.

3. Collisions are a regular compaction mechanism. Through collisions, it is possible to store three records with id, say, 159, 4'569'209, 29'459'876 in array of length three, rather than length 30 000 000, even if the last two values both fall into the gang with index 2.

 
Vasiliy Sokolov #:

2. Resizing is best avoided by all means. It is resizing, not hash function collisions, which is the main drawback. Whenever possible, initialize the collection with a preset number of items. Even if you don't know them exactly, give an approximate number - it will be a great help to the algorithm.

Please show by example.

I only use this so far.

//+------------------------------------------------------------------+
//| Adds the specified key and value to the map.                     |
//+------------------------------------------------------------------+
template<typename TKey,typename TValue>
bool CHashMap::Add(TKey key,TValue value);
//+------------------------------------------------------------------+
//| Gets the value associated with the specified key.                |
//+------------------------------------------------------------------+
template<typename TKey,typename TValue>
bool CHashMap::TryGetValue(TKey key,TValue &value);
 
fxsaber #:

Please show me an example.

I only use this so far.

When you create a CHashMap collection, use one of the constructors that accepts the capacity parameter. See my article about CDicionary for details on performance during resizing. There's a complete walkthrough of this algorithm.

 

I found a very unusual behaviour in CHashMap.

// Демонстрация невозможности доступа по ключу после первого добавления.

#include <Generic\HashMap.mqh>

void OnStart()
{
  // Отсортированный массив неповторяющихся цен.
  const double Array[] =
{1.70682, 1.70691, 1.70700, 1.70709, 1.70710, 1.70717, 1.70721, 1.70742, 1.70749, 1.70752, 1.70757, 1.70766, 1.70768, 1.70769, 1.70771, 1.70773, 1.70774, 1.70777,
 1.70782, 1.70785, 1.70786, 1.70797, 1.70804, 1.70808, 1.70813, 1.70850, 1.70868, 1.70879, 1.70881, 1.70883, 1.70895, 1.70905, 1.70918, 1.70921, 1.70936, 1.70962,
 1.70966, 1.71022, 1.71027, 1.71047, 1.71056, 1.71080, 1.71081, 1.71093, 1.71096, 1.71109, 1.71112, 1.71119, 1.71120, 1.71127, 1.71128, 1.71135, 1.71136, 1.71138,
 1.71145, 1.71147, 1.71148, 1.71150, 1.71155, 1.71159, 1.71162, 1.71164, 1.71168, 1.71172, 1.71203, 1.71204, 1.71210, 1.71269, 1.71289, 1.71293, 1.71299, 1.71334,
 1.71337, 1.71343, 1.71357, 1.71358, 1.71367, 1.71371, 1.71374, 1.71380, 1.71402, 1.71410, 1.71414, 1.71425, 1.71426, 1.71427, 1.71429, 1.71430, 1.71446, 1.71474,
 1.71477, 1.71518, 1.71521, 1.71558, 1.71587, 1.71588, 1.71617, 1.71620, 1.71624, 1.71626, 1.71629, 1.71635, 1.71644, 1.71654, 1.71655, 1.71658, 1.71661, 1.71664,
 1.71689, 1.71700, 1.71712, 1.71718, 1.71727, 1.71752, 1.71769, 1.71775, 1.71780, 1.71782, 1.71786, 1.71792, 1.71795, 1.71797, 1.71798, 1.71801, 1.71815, 1.71827,
 1.71832, 1.71835, 1.71841, 1.71869, 1.71895, 1.71908, 1.71951, 1.71984, 1.71992, 1.72008, 1.72086, 1.72132, 1.72147, 1.72204, 1.72208, 1.72212, 1.72215, 1.72227,
 1.72230, 1.72233, 1.72234, 1.72336, 1.72374, 1.72378, 1.72381, 1.72405, 1.72454, 1.72464, 1.72477, 1.72488, 1.72541, 1.72553, 1.72562, 1.72588, 1.72625, 1.72651,
 1.72653, 1.72671, 1.72679, 1.72685, 1.72720, 1.72783, 1.72849, 1.72852, 1.72854, 1.72925, 1.72927, 1.72938, 1.72953, 1.72956, 1.72960, 1.72970, 1.72975, 1.72979,
 1.72996, 1.73014, 1.73028, 1.73034, 1.73050, 1.73056, 1.73084, 1.73085, 1.73119, 1.73137, 1.73176, 1.73278, 1.73328, 1.73337, 1.73344, 1.73345, 1.73362, 1.73377,
 1.73385, 1.73399, 1.73401, 1.73405, 1.73425, 1.73432, 1.73445, 1.73446, 1.73447, 1.73451, 1.73458, 1.73468, 1.73491, 1.73512, 1.73515, 1.73516, 1.73537, 1.73545,
 1.73553, 1.73580, 1.73581, 1.73598, 1.73611, 1.73630, 1.73635, 1.73643, 1.73658, 1.73678, 1.73694, 1.73695, 1.73698, 1.73712, 1.73713, 1.73721, 1.73746, 1.73748,
 1.73750, 1.73751, 1.73781, 1.73782, 1.73789, 1.73821, 1.73843, 1.73845, 1.73854, 1.73857, 1.73858, 1.73861, 1.73874, 1.73892, 1.74003, 1.74060, 1.74062};

  CHashMap<double, int> Index; // Будем по double-ключу хранить int-значения.
  
  for (int i = ArraySize(Array) - 1; i >= 0; i--)
    Index.Add(NormalizeDouble(Array[i], 5), 0); // Каждую цену массива используем в качестве ключа.
    
  // Все это было подготовкой, чтобы продемонстрировать поведение ниже.
  
  const double Price = 1.75141; // Берем новую цену.
  
  if (Index.Add(Price, 0)) // Если получилось добавить ее в качестве ключа.
  {
    int Value = 0;

    Print(Index.TryGetValue(Price, Value)); // (false) Проверяем, получается ли посмотреть данные по этому ключу.
    
    if (Index.Add(Price, 0)) // Пробуем повторно добавить в качестве ключа
      Print(Index.TryGetValue(Price, Value)); // (true) Проверяем, получается ли посмотреть данные по этому ключу.
  }
}


false
true

It appears that a key is supposedly added, but nothing can be read from it. However, as soon as I add the key again, it becomes readable!

Is this a BAG?


If you replace the price1.75141 in the example with another one, it will work correctly.


It was hellishly difficult to detect this behavior in a huge code without debugging and create such a concise example.

 
Vasiliy Sokolov #:

Always initialise the collection with a predetermined number of elements whenever possible. Even if you don't know them exactly, give an approximate number - it will be a great help to the algorithm.

In the example above, I tried setting capacity via the constructor. From a certain value of capacity the example starts working correctly.

But I think this is a coincidence.

 
fxsaber #:

I found a very unusual behaviour in CHashMap.


It appears that a key is supposedly added, but nothing can be read from it. However, as soon as the key is added again, it's readable!

Is this a BAG?


If you replace the price1.75141 in the example with another one, it will work correctly.


It was hellishly difficult to detect this behavior in a huge code without debugging and create such a concise example.

Fixed, will be in today's release.
 
MetaQuotes #:
Corrected, will be in today's release.

Thank you! I see the edit.


 
fxsaber #:

In the example above, I tried to set capacity via the constructor. From a certain value of capacity the example starts working correctly.

But I think this is a coincidence.

Yes, there is a glitch upon glitch. I would strongly advise against using Generic.