Clearing an array of defined element(s) - page 29

 
Sergey Dzyublik:
Essentially you are using a static HashSet data structure with an initial array of data to resolve collisions.
But the implementation is just eye-opening...

Instead of calling function ("FindValueInSortArray") with 100-500 unnecessary parameters, usually class is used where these parameters act as fields of class (gain on passing of parameters, if compiler didn't guess to make implicit inline).
If there is a need to use a pair of arrays of the same size and with one purpose of use ( int p1[]; int p2[];), usually a structure array is used (index access advantage, less chance of cache miss).

Yes, I know. I agree, I'm scared of my own implementation myself. But my goal is the algorithm, not the implementation. As they say - "assembled on my knees" (about an hour and a half), to check the correctness and speed of the algorithm. Recall that this thread discusses (and I quote) "the fastest way to flush an array of junk values".

Although I have a lot of complaints about my own algorithm, because it is far from universal. I can see ways of improving the algorithm but the code will become much more complex. A pity about the time.

ZZY HashSet first heard, because I'm not strong in theory. I'm well aware that I haven't invented anything new and everything was invented before me. :)) The advantage of this algorithm is evident when the array of items to be deleted has a uniform distribution. If the distribution is very different from uniform, this implementation may be inferior in speed to others that already exist in this branch.

 
Taras Slobodyanik:

changed the calculation of the amount to CRC32 )

Thank you. Of course it's more correct.

But as far as I understand, switching to CRC32 didn't reveal any errors. :)

 

Let me tell you, I'm personally impressed by capabilities of modern computers and MQL5 in particular.

After all, it takes only 1/40 seconds(!!!) to search for about a thousand different values in an array of one million elements, delete them (about 100,000 cells) and repackage the cleaned array.

 
optimised
Files:
 
nicholi shen :

It's all brilliantly simple! :))

But, however, this algorithm consumes memory uncontrollably.

So, for example, if you replace

arr[i]= rand ()% 10000 ;
....
Value2[i]= rand ()% 10000 ;

with .

arr[i]= rand ()* rand ();
....
Value2[i]= rand ()* rand ();

the situation is completely different:

2018.11 . 19 23 : 25 : 57.593 ArrayDeleteValue21 (NZDUSD,D1)  вариант nicholi shen: Контрольная сумма = 3902843984.813025 ; элементов - 999983 ; время выполнения -  603551 микросекунд
2018.11 . 19 23 : 25 : 57.980 ArrayDeleteValue21 (NZDUSD,D1)  вариант Nikitin     : Контрольная сумма = 3902843984.813025 ; элементов - 999983 ; время выполнения -  383847 микросекунд
2018.11 . 19 23 : 25 : 58.043 ArrayDeleteValue21 (NZDUSD,D1)  вариант fan9        : Контрольная сумма = 3902843984.813025 ; элементов - 999983 ; время выполнения -   61414 микросекунд
2018.11 . 19 23 : 25 : 58.074 ArrayDeleteValue21 (NZDUSD,D1)  вариант Semko       : Контрольная сумма = 3902843984.813025 ; элементов - 999983 ; время выполнения -   28430 микросекунд
Files:
 
Added ArrayDeleteValue to a couple more people in the project, rights to edit. Whom I missed before.
 
Nikolai Semko:

Thank you. This is certainly the right way to go.

But as far as I understand, switching to CRC32 didn't reveal any errors. :)

Yes, but now those who have seen with their own eyes the result of only one algorithm, can make sure that others get the same result)

 
Please write on the subject. No flubbing or swearing. Thank you.
 
nicholi shen:

Not a bad idea, but if there are negative values in an array or vector, you will get an array out of range error

Besides, the size of an array with filtering flags may reach2 147 483 647, which is too much

Of course, you can reduce the size by 32 times by replacing the setting of 32 BOOL variables with a bitmask like this:

int array_filter(int &a[],const int &b[]) //nicholishen
{
   int e=ArraySize(a);
   int c[];
   int d =b[ArrayMaximum(b)]+1;
   ArrayResize(c, b[ArrayMaximum(b)]/32 + 1);
   ArrayInitialize(c, 0);
   int i=0;
   int ii=0;
   while(i<ArraySize(b))
   {     
     ii=b[i]/32;
     c[ii]|=(1<<(b[i]-(ii*32)));
     i++; 
   } 
   
   int g=0, h=0, f=0;
   for(g=0; g<e; g++){
      f = a[g];
      ii=f/32;
      if(f >= d || !(c[ii] & (1<<(f-(ii*32)))) )
      {
         a[h]=f;
         h++;
      }
   }
   return ArrayResize(a, h);
}


but it would still be too big and the speed would drop by about half.

 
Nikolai Semko:

It's all brilliantly simple! :))

But, however, this algorithm consumes memory uncontrollably.

So, for example, if you replace

with .

the situation is quite different:

Your version is also good, but unfortunately you also have a weak spot. Under "domestic" conditions, i.e. small arrays of size ~100 and small vector ~5-10

your algorithm will be slower than all of those presented here, although the difference in microseconds is so insignificant that probably does not matter, but on larger arrays it is the best.

P.S I think, that function should be universal, it is necessary to combine in it some different algorithms picked up according to input data.