Generic Class Library - bugs, description, questions, usage and suggestions - page 6

 
Vasiliy Sokolov:

It happens that different words begin with the same letter of the alphabet. If we put the word "apple" in our previous dictionary, and then try to put "application" there, it won't work. Index 0 will be already occupied by the word apple. In this case they talk about a hash function collision. Our hash function is very simple - it returns the number of the first character of the word, that's why collisions in this function are very frequent. In order to store different words starting with the same letter, we make an addition: we will store elements not in an array, but in an array of arrays. Then index 0 will contain another array with two words: "apple" and "application":

Now our dictionary stores words even starting with the same letter. But the cost of accessing words with the same first letter increases. Because now we need a full search of all the words beginning with 'a' to find out if the word 'application' is in the dictionary or not. If our hash function would produce different numbers even if the words differed by one character, then the probability of trying words with the same indexes would tend to zero, and access to an arbitrary element would tend to o(1). This is exactly what happens in real dictionaries and functions used there are collision immune, so we can say with certainty that access to items in these collections is very fast and almost no search.

I will try to provide my solution to this example. Without pointers. A little bit later.
 

I recently read a book on the subject. It's called"Grokking Algorithms". Everything is very clearly laid out, with examples.

 
Vasiliy Sokolov:

In the next example we will improve it so that it can store words that begin with the same letter.

Please write concisely, without any spoilers in the form of hats and unnecessary entities.

For example, this

bool Contains(string word)
{
   uchar index = (uchar)StringGetCharacter(word, 0)-97;
   return words[index] != NULL;
}

could be written much clearer

bool Contains(string word)
{
   return words[word[0]-'a'] != NULL;
}


And still this code may work differently for MT4/5 because in MT4 the array is initialized with NULL values, and in MT5 it may be trash.

 
fxsaber:

Please write succinctly, with no tangles in the form of hats or extra entities.

...


Categorically AGAINST! It is desirable to write the code in full. Look at all of the MQ codes - there are "caps" everywhere. This is the standard.


fxsaber:

...

And yet this code for MT4/5 may work differently, because in MT4 the array is initialized with NULL values, and in MT5 it may be trash.


What does this have to do with the old terminal? Who is lazy and is still sitting on the old one - this is only his personal problem. The community should not slow down because of such lazybones.

 
Vladimir Karputov:

Look at all the MQ codes - there are "caps" everywhere. This is the standard.

Screw the standard, the essence is important here, not the style occupying 50% of the code.

 
fxsaber:

Fuck the standard, the essence is important here, not the style that takes 50% of the code.


The main task of the forum is EDUCATION. That's why the code should be expanded and understandable with maximum approximation to the standard.

 
Vasiliy Sokolov:

This is exactly what happens in real dictionaries; the functions used there are resistant to collisions, so we can definitely say that accessing the items in these collections happens very quickly and almost without overshoot.

So for every task it's necessary to find the right balance between RAM and CPU hash complexity.

 
Vladimir Karputov:

The main task of the forum is EDUCATION. Therefore, the code should be expanded and clear

it is enough.

With the closest approximation to the standard.

You can insert your own headers. A100 hundreds of reports in the CD without caps - that's the standard! Because it's the essence, not the tinsel, that matters.

 
There is a solution. However, to temporarily preserve the intrigue, I would like to put here the exerter. Next, skilled people will compare the performance of my solution and the solution provided by the author above. I wonder which works faster.
Files:
Dictionary.ex5  10 kb
 
Tag Konow:
There is a solution. However, to temporarily keep the intrigue, I would like to post the exerpt here. Next, the skillful people will compare the performance of my solution and the solution provided by the author above. I wonder which works faster.
You need to run the executable. You will see an input field. Next, you can enter the different words. If there is a match, you will be notified by the printer that the word is in the dictionary. If there is no word, you will get a notice that the word is added to the dictionary.