HashTrie data structureThe existing implementations of hash-tables have the important disadvantage - fixed size. As a corollary, if the table is filled more than 90 % search and insert operation become very slow. HashTrie represents new efficient data structure. It combines in herself properties of the hash-tables and trie (digital-trees). As against the usual hash-tables the size HashTrie is not fixed, that allows to work with a unknown beforehand amount of datas. Structure of HashTrieHashTrie is very like to tree. Each node is array of pointers. (In our implementation array size is 256). Each pointer may be reference to 1) nil; 2) overflow list; 3) other node (subtree).
Initially HashTrie contains one item (TTreeItem) and behaves as hash-table. The conflicts of duties are allowed through the list of overflow. As hashing function we shall use CRC32 (hereinafter we shall return to a problem of choice hashing function). Is important that outcome of calculation hashing function it 32-bit unsigned integer. For addressing in the table (at the first level) we use a low byte. For example we add in HashTrie string "ABCDE" so:
In the following figure you can see a graphics image of the obtained structure Choice of hashing functionWe have tested with HashTrie some different hashing functions:
2) CRC32 After addition in HashTrie of 30 thousand words from the English language dictionary we have following results:
*) in array of TTreeItem nodes It is visible, that at usage of simple hashing function we receive poor outcomes. CRC32 and the method of Bob Jenkins is displayed approximately identical outcomes, thus CRC32 is calculated faster. |