'Custom Hashing Algorithm

Out of sheer boredom, I decided to write a hashing algorithm in python. It all works, however I've got a couple problems:

Results are similar (e.g. hash(1234) has one character difference to hash(4321))
There is a repeating sequence forming (more noticeable on smaller inputs)

Most of the values I've used are random, I have no experience writing a hash algorithm but I wanted to give it a shot. Is there anything I can do to resolve the 2 problems I'm having?

def mash(n):
    l = [int(x) for x in list(str(n))]
    m = []
    for i in range(len(l)):
        if pow(len(l) >> 1, (i + 1) << 3) % (i + 1) == 0 and i != 0:
            m[-1] = int(str(m[-1]) + str(l[i]))
        else:
            m.append(l[i])
    return m

def hash(content, key=None):
    """Hashes the content (either a string or number"""
    o = [0] * 128
    if type(content) is float:
        while content % 1 != 0:
            content = content * 10
        content = int(content)
    if type(content) is str:
        content = int(''.join([str(ord(x)) for x in list(content)]))
    if type(content) is int:
        content = mash(content)
        _ = []
        for i in range(len(content)):
            _.append((chr(((content[i] * (i << 3)) % 94) + 33)))
            # 33 - 126
        content = ''.join(_)
    elif type(content) is not str:
        raise Exception
    z = [ord(x) for x in list(content)]
    for i in range(len(z)):
        if z[i] % 2 == 0:
            n = next((i for i, x in enumerate(o) if not x and i % 2 == 0), 0)
        else:
            n = next((i for i, x in enumerate(o) if not x and i % 2 == 1), 0)
        o[n] = content[i]
    o = [str(x) for x in o]
    o.reverse()
    init_n = o.count('0')
    while o.count('0') > 5:
        n = next((i for i, x in enumerate(o) if x == '0'), None)
        o[n] = chr(((pow(init_n >> 1, o.count('0'))) % 94) + 33)
    o.reverse()
    return ''.join(o)

Note that the key parameter has not yet been implemented. Wanted to resolve existing issues before attempting another feature.

Solution 1:^[1]

I know that this question is around four years old at this point, but I still wanted to give an answer to this Stack Overflow post.

Recall with Hash Tables, our goal is "near" constant time searching for data. We have the option of using a binary search tree for a faster O(log n), though it must be stored in sorted order and maintain balance.

We can do better, by using memory for a decent tradeoff in speed! This is a common pattern we see in Computer Science.

Hashing functions are created to convert data of various sizes to a fixed "address" that can be identified with a key. Key features include this algorithm must be done quickly, minimal collisions with existing keys, and repeatable. Collisions will happen so we need a strategy to handle them.

class BasicHashTable:
    """ Simple HashTable implementation """
    class HashableItem:
        def __init__(self, key=None, value=None):
            self.key = key
            self.value = value

    def __init__(self):
        self.capacity = 4
        self.size = 2**self.capacity - 1
        self.slot = [None for _ in range(self.size)]
        self.count = 0

    def _hash(self, s):
        i = 1
        hash_value = 0
        for char in string:
            h += ord(char) * i
            i += 1
        return h % self.size

Here I have a hashing algorithm that is based on the ordinal value of a character and the position in the string. I also return the value as it is confined within the BasicHashTable size constraint.

Recall that this algorithm fulfills constraint outlined earlier:

Quick
Minimal Collisions
Repeatable

I omitted two key interfaces put and get. Since they can vary and out of the scope of this post. Though, they will handle collisions and resizing the BasicHashTable as it exceeds the capacity.

Cheers!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	kev-odin

'Custom Hashing Algorithm

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]