'How can I convert C++ code of a CRC16-CCITT algorithm to Python code?

I have an example code for CRC16-CCITT algorithm written in C++ and I need help converting it to Python.

Example C++ code:

#include<iostream>

using namespace std;

unsigned short calculateCRC(unsigned char data[], unsigned int length)
{
        unsigned int i;
        unsigned short crc = 0;

        for(i=0; i<length; i++){
                crc = (unsigned char)(crc >>8) | (crc<<8);
                crc ^= data[i];
                crc ^= (unsigned char)(crc & 0xff) >> 4;
                crc ^= crc << 12;
                crc ^= (crc & 0x00ff) << 5;
        }

        return crc;
}

int main()
{
        unsigned int length;
        length = 15;

        unsigned char data[length] = {0x01,0x08,0x00,0x93,0x50,0x2e,0x42,0x83,0x3e,0xf1,0x3f,0x48,0xb5,0x04,0xbb};
        unsigned int crc;
        crc =  calculateCRC(data, length);
        cout<< std::hex << crc << '\n';
}

This code gives 9288 as output which is correct.

I tried the following in Python:

#!/usr/bin/env python3

def calculateCRC(data):
    crc = 0

    for dat in data:
        crc = (crc >> 8) or (crc << 8)
        crc ^= dat
        crc ^= (crc and 0xff) >> 4
        crc ^= crc << 12
        crc ^= (crc and 0x00ff) << 5
    crc = hex(crc)
    return (crc)


data = [0x01,0x08,0x00,0x93,0x50,0x2e,0x42,0x83,0x3e,0xf1,0x3f,0x48,0xb5,0x04,0xbb]
print(calculateCRC(data))

This outputs 0xf988334b0799be2081.

Could you please help me understand what I am doing wrong? Thank you.



Solution 1:[1]

Python's int type is unbounded, but C / C++ unsigned short values are represented in 2 bytes so overflow when you shifted to the left. You need to add masking in Python to achieve the same effect, where you remove any bits higher than the 16th most-significant bit. This is only needed where values are shifted to the left as right-shifting already drops the right-most rotated bits.

Next, you are confusing the | and & bitwise operators with or and and boolean logical operators. The C++ code uses bitwise operators, use the same operators in Python.

Last but not least, leave conversion to hex to the caller, don't do this in the CRC function itself:

UNSIGNED_SHORT_MASK = 0xFFFF  # 2 bytes, 16 bits.

def calculateCRC(data):
    crc = 0
    for dat in data:
        crc = (crc >> 8) | (crc << 8 & UNSIGNED_SHORT_MASK)
        crc ^= dat
        crc ^= (crc & 0xff) >> 4
        crc ^= crc << 12 & UNSIGNED_SHORT_MASK
        crc ^= (crc & 0x00ff) << 5
    return crc

Now you get the same output:

>>> print(format(calculateCRC(data), '04x'))
9288

I used the format() function rather than hex() to create hex output without a 0x prefix.

As Mark Adler rightly points out, we don't need to mask for every left-shift operation; just because the C / C++ operations naturally would result in a masked value, doesn't mean we need to do this as often here. Masking once per iteration is enough:

def calculateCRC(data):
    crc = 0
    for dat in data:
        crc = (crc >> 8) | (crc << 8)
        crc ^= dat
        crc ^= (crc & 0xFF) >> 4
        crc ^= crc << 12
        crc ^= (crc & 0x00FF) << 5
        crc &= 0xFFFF
    return crc

There may be more short-cuts we could apply to shave off operations and so speed up operations, but if speed really is an issue, I'd re-implement this in Cython or C or another natively-compiled option, anyway.

Also note you can use a bytes object, you don't have to use a list of integers:

data = b'\x01\x08\x00\x93\x50\x2e\x42\x83\x3e\xf1\x3f\x48\xb5\x04\xbb'

Looping over a bytes object still gives you integers between 0 and 255, just like the char array in C++.

Finally, you don't actually have to translate the code yourself, you could just use an existing project like crccheck, which implements this specific CRC16 variant as well as many others:

>>> from crccheck.crc import CrcXmodem
>>> print(format(CrcXmodem.calc(data), '04x'))
9288

crccheck is written in pure Python. For native implementations, there is crcmod. This library's documentation is a little lacking, but it is also very flexible and powerful, and actually includes predefined functions:

>>> from crcmod.predefined import mkPredefinedCrcFun
>>> xmodem = mkPredefinedCrcFun('xmodem')
>>> print(format(xmodem(data), '04x'))
9288

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1