'Android Bluetooth (BLE) corrupted data on onCharacteristicChanged

My app do the following:

  1. It sends a command with onDescriptorWrite to the BT device.

  2. As soon the BT device gets this command it starts transferring data to the Android phone.

  3. Android's onCharacteristicChanged is catching all the data sent from the BT device.

  4. After all the data is transferred the Android app writes it to a file.

I've tested it and everything works totally perfect on a Samsung (Android 11), a OnePlus (Android 11), and a Xiaomi (Android 9), but the data coming from onCharacteristicChanged is getting corrupted on a Nokia (Android 11).

This test example shows the transferred data wrote into a file with a checksum. As you can see I get the very same bytes on a Xiaomi, but on the Nokia it sometimes gets corrupted.

Tested on Xiaomi MiA1 (Android 9)

File_MD5: The right file MD5 -> AE36F08213B25B5E0EE19425257D0D85

File_MD5: Measurement file MD5: AE36F08213B25B5E0EE19425257D0D85 (ok)
File_MD5: Measurement file MD5: AE36F08213B25B5E0EE19425257D0D85 (ok)
File_MD5: Measurement file MD5: AE36F08213B25B5E0EE19425257D0D85 (ok)
File_MD5: Measurement file MD5: AE36F08213B25B5E0EE19425257D0D85 (ok)
File_MD5: Measurement file MD5: AE36F08213B25B5E0EE19425257D0D85 (ok)
File_MD5: Measurement file MD5: AE36F08213B25B5E0EE19425257D0D85 (ok)
File_MD5: Measurement file MD5: AE36F08213B25B5E0EE19425257D0D85 (ok)
File_MD5: Measurement file MD5: AE36F08213B25B5E0EE19425257D0D85 (ok)
File_MD5: Measurement file MD5: AE36F08213B25B5E0EE19425257D0D85 (ok)


Tested on Nokia 5.4 (Android 11)

File_MD5: The right file MD5 -> BCF704DD811A760B5602C20DEDB61AF8

File_MD5: Measurement file MD5: BCF704DD811A760B5602C20DEDB61AF8 (ok)
File_MD5: Measurement file MD5: E65A5D38EB3D8BF4E1AF5240DFBE1840 (ERROR)
File_MD5: Measurement file MD5: BCF704DD811A760B5602C20DEDB61AF8 (ok)
File_MD5: Measurement file MD5: BCF704DD811A760B5602C20DEDB61AF8 (ok)
File_MD5: Measurement file MD5: BCF704DD811A760B5602C20DEDB61AF8 (ok)
File_MD5: Measurement file MD5: BCF704DD811A760B5602C20DEDB61AF8 (ok)
File_MD5: Measurement file MD5: 0D3A577631A115FBAF3324A9B09244A8 (ERROR)
File_MD5: Measurement file MD5: A6FB1334D7AA1520F105ACB1EC1324C5 (ERROR)
File_MD5: Measurement file MD5: BCF704DD811A760B5602C20DEDB61AF8 (ok)
File_MD5: Measurement file MD5: BCF704DD811A760B5602C20DEDB61AF8 (ok)

I'm absolutely standing incomprehensibly in front of this issue, yet another one of Android's fragmented and unreliable ecosystem.


I've tried the following:

  1. Setting my BluetoothGatt instance to CONNECTION_PRIORITY_HIGH mode with requestConnectionPriority

  2. Using a synchronized data container in onCharacteristicChanged in case if this is some multiple thread writing the same container type of error.

None of them helped.

Any insights?

E D I T:

My coworker made a firmware which simply sends incrementing numbers. As you can see within the red area, the data is corrupted, even with Nordic's own nRF toolbox app.

Should I be worried?

That means this Nokia is just doesn't work and that is it? Can this be a hardware issue for only this device?

enter image description here



Solution 1:[1]

Seems like there are some differences in the realisation on that smartphone. Try to do these things:

  1. Put each chunks of the data (retrieved from onCharacteristicChanged) to a thread safe collection. It'll be better to wrap each chunk in a structure ({data: ByteArray, device: *, something else...}).
  2. Handle each chunk in a background thread to avoid the UI freezing. The main reason to do it, some devices are actively using byte caches inside and that could corrupt the data, we should copy bytes from onCharacteristicChanged as soon as possible and them handle in a convenient way on our own thread.

Solution 2:[2]

Yes, byte corruption can and does happen. I observed this on various devices most partiuclar cheap phones (e.g. ZTE Blade 512, Vivo Y20).
For my analysis I usually compare the btsnoop_hci.log (from Android) and a sniffer log (I use a frontline BPA).
When I run a test, I'll do echo test with incrementing numbers between phone and peripheral. In our usecase the connection is always encrypted via Ble Enhanced Security.

Issues I saw:

  1. Notification Payload is corrupted, random values
  2. Notification Payload is appended to wrong HCI Packets.

Number 1: the issue you are facing and indeed must happen somewhere in the stack, and is usually already visible in hci.log.
It must happen within the phone after the packet was received, as BLE does have MIC for the encrypted part and CRC within for the payload. So if corrpution would happen over air this 2 checks would already fail. Since the packet is correct in SnifferLog and corrupted in hci.log we can conclude an issue within the phone
Number 2:
Sometimes I see the actual payload of a notification appended to an ATT_WRITE_RESPONSE, in this case after a Write_Request I saw 2 WriteResponses were only 1 should have been +1 Notification. The extra write response had the DataPayload of the Notification, which was expected and visibile in SnifferLog
Again MIC and CRC should have taken care of this.

Possible Workarounds:
Implement CRC checks for Notifications on Application level, so the notification not only include the status value but also a CRC.
Make characteristic Read/Notify (this may require FW changes on your peripheral), so in case of CRC for notifications is wrong you can still read.

If you are using Notifications as RX Channel on a message protocol. Implement high level integrity check + retransmission.
Yes this may be a hard sell to PM since BLE should have this already, but not the stack inbetween.

Could you specify your Nokia phone? I might want to add it to my testing setup. I saw issues with Nokia phones but never of this kind.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Maxim Firsoff
Solution 2