'Case sensitive search in Django, but ignored in Mysql
I have a field in a Django Model for storing a unique (hash) value. Turns out that the database (MySQL/inno) doesn't do a case sensitive search on this type (VARCHAR), not even if I explicitly tell Django to do a case sensitive search Document.objects.get(hash__exact="abcd123")
. So "abcd123" and "ABcd123" are both returned, which I don't want.
class document(models.Model):
filename = models.CharField(max_length=120)
hash = models.CharField(max_length=33 )
I can change the 'hash field' to a BinaryField , so in the DB it becomes a LONGBLOB , and it does do a case-sensitive search (and works). However, this doesn't seem very efficient to me. Is there a better way (in Django) to do this, like adding 'utf8 COLLATE'? or what would be the correct Fieldtype in this situation? (yes, I know I could use PostgreSQL instead..)
Solution 1:[1]
As @dan-klasson mentioned, the default non-binary string comparison is case insensetive by default; notice the _ci
at the end of latin1_swedish_ci
, it stands for case-insensetive.
You can, as Dan mentioned, create the database with a case sensitive collation and character set.
You may be also interested to know that you can always create a single table or even set only a single column to use a different collation (for the same result). And you may also change these collations post creation, for instance per table:
ALTER TABLE documents__document CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
Additionally, if you rather not change the database/table charset/collation, Django allows to run a custom query using the raw method. So you may be able to work around the change by using something like the following, though I have not tested this myself:
Document.objects.raw("SELECT * FROM documents__document LIKE '%s' COLLATE latin1_bin", ['abcd123'])
Solution 2:[2]
The default collation for character set for MySQL is latin1_swedish_ci, which is case insensitive. Not sure why that is. But you should create your database like so:
CREATE DATABASE database_name CHARACTER SET utf8;
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | tutuDajuju |
Solution 2 | dan-klasson |