I am studying AVX by writing AVX code with inline assembly. In this case, I tried to implement AVX in a simple function. The function name I made is lower_all_c
I have recently discovered that AVX2 doesn't have a popcount for __m256i and the only way I found to do something similar is to follow the Wojciech Mula algori