A developer claims to have reverse-engineered the NeuralHash algorithm used in Apple’s CSAM detection. Conflicting views have been expressed about whether this would enable the child sexual abuse material detection system to be defeated…
Developer Asuhariet Ygvar posted the code to GitHub.
- Convert image to RGB.
- Resize image to
360x360
. - Normalize RGB values to
[-1, 1]
range. - Perform inference on the NeuralHash model.
- Calculate dot product of a
96x128
matrix with the resulting vector of 128 floats. - Apply binary step to the resulting 96 float vector.
- Convert the vector of 1.0 and 0.0 to bits, resulting in 96-bit binary data.
A commenter in his Reddit thread asked how he could be sure it was correct, and Ygvar outlined the evidence.
First of all, the model files have prefix NeuralHashv3b-, which is the same term as in Apple’s document.
Secondly, in this document Apple described the algorithm details in Technology Overview -> NeuralHash section, which is exactly the same as what I discovered. For example, in Apple’s document:
“The descriptor is passed through a hashing scheme to convert the N floating-point numbers to M bits. Here, M is much smaller than the number of bits needed to represent the N floating-point numbers.”
And as you can see from here and here N=128 and M=96.
Moreover, the hash generated by this script almost doesn’t change if you resize or compress the image, which is again the same as described in Apple’s document.
He also explains why the hashes are off by a few bits.
It’s because neural networks are based on floating-point calculations. The accuracy is highly dependent on the hardware. For smaller networks it won’t make any difference. But NeuralHash has 200+ layers, resulting in significant cumulative errors. In practice it’s highly likely that Apple will implement the hash comparison with a few bits tolerance.
Some are suggesting that knowing the algorithm would allow people to generate both false negatives – CSAM images which would not be detected, despite being in the database – and false positives, which could flood Apple’s human reviewers with innocent images.
However, others say that the blinding system used by Apple would make both impossible.
There is one important step where apple uses a blinding algorithm to alter the hash. In order to train a decoder to do this, you would need access to the blinding algorithm, which only Apple has access to.
No doubt security experts will weigh in soon.
FTC: We use income earning auto affiliate links. More.
Comments