This work caused some serious privacy issues. Jan Kautz, vice president of learning and perception research at Nvidia, said: “The artificial intelligence community has a misleading sense of security when sharing trained deep neural network models.”
In theory, this attack can be applied to other data related to individuals, such as biometrics or medical data. On the other hand, Webster pointed out that people can also use the technology to check whether their data is used to train artificial intelligence without their consent.
Artists can check whether their work has been used to train GAN in commercial tools, he said: “You can use methods like ours as proof of copyright infringement.”
This process can also be used to ensure that GAN does not expose private data in the first place. Before release, GAN can use the same technology developed by researchers to check whether the work it creates is similar to the real examples in the training data.
However, Kautz said, this assumes that you can master the training data. He and his colleagues at Nvidia came up with a different way to disclose private data, including images of faces and other objects, medical data, etc., without the need to access training data.
Instead, they developed an algorithm that can recreate the data that the trained model has been exposed to. Reversing the steps of the model When processing the data. Take a well-trained image recognition network as an example: in order to recognize the content in the image, the network passes it to a series of artificial neuron layers, each layer extracts different levels of information, from abstract edges to shapes, to more Recognized characteristics.
Kautz’s team discovered that they can interrupt the model and reverse its direction in the middle of these steps, recreating the input image from the model’s internal data. They tested the technology on various common image recognition models and GANs. In a test, they showed that they can accurately reconstruct images from ImageNet, which is one of the most famous image recognition datasets.
Like Webster’s work, the recreated image is very similar to the real image. “We were surprised by the final quality,” Kautz said.
Researchers believe that this attack is not just hypothetical. Smartphones and other small devices are beginning to use more artificial intelligence. Due to battery and memory limitations, the AI model sometimes only performs half of the processing on the device itself, and the semi-execution model is sent to the cloud for final calculation. This method is called split calculation. Kautz said that most researchers believe that split computing will not reveal any private data in a person’s phone, because only the artificial intelligence model is shared. But his attack shows that this is not the case.
Kautz and his colleagues are now trying to figure out ways to prevent the model from leaking private data. He said that we want to understand the risks in order to minimize loopholes.
Although they used very different techniques, he believed that his work and Webster’s work complemented each other. Webster’s team showed that private data can be found in the output of the model; Kautz’s team showed that private data can be revealed by reversing and recreating the input. “Exploring two directions is important to better understand how to prevent attacks,” Kautz said.