Transfer learning is a method of training neural networks. The essence of this method is that the knowledge of the neural network that was trained for one task is transferred to another task. This two-stage method is widely used in computer vision in medical images.

Neural networks are used in various fields. Google AI researchers have confirmed the importance and effectiveness of transfer learning in solving medical problems. According to the results, models with randomly initialized weights work with the same quality as models pre-trained on ImageNet. For transfer learning, the neural network is trained in two stages:

First, pre-training. The network is trained on a large dataset with a variety of classes, like ImageNet; second, fine tuning. pre-trained model retrains on the data of the target task.

Pre-training allows the neural network to use the knowledge that was learned at the first stage to solve the required task. In the context of transfer learning, the standard architectures that were developed for ImageNet, with weights, are retrained for medical tasks. Medical tasks in computer vision range from analyzing chest x-rays to recognizing eye infections.

Despite the widespread use of this method on medical data, the effect of the transfer learning approach has not been previously investigated Researchers analyzed and evaluated hidden representations of neural networks for several tasks from the medical field.

Quality of work of pre-trained models


At the first stage, the researchers studied the effect of the pre-trained model on the quality of its predictions. Researchers compared models with randomly initialized weights and pre-trained weights on ImageNet. Diagnosis of diabetes mellitus and recognition of 5 diseases by x-ray of the breast were chosen as tasks for testing.

Tested models included ResNet50, Inception-v3, and simple convolutional neural networks with 4 or 5 layers of convolution-batchnorm-ReLU.

Comparison results:


Pre-training does not significantly affect the quality of neural network predictions for medical problems; smaller models produce similar results with standard ImageNet architectures. Due to the fact that medical tasks are smaller than ImageNet, for large architectures with a large number of parameters, pre-training can harm the quality of predictions on a medical task.

Analysis of hidden predictions

Researchers tested how the traits that randomly initialized and pre-trained models learn are different. They compared hidden views from models for this. Researchers used singular vector canonical correlation analysis (SVCCA) to make the comparison valid. SVCCA allows you to calculate the similarity metric for hidden representations from different models. So we can see that for large models (ResNet-50 and Inception-v3), hidden representations of randomly initialized models are more similar than pre-trained representations.

The similarity of representations between neural networks. The higher the metric, the more similar the views. Show All Articles