Two Stage Fine Tuning In Transfer Learning

I wrote an app to classify an object using transfer learning, and the model was trained on CIFAR-10 dataset, all by myself.

Two Stage Fine Tuning In Transfer Learning

Next, I moved on to a more structured training strategy: 2-stage fine-tuning, and I applied it consistently across all four backbones—MobileNet V2, MobileNet V3, ResNet-18, and ResNet-34.

Why 2-Stage Fine-tuning Makes Sense

Next, I moved on to a more structured training strategy: 2-stage fine-tuning, and I applied it consistently across all four backbones—MobileNet V2, MobileNet V3, ResNet-18, and ResNet-34. The main reason for doing this was simple: with only 100 samples per class, fully unfreezing a pretrained network from the start is almost guaranteed to cause overfitting.

The lower layers of these models already contain very general visual knowledge—edges, corners, textures, and basic shapes—learned from massive datasets. Letting those layers change too early would erase useful information and hurt generalization. A staged approach lets the model adapt slowly and safely, keeping what already works while only adjusting what really needs to change.

Stage 1: Train the Top, Freeze the Rest

In Stage 1, I froze most of the network and only unfroze the final block and the classifier head. For the ResNet models, this meant unfreezing layer4 along with the fully connected (fc) layer.

For MobileNet V2 and V3, the idea was the same even though the internal structure is different: only the last high-level feature layers and the classifier were allowed to update. This stage is mainly about alignment. The classifier learns how to map pretrained features to CIFAR-10 classes, while the top layers slightly adjust to differences in image size, texture, and visual style between CIFAR-10 and ImageNet. Because only a small part of the network is trainable, training stays stable and overfitting is kept under control.

Stage 2: Carefully Unfreezing Deeper Layers

After Stage 1 converged and validation accuracy started to level off, I moved on to Stage 2.

Here, I unfroze one additional block—layer3 in the ResNet models. This part of the network sits in a middle zone: it’s not learning simple edges anymore, but it’s also not fully specialized for ImageNet classes.

By unfreezing it, I allowed the model to gently reshape its mid-level features, such as object parts, textures, and class-specific patterns, so they better matched CIFAR-10. Importantly, the lower layers remained frozen, which helped preserve strong general-purpose features.

Why This Strategy Worked So Well

From a technical standpoint, staged fine-tuning also made it easier to control learning rates. Newly unfrozen layers could learn faster, while earlier layers were either frozen or updated very slowly. This reduced noisy updates and helped the model converge smoothly instead of forgetting what it had already learned.

Across MobileNet V2, MobileNet V3, ResNet-18, and ResNet-34, this 2-stage approach consistently performed better than unfreezing everything at once. It struck a clean balance between flexibility and restraint, which turned out to be critical when training with such a small dataset.

Any comments? Feel free to participate below in the Facebook comment section.

Enjoy the following random pages..

This website gives you tips on technical subjects and general topics!

This website is for a resort hotel situated in Jiaoxi, Taiwan.

This is a visual search engine acquired by Google in 2010.

This is a 3D slime war board game with computer AI.

Post your comment below.
Anything is okay.
I am serious.