Fine-tune a pretrained model in PyTorch with a deliberate layer-freezing strategy

Produces a transfer-learning script that swaps the right head, freezes the right layers, and uses distinct learning rates so you adapt a backbone instead of nuking its pretrained weights.

Open in Studio

Prompt

You are a senior ML engineer who fine-tunes pretrained models without destroying what pretraining bought.

Goal: adapt a pretrained backbone for a new task.
- Backbone: [e.g. 'resnet50 from torchvision', 'vit-base-patch16 from timm', 'a Hugging Face checkpoint']
- New task and head: [e.g. '10-class image classification', 'regression on a single scalar', 'token-level tagging']
- Target dataset size and domain gap: [N LABELED SAMPLES — e.g. '2k images, same domain as pretraining' / '500 rows, quite different domain']
- Runtime: [single GPU / CPU]. PyTorch [2.x].

Produce a single runnable finetune.py with:
1. Backbone loading that visibly freezes the right layers by default: print which params are frozen vs trainable and their parameter counts. Make the freeze policy a named strategy (e.g. 'freeze-all-then-head', 'freeze-bn-only', 'unfreeze-last-block') chosen via a flag.
2. A replacement head matching the new task, initialized sensibly (not random huge weights). Preserve the backbone's expected preprocessing and normalization.
3. Distinct learning rates: a higher learning rate for the new head, a lower one for any unfrozen backbone params, implemented as optimizer parameter groups. No single flat learning rate across everything.
4. A train and eval loop with running loss and the right metric for the task; zero gradient leaks in eval; tqdm progress.
5. Early stopping on the validation metric with patience [N] and best-weight checkpointing to disk.
6. An argparse CLI exposing backbone, freeze strategy, both learning rates, batch size, epochs, and seed. Fixed seed for reproducibility.

Guardrail: this is an engineering scaffold, not research advice. Do not claim a specific accuracy or SOTA result. Flag plainly that small datasets can overfit and that the freeze strategy is a starting point to validate, not a guaranteed answer.

Output the full script in one fenced block, then a 4-item runbook: how to pick the freeze strategy from dataset size, the two learning rates to try first, how to confirm the head is learning while the backbone stays stable, and how to read the frozen-vs-trainable counts.

Success signal: the output is good only if it uses separate parameter groups with different learning rates, prints the frozen-vs-trainable split explicitly, and eval has no gradient leaks.

Use case

Use when you have a pretrained backbone and a small target dataset and want a principled fine-tune instead of unfreezing everything at one learning rate.

When to use this

After choosing a backbone and gathering a labeled target set. Specify dataset size and whether it is close to or far from the pretraining domain.

Follow-up prompts

Add a two-stage schedule: train the head frozen, then unfreeze with a lower learning rate.
Swap the backbone for a timm model and expose a single flag to change it.
Add per-layer learning-rate scaling with parameter groups and log which layers learn fastest.

#pytorch#transfer-learning#fine-tuning#machine-learning#python

Source: promptfork seed
License: CC-BY-4.0
Published: 6/22/2026

Report

Fine-tune a pretrained model in PyTorch with a deliberate layer-freezing strategy

Use case

When to use this

Follow-up prompts

Explore more

More prompts you might like

Scaffold a clean PyTorch training loop with eval and early stopping

Build a robust PyTorch Dataset and DataLoader with an augmentation pipeline

RAG system prompt that refuses to hallucinate and cites sources

Pandas data-cleaning pipeline for a messy CSV

Pick the right Ollama model and generate an install plus run script for your hardware

Wire a local RAG pipeline to Ollama with a doc loader and vector store