A Practical Guide to Fine-Tuning Language Models

This post is a hands-on tutorial that walks you through fine-tuning a pre-trained language model for a custom text classification task, making powerful AI accessible for your specific needs.

Leeds AI SocietyLinkedInGitHubUpdated: 3 Dec 2025

The "Why": The Power of Transfer Learning

Fine-tuning is a form of transfer learning. The core idea is simple and powerful: a model that has been pre-trained on a massive, general dataset (like Wikipedia and the entire internet) has already learned the fundamentals of language—grammar, syntax, and a vast amount of world knowledge.

Instead of training a new model from scratch, which is computationally expensive, we can take this pre-trained model and train it a little more on our own smaller, specific dataset. This "fine-tuning" process adjusts the model's weights to make it an expert on our particular task, saving an immense amount of time and resources.

Setting Up Your Local Environment

This guide assumes you will be running the code locally. First, you need Python installed. You can check by opening a terminal (Command Prompt or PowerShell on Windows, Terminal on macOS/Linux) and running python --version or python3 --version.

Next, create a dedicated project folder and set up a virtual environment to keep your dependencies clean.

1. Create a Project Directory:

bash
mkdir my-finetuning-project
cd my-finetuning-project

2. Create a Virtual Environment:

On Windows:

bash
python -m venv .venv

On macOS / Linux:

bash
python3 -m venv .venv

3. Activate the Virtual Environment:

On Windows (PowerShell):

powershell
.venv\Scripts\Activate.ps1

On Windows (Command Prompt):

cmd
.venv\Scripts\activate.bat

On macOS / Linux:

bash
source .venv/bin/activate

Your terminal prompt should now be prefixed with (.venv).

4. Install Libraries:

With your virtual environment active, install the necessary libraries.

bash
pip install transformers datasets torch

The "How": Fine-Tuning for News Classification

Our task will be to fine-tune a model to classify news headlines into one of four categories: World, Sports, Business, and Sci/Tech. We'll use the ag_news dataset for this.

Create a new Python file (e.g., finetune.py) and add the following code blocks.

1. Load Data and Tokenizer

First, we load the ag_news dataset and the tokenizer that corresponds to our base model, distilbert-base-uncased.

python
from datasets import load_dataset
from transformers import AutoTokenizer

# Load the dataset
dataset = load_dataset("ag_news")

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

# Create a preprocessing function
def preprocess_function(examples):
    return tokenizer(examples["text"], truncation=True, padding="max_length")

# Apply the tokenizer to the whole dataset
tokenized_datasets = dataset.map(preprocess_function, batched=True)

2. Load the Model

We load the distilbert-base-uncased model, a classification task with 4 labels.

python
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=4)

3. Fine-Tune the Model

The Hugging Face Trainer class simplifies the training process. We'll use a small subset of the data for a quick demonstration.

python
from transformers import TrainingArguments, Trainer

# Select smaller subsets for a quick training run
small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(1000))
small_eval_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(1000))

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=2,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    weight_decay=0.01,
    evaluation_strategy="epoch",
)

# Create the Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=small_train_dataset,
    eval_dataset=small_eval_dataset,
)

# Start fine-tuning
trainer.train()

4. Test Your Fine-Tuned Model

Now for the exciting part. Let's use our new, specialised model to classify some headlines.

python
from transformers import pipeline

# The model in the trainer is our fine-tuned model
# You can also save it with trainer.save_model("my_news_classifier")
classifier = pipeline("text-classification", model=trainer.model, tokenizer=tokenizer)

# Create a mapping from label ID to label name
label_map = {0: "World", 1: "Sports", 2: "Business", 3: "Sci/Tech"}

# Test with some headlines
headlines = [
    "Manchester United wins the FA Cup.",
    "New study shows promise for renewable energy.",
    "Global markets react to latest inflation data."
]

for text in headlines:
    prediction = classifier(text)
    label_id = int(prediction[0]['label'].split('_')[1])
    print(f"Headline: '{text}' -> Prediction: {label_map[label_id]}")

You should see that the model correctly classifies each headline, a task it would have performed poorly on before fine-tuning.

Conclusion

You've just successfully fine-tuned a powerful language model for a custom task. This process—taking a pre-trained foundation and moulding it to your specific needs—is a cornerstone of modern applied AI. From here, you can explore different models, datasets, and more advanced techniques like Parameter-Efficient Fine-Tuning (PEFT).

This topic was the focus of our recent 'Introduction to Fine-Tuning' workshop at the Leeds AI Society.