Getting Started with Local AI: An Introduction to Ollama CLI and API

Explore Ollama: a comprehensive guide detailing installation, fundamental CLI commands, API usage, and a practical API demonstration for building chatbots, emphasizing local AI's capabilities and privacy.

Mohammad Tasfiq JawaadLinkedInGitHubUpdated: 19 Nov 2025

Getting started with Ollama

What is Ollama?

Ollama is a lightweight, extensible framework that dramatically simplifies the process of downloading, setting up, and running LLMs on your local machine. It bundles model weights, configurations, and data into a single package, managed by a Modelfile. With Ollama, you can be up and running with open-source models like Llama, Mistral, and Phi in a matter of minutes (depending on your download speed, but you only need to download once).

Installation

Ollama provides a simple, one-click installation for macOS and Windows, and a single command for Linux.

macOS & Windows

Download the installer from the official website of Ollama https://ollama.com/download.

Linux

Run the following command in your terminal:

bash

curl -fsSL https://ollama.com/install.sh | sh

Once installed, Ollama will run in the background, ready to receive commands.

Ollama CLI

The Ollama Command Line Interface (CLI) is the primary way to interact with the framework.

Accessing the CLI

After installation, you can interact with Ollama through your system's command line interface.

Windows

Open the Command Prompt or PowerShell. You can find these by searching for them in the Start Menu.

macOS

Open the Terminal app. You can find it by searching for it using Spotlight/Application launcher.

Linux

If you're using Linux, you should already know how to open your terminal. You can enable and start Ollama as a systemd service for automatic startup and background operation.

bash

sudo systemctl enable ollama --now

Or only start the service when you want to with:

bash

sudo systemctl start ollama

Once you have your terminal open, you can start using the ollama commands. Here are the essential commands to get you started.

Pulling a Model

Before you can use a model, you need to download it from the Ollama library. Let's pull phi, a good small model for getting started.

bash

ollama pull phi

You can find a list of available models on the official website of Ollama https://ollama.com/search

Running a Model

To start a chat session with your downloaded model, use the run command.

bash

ollama run phi

You can now chat with the model directly in your terminal. To exit, type /bye.

Listing Models

To see all the models you have downloaded, use the `list` command.

bash

ollama ls

This will show you a table of your local models, their size, and when they were last updated.

Ollama API

While the interactive chat is useful, the real power of Ollama comes from its ability to be integrated into other applications. Ollama provides a local REST API that runs automatically on port 11434.
We can use a simple curl command to send a request to the API for a one-off text generation task.

A Note on `curl` for different Operating Systems

macOS/Linux

The following command should work as is in your terminal.

Windows (PowerShell)

The command is the same.

Windows (Command Prompt)

You may need to escape the double quotes within the JSON payload or save the JSON to a file.

Once you have your terminal open, you can start using the ollama commands. Here are the essential commands to get you started.

bash

curl http://localhost:11434/api/generate -d '{
    "model": "phi",
    "prompt": "What is the capital of the United Kingdom?",
    "stream": false
  }'

This command sends a request to the /api/generate endpoint with a simple prompt. stream: false tells Ollama to wait for the full response before returning it. You will see a JSON response containing the generated text.

This is interesting, because, if we can send an api request from our termincal, we can do the same from our applications! (We will see an example below)

Customising Models with Modelfile

Ollama's most significant capability lies in the ability to customize and create your own models using a Modelfile. A Modelfile is essentially a blueprint that defines how a model should be built or modified, allowing for significant flexibility.

Modelfile Basics

A Modelfile is a simple text file, similar to a Dockerfile, that uses a set of instructions to define your model. Here are some common instructions:

FROM: Defines the base model to use when creating a model.
PARAMETER: Sets model parameters like temperature, top_k, top_p, etc.
MESSAGE: Allows you to specify a message history for the model to use when responding. Use multiple iterations of the MESSAGE command to build up a conversation which will guide the model to answer in a similar way.
SYSTEM: Specifies the system message to be used in the template, if applicable.
ADAPTER: (Advanced) Specifies LoRA adapters for fine-tuning.

I'll be delivering an interactive session at the Leeds Artificial Intelligence Society soon, where we’ll get to explore many of these topics hands-on.