Getting Started with Local AI: An Introduction to Ollama CLI and API
Explore Ollama: a comprehensive guide detailing installation, fundamental CLI commands, API usage, and a practical API demonstration for building chatbots, emphasizing local AI's capabilities and privacy.
Getting started with Ollama
What is Ollama?
Ollama is a lightweight, extensible framework that dramatically simplifies the process of downloading, setting up, and running LLMs on your local machine. It bundles model weights, configurations, and data into a single package, managed by a Modelfile. With Ollama, you can be up and running with open-source models like Llama, Mistral, and Phi in a matter of minutes (depending on your download speed, but you only need to download once).
Installation
Ollama provides a simple, one-click installation for macOS and Windows, and a single command for Linux.
curl -fsSL https://ollama.com/install.sh | shOnce installed, Ollama will run in the background, ready to receive commands.
Ollama CLI
Accessing the CLI
If you're using Linux, you should already know how to open your terminal. You can enable and start Ollama as a systemd service for automatic startup and background operation.
sudo systemctl enable ollama --nowOr only start the service when you want to with:
sudo systemctl start ollamaOnce you have your terminal open, you can start using the ollama commands. Here are the essential commands to get you started.
Pulling a Model
phi, a good small model for getting started.ollama pull phiYou can find a list of available models on the official website of Ollama https://ollama.com/search
Running a Model
run command.ollama run phiYou can now chat with the model directly in your terminal. To exit, type /bye.
Listing Models
ollama lsThis will show you a table of your local models, their size, and when they were last updated.
Ollama API
11434.We can use a simple
curl command to send a request to the API for a one-off text generation task.A Note on `curl` for different Operating Systems
Once you have your terminal open, you can start using the ollama commands. Here are the essential commands to get you started.
curl http://localhost:11434/api/generate -d '{
"model": "phi",
"prompt": "What is the capital of the United Kingdom?",
"stream": false
}'This command sends a request to the /api/generate endpoint with a simple prompt. stream: false tells Ollama to wait for the full response before returning it. You will see a JSON response containing the generated text.
This is interesting, because, if we can send an api request from our termincal, we can do the same from our applications! (We will see an example below)
Customising Models with Modelfile
Modelfile. A Modelfile is essentially a blueprint that defines how a model should be built or modified, allowing for significant flexibility.Modelfile Basics
Modelfile is a simple text file, similar to a Dockerfile, that uses a set of instructions to define your model. Here are some common instructions:FROM: Defines the base model to use when creating a model.PARAMETER: Sets model parameters liketemperature,top_k,top_p, etc.MESSAGE: Allows you to specify a message history for the model to use when responding. Use multiple iterations of the MESSAGE command to build up a conversation which will guide the model to answer in a similar way.SYSTEM: Specifies the system message to be used in the template, if applicable.ADAPTER: (Advanced) Specifies LoRA adapters for fine-tuning.
I'll be delivering an interactive session at the Leeds Artificial Intelligence Society soon, where we’ll get to explore many of these topics hands-on.