llama.cpp setup and test notes

These are my notes on setting up llama.cpp on Ubuntu 24.04. To generate these notes, I started with two fresh installations of Ubuntu 24.04.1 on identical hardware (Dell laptops). I worked through issues on one laptop, then tested on the other laptop to refine my notes. This was done to come up with a repeatable process to setup and test llama.cpp on Ubuntu 24.0.4 other hosts.

Host configuration

sudo apt update
sudo apt install git cmake g++ python3 python3-venv plocate gedit openssh-server -y
sudo systemctl enable ssh
sudo systemctl start ssh
sudo ufw allow ssh
sudo ufw enable
git config --global credential.helper store

Notes: git needed to download llama.cpp reo. cmake and g++ needed to build llama.cpp library. python packages needed as dependencies for huggingface CLI, which is used to download LLMs used by llama.cpp. plocate installed for updatedb, which is used for file searching. gedit installed as a more familiar simple text editor. openssh-server installed to allow remote access.

Llama.cpp download & installation

mkdir -p ~/github.com && cd ~/github.com
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir -p build && cd build
cmake ..
make
sudo make install
sudo ldconfig

Notes: ldconfig did not need to be run on the initial host setup but was needed on the test host. Without running this command, I got an error later that the llama library could not be found. It’s likely that the initial host was rebooted between the library being installed and then being referenced later. I’ve run into errors like this on Red Hat (which I am more familiar), but never with common library directories. This might be a nuance of the /usr/local/lib directory being used for the first time on a fresh system installation, but I did not investigate further.

Setup a temporary project space for testing

mkdir -p ~/tmp/llamacpptest && cd ~/tmp/llamacpptest

Create & set Python virtual environment

python3 -m venv venv
source venv/bin/activate

Notes: the first command only needs to be run once. The second command is used whenever the environment needs to be setup. However, this is just a test to help verify that llama.cpp gets installed and runs correctly (with the intention to setup more interesting projects).

Install module dependencies

pip install huggingface_hub[cli] llama-stack torch numpy transformers sentencepiece protobuf tokenizers

Log into HuggingFace

huggingface-cli login

Notes: this will prompt you for a huggingface login key.

Tiny Llama download

huggingface-cli download TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf --local-dir .

Notes: tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf is a 638 MB download and is much smaller than the LLMs that will likely be used for more interesting and capable testing and projects. Tiny Llama is used here mainly for quick environment setup and verification.

Tiny Llama test

llama-cli -m tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf -c 2048 -n 128 --repeat_penalty 1.2 --interactive

Notes: This uses llama-cli, which was installed along with the the llama library earlier. This just helps verify that at least one of the installed tools runs and can use the downloaded LLM.

Compilation test

cp ~/github.com/llama.cpp/examples/simple/simple.cpp llamacpptest.cpp
g++ llamacpptest.cpp -o llamacpptest -I. -L. -Wl,--copy-dt-needed-entries -lggml -lllama -std=c++17
./llamacpptest -m tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf once upon a time

Notes: this just copies the example source code from simple.cpp and performs a quick test compile/link/run. If this runs without error, then you should be able to build/run more complex llama.cpp examples and custom projects as well.

Another LLM download example

huggingface-cli download meta-llama/Llama-2-7b-hf --cache-dir models/raw/llama-2-7b-hf
python3 ~/github.com/llama.cpp/convert_hf_to_gguf.py models/raw/llama-2-7b-hf/models--meta-llama--Llama-2-7b-hf/snapshots/01c7f73d771dfac7d292323805ebc428287df4f9/ --outfile llama-2-7b.gguf
llama-cli -m llama-2-7b.gguf -c 2048 -n 128 --repeat_penalty 1.2 --interactive

DB update for file searching

sudo updatedb

Notes: this allows all previous files to be searchable using the locate command.