Micro WakeWord Training Playbook Ubuntu 24.04 Cuda

Micro WakeWord Training Playbook Ubuntu 24.04 Cuda

Specs used for Benchmarks (both using a 1 Gbit/s connection)

Times below include downloading dependancies, 500 samples, normalisation, audio subsets presets, negative datasets and the training :

Ryzen 9 7945HX 16C/32T

64GB DDR5 / RTX3090FE (24GB GDDR6X VRAM)

Training Time: 2 hours


AMD EPYC 9575F 64c/128T

192GB DDR5 ECC / Nvidia L40S (48GB GDDR6 VRAM)

Training Time: 26 minutes

CUDA Toolkit 13.1

CUDA Toolkit 12.1 Downloads
Get the latest feature updates to NVIDIA’s proprietary compute stack.
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-13-1
sudo apt-get install -y cuda-drivers

Install Docker

How to install Docker on Ubuntu 22.04 | Guide by Hostman
In this step-by-step instruction you’ll learn how to install Docker and Docker Compose on Ubuntu 22.04.
sudo apt install curl software-properties-common ca-certificates apt-transport-https -y

wget -O- https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor | sudo tee /etc/apt/keyrings/docker.gpg > /dev/null

echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu jammy stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt update

apt-cache policy docker-ce

sudo apt install docker-ce -y

sudo systemctl status docker

sudo apt-get install docker-compose

sudo apt-get install git

git clone https://github.com/docker/compose.git

sudo chmod +x /usr/local/bin/docker-compose

Installing the NVIDIA Container Toolkit

Installing the NVIDIA Container Toolkit — NVIDIA Container Toolkit
sudo apt-get update && sudo apt-get install -y --no-install-recommends \
   curl \
   gnupg2

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update

export NVIDIA_CONTAINER_TOOLKIT_VERSION=1.18.1-1
  sudo apt-get install -y \
      nvidia-container-toolkit=${NVIDIA_CONTAINER_TOOLKIT_VERSION} \
      nvidia-container-toolkit-base=${NVIDIA_CONTAINER_TOOLKIT_VERSION} \
      libnvidia-container-tools=${NVIDIA_CONTAINER_TOOLKIT_VERSION} \
      libnvidia-container1=${NVIDIA_CONTAINER_TOOLKIT_VERSION}

Deploy Microwakeword Trainer

mkdir /home/user/wakeword

cd /home/user/wakeword

docker pull ghcr.io/tatertotterson/microwakeword:latest

docker run --rm -it \
    --gpus all \
    -p 8888:8888 \
    -v $(pwd):/data \
    ghcr.io/tatertotterson/microwakeword:latest

Go to your servers IP port 8888 to access the JupyterLab workbook.

In the first cell change this variable TARGET_WORD = "YOURWORD"

Examples:

TARGET_WORD = "hey_emma"

then on the top menu, run, run all cells

the workbook will now run through all the steps, when finished at the end/bottom you will see your trained json and tflite files.

If using this for HA Voice PE, see this article next - https://chris-potter.co.uk/home-assistant-voice-pe-micro-wakeword/