120W to 61W: Half the Power, Zero Compromises

How I cut my server's idle draw in half without sacrificing a single feature - and made it silent in the process.

System| i7-8700K (delidded, 5GHz all-core OC)| GTX 1080 Ti 11GB| 32GB DDR4| Samsung 970 EVO 1TB + Crucial P3 4TB NVMe| MSI A1000G 1000W PSU| Ubuntu 24.04.4 LTS

I inherited this gaming PC. Six cores of overclocked silicon, a beefy 1080 Ti, a kilowatt PSU - it was built to push frames, not serve containers. But here it was, running Docker, Ollama, Tailscale, and pulling 120+ watts doing absolutely nothing. That's roughly $130/year at my electricity rates, just to idle.

I decided to see how low I could take it without compromising a single capability. No underclocking. No removing hardware. No turning things off. Just eliminating waste.

Here's every phase of the journey.

Phase 1: GNOME Compositing Overhead

Minimal Impact

First thing I noticed: Blur My Shell was installed. This GNOME extension does real-time Gaussian blur on the desktop background, which means it's constantly running GPU compositing passes - on a server that nobody is looking at 99% of the time.

I reverted GNOME to its default theme and removed the extension. The impact on wall power was negligible (maybe 1-2W), but it was the observation that started me down the rabbit hole. If a desktop blur effect was running on a headless server, what else was wasting cycles?

Phase 2: Service Cleanup

~175MB RAM freed

I audited every running service. The usual suspects were all there:

systemd service cleanup
# Modem manager on a desktop that has no modem
$ sudo systemctl disable --now ModemManager.service

# Print server on a machine with no printer
$ sudo systemctl disable --now cups.service cups-browsed.service

# Snapd - not using any snaps
$ sudo systemctl disable --now snapd.service snapd.socket

# Ollama doesn't need to autostart - call it when needed
$ sudo systemctl disable ollama.service

About 175MB of RAM freed. No power impact worth measuring, but it set the stage for the discovery that actually mattered.

Phase 3: The Big Win - Headless Nvidia

-47W GPU idle

This is where the numbers got interesting.

I ran nvidia-smi and saw something that made me pause:

nvidia-smi - BEFORE (display driver)
$ nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.16     Driver Version: 570.86.16     CUDA Version: 12.8               |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|=========================================+========================+======================|
| 0    NVIDIA GeForce GTX 1080 Ti     Off | 00000000:01:00.0    On |                  N/A |
| 23%   36C    P2             55W / 250W |      830MiB / 11264MiB |      2%      Default |
+-----------------------------------------+------------------------+----------------------+

55 watts. At idle. The GPU was sitting in P2 performance state because it was driving the display output. Even with nothing on screen, the Nvidia driver was keeping the GPU clocks improved and VRAM active for desktop compositing. 830MB of VRAM allocated just to show a desktop nobody was watching.

The i7-8700K has an Intel UHD 630 iGPU. It's perfectly capable of driving displays - even dual 4K. The 1080 Ti only needs to wake up for CUDA compute (Ollama inference, mostly). So I made the switch:

switching to headless nvidia driver
# Remove the full display driver
$ sudo apt remove nvidia-driver-570

# Install headless (compute-only) driver
$ sudo apt install nvidia-headless-570 nvidia-utils-570

# Enable iGPU in BIOS as primary display adapter
# Plug monitors into motherboard DisplayPort/HDMI

$ sudo reboot

After reboot, the difference was immediate:

nvidia-smi - AFTER (headless driver)
$ nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.16     Driver Version: 570.86.16     CUDA Version: 12.8               |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|=========================================+========================+======================|
| 0    NVIDIA GeForce GTX 1080 Ti     Off | 00000000:01:00.0   Off |                  N/A |
|  0%   32C    P8              8W / 250W |       3MiB / 11264MiB |      0%      Default |
+-----------------------------------------+------------------------+----------------------+

8 watts. P8 power state. 3MB VRAM. Fans off. The GPU was effectively asleep, but still fully available for CUDA workloads the moment Ollama or anything else needs it. When a model loads, it wakes up, does its work, and goes right back to sleep.

The takeaway: If you have an Nvidia GPU in a Linux server and you're not actively using it for display, switch to nvidia-headless-*. This single change saved 47W - nearly 80% of the total reduction.

Phase 4: Background Process Cleanup

~456MB RAM freed

With the big win secured, I went hunting for memory waste. A quick look at the top consumers revealed some surprises:

background process cleanup
# GNOME Software was using 307MB sitting in the background
# On a server. Checking for app store updates. For nobody.
$ sudo apt remove gnome-software

# Tracker file indexer - indexing files for desktop search
$ systemctl --user mask tracker-miner-fs-3.service
$ systemctl --user mask tracker-miner-fs-control-3.service
$ tracker3 reset -rs

# Docker: switch to socket activation
# Only starts the daemon when something actually calls it
$ sudo systemctl disable docker.service
$ sudo systemctl enable docker.socket

307MB for GNOME Software alone. On a headless server. It was auto-launched by the desktop session, silently sitting there refreshing package metadata for an app store GUI that would never be opened. Tracker was doing the same thing - crawling the filesystem to build a search index for a desktop that's accessed via RDP maybe once a week.

The Docker socket activation change was elegant: instead of Docker always running and consuming resources, the socket just listens. The first docker command wakes it up on demand. In practice, Docker is always running anyway (I have persistent containers), but it's the right architecture for a system that might not need Docker on every boot.

Phase 5: Fan Curve Optimization

Audible → Silent

With power draw halved, thermal output dropped significantly. The existing fan curves were way too aggressive for the new power profile. The system was cooling 60W of heat with curves designed for 250W+ gaming loads.

Two changes:

BIOS Smart Fan 5 curves: Adjusted the motherboard fan headers to start ramping much later. At 60W total system power, the CPU barely gets warm - even at 5GHz.

Commander Core XT (AIO pump/fans): Corsair's iCUE doesn't run on Linux, but liquidctl does. I set the AIO fans to a fixed 20% RPM via a systemd service that runs at boot:

liquidctl fan control
# /etc/systemd/system/liquidctl-fans.service
[Unit]
Description=Set AIO fan speed via liquidctl
After=multi-user.target

[Service]
Type=oneshot
ExecStart=/usr/bin/liquidctl initialize --match "Commander Core"
ExecStart=/usr/bin/liquidctl set fan1 speed 20 --match "Commander Core"
ExecStart=/usr/bin/liquidctl set fan2 speed 20 --match "Commander Core"
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

At 20% RPM, the fans are inaudible from two feet away. CPU temps still sit comfortably in the low 40s at idle. The 240mm AIO is wildly overkill for 60W of total system power, which means the fans barely have to work.

Phase 6: Final Cuts

Last few watts
final optimizations
# Disable WiFi hardware (hardwired, WiFi is just wasting power)
$ sudo rfkill block wifi

# NVMe power management - let drives sleep when idle
$ echo "auto" | sudo tee /sys/block/nvme0n1/device/power/control
$ echo "auto" | sudo tee /sys/block/nvme1n1/device/power/control

# Mask services that keep respawning
$ sudo systemctl mask snapd.service snapd.seeded.service
$ systemctl --user mask evolution-data-server.service
$ systemctl --user mask tracker-miner-fs-3.service

Small individually, but they add up. The WiFi radio alone was probably 1-2W. NVMe power management lets the drives enter lower power states during idle periods. And masking (not just disabling) the persistent services prevents them from being socket-activated or dependency-pulled back to life.

Results

Metric Before After
Wall power (idle) ~120W+ 60.9W -50%
GPU idle draw 55W (P2) 8W (P8) -85%
GPU VRAM used 830MB 3MB -99.6%
System RAM used ~7.7GB ~3.5GB -55%
Fan noise Audible Silent
Est. annual cost ~$131 ~$66 -$65/yr

50% power reduction with 0% performance loss. Every watt saved was pure waste eliminated - services running for nobody, a GPU driving a display nobody watches, fans cooling heat that no longer exists.

What Was Kept

This is the important part. Nothing was sacrificed:

The machine does everything it did before. It just stopped doing things nobody asked it to do.

Closing Thoughts

The single biggest lesson: check your GPU. If you have an Nvidia card in a Linux box and it's driving a display you rarely look at, you're probably burning 40-50W for nothing. The headless driver exists specifically for this use case, and the switch takes five minutes.

The rest - service cleanup, background process removal, fan tuning - is standard Linux housekeeping. But that GPU change alone accounts for nearly 80% of the total savings. It's the kind of thing that seems obvious in hindsight but you'd never notice unless you actually looked at nvidia-smi and asked "why is this in P2?"

60.9W for a delidded, overclocked 8700K with a 1080 Ti and 5TB of NVMe storage. I'll take it.