Infrastructure 2026-03-05

25.5TB of Irreplaceable Data and No Real Backup Plan


unraid backup preservation nas homelab

I have 78,398 photos on my NAS. Family photos, concert shots, travel photos going back to 2010. RAW files from my Canon 6D. Every photo I've ever cared about keeping lives on this machine. And until last week, I had never seriously thought about what happens when the machine fails.

Not if it fails. When.

I already know this. Last September, one of my array disks started throwing errors. It took eleven days of escalating warnings before the drive was rebuilt on a replacement. The parity disk caught it. Everything survived. But it got me thinking about scenarios that parity can't save you from.

The Audit

I ran a deep analysis of every share on my Unraid server. A UGREEN DXP4800+ with three data disks, a parity disk, and a 2TB NVMe cache. 25.5TB used out of 50TB total array capacity. Here's what I found.

df -h /mnt/disk* /mnt/cache
$ df -h /mnt/disk* /mnt/cache
Filesystem      Size  Used Avail Use%  Mounted on
/mnt/disk1      15T   11T  3.2T  79%  /mnt/disk1
/mnt/disk2      15T   9.8T  4.7T  68%  /mnt/disk2
/mnt/disk3      20T   5.2T   15T  26%  /mnt/disk3
/mnt/cache     1.8T   268G  1.6T  14%  /mnt/cache

The numbers looked fine on the surface. Parity is healthy. Cache drive has tons of room. Disk 3 is mostly empty. But the numbers tell you nothing about how protected the data is. Or how findable it would be in twenty years. Or what happens if someone breaks into my house and takes the whole machine.

I graded every dimension of the archive's health. The results were not great.

Domain Grade Key Issue
Storage tiering B- Music cache-only, orphaned data, Documents on spinning rust
Backup & redundancy D Zero offsite backup anywhere
Integrity verification F Zero checksums on 25.5TB of XFS
Organization C- 10TB of unsorted drive dumps and recordings
Format longevity B RAW photos preserved, FLAC excellent
Workflow & ingest C Multiple tools deployed but unused
Searchability D+ Only 15-20% of archive is indexed
Disaster recovery C- Good single-failure protection, nothing beyond that

An F in integrity verification. A D in backup. Let me walk through the scariest things I found.

The Things That Kept Me Up at Night

My entire music library has zero redundancy

This one made my stomach drop. My Music share is set to cache=prefer, which in Unraid means the data lives only on the NVMe cache drive. The mover never copies it to the array. I verified it: du -sh /mnt/disk*/Music/ returns nothing. 123GB of ripped FLACs, my entire iTunes library, all of it sitting on a single Samsung 990 PRO with no parity protection, no backup, no second copy anywhere on earth.

One NVMe failure and it's gone. Not recoverable. Just gone.

Music share: cache=prefer

123GB exists only on NVMe. Zero copies on array. Zero copies offsite. A single drive failure means permanent loss of the entire library.

No Lightroom catalog on the server

I have 64,790 RAW photo files on the NAS. CR2s from my Canon 6D, ARWs from a Sony, some newer CR3s. But the Lightroom catalog that contains every crop, exposure adjustment, color grade, and metadata edit I've ever made? That lives on my MacBook. It's not on the server. It's not backed up to the server. If the MacBook's SSD fails, a decade of editing work disappears.

Only 760 out of 64,790 photos have XMP sidecar files. That means 98.8% of my edit history is locked inside a single .lrcat database file on a single laptop.

Zero offsite backup

This is the big one. I have parity. I have a UPS. I have monthly appdata backups. But every single bit of protection lives inside the same metal box, in the same room, in the same house. A fire takes everything. A flood takes everything. A theft takes everything. Ransomware encrypts the array and parity syncs the encrypted data, so parity actually makes it worse.

Scenario Current Outcome
Single disk failure Covered (parity rebuild, 24-48 hours)
NVMe cache failure Music GONE, 30 days of appdata lost
Double disk failure TOTAL LOSS (25.5TB)
Fire / theft / flood PERMANENT LOSS
Ransomware TOTAL LOSS (parity syncs encrypted data)

I kept telling myself single-disk parity was "good enough." It's not. It protects against exactly one failure mode.

25.5TB with zero integrity checks

My array uses XFS. XFS has no built-in checksumming. That means if a bit flips on disk, nothing detects it. Nothing alerts me. The file just quietly becomes corrupted, and the next time parity recalculates, the corruption becomes the new truth.

The September 2025 disk failure had an eleven-day window of escalating errors. Any file read from that disk during those eleven days might be silently corrupted. I have no way to know. I never generated checksums. Not for the photos, not for the music, not for anything.

25.5 terabytes. Zero verification. For years.

The Organizational Debt

Beyond the survival-level problems, there's a less dramatic but equally important issue: I can't find anything. About 40% of my data is organizational debt.

I deployed Paperless-ngx for document management. It's running, it's healthy, and it has zero documents in it. I have about 1,199 PDFs scattered across the server that are completely unsearchable. I set up Calibre for ebooks. Also empty. Tdarr for video transcoding. Zero libraries configured, zero files processed.

I installed the tools. I never actually used them.

What I'm Protecting

Before building a plan, I needed to categorize what actually matters. Not all 25.5TB is created equal.

data value tiers
IRREPLACEABLE                    HARD TO REPLACE          REPLACEABLE
Photos       78K files, 1.7TB    Music/FLAC  12G ripped   Videos       1.8TB
Documents    19GB legal/personal iTunes      108G library Data         9.5TB
Personal archives                Home Assistant config    Downloads    transient
Lightroom catalog*               PostgreSQL databases     Ollama models 11GB
Family film scans                n8n workflows            Stash metadata 25GB
Google Takeout exports           Roon database 6.6GB      App installers

* NOT ON SERVER - critical gap

The irreplaceable column is about 2.7TB. That's the stuff I cannot re-download, re-purchase, or re-create. Family photos from before my kids could walk. Legal documents. Scanned film negatives from my parents. Every photo I took at every concert, every trip, every moment I thought was worth preserving.

$15 a month for Backblaze B2 would protect all of it. I've been paying $0 and risking everything.

The Plan

I organized the fix into three layers. Prevention, protection, and organization. Each one independent. Each one valuable on its own. No layer depends on the others being complete.

Layer 1: Prevention

Stop problems before they start.

verification schedule
# Weekly (Sunday 2 AM)
b2sum --check /mnt/user/Photos/.b2sum-manifest    # 1.7TB, ~4-6 hours
b2sum --check /mnt/user/Music/.b2sum-manifest     # 123GB
b2sum --check /mnt/user/Documents/.b2sum-manifest # 19GB
b2sum --check /mnt/user/Books/.b2sum-manifest     # 5.8GB

# Weekly (Sunday 3 AM)
btrfs scrub start /mnt/cache                       # ~60 seconds on NVMe

# Monthly (1st of month, 4 AM)
b2sum --check /mnt/user/*/.b2sum-manifest          # All 25.5TB, 24-48 hours

Layer 2: Protection

Survive failures when they happen.

The immediate fixes took less than an hour:

  1. Fix Music share. Change cache=prefer to cache=yes and run the mover. Five minutes of work to eliminate the single biggest risk on the entire server.
  2. Copy the Lightroom catalog from my MacBook to the NAS. Export XMP sidecars for all 64,790 photos.
  3. Purge 179GB of Recycle Bin data to free space on Disk 1 (79% full).
  4. Move 54GB of orphaned data off the NVMe cache. Archive and isos are set to cache=no but the mover skipped them.

The longer-term protection layer:

Target backup coverage

Photos (1.7TB), Documents (19GB), Music (123GB), and Backups (1.1TB) all protected by parity locally and rclone to B2 offsite. Appdata weekly to array, then synced to B2. PostgreSQL dumped daily.

Layer 3: Organization

Find things in twenty years.

I'm consolidating from 17 shares down to 11. Consistent Title-Case-Hyphen naming everywhere. No spaces in folder names (they break scripts). No ALL_CAPS. Date-prefixed filenames so chronological sorting is built in.

The tools I already deployed but never used are getting put to work:

The ugly part is the 10TB of organizational debt. I'm scheduling that as weekend projects across several months. One legacy drive dump per weekend. Review, sort, delete. The 3.6TB of OBS recordings alone could probably be cut in half, but I need to actually watch them to decide.

Format Longevity

Will my files still be readable in 2045? I looked at every format in the archive.

Format 2045 Readability Action Needed
FLAC Certain None
CR2 / ARW / CR3 (RAW) Probable Convert to DNG as insurance copy
JPG Certain None
MKV / MP4 (H.264/H.265) Very likely None
M4P (DRM-protected) Uncertain Upgrade 34 files now
ENEX (Evernote export) Uncertain Convert to Markdown
Markdown Certain Already ideal
PDF Certain Ingest into Paperless-ngx for OCR

The good news: most of my archive is in open, well-supported formats. FLAC will outlive me. JPEG is eternal. H.264/H.265 video is safe for decades.

The bad news: 34 M4P files from the iTunes DRM era could become unplayable if Apple ever drops FairPlay support. And my Evernote exports are in a proprietary XML format that nobody else reads. Both are small problems. Easy to fix now, potentially impossible later.

The ambiguous case is camera RAW. Canon CR2 and Sony ARW are proprietary formats. They'll probably be readable in 2045, but "probably" isn't the word I want for 64,790 irreplaceable photos. Adobe DNG is the open alternative. I plan to batch-convert everything to DNG as a parallel archive, keeping the originals.

What Gets Automated vs. What Stays Manual

I was tempted to automate everything. But some decisions require a human. Specifically, they require me. Only I know whether an OBS recording from 2023 is something I'll ever watch again. Only I know which folder a random PDF belongs in.

Fully automated: integrity checks, BTRFS scrubs, parity checks, PostgreSQL dumps, offsite sync, recycle bin purging, DS_Store cleanup, docker image cleanup.

Manual forever: Lightroom editing, inbox file categorization, archive triage, transcode quality review, family photo naming, backup restore drills.

Semi-automated: Phone photos upload via Immich (automatic), but organization is manual. Documents drop into Paperless-ngx (OCR is automatic), but I pick categories. An n8n workflow could auto-sort inbox files by extension and send me a daily digest, but I make the final call.

The Cost of Doing This

Item One-Time Monthly
Backblaze B2 (~2.7TB) $0 ~$15
24TB drive (dual parity or spare) ~$375 $0
rclone, b2sum, scripts $0 $0
Immich, Paperless-ngx (Docker) $0 $0
Total ~$375 ~$15

$375 up front. $15 a month. That's it.

My entire photo archive from 2010 to today. A decade of legal documents. Every piece of music I've ripped from CD. Family film scans from the 1970s. Protected from fire, theft, disk failure, and ransomware for the price of a couple of takeout meals per month.

I spent more than that on the NAS enclosure. I spent more than that on a single hard drive. But I spent $0 per month on making sure the data on those drives would survive a house fire.

The Timeline

Week 1: Stop the bleeding

Month 1: Build the safety net

Quarter 1: Harden

Months 3-12: Organize

The 2045 Test

The question I keep asking myself: can I meaningfully access this archive in twenty years?

Before this plan, the honest answer was "maybe, if nothing goes wrong." After implementation, the answer should be yes. Family photos readable in open formats, findable through Immich, protected by parity and offsite backup and PAR2 repair files. Documents OCR'd and full-text searchable. Music in FLAC, which will outlast every streaming service. Edit history preserved in XMP sidecars alongside the originals, not locked inside a proprietary database on a single laptop.

The tools are all mature and boring. rclone has been around for a decade. b2sum is part of GNU coreutils. FLAC is an open standard. Backblaze B2 uses the same storage format as S3. Nothing here is going to stop working because a startup ran out of funding.

The cost of protecting a lifetime of memories: $375 once + $15/month.

The cost of losing them: incalculable.

I should have done this years ago. But the best time to start protecting irreplaceable data is before you lose it, and the second best time is right now.