Small
Plates

Short stories and tech experiments.

Contact
Alex Lance
hello@alexlance.blog

Subscribe

About
Web consultant and founder of Dibs On Stuff and TF State

← back
Filesystem Tagging
by Alex Lance
1248 words, non-fiction, ©2025


I wanted a simple and portable way to tag my old digital photo albums. No magic database files, opaque storage formats, or loose index files floating around. A tagging solution that could be backed up with my normal rsync processes and had widespread support.

It turns out that most modern filesystems support something called extended attributes (xattrs). And that you can use these to stick interesting metadata onto a file. In my case, arbitrary tags. Join me as I take you into the exciting nightmare that has been me categorizing 23 years of photos. I'm not a hoarder, you're a hoarder.

Let's jump in with a code example. There are two main commands that I've found: getfattr ("get file attribute") and setfattr ("set file attribute"). They can be installed on Debian by installing the attr package. Here's how they run:

Eg: display any tags that currently exist in a file "photo.jpg" $ getfattr -d photo.jpg

# No output, this file doesn't have any tags

Try and set a tag named "colour" to the value "red" for photo.jpg $ setfattr -n colour -v red photo.jpg
setfattr: photo.jpg: Operation not supported

# It didn't like using a top-level name like "colour" without a namespace prefix

There are five top-level namespaces, one of them "user" is for our diabolicalness $ setfattr -n user.colour -v red photo.jpg

# No output, it ran successfully

Now let's check it for any tags...

$ getfattr -d photo.jpg
# file: photo.jpg
user.colour="red"

# Hooray now it has a tag

(spoiler: I wrote a wrapper script to help automate this en masse)

The problem
That was one piece of the pie. But there are more pieces. One might suggest too many pieces.

I've been backing up my files for decades now. But they're a mess. Different files named in different ways, stored in different folders. Some of it collected into folders. A floordrobe of files.

The first step, break some eggs: One new folder per year, eg:

# Make some folders

$ for y in $(seq 2003 2026); do mkdir $y; done

# One for each year

$ ls
2003 2005 2007 2009 2011 2013
2015 2017 2019 2021 2023 2025
2004 2006 2008 2010 2012 2014
2016 2018 2020 2022 2024 2026
Then some malarky to move all the untagged files into their correct year folder. Note: this required a lot of manual intervention. Here's a script that generates mv commands. #!/bin/bash
set -euxo pipefail

SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
cwd=$(dirname ${SCRIPT_DIR})

# Eg 2025

year=$1
plus="$(( $year + 1 ))"

# Ignore these paths

IGNORE_PATHS=(
"("
   -path "$cwd/Media/*" -o
   -path "$cwd/music/*" -o
   -path "$cwd/movie/*" -o
   -iregex ".*\.kde.*" -o
   -iregex ".*backups.*" -o
   -iregex ".*/\.local/.*"
")"
)

# Look for files with the right suffix, that have a modified year within the right time range

find ${cwd} \
   "${IGNORE_PATHS[@]}" \
   -prune -o \
   -type f \
   -iregex ".*\.\(jpg\|jpeg\|png\|mpg\|mov\|mp4\|mkv\|avi\|mpeg\)" \
   -newermt "${year}-01-01" ! -newermt "${plus}-01-01" \
   -exec echo "mv \"{}\" ./${year}/" \;

# and look for files with the year embedded in the filename

find ${cwd} \
   "${IGNORE_PATHS[@]}" \
   -prune -o \
   -type f \
   -iregex ".*[_-]+${year}.*\.\(jpg\|jpeg\|png\|mpg\|mov\|mp4\|mkv\|avi\|mpeg\)" \
   -exec echo "mv \"{}\" ./${year}/" \;
You can see that the script has two main considerations: Files that were modified (i.e. likely created) in the particular year, and also files that looked like they had a naming convention (thank goodness most cameras do this) of putting the year in the filename.

Invoked like:

# Generate a script containing a list of mv commands

$ move.sh 2025 | sort | uniq > mv-commands-2025.sh

# Btw manually review the file before executing

$ chmod a+x mv-commands-2025.sh
$ ./mv-commands-2025.sh


Tagging with Gwenview
Ok! So now we've got millions of files sitting in per-year folders. Let the tagging commence. Just one problem, how to tag each file?

The open source application Gwenview allows you to select multiple files and then edit the tags for all of those photos in one hit. It stores the values for the tags in the extended file attribute name user.xdg.tags. Which is sort of an unofficial convention - but it'll do!

Useful Gwenview shortcuts include:

Ctrl-click -> select individual files
Shift-click -> select runs of sequential files
Ctrl-t -> open the tagging dialog

And as you'll see below I also wrote a script to edit tags in bulk too.

Supporting Software
Couple of tricks with other tools that are worth mentioning when working with files that have tags.

rsync has 2 handy flags:
--xattrs ensures the attributes are synced around too when moving or backing up files.
--checksum is useful when moving files from your phone (with its filesystem that does not support extended file attributes) to your archive. It helps ensure you don't overwrite a file you've already tagged previously, by only copying the file over if its contents have changed.

vim has a configuration option which you must set if you're editing a previously tagged file and you don't want to lose the tags when you hit save: set backupcopy=yes.

Tag: a wrapper script
Lastly, I ended up writing a wrapper around getfattr and setfattr for working with multiple files. It's just called "tag" and you can grab it from over here: Tag.

tag -v or --view fires up Gwenview to see images that have a particular tag:

# Show photos from Xmas 2025

$ tag --view xmas2025

# Show Xmas 2025 that don't have Steve in them

$ tag --view xmas2025 -steve

# Show all photos in current directory that don't have any tags at all

# VERY useful for making sure you have tagged everything

$ tag --view
tag -s or --search same syntax for tags as with --view, but instead of opening up Gwenview, just print the list of files that match, eg:

# Handy for command substitution:

# Eg: mv $(tag -s roof) roof-folder/

$ tag -s roof
PXL_20250114_014413164.jpg
PXL_20250114_014355184.jpg
PXL_20250114_014423667.jpg
PXL_20250114_014432786.jpg
PXL_20250114_014431971.jpg
PXL_20250114_014418299.jpg
PXL_20250114_014420316.jpg
PXL_20250114_014422828.jpg
PXL_20250114_014354245.jpg
tag -d or --dump dump out all the tags that are in use in the current directory along with the tallies: $ tag -d
122 gumtree
74  theatre
53  camping
47  dibs
28  therats
27  art
18  mum
18  dad
9   roof
9   car
tag -e or --edit to actually add or remove tags from some files. Eg:

Add the tag xmas2025 to all the files in the "xmas" folder

$ tag -e xmas2025 -- xmas/*

Or remove a tag (i.e. a tag typo in this example)

$ tag -e -xmaaaas -- xmas/*


In conclusion
Some tags that I've found useful: the street names and suburbs of places I've lived. People's names or surnames, and whether a particular photo is sensitive/personal. Also whether something is a photo of a document, whether it's related to a workplace, or whether it was an artwork. Oh and lots of photos of my soulful whippet, who is missed.

Hopefully this is useful for you. If I had to do it over again I definitely would. But it was a lot of work. The whole process has simplified my backups though, and made it possible to quickly find files that are useful, and also delete a bunch of stuff that didn't need to be kept.

Now that so many things are tagged - I wonder if I could use that info to train a not-internet-connected thingo to tag things automatically in the future...


← back