I wanted a simple and portable way to tag my old digital photo albums. No magic database files, opaque storage formats, or loose index files floating around. A tagging solution that could be backed up with my normal rsync processes and had widespread support.
It turns out that most modern filesystems support something called extended attributes (xattrs). And that you can use these to stick interesting metadata onto a file. In my case, arbitrary tags. Join me as I take you into the exciting nightmare that has been me categorizing 23 years of photos. I'm not a hoarder, you're a hoarder.
Let's jump in with a code example. There are two main commands that I've found: getfattr ("get file attribute") and setfattr ("set file attribute"). They can be installed on Debian by installing the attr package. Here's how they run:
Eg: display any tags that currently exist in a file "photo.jpg"
$ getfattr -d photo.jpg
# No output, this file doesn't have any tags
Try and set a tag named "colour" to the value "red" for photo.jpg
$ setfattr -n colour -v red photo.jpg
setfattr: photo.jpg: Operation not supported
# It didn't like using a top-level name like "colour" without a namespace prefix
There are five top-level namespaces, one of them "user" is for our diabolicalness
$ setfattr -n user.colour -v red photo.jpg
# No output, it ran successfully
Now let's check it for any tags...
$ getfattr -d photo.jpg
# file: photo.jpg
user.colour="red"
# Hooray now it has a tag
(spoiler: I wrote a wrapper script to help automate this en masse)
I've been backing up my files for decades now. But they're a mess. Different files named in different ways, stored in different folders. Some of it collected into folders. A floordrobe of files.
The first step, break some eggs: One new folder per year, eg:
# Make some folders
$ for y in $(seq 2003 2026); do mkdir $y; done
# One for each year
$ ls
2003 2005 2007 2009 2011 2013
2015 2017 2019 2021 2023 2025
2004 2006 2008 2010 2012 2014
2016 2018 2020 2022 2024 2026
Then some malarky to move all the untagged files into their correct year folder. Note:
this required a lot of manual intervention. Here's a script that generates mv
commands.
#!/bin/bash
set -euxo pipefail
SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
cwd=$(dirname ${SCRIPT_DIR})
# Eg 2025
year=$1
plus="$(( $year + 1 ))"
# Ignore these paths
IGNORE_PATHS=(
"("
-path "$cwd/Media/*" -o
-path "$cwd/music/*" -o
-path "$cwd/movie/*" -o
-iregex ".*\.kde.*" -o
-iregex ".*backups.*" -o
-iregex ".*/\.local/.*"
")"
)
# Look for files with the right suffix, that have a modified year within the right time range
find ${cwd} \
"${IGNORE_PATHS[@]}" \
-prune -o \
-type f \
-iregex ".*\.\(jpg\|jpeg\|png\|mpg\|mov\|mp4\|mkv\|avi\|mpeg\)" \
-newermt "${year}-01-01" ! -newermt "${plus}-01-01" \
-exec echo "mv \"{}\" ./${year}/" \;
# and look for files with the year embedded in the filename
find ${cwd} \
"${IGNORE_PATHS[@]}" \
-prune -o \
-type f \
-iregex ".*[_-]+${year}.*\.\(jpg\|jpeg\|png\|mpg\|mov\|mp4\|mkv\|avi\|mpeg\)" \
-exec echo "mv \"{}\" ./${year}/" \;
You can see that the script has two main considerations: Files that were
modified (i.e. likely created) in the particular year, and also files that looked like
they had a naming convention (thank goodness most cameras do this) of putting the year in the
filename.
Invoked like:
# Generate a script containing a list of mv commands
$ move.sh 2025 | sort | uniq > mv-commands-2025.sh
# Btw manually review the file before executing
$ chmod a+x mv-commands-2025.sh
$ ./mv-commands-2025.sh
The open source application Gwenview allows you to select multiple files and then edit the tags for all of those photos in one hit. It stores the values for the tags in the extended file attribute name user.xdg.tags. Which is sort of an unofficial convention - but it'll do!
Useful Gwenview shortcuts include:
Ctrl-click -> select individual files
Shift-click -> select runs of sequential files
Ctrl-t -> open the tagging dialog
And as you'll see below I also wrote a script to edit tags in bulk too.
rsync has 2 handy flags:
vim has a configuration option which you must set if you're editing a previously tagged file and you don't want to lose the tags when you hit save: set backupcopy=yes.
tag -v or --view fires up Gwenview to see images that have a particular tag:
# Show photos from Xmas 2025
$ tag --view xmas2025
# Show Xmas 2025 that don't have Steve in them
$ tag --view xmas2025 -steve
# Show all photos in current directory that don't have any tags at all
# VERY useful for making sure you have tagged everything
$ tag --view
tag -s or --search same syntax for tags as with --view, but instead of opening up Gwenview,
just print the list of files that match, eg:
# Handy for command substitution:
# Eg: mv $(tag -s roof) roof-folder/
$ tag -s roof
PXL_20250114_014413164.jpg
PXL_20250114_014355184.jpg
PXL_20250114_014423667.jpg
PXL_20250114_014432786.jpg
PXL_20250114_014431971.jpg
PXL_20250114_014418299.jpg
PXL_20250114_014420316.jpg
PXL_20250114_014422828.jpg
PXL_20250114_014354245.jpg
tag -d or --dump dump out all the tags that are in use in the current directory
along with the tallies:
$ tag -d
122 gumtree
74 theatre
53 camping
47 dibs
28 therats
27 art
18 mum
18 dad
9 roof
9 car
tag -e or --edit to actually add or remove tags from some files. Eg:
Add the tag xmas2025 to all the files in the "xmas" folder
$ tag -e xmas2025 -- xmas/*
Or remove a tag (i.e. a tag typo in this example)
$ tag -e -xmaaaas -- xmas/*
Hopefully this is useful for you. If I had to do it over again I definitely would. But it was a lot of work. The whole process has simplified my backups though, and made it possible to quickly find files that are useful, and also delete a bunch of stuff that didn't need to be kept.
Now that so many things are tagged - I wonder if I could use that info to train a not-internet-connected thingo to tag things automatically in the future...