Filenames by Design, Part Two
Continuing our series on how to take full advantage of your filesystem with tips and tricks for the newbie and old pro alike.
Sunday, November 2nd, 2008
Extended Sidebar: Details of the File-Renaming Loops
Earlier in the article I showed two shell loops that rename an ordered list of files (read from /tmp/files). Each loop builds a series of shell commands to rename the files with a prefix of either three decimal digits or a variable number of hexadecimal digits. The prefix insures file sorting order for shell wildcards and ls(1). Each loop writes a shell script to /tmp/renamer that renames the files.
This “sidebar” page has more information about those two loops.
First loop: numeric prefixes
$ let i=0
$ while read -r oldfile
> do
> printf -v prefix '%03d' $((i++))
> echo "mv -i '$oldfile' '${prefix}_$oldfile'"
> done </tmp/files >/tmp/renamer
Some of the techniques above might not be familiar. Here’s a rundown:
-
The
> prompt is a Bourne shell secondary prompt. The shell is waiting for you to complete something it’s missing — in this case, the rest of the while statement. If your shell is set up to edit multi-line input, you should be able to edit any line in the loop before you press ENTER on the done line. (If you aren’t comfortable typing multiple lines at a shell prompt, you can make a throw-away script file instead and run it with bash scriptname.)
- The variable
i stores the file prefix number. (Note that with a redirected-I/O loop like this one, some shells will increment variables within the loop but not pass any changes out of the loop. That is, after the loop runs, $i may still be 0.)
- The while loop reads existing filenames one by one from /tmp/files and stores each name in the variable
oldfile.
(Notice the redirection at the end of the loop. That’s explained here and here.) The -r option keeps read from changing the input text; -r isn’t needed on some versions of read.
- The bash operator
$((i++)) does the arithmetic operation between the double parentheses and returns the result. Here we’re using the postfix operator ++ to increment $i after passing its value to printf. The formatting specification %03d prints this value in a 3-character field, left-padded with one or two zeroes if needed.
We’re using a version of printf that’s built into bash. From other shells, you can use the external version of printf(1):
prefix=`printf '%03d' "$i"`
If you don’t have printf and/or the $((i++)) operator, you can use a case statement and expr. This also lets you be sure $i isn’t empty and hasn’t overflowed:
case "$i" in
?) prefix=00$i ;;
??) prefix=0$i ;;
???) prefix=$i ;;
*) echo "case: bad length: '$i'" 1>&2; break ;;
esac
i=`expr "$i" + 1`
- The echo outputs an mv command line to rename
$oldfile to a new name starting with $prefix and an underscore. These command lines are collected from echo‘s standard output into the file /tmp/renamer.
Notice the quoting: outer double quotes (") allow variable substitution between them — but suppress the special meaning of single quotes ('). So $oldfile and ${prefix} are expanded as part of two single-quoted arguments. Those single quotes protect the filenames stored in /tmp/renamer from any interpretation by the shell that reads them. (You can see the result below, at the less command.)
Here’s a look at the two files we made, /tmp/files and /tmp/renamer:
$ head -3 /tmp/files
foo
bar
data
$ less /tmp/renamer
mv -i 'foo' '000_foo'
mv -i 'bar' '001_bar'
mv -i 'data' '002_data'
...
mv -i 'a file' '124_a file'
mv -i 'How
many
people?' '125_How
many
people?'
Note that the single-quoting even protects filenames containing spaces, newlines, and wildcard characters like ?. (Yes, a multi-line filename is legal on Linux filesystems.) This keeps the shell from breaking the last two filenames into pieces and from interpreting the question mark (?) as a wildcard.
(You might want to modify the script that builds /tmp/renamer, or simply edit /tmp/renamer by hand, to make a more “agreeable”
new filename without spaces or special characters — for instance, 125_How_many_people.)
Second loop
file_count=$(wc -l < /tmp/files)
max_hex=$(echo -e "obase=16\n${file_count}-1" | bc -q)
prefix_width=${#max_hex}
for prefix in $(jot -w "%0${prefix_width}x" "$file_count" 0)
do
read -r oldfile
echo "mv -i '$oldfile' '${prefix}_$oldfile'"
done </tmp/files >/tmp/renamer
Here’s the rundown:
Comments on "Filenames by Design, Part Two"
Hello, this is a great article and in an attempt to share more knowledge about command line tips, here’s this one:
With Bash (this tip does not work with Csh for exemple), instead of using {1,2,3,4,5,6,7,8,9} you can simply type {1..9}.
Give it a try in a Bash shell with:
echo {1..9}
Great article, Jerry. Learned some new tricks!
Maximd,
Interesting..your shortcut didn’t work in my Bash shell on my MacBook:
MACLT:~/Documents$ echo $SHELL
/bin/bash
MACLT:~/Documents$ echo {1,9}{a,b,c}
1a 1b 1c 9a 9b 9c
MACLT:~/Documents$ echo {1..9}
{1..9}
scott
Commenting on maximd’s comment:
> With Bash (this tip does not work with Csh for exemple), instead of using {1,2,3,4,5,6,7,8,9} you can simply type {1..9}.
This is correct, but it only works for Bash 3.0 and above.
Thanks, maximd and wirawan0. I need to take a closer look at bash 3.
I’m guessing that bash got the {1..9} expansion from the Z Shell, which has had it for quite a while. Here’s some info from the zshexpn(1) manpage:
Jerry
The seq command is also nice to generate sequences. Instead of typing {1,2,3,4,5,6,7,8,9} , you can type $(seq 1 9)
???????? ??????????????? ???????????????????????????????????????????????????????? tnhkas