If you’ve been dealing with files for a while you will have noticed that there is a slight semantic gap between how humans see files and how computers do. If you’ve ever seen a file list like this you know what I mean:
Lecture10.pdf
Lecture11.pdf
Lecture12.pdf
Lecture1.pdf
Lecture2.pdf
…
Numbering these files was done in good faith, and a user understands what it means, but the computer doesn’t get it. Sorting in dictionary order produces the wrong order as far as the user is concerned. The reason is that the digits in these filenames are not treated and compared as integers, merely as strings. (Actually, . comes before 0 in ASCII, what’s going on here?)
While we’re not expecting our computers to wisen up about this anytime soon, there is the obvious fix:
Lecture01.pdf
Lecture02.pdf
…
Lecture10.pdf
Lecture11.pdf
Lecture12.pdf
You’ve probably done this by hand once or twice, while cursing.
On the upshot, this is very easy to fix with a few lines of code:
#!/usr/bin/env python # # Author: Martin Matusiak <numerodix@gmail.com> # Licensed under the GNU Public License, version 3. # # revision 1 - support multiple digit runs in filenames import os, string, glob, re, sys def renseq(): if (len(sys.argv) != 2): print "Usage:\\t" + sys.argv[0] + " <num_digits>" else: ren_seq_files(sys.argv[1]) def ren_seq_files(num_digits): files = glob.glob("*") for filename in files: m = re.search("(.*)(\\..*)", filename) ext = "" if m: (filename, ext) = m.groups() digit_runs = re.finditer("([0-9]+)", filename) spans = [m.span() for m in digit_runs if digit_runs] if spans: spans.reverse() arr = list(filename) for (s, e) in spans: arr[s:e] = string.zfill(str( int(filename[s:e]) ), int(num_digits)) os.rename(filename+ext, "".join(arr)+ext) if __name__ == "__main__": renseq()
Download this code: renseq.py
This works on all the files in the current directory. Pass an integer to renseq.py and it will change all the numbers in a filename (if there are any) to the same numbers, padded with zeros if they have fewer digits than the amount you want. So on the example
renseq.py 2
will turn the first list into the second list.
If say, there are filenames with numbers of three digits and you pass 2 to renseq.py, the numbers will be preserved (so it’s not a destructive rename), you’ll just revert to your incorrect ordering as it was in the beginning.
renseq.py will rewrite all the numbers in a filename, but not the extension. So mp3 won’t become mp03.

Thursday, May 1st, 2008