It seems to me that we have pretty good tools for managing files that aren’t changing. We have file managers that display all the pertinent details, they’ll detect the file type, they’ll even show a preview if the content is an image or a video.
But what about files that are changing? Files get transfered all the time, but network connections are not always reliable. Have you ever been in the middle of a transfer wondering if it just stopped dead, wondering if it’s crawling along very slowly, too slow, almost, to notice? Or how about downloading something where the host doesn’t transmit the size of the file, so you’re just downloading not knowing how much there is left?
These things happen, not everyday, but from time to time they do. And it’s annoying. A file manager doesn’t really do a great job of showing you what’s happening. Of course you can stare at the directory and reload the display to see if the file size is changing. (Or the time stamp? But that’s not very convenient to check against the current time to measure how long it was since the last change.) Or maybe the file manager displays those updates dynamically? But it’s still somewhat lacking.
Basically, what you want to know is:
- Is the file being written to right now?
- How long since the last modification?
And you want those on a second-by-second basis, ideally. Something like this perhaps?

Here you have the files in this directory sorted by modification time (mtime). One file is actively being written to, you can see the last mtime was 0 seconds ago at the time of the last sampling. Sampling happens every second, so in that interval 133kb were written to the file and the mtime was updated. The other file has not been changed for the last 7 minutes.
The nice thing about this display is that whether you run the monitor while the file is being transfered or you start it after it’s already finished, you see what is happening, and if nothing is, you see when the last action took place.
#!/usr/bin/env python # # Author: Martin Matusiak <numerodix@gmail.com> # Licensed under the GNU Public License, version 3. # # <desc> Watch directory for changes to files being written </desc> import os import sys import time class Formatter(object): size_units = [' b', 'kb', 'mb', 'gb', 'tb', 'pb', 'eb', 'zb', 'yb'] time_units = ['sec', 'min', 'hou', 'day', 'mon', 'yea'] @classmethod def simplify_time(cls, tm): unit = 0 if tm > 59: unit += 1 tm = float(tm) / 60 if tm > 59: unit += 1 tm = float(tm) / 60 if tm > 23: unit += 1 tm = float(tm) / 24 if tm > 29: unit += 1 tm = float(tm) / 30 if tm > 11: unit += 1 tm = float(tm) / 12 return int(round(tm)), cls.time_units[unit] @classmethod def simplify_filesize(cls, size): unit = 0 while size > 1023: unit += 1 size = float(size) / 1024 return int(round(size)), cls.size_units[unit] @classmethod def mtime(cls, reftime, mtime): delta = int(reftime - mtime) tm, unit = cls.simplify_time(delta) delta_s = "%s%s" % (tm, unit) return delta_s @classmethod def filesize(cls, size): size, unit = cls.simplify_filesize(size) size_s = "%s%s" % (size, unit) return size_s @classmethod def filesizedelta(cls, size): size, unit = cls.simplify_filesize(size) sign = size > 0 and "+" or "" size_s = "%s%s%s" % (sign, size, unit) return size_s @classmethod def bold(cls, s): """Display in bold""" term = os.environ.get("TERM") if term and term != "dumb": return "\\033[1m%s\\033[0m" % s return s class File(object): sample_limit = 60 # don't hold more than x samples def __init__(self, file): self.file = file self.mtimes = [] def get_name(self): return self.file def get_last_mtime(self): tm, sz = self.mtimes[-1] return tm def get_last_size(self): tm, sz = self.mtimes[-1] return sz def get_last_delta(self): size_last = self.get_last_size() try: mtime_beforelast, size_beforelast = self.mtimes[-2] return size_last - size_beforelast except IndexError: return 0 def prune_samples(self): """Remove samples older than x samples back""" if len(self.mtimes) % self.sample_limit == 0: self.mtimes = self.mtimes[-self.sample_limit:] def sample(self, mtime, size): """Sample file status""" # Don't keep too many samples self.prune_samples() # Update time and size self.mtimes.append((mtime, size)) class Directory(object): def __init__(self, path): self.path = path self.files = {} def prune_files(self): """Remove indexed files no longer on disk (by deletion/rename)""" for f in self.files.values(): name = f.get_name() file = os.path.join(self.path, name) if not os.path.exists(file): del(self.files[name]) def scan_files(self): # remove duds first self.prune_files() # find items, grab only files items = os.listdir(self.path) items = filter(lambda f: os.path.isfile(os.path.join(self.path, f)), items) # stat files, building/updating index for f in items: st = os.stat(os.path.join(self.path, f)) if not self.files.get(f): self.files[f] = File(f) self.files[f].sample(st.st_mtime, st.st_size) def display_line(self, name, time_now, tm, size, sizedelta): time_fmt = Formatter.mtime(time_now, tm) size_fmt = Formatter.filesize(size) sizedelta_fmt = Formatter.filesizedelta(sizedelta) line = "%6.6s %5.5s %6.6s %s" % (time_fmt, size_fmt, sizedelta_fmt, name) if time_now - tm < 6: line = Formatter.bold(line) return line def sort_by_name(self, files): return sorted(self.files.values(), key=lambda x: x.get_name()) def sort_by_mtime(self, files): return sorted(self.files.values(), key=lambda x: (x.get_last_mtime(),x.get_name())) def display(self): time_now = time.time() files = self.sort_by_mtime(self.files.values()) print("\\nmtime> size> delta> name>") for f in files: line = self.display_line(f.get_name(), time_now, f.get_last_mtime(), f.get_last_size(), f.get_last_delta()) print(line) def main(path): directory = Directory(path) while True: try: directory.scan_files() directory.display() time.sleep(1) except KeyboardInterrupt: print("\\rUser terminated") return if __name__ == '__main__': try: path = sys.argv[1] except IndexError: print("Usage: %s /path" % sys.argv[0]) sys.exit(1) main(path)
Download this code: peek.py

December 1st, 2009
It works nicely when copying stuff to/from slow media.
cool script, but on the other hand, is a `watch ls -alh` that much worse? it shows less details but when things go wrong, you’ll also notice
It doesn’t tell you how long since the file has been modified, though, which is a very useful fact. Plus the output is less pretty