Deleting files, keeping a few

Monday 12 December 2011

This is one of those conceptually easy tasks that seems frequently required, and yet needs a complex incantation to accomplish. I have a series of files, and it will grow over time, and I want to clean them up, but keep the most recent N files.

After poking around the Google, I found this for deleting PATTERN, but keeping the five most recent:

ls -t1 PATTERN | tail -n +6 | xargs -r rm -r

That's dash-t-one on the ls command. Or, in words:

  1. List files matching PATTERN, in descending order of modification time, in one column,
  2. Pass through all the trailing lines, starting with the sixth from the beginning,
  3. Bundle all those filenames into an "rm -r" command, but not if there are none.

That wasn't so hard, was it??


CH 10:51 AM on 12 Dec 2011

Isn't xargs unsafe? you could use a loop, and save it to a function:

function rm_old_files { ls -td1 "$@"| tail -n6 | while read f; do rm -i $f; done }

Using a 'read-while' loop like this lets you pipe output (as opposed to using 'find'), and without un-escaped-space problems found in other loops.

also, In ZSH you could do:

rm -i ./(.Oa[1,6])

CH 10:53 AM on 12 Dec 2011

Sorry, in the above it should be:

rm -i ./PATTERN(.Oa[1,6])

I'm not sure how to re-edit my post, or add formatting...

Chris 12:06 PM on 12 Dec 2011

As ever, put -- (double dash) after your rm if it's taking arbitrary file names :)

Ed Davies 1:41 PM on 12 Dec 2011

You don't need the -1 option to ls; it'll do it anyway if you're piping its output (and you don't specify -C). From prior knowledge on Linux, verified by experiment. From the man page for OS X.

void 4:46 PM on 12 Dec 2011

Moving this way means inventing logrotate. Write your output into a single file and set up that thing that already rotates /var/log/messages on your system.

Ned Batchelder 7:30 PM on 12 Dec 2011

@CH: good point about xargs and tricky filenames. That's definitely something to keep in mind.

@Chris: excellent point, -- can only make things safer.

@Ed: another good point, I should have thought about the '1' myself.

@void: in my case, the things to delete were not simple text files, but entire trees, which means I need -d on the ls command also.

Malcolm Tredinnick 4:47 AM on 13 Dec 2011

In addition to CH's suggestions, the -d option to xargs is idiomatic in this kind of situation: where you know that only newlines separate entries, not any old whitespace.

I find the tail-piped-to-xargs style of solution more memorable and readable than having to work out while-read loops in shell, but that's possibly a matter of muscle memory. I use the former style much more frequently.

Graham Fawcett 7:38 AM on 13 Dec 2011

@CH: Your zsh expression is a bit off... I think that would delete the six oldest files, and leave the rest. I think you're looking for something like "rm PATTERN(oa[7,-1])" -- sort by last-accessed time, skip the first 6 values (it's 1-indexed, not zero-indexed), and pass the rest to "rm".

Personally I think I'd do this with pipes, though, it's easier to maintain that way.

Anonymous 8:05 AM on 13 Dec 2011

There's also the option of using find, with -newer or -ctime.

CH 9:28 AM on 13 Dec 2011

You could also use GNU Parallel as sugested here

Ah, my examples just deletes the last 6, and the ZSH snippet doesn't delete directories; Your ZSH snippet is right, and the shell function would be:

function rm_old_files {
    ls -td1 "$@" | tail -n+7 | while IFS= read -r f; do rm -- $f; done }
for another point of safety, I've used 'IFS= read -r' as well.

Also, I've discovered a weaknesss; You cannot use '-i' with rm, (or use any interactive command) as that will conflict with 'read'...

I prefer not to use find, but then there's also find2perl :-D

Leon Matthews 7:23 PM on 13 Dec 2011

Thank you for the GNU parallel suggestion CH.

I hadn't run across the program before and it seems like a very nice tool -- running tools in parallel, local and remote execution, etc, etc...

Its Wikipedia article links to a couple of very nice introductory videos:

Add a comment:

Ignore this:
not displayed and no spam.
Leave this empty:
not searched.
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.