Deleting files, keeping a few

Monday 12 December 2011This is almost 13 years old. Be careful.

This is one of those conceptually easy tasks that seems frequently required, and yet needs a complex incantation to accomplish. I have a series of files, and it will grow over time, and I want to clean them up, but keep the most recent N files.

After poking around the Google, I found this for deleting PATTERN, but keeping the five most recent:

ls -t1 PATTERN | tail -n +6 | xargs -r rm -r

That’s dash-t-one on the ls command. Or, in words:

  1. List files matching PATTERN, in descending order of modification time, in one column,
  2. Pass through all the trailing lines, starting with the sixth from the beginning,
  3. Bundle all those filenames into an “rm -r” command, but not if there are none.

That wasn’t so hard, was it??

Comments

[gravatar]
Isn't xargs unsafe? you could use a loop, and save it to a function:

function rm_old_files { ls -td1 "$@"| tail -n6 | while read f; do rm -i $f; done }

Using a 'read-while' loop like this lets you pipe output (as opposed to using 'find'), and without un-escaped-space problems found in other loops.

also, In ZSH you could do:

rm -i ./(.Oa[1,6])
[gravatar]
Sorry, in the above it should be:

rm -i ./PATTERN(.Oa[1,6])

I'm not sure how to re-edit my post, or add formatting...
[gravatar]
As ever, put -- (double dash) after your rm if it's taking arbitrary file names :)
[gravatar]
You don't need the -1 option to ls; it'll do it anyway if you're piping its output (and you don't specify -C). From prior knowledge on Linux, verified by experiment. From the man page for OS X.
[gravatar]
Moving this way means inventing logrotate. Write your output into a single file and set up that thing that already rotates /var/log/messages on your system.
[gravatar]
@CH: good point about xargs and tricky filenames. That's definitely something to keep in mind.

@Chris: excellent point, -- can only make things safer.

@Ed: another good point, I should have thought about the '1' myself.

@void: in my case, the things to delete were not simple text files, but entire trees, which means I need -d on the ls command also.
[gravatar]
Malcolm Tredinnick 4:47 AM on 13 Dec 2011
In addition to CH's suggestions, the -d option to xargs is idiomatic in this kind of situation: where you know that only newlines separate entries, not any old whitespace.

I find the tail-piped-to-xargs style of solution more memorable and readable than having to work out while-read loops in shell, but that's possibly a matter of muscle memory. I use the former style much more frequently.
[gravatar]
@CH: Your zsh expression is a bit off... I think that would delete the six oldest files, and leave the rest. I think you're looking for something like "rm PATTERN(oa[7,-1])" -- sort by last-accessed time, skip the first 6 values (it's 1-indexed, not zero-indexed), and pass the rest to "rm".

Personally I think I'd do this with pipes, though, it's easier to maintain that way.
[gravatar]
There's also the option of using find, with -newer or -ctime.
[gravatar]
@Malcolm
You could also use GNU Parallel as sugested here

@Graham
Ah, my examples just deletes the last 6, and the ZSH snippet doesn't delete directories; Your ZSH snippet is right, and the shell function would be:
function rm_old_files {
    ls -td1 "$@" | tail -n+7 | while IFS= read -r f; do rm -- $f; done }
for another point of safety, I've used 'IFS= read -r' as well.

Also, I've discovered a weaknesss; You cannot use '-i' with rm, (or use any interactive command) as that will conflict with 'read'...

@Anon
I prefer not to use find, but then there's also find2perl :-D
[gravatar]
Thank you for the GNU parallel suggestion CH.

I hadn't run across the program before and it seems like a very nice tool -- running tools in parallel, local and remote execution, etc, etc...

Its Wikipedia article links to a couple of very nice introductory videos: http://commons.wikimedia.org/wiki/GNU_parallel

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.