Be careful deleting files around git

Saturday 2 May 2015

Working in a Python project, it's common to have a clean-up step that deletes all the .pyc files, like this:

find . -name '*.pyc' -delete

This works great, but there's a slight chance of a problem: Git records information about branches in files within the .git directory. These files have the same name as the branch.

Try this:

git checkout -b cleanup-all-.pyc

This makes a branch called "cleanup-all-.pyc". After making a commit, I will have files named .git/refs/heads/cleanup-all-.pyc and logs/refs/heads/cleanup-all-.pyc. Now if I run my find command, it will delete those files inside the .git directory, and my branch will be lost.

One way to fix it is to tell find not to delete the file if it's found in the .git directory:

find . -name '*.pyc' -not -path './.git/*' -delete

A better way is:

find . -name '.git' -prune -o -name '*.pyc' -exec rm {} \;

The first command examines every file in .git, but won't delete the .pyc it finds there. The second command will skip the entire .git directory, and not waste time examining it.

UPDATE: I originally had -delete in that latter command, but find doesn't like -prune and -delete together. It seems simplistic and unfortunate, but there it is.

tagged: » 9 reactions

Comments

[gravatar]
EricH 11:19 PM on 2 May 2015

I always use

git clean -dxf
-- dangerous but effective.

[gravatar]
jhermann 9:34 AM on 3 May 2015

I always use "invoke clean --all". ☺

[gravatar]
Roger Lipscombe 11:38 AM on 3 May 2015

Don't use dots when naming branches? Always make sure your branches are pushed to origin (so you can always just pull the branch again)? Don't have long-lived branches (so that if you do accidentally lose one, it's no big deal)?

[gravatar]
Tres Seaver 5:42 PM on 3 May 2015

Always dry run the "find" command to see the files it locates, and adjust to exclude unwanteds.

[gravatar]
Ryne Everett 12:05 AM on 4 May 2015

Don't you get an error on the last command because you're using -prune and -delete together?

$ find . -name '.git' -prune -o -name '*.pyc' -delete
find: The -delete action atomatically turns on -depth, but -prune does nothing when -depth is in effect.  If you want to carry on anyway, just explicitly use the -depth option.
$ echo $?
1
$ find --version
find (GNU findutils) 4.4.2

[gravatar]
Ned Batchelder 10:20 AM on 4 May 2015

@ryne thanks, I guess I mis-tested that command! I've replaced the -delete with -exec rm {} \;

I've learned a lot about find with this post, some of it I don't agree with, but I've learned it... :)

[gravatar]
Lars Solberg 9:57 PM on 5 May 2015

Even if you would delete those files, you wont delete any real data. Those files are only references to which hash that branch was on.
You can probably use `git reflog` to find out where it was pointing, do something like `git branch branchname abc123hashhere`. Not tested, but think this works..

[gravatar]
Mick T. 8:22 PM on 29 May 2015

$ echo "# ignore python compiled files:
> # ignore python compiled files:
> *.py[cod]
> " >> >> .gitignore; git commit -m "ignore python compiled files"

;)

On a more serious note, why would you commit Python binaries to your repo?

[gravatar]
Ned Batchelder 10:01 PM on 29 May 2015

@Mick T: I don't commit .pyc files to my repo, and this is not about .pyc files in the working tree. The problem has to do with *.pyc files in the .git directory. They are not compiled Python files at all, they are branches.

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.