Making good tar files on Windows

Sunday 5 September 2010

I work on Windows, and produce Python kits for various projects. Of course one of the kits is a source kit, as a .tar.gz. Python’s distutils does a fine job of this, but it uses the native system command to tar up the files, and this means my tar files are from Windows.

The problem is that Windows doesn’t know how to make tar files that look native on Unix: the permission bits aren’t right, they come out as 0700:

drwx------ batcheln/100      0 2010-08-21 14:13 coverage-3.4b1/
-rwx------ batcheln/100    516 2010-08-08 09:36 coverage-3.4b1/AUTHORS.txt

The tar command let me set the mode bits in a command-line switch, which means I could set them in an environment variable, but that would make all the files have the same permission bits, and directories and files should be different.

To fix this problem, I wrote a distutils extension command. It turned out to be not difficult, distutils has good extensibility.

To make a new command, create a new package directory somewhere, I called my “distcmd”. To make a command called “fixtar”, you you derive a class from distutils.core.Command, name it fixtar, and put it in a file called You have to provide three methods:

from distutils.core import Command
import shutil, tarfile

class fixtar(Command):
    """A new command to fix tar file permissions."""

    description = "Re-pack the tar file to have correct permissions."

    user_options = []

    def initialize_options(self):
        """Required by Command, even though I have nothing to add."""

    def finalize_options(self):
        """Required by Command, even though I have nothing to add."""

    def run(self):
        """The body of the command."""
        # Put something useful here!

The initialize_options and finalize_options methods are required. It’s too bad distutils didn’t allow them to be omitted in cases like mine where there are no options to specify. The run() method is the interesting one, that’s where all the work will happen. In mine, I read the tar file distutils made, and I create a new tar file just like it, but with the permissions and owner info that I want:

def run(self):
        """The body of the command."""
        for _, _, filename in self.distribution.dist_files:
            if filename.endswith(".tar.gz"):
                self.repack_tar(filename, "temp.tar.gz")
                shutil.move("temp.tar.gz", filename)

    def repack_tar(self, infilename, outfilename):
        """Re-pack `infilename` as `outfilename`.

        Permissions and owners are set the way we like them.

        itar =, "r:gz")
        otar =, "w:gz")
        for itarinfo in itar:
            otarinfo = otar.gettarinfo(
            if itarinfo.isfile():
                otarinfo.mode = 0644
                otarinfo.mode = 0755
            otarinfo.uid = 100
            otarinfo.gid = 100
            otarinfo.uname = "ned"
            otarinfo.gname = "coverage"
            otar.addfile(otarinfo, itar.extractfile(itarinfo))

To invoke my new command, I have to tell to pull it in with the --command-packages=distcmd switch. Then I can simply use “fixtar” as a command on the line just like any other:

python --command-packages=distcmd sdist --keep-temp --formats=gztar fixtar

Now I get a tar file that looks right:

drwxr-xr-x ned/coverage      0 2010-09-04 19:44 coverage-3.4b2/
-rw-r--r-- ned/coverage    516 2010-08-08 09:36 coverage-3.4b2/AUTHORS.txt

There’s a bit more complication to it, because of the --keep-temp switch that I needed to keep the original tarred files around so they could be re-tarred. The real also cleans up that directory.

This was more than I wanted to do to get the right permissions in a tar file, but it was a chance to see how distutils extensions work, and it lets me make all my kits in one place.


Thank you so much for this post. It solved a problem I was struggling with all day! :)

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.