OS truce?

Wednesday 6 May 2009

Tell you what: I'll be careful not to create Python kits with \r\n in the source files, if you Mac guys will stop including files like ._setup.py in your tarballs. OK, thanks. Resume coding...

» 14 reactions

Comments

[gravatar]
Nate 7:14 AM on 6 May 2009

Gah... line endings... who's moronic idea was it to use two different line endings for different OSes? Someone needs to just set down the law and pick one. I don't care which one, but come on, people, seriously, we're almost into the second decade of the new millennium. Computers as we know them have been around for over 30 years. Can we not just pick one line ending and get on with our lives not having to bend over backward to have a TEXT FILE be compatible across platforms?

[gravatar]
Ned Batchelder 7:33 AM on 6 May 2009

Well, ironically, or perhaps tellingly, both these artifacts are implementation details that we're stuck with. The \r\n sequence was meant to control teletypes which had two distinct operations to perform: returning the carriage to the first column, and feeding the paper up one line (carriage return, linefeed). The ._foo files on Mac solve the problem of where to store information about a file other than the sequence of bytes it contains, but on a file system that only has the concept of the sequence of bytes.

[gravatar]
Wayne Witzel III 7:38 AM on 6 May 2009

Any modern SCM will handle this pretty gracefully. I can't remember the last time I ran in to this issue within a translation unit. I've run in to this now and then when dealing with some document I need to parse, but most of the standard tools and libraries I use handle \n and \r\n implicitly.

I've ran in to case sensitive file naming issues more than newline issues, Foo.py and foo.py, but I'm a recent Mac convert, so I don't sympathize anymore, hehe.

[gravatar]
Robert K 8:05 AM on 6 May 2009

@Wayne: Do much shell scripting? Sure, text editors and IDEs do a great job of hiding this difference from you, but there are cases where it actually matters.

And, sure, an SCM may be able to hide this difference as well. But not all files are shared via SCMs. (Downloaded a tarball lately? Or just pulled a source file directly off someone's website?)

Sorry, I'm with Ned on this one. I run into this all the time. 'Even have a shell command I cooked up to auto remove the '\r's from files...

# Remove \r's from a file
function cleanse() {
    if (( $# < 1 )); then
      # Use $'\xd' instead of ^M to make sure it doesn't get replaced in this
      # file by accident
      files=`g -rl $'\xd' .`
    else
      files="$@"
    fi
    for file in $files; do
      echo $file
      tr -d '\r' < "$file" > "$file.cleanse"
      mv "$file.cleanse" "$file"
    done
}

[gravatar]
scott lewis 8:21 AM on 6 May 2009

You forgot about thumbs.db files. I'm not sure this arms war will ever end. :)

[gravatar]
Charles Darke 9:02 AM on 6 May 2009

Not sure what the issue is most things can handle both line endings. If you're writing stuff yourself then just deal with both variations.

[gravatar]
Ned Batchelder 9:07 AM on 6 May 2009

@Charles: if I distribute a .py file to be run as a script, and the hashbang line ends with \r\n, Unix users can't run the script. It's the old case of being as forgiving as possible when consuming data, and as strict as possible when producing it.

[gravatar]
Russell Borogove 12:05 PM on 6 May 2009

You'd think that by now, someone would have modified every *ix shell on the planet to accept a hashbang ending in \r\n. That can't be a difficult modification, and I keep hearing how awesome this "open source" thing is in that regard.

[gravatar]
Wayne Witzel III 1:51 PM on 6 May 2009

@Robert: Though I am sure all the reasons you've listed strike a chord with other people, I guess I am not the guy. I can't remember the last time I ran in to the issue. I am sure after leaving these comments I will encounter this problem within the hour.

[gravatar]
Michael Watkins 3:13 PM on 6 May 2009

Mac file system garbage is one irritant but I have a more common nit to bring up -- I cannot believe that in this day and age people would still post Python code with tabs. Python 3 spewing all over such whitespace abuse is, well, a lovely restriction in my mind.

[gravatar]
Nathan Fritz 12:48 AM on 7 May 2009

@Michael Watkins Tabs are far superior to spaces because your text editor can display them as whatever amount of whitespace you want it to. It's less characters and it's more flexible. Do you want your tab indents to look like 1 or 2 spaces, fine do it. Do you want it to look like 6? More power to you. I'll following the style guidelines for a given project, but as for me and my house, we'll use tabs. The only problem is when mixing the two. I have a script in VIM which detects if spaces or tabs are used in the python file I'm editing, and adjusts what the tab key does accordingly.

[gravatar]
Robert K 9:05 AM on 7 May 2009

@nathan: Everywhere I've worked for the past 10 years (Sun, Google, AOL, among other places) has standardized on using spaces instead of tabs, as have many (most?) opensource projects. The [very good!] reason for this is that code will be rendered consistently for all developers working on the project regardless of their tab settings.

This is important when you consider editor features like auto-wrapping; a file auto-wrapped using a tab-width of 2 will look like hell when later viewed with a tab-width of 8, and vice-versa. The problem is compounded in team environments where the code ends up inconsistently wrapped as different people edit various parts of files.

Using tabs "in your house" is fine, but be prepared to convert all those tabs to spaces (or have them converted for you!) when you go to share that code with someone outside your house.

...and, yeah, anyone that mixes tabs and spaces needs to be taken out back and given a good smack down! :-)

[gravatar]
Bob Congdon 9:49 PM on 10 May 2009

Wow, it took 10 comments for a line ending debate devolved into tabs vs. space. Good to see that we're making progress. ;-)

By the way, up until OS 9, Mac OS used CR as its line ending character.

[gravatar]
Trent Mick 12:11 AM on 19 May 2009

@ned: AFAIK, the only way to disable create of those "._FOO" files when using "python setup.py sdist" on Mac OS X is to set this environment variable:

COPY_EXTENDED_ATTRIBUTES_DISABLE=1
via http://forums.macosxhints.com/archive/index.php/t-43243.html

At least that is the case for tarballs (the default setup.py sdist format on Mac).

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.