![]() | Ned Batchelder : Blog | Code | Text | Site blameall.py » Home : Code : Shell Utils |
Created 27 November 2007 One thing I've missed from Perforce since using Subversion is the "p4 annotate -a" command. This annotates a file with the revisions that introduced each line, much like the "svn blame" command. But the -a switch tells it to include every revision of every line. This is a way of getting the complete history of the file in one textual output. It's great for finding a snippet that you suspect existed somewhere in the file's past. Blameall.py provides the same feature, but for Subversion. For example, let's say you have a file with a number of revisions. Revision 26: Shopping List Revision 27: Shopping List Revision 28: Shopping List Running blameall shows the history of the file in one series of lines: $ python blameall.py shoplist.txt This shows us that "Milk" appeared in revision 26 and was present through revision 27. "Shopping List" appeared in 26, and is still in the file in the head revision. It can be slow to get all the revisions, but it's faster than manually searching through old revisions for that piece you know was back there somewhere. You can provide a -r argument to blameall to limit its attention to a particular range of revisions. Getting itBlameall is a single-file python script, no need to install anything. Just download and run: Download: blameall.py | |
Comments
Excellent. Thanks Ned!
Cool tool! I don't see anything that says whether your blog
comments have markup characters, so this comment might look
weird.
The log lines in my personal repository look like this:
r218 | (no author) | 2007-11-05 13:07:43 -0500 (Mon, 05 Nov 2007) | 2 lines
(I'm the only person who uses it, so I've never bothered
getting a username to show up.) To get blameall.py to work
for me, I had to change the regex on line 122:
--- blameall.py.original 2007-11-28 12:00:48.531250000 -0500
+++ blameall.py.corrected 2007-11-28 12:03:09.078125000 -0500
@@ -119,7 +119,7 @@
if not revline and not log:
# End of the log.
break
- m = re.match(r"r(?P[0-9]+) \| (?P[^ ]+) \| [^|]+ \| (?P[0-9]+) line.*", revline)
+ m = re.match(r"r(?P[0-9]+) \| (?P(?:\(no author\)|[^ ]+)) \| [^|]+ \| (?P[0-9]+) line.*", revline)
if not m:
raise Exception("Couldn't scrape log line: %r\nRemaining: %r" % (revline, log))
revs.append((int(m.group('rev')), m.group('user')))
The diff in my last comment is hard to read and the angle
brackets got swallowed; the short version is that I changed
> (?P<user>[^ ]+)
to
> (?P<user>(?:\(no author\)|[^ ]+))
No need to introduce more alternatives ;)
(?P[^|]+)
catches everything between the both pipe symbols except the leading and trailing single space (so even trailing spaces except the last one would be catched).
If you look into the next expression, you can see that [^|]+ is already working :)
And for that matter, the trailing .* is unnecessary, too (or even the whole line.*).
So a more compact and and though more general version would be:
r(?P[0-9]+) \| (?P[^|]+) \| [^|]+ \| (?P[0-9]+)
The symbolic group names rev, user and lines were lost during the submit process, they would have to be added after each ?P
And you obviously have an Unicode problem in this blog (see my name in the last comment ;)
It would be nice if you would catch the revision timestamp and add an option to display it next to the revision number, too.
Then we could easier blame old colleagues, which are long gone, to the project leader ;)
Subversion's command-line client has an "--xml" option for most commands (most notably "svn log"), which comes out in a readily parseable format -- no need for "tweaking the regex".
Even better, the Python bindings for Subversion offer a direct interface -- no parsing required!
@Aaron: didn't realize that syntax existed, thanks for the pointer. I've updated the code to account for it.
@Rene: thanks for the fine-tuning of the regex, and I'm sorry about the Unicode. My PHP skills are not that polished, but I can't switch my blog infrastructure just yet.
@Rob: I didn't know about the --xml switch, but the regex is working pretty well for me at the moment, so I guess it stays. And the python bindings feel like another dependency, so I'm glad to rely only on the command line client.
Ned, thanks - I've already found this a really useful script (discovered via your recent svn user list email)
Add a comment: