I wrote an Open edX blog post about the need to move from Python 2 to Python 3. For emphasis, I wanted to say how much code there was. Open edX is a large project spread across a number of repos. Why spend 30 minutes writing a blog post when you can first spend two hours fiddling around with line-counting tools to get a vague factoid for the blog post?
The old standard tool for line-counting is cloc. It has way too many options, many of which don’t work quite the way I would have expected, but it gets the job done, with some bash support. My resulting monster is below.
BTW, on the subject of line counting: once, helping someone with a program, I saw they were using semicolons to end their Python statements. I said they didn’t need them, and they replied, “Yes I do, because my manager’s line-counting software requires them.” !!!
Be careful out there...
# Count lines of code in a tree of git repos.
# Needs cloc (https://github.com/AlDanial/cloc)
mkdir -p $REPORTDIR
rm -rf $REPORTDIR/*
cat <<EOF > $REPORTDIR/exclude-files.txt
cat <<EOF > $REPORTDIR/more-langs.txt
filter remove_matches xyzzy
find . -name .git -type d -prune | while read d; do
if [[ $dd == ./src/third-party/* ]]; then
# Ignore repos in the "third-party" tree.
echo "==== $dd =============================================="
git remote -v