Saturday 19 January 2019 — This is almost six years old. Be careful.
I wrote an Open edX blog post about the need to move from Python 2 to Python 3. For emphasis, I wanted to say how much code there was. Open edX is a large project spread across a number of repos. Why spend 30 minutes writing a blog post when you can first spend two hours fiddling around with line-counting tools to get a vague factoid for the blog post?
The old standard tool for line-counting is cloc. It has way too many options, many of which don’t work quite the way I would have expected, but it gets the job done, with some bash support. My resulting monster is below.
It over-counts JavaScript code because there are lots of places that JavaScript gets checked into git that isn’t code we wrote. I don’t know what to do about that. Oh well.
BTW, on the subject of line counting: once, helping someone with a program, I saw they were using semicolons to end their Python statements. I said they didn’t need them, and they replied, “Yes I do, because my manager’s line-counting software requires them.” !!!
Be careful out there...
#!/bin/bash
#
# Count lines of code in a tree of git repos.
# Needs cloc (https://github.com/AlDanial/cloc)
REPORTDIR=/tmp/cloc-reports
mkdir -p $REPORTDIR
rm -rf $REPORTDIR/*
cat <<EOF > $REPORTDIR/exclude-files.txt
package-lock.json
EOF
cat <<EOF > $REPORTDIR/more-langs.txt
reStructured Text
filter remove_matches xyzzy
extension rst
3rd_gen_scale 1.0
SVG Graphics
filter remove_html_comments
extension svg
3rd_gen_scale 1.0
EOF
find . -name .git -type d -prune | while read d; do
dd=$(dirname "$d")
if [[ $dd == ./src/third-party/* ]]; then
# Ignore repos in the "third-party" tree.
continue;
fi
echo "==== $dd =============================================="
cd $dd
git remote -v
REPORTHEAD=$REPORTDIR/${dd##*/}
cloc \
--report-file=$REPORTHEAD.txt \
--read-lang-def=$REPORTDIR/more-langs.txt \
--ignored=$REPORTHEAD.ignored \
--vcs=git \
--not-match-d='.*\.egg-info' \
--exclude-dir=node_modules,vendor,locale \
--exclude-ext=png,jpg,gif,ttf,eot,woff,mo,xcf \
--exclude-list-file=$REPORTDIR/exclude-files.txt \
.
cd -
done
cloc \
--sum-reports \
--read-lang-def=$REPORTDIR/more-langs.txt \
$REPORTDIR/*.txt
Comments
https://github.com/Aaronepower/tokei
Add a comment: