Tuesday 16 July 2013 — This is more than 11 years old. Be careful.
Hanging out in IRC channels, you see a lot of the same discussions pop up over and over again. One involves people who want to be “close to the metal.” Either they want “no dependencies” (other than Python itself, which is a large dependency), or they feel like they need to know how things “really work” so they want to use sockets instead of Flask, or something.
Today that topic came up again, and the low-level proponent said it was important to know what’s happening in the CPU when you do “print x”. My feeling is, modern CPUs are hella-complicated beasts, I have no idea how they work, and it hasn’t hindered me.
He thought you should at least have a rough idea of the instruction count for something like that. I asked him to tell me what he thought it was. He guessed 500 instructions for “print x” if x was an integer. I guessed that a) he was off by a factor of at least 10, and b) that we were both making incredibly wild guesses.
Conceptually, printing an integer isn’t much work, but keep in mind that print has to find sys.stdout, and manipulate reference counts, and convert the int to a string, and deal with output buffering, etc, not to mention the general mechanisms of Python bytecode interpretation, memory management, and so on.
OK, so we had our two guesses, how to actually measure? Linux has “perf stat” which can measure all sorts of performance statistics, including number of instructions executed.
I wrote a simple Python program:
import sys
x = 1
for i in range(int(sys.argv[1])):
print x
Running this, I can change the number of print statements from the command line, and see how many instructions result by running it under perf stat:
ned@ubuntu:~$ perf stat python foo.py 10
1
1
1
1
1
1
1
1
1
1
Performance counter stats for 'python foo.py 10':
11.913667 task-clock # 0.883 CPUs utilized
21 context-switches # 0.002 M/sec
0 cpu-migrations # 0.000 K/sec
1,221 page-faults # 0.102 M/sec
33,379,047 cycles # 2.802 GHz
19,506,536 stalled-cycles-frontend # 58.44% frontend cycles idle
<not supported> stalled-cycles-backend
28,821,962 instructions # 0.86 insns per cycle
# 0.68 stalled cycles per insn
6,345,082 branches # 532.588 M/sec
292,467 branch-misses # 4.61% of all branches
0.013497566 seconds time elapsed
So, 28 million instructions for that program. Running it again, I saw that the total instruction count fluctuates quite a bit. So I ran it 10 times to get an average: 28,696,694 instructions for 10 print statements.
Then I ran it 10 times with 11 print statements, for an average of 28,705,257, or a difference of 8,563 instructions for the one extra print statement.
Then I ran it 10 times with 30 print statements, averaged, subtracted, and divided by 20, which should give me another per-print statement instruction count. This time it came out to 10,518 instructions per additional print statement.
What did we learn?
- Linux has some cool tools.
- Measuring instruction counts is an inexact science.
- There’s a lot more going on in a print statement than some people think.
- Printing an integer in Python takes roughly 10,000 instructions.
Finally, does this matter? I claim that if you want to think about numbers of instructions, then Python (or any other language of its kind) is not for you. Sure, it’s useful to understand the big picture of what goes into Python execution, but tomorrow when I go to work, how does this help me? It’s important to know things like the performance characteristics of data structures, and have an idea of the forces at work on your system.
But number of instructions? Meh.
Comments
These are not Linux specific: hwpmc works on other OSes too.
Knowing in general how things work IS a good idea (because when you don't, you tend to do pathological things), but trying to know what python does to print an integer? Not useful beyond knowing that I/O isn't cheap.
Yes but this does not negate the fact that Linux has cool tools
@Michael Kohne
It astonishes you that people writing programs are curious how things work under the hood? That astonishes me :). I don't think this curiosity misses the point of working with python. Not all knowledge is immediately useful on its own but taken as a whole this knowledge adds to the ability to make informed decisions.
Knowing how many instructions a print in Python is is interesting. Finding it out is kinda interesting too. Trying to use that knowledge in your day to day programming is pointless and it's likely to cause you to draw bad conclusions.
Anyway the question itself is really interesting.
However, by pure intellectual curiosity, I couldn't resist comparing to the following more-or-less-equivalent program in C: And the same methodology of running 10 times, once with 10 loops and then with 30 loops, I get 11,502 instructions per print statement. More or less a tie with your Python result. Suggests to me that this number of instructions is dominated by the system I/O (which should fundamentally be the same in both cases).
EDIT: Note however, when trying to reproduce your original Python example on my machine, I got 55,000 instructions per Python print, so there's obviously some environmental/platform difference such that my numbers shouldn't be directly compared to yours.
Also, I already mentioned what I thought Python was doing under the covers: "print has to find sys.stdout, and manipulate reference counts, and convert the int to a string, and deal with output buffering, etc, not to mention the general mechanisms of Python bytecode interpretation, memory management, and so on."
Python does much more to accomplish the same thing than assembly code does. The tradeoff is that it can do much of the work for you. For most code, it's a good tradeoff.
Add a comment: