Thursday 29 March 2018 — This is over six years old, but it's still good.
A common question: “Is Python interpreted or compiled?” Usually, the asker has a simple model of the world in mind, and as is typical, the world is more complicated.
In the simple model of the world, “compile” means to convert a program in a high-level language into a binary executable full of machine code (CPU instructions). When you compile a C program, this is what happens. The result is a file that your operating system can run for you.
In the simple definition of “interpreted”, executing a program means reading the source code a line at a time, and doing what it says. This is the way some shells operate.
But the real world is not so limited. Making real programming languages useful and powerful involves a wider range of possibilities about how they work. Compiling is a more general idea: take a program in one language (or form), and convert it into another language or form. Usually the source form is a higher-level language than the destination form, such as when converting from C to machine code. But converting from JavaScript 8 to JavaScript 5 is also a kind of compiling.
In Python, the source code is compiled into a much simpler form called bytecode. These are instructions similar in spirit to CPU instructions, but instead of being executed by the CPU, they are executed by software called a virtual machine. (These are not VM’s that emulate entire operating systems, just a simplified CPU execution environment.)
Here’s an example of a short Python function, and its bytecode:
>>> import dis
>>> def example(x):
... for i in range(x):
... print(2 * i)
...
>>> dis.dis(example)
2 0 SETUP_LOOP 28 (to 30)
2 LOAD_GLOBAL 0 (range)
4 LOAD_FAST 0 (x)
6 CALL_FUNCTION 1
8 GET_ITER
>> 10 FOR_ITER 16 (to 28)
12 STORE_FAST 1 (i)
3 14 LOAD_GLOBAL 1 (print)
16 LOAD_CONST 1 (2)
18 LOAD_FAST 1 (i)
20 BINARY_MULTIPLY
22 CALL_FUNCTION 1
24 POP_TOP
26 JUMP_ABSOLUTE 10
>> 28 POP_BLOCK
>> 30 LOAD_CONST 0 (None)
32 RETURN_VALUE
>>>
The dis module in the Python standard library is the disassembler that can show you Python bytecode. It’s also the best (but not great) documentation for the bytecode itself. If you want to know more about how Python’s bytecode works, there are lots of conference talks about bytecode. The software that executes bytecode can be written in any language: byterun is an implementation in Python (!), which is useful only as an educational exercise.
An important aspect of Python’s compilation to bytecode is that it’s entirely implicit. You never invoke a compiler, you simply run a .py file. The Python implementation compiles the files as needed. This is different than Java, for example, where you have to run the Java compiler to turn Java source code into compiled class files. For this reason, Java is often called a compiled language, while Python is called an interpreted language. But both compile to bytecode, and then both execute the bytecode with a software implementation of a virtual machine.
Another important Python feature is its interactive prompt. You can type Python statements and have them immediately executed. This interactivity is usually missing in “compiled” languages, but even at the Python interactive prompt, your Python is compiled to bytecode, and then the bytecode is executed. This immediate execution, and Python’s lack of an explicit compile step, are why people call the Python executable “the Python interpreter.”
By the way, even this is a simplified description of how these languages can work. “Compiled” languages like Java and C can have interactive prompts, but they are not at the center of those worlds in the same way that Python’s is. Java originally always compiled to bytecode, but then it pioneered just-in-time (JIT) techniques for compiling to machine code at runtime, and now Java is sometimes compiled entirely to machine code, in the C style.
This shows just how flimsy the words “interpreted” and “compiled” can be. Like most adjectives applied to programming languages, they are thrown around as if they were black-and-white distinctions, but the reality is much subtler and complex.
Finally, how your program gets executed isn’t a characteristic of the language at all: it’s about the language implementation. I’ve been talking here about Python, but this has really been a description of CPython, the usual implementation of Python, so-named because it is written in C. PyPy is another implementation, using a JIT compiler to run code much faster than CPython can.
So: is Python compiled? Yes. Is Python interpreted? Yes. Sorry, the world is complicated...
Comments
I’m now curious if I can use it in Gunicorn for faster response of Django and Wagtail - in hopes that my site’s slow processing isn’t caused by network latency from having a more affordable managed host.
This whole conversation reminds me of how disgruntled I was when I was still a giant fan of C++ and abhorred Java and came across a study showing Java machine code beating C++ in a benchmark, ostensibly because the garbage colector does a good job of allocating available memory to be continguous.
If we use the other name for it (portable code or p–code), it makes it clearer that this intermediate format can then be used on any CPU architecture providing there is a ready-made VM. Bytecode is platform agnostic and that's potentially very useful.
So, we could (in theory) distribute bytecode/p–code, though we still prefer to distribute source code and therefore I guess I've just talked myself out of the benefit I started to describe ;o)
Assuming this person knows other languages (let's say C and Java) this simple question can be rephrased to is Python natively executed by the processor or is there an interpreter executing the control flow of the program. And that is an easy answer, it's interpreted.
Any other "complicated" answer can be applied to other languages, too, (e.g.: C with Docker, Java with JIT), but is not answering the underlying question.
Otherwise a nice read and a good overview, thank you!
Good point, Ned.
I think it applies to the machines' Aristotelian logic too, it must be changed.
Python converts code to byte code; so does java
Python has a VM; so does java
Can i assume that on a higher level we call python as an interpreter just because it executes in an interactive shell by generating byte code for each command?
Also, If I create a high level code using Python and Java wouldn't it be a same way of execution for both as both generates the byte code for the respective VMs which then generates the final code for the downstream arch?
Please take a moment to answer my doubts, I really am looking for an answer as I am starting to learn python and this is really bugging me!
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
...
...
The Python interpreter and the extensive standard library are available in source or binary form without charge for all major platforms, and can be freely distributed.
JavaScript and these translators convert the programs to machine language for
actual execution by the CPU.
Book -> Python for Everybody By Dr. Charles R. Severance
Guido is using "interpreted" to mean, "does not produce a machine-language executable file", which is what many people mean when they say "interpreted". (Though no one calls Java interpreted..?)
Dr. Severance is speaking very broadly about the process of executing Python. The Python bytecode interpreter does result in a stream of machine instructions, the instructions it executes as it is interpreting the bytecode. They are not stored anywhere before execution though.
thanks for that! Brilliant as ever! It's important to know, that Python Source Code is compiled to Byte Code in a first step, just to understand the difference between Syntax Errors (which appear in the compilation step) and cxceptions, which appear at runtime. I try to teach my students pretty early in the courses.
Sorry, but I am old school. If the code cannot be natively run on a CPU, then it is not compiled. Yeah, world is complicated, but when you boil it down to basics, a compiled program is able to run on the target system without going though another software layer.
Hi Mark, my Java-Colleagues next door will freak out, when i tell them, that Java is an interpreted language after all. But well, i think, the problem in this debate is, it’s not just black and white. It’s not just COMPILED vs INTERPRETED yes or no, the question itself is just plain wrong. It is not that simple.
import dis
def fn(): x = z
dis.dis(fn)
Thanks for the article.
After spending some time thinking about this, I arrived to a conclusion that this distinstion is not very useful, because it does not necessarily relate to other properties of an implementation.
Therefore, let me propose a hypothetical reduction of usage of these terms and use more neutral ones. Instead of saying “a compiler” or saying “an interpreter”, one can say “an implementation”; instead of saying “compiling foo to bar”, one can say “translating foo to bar”.
While this approach is quite radical, I think that it may illustrate how far one can go without using terms “compilation” and “interpretation” and related ones. It may make one think about the real importance of these words.
The same way I feel about different paradigms: “functional”, “object-oriented”, and so on. In my opinion, these words, too, have little meaning when taken out of some specific context sensetive definition. These words can of course cause great debates, but one can think if these debates are worth getting into.
I prefer to avoid talking about different paradigms without clarifying what I mean first.
Add a comment: