Is Python interpreted or compiled? Yes.

Thursday 29 March 2018

A common question: “Is Python interpreted or compiled?” Usually, the asker has a simple model of the world in mind, and as is typical, the world is more complicated.

In the simple model of the world, “compile” means to convert a program in a high-level language into a binary executable full of machine code (CPU instructions). When you compile a C program, this is what happens. The result is a file that your operating system can run for you.

In the simple definition of “interpreted”, executing a program means reading the source code a line at a time, and doing what it says. This is the way some shells operate.

But the real world is not so limited. Making real programming languages useful and powerful involves a wider range of possibilities about how they work. Compiling is a more general idea: take a program in one language (or form), and convert it into another language or form. Usually the source form is a higher-level language than the destination form, such as when converting from C to machine code. But converting from JavaScript 8 to JavaScript 5 is also a kind of compiling.

In Python, the source code is compiled into a much simpler form called bytecode. These are instructions similar in spirit to CPU instructions, but instead of being executed by the CPU, they are executed by software called a virtual machine. (These are not VM’s that emulate entire operating systems, just a simplified CPU execution environment.)

Here’s an example of a short Python function, and its bytecode:

>>> import dis
>>> def example(x):
...     for i in range(x):
...         print(2 * i)
...
>>> dis.dis(example)
  2           0 SETUP_LOOP              28 (to 30)
              2 LOAD_GLOBAL              0 (range)
              4 LOAD_FAST                0 (x)
              6 CALL_FUNCTION            1
              8 GET_ITER
        >>   10 FOR_ITER                16 (to 28)
             12 STORE_FAST               1 (i)

  3          14 LOAD_GLOBAL              1 (print)
             16 LOAD_CONST               1 (2)
             18 LOAD_FAST                1 (i)
             20 BINARY_MULTIPLY
             22 CALL_FUNCTION            1
             24 POP_TOP
             26 JUMP_ABSOLUTE           10
        >>   28 POP_BLOCK
        >>   30 LOAD_CONST               0 (None)
             32 RETURN_VALUE
>>>

The dis module in the Python standard library is the disassembler that can show you Python bytecode. It’s also the best (but not great) documentation for the bytecode itself. If you want to know more about how Python’s bytecode works, there are lots of conference talks about bytecode. The software that executes bytecode can be written in any language: byterun is an implementation in Python (!), which is useful only as an educational exercise.

An important aspect of Python’s compilation to bytecode is that it’s entirely implicit. You never invoke a compiler, you simply run a .py file. The Python implementation compiles the files as needed. This is different than Java, for example, where you have to run the Java compiler to turn Java source code into compiled class files. For this reason, Java is often called a compiled language, while Python is called an interpreted language. But both compile to bytecode, and then both execute the bytecode with a software implementation of a virtual machine.

Another important Python feature is its interactive prompt. You can type Python statements and have them immediately executed. This interactivity is usually missing in “compiled” languages, but even at the Python interactive prompt, your Python is compiled to bytecode, and then the bytecode is executed. This immediate execution, and Python’s lack of an explicit compile step, are why people call the Python executable “the Python interpreter.”

By the way, even this is a simplified description of how these languages can work. “Compiled” languages like Java and C can have interactive prompts, but they are not at the center of those worlds in the same way that Python’s is. Java originally always compiled to bytecode, but then it pioneered just-in-time (JIT) techniques for compiling to machine code at runtime, and now Java is sometimes compiled entirely to machine code, in the C style.

This shows just how flimsy the words “interpreted” and “compiled” can be. Like most adjectives applied to programming languages, they are thrown around as if they were black-and-white distinctions, but the reality is much subtler and complex.

Finally, how your program gets executed isn’t a characteristic of the language at all: it’s about the language implementation. I’ve been talking here about Python, but this has really been a description of CPython, the usual implementation of Python, so-named because it is written in C. PyPy is another implementation, using a JIT compiler to run code much faster than CPython can.

So: is Python compiled? Yes. Is Python interpreted? Yes. Sorry, the world is complicated...

Comments

[gravatar]
NITIN GEORGE CHERIAN 3:58 AM on 30 Mar 2018

Very informative and clear, Ned.

[gravatar]

I only learned recently that PyPy is faster, and it’s given me new respect for it after having assumed forever that it was just a project resulting front people having fetished the language.

I’m now curious if I can use it in Gunicorn for faster response of Django and Wagtail - in hopes that my site’s slow processing isn’t caused by network latency from having a more affordable managed host.

This whole conversation reminds me of how disgruntled I was when I was still a giant fan of C++ and abhorred Java and came across a study showing Java machine code beating C++ in a benchmark, ostensibly because the garbage colector does a good job of allocating available memory to be continguous.

[gravatar]
Peter Morris 10:23 AM on 3 Apr 2018

Ned, you've given an accurate and detailed response to this question. However, you haven't necessarily described one of the potential benefits of bytecode.

If we use the other name for it (portable code or p–code), it makes it clearer that this intermediate format can then be used on any CPU architecture providing there is a ready-made VM. Bytecode is platform agnostic and that's potentially very useful.

So, we could (in theory) distribute bytecode/p–code, though we still prefer to distribute source code and therefore I guess I've just talked myself out of the benefit I started to describe ;o)

[gravatar]

“Is Python interpreted or compiled?” The question is usually stated by people who don't know the language but have a concept in mind of compiled languages and interpreted languages and the difference between them, which they are asking for.
Assuming this person knows other languages (let's say C and Java) this simple question can be rephrased to is Python natively executed by the processor or is there an interpreter executing the control flow of the program. And that is an easy answer, it's interpreted.
Any other "complicated" answer can be applied to other languages, too, (e.g.: C with Docker, Java with JIT), but is not answering the underlying question.

Otherwise a nice read and a good overview, thank you!

[gravatar]

"So: is Python compiled? Yes. Is Python interpreted? Yes. Sorry, the world is complicated..."

Good point, Ned.

I think it applies to the machines' Aristotelian logic too, it must be changed.

[gravatar]

such a simple and concise explanation, thank you so much!

[gravatar]

Ned makes a great argument, but Bartek conveys in the Pythonic aphorism, and gets right to it.

[gravatar]

this is awesome, thanks!

[gravatar]

Thank you!

[gravatar]
ananthi Ramaswamy 7:29 AM on 30 May 2019

Thank you so much for such a valuable explanantion. I have been digging my head a lot, and finally ending with your write up. Thanks a lot

[gravatar]

So,
Python converts code to byte code; so does java
Python has a VM; so does java

Can i assume that on a higher level we call python as an interpreter just because it executes in an interactive shell by generating byte code for each command?

Also, If I create a high level code using Python and Java wouldn't it be a same way of execution for both as both generates the byte code for the respective VMs which then generates the final code for the downstream arch?

Please take a moment to answer my doubts, I really am looking for an answer as I am starting to learn python and this is really bugging me!

[gravatar]

I mentioned this a bit in the piece, but I think Python is called "interpreted" because it doesn't have an explicit compilation step, and it has an interactive prompt. Java requires you to run a compiler before you can run your program, and does not have an interactive prompt. So Python is mis-labelled "interpreted" and Java is mis-labelled "compiled."

[gravatar]

Thanks a lot Ned. I really loved your article and explanations.

[gravatar]
Vaishnavi Singh 8:11 AM on 29 May 2020

Thanks a lot.... it cleared my doubt very well which other websites couldn't!

[gravatar]

Awesome article, thanks Ned.

[gravatar]

Does python virtual machine send machine code to CPU for processing or dose PVM executes byte code and send only result to CPU?

[gravatar]

Source: https://www.python.org/doc/essays/blurb/

Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
...
...

The Python interpreter and the extensive standard library are available in source or binary form without charge for all major platforms, and can be freely distributed.

[gravatar]

Various translators allow programmers to write in high-level languages like Python or
JavaScript and these translators convert the programs to machine language for
actual execution by the CPU.

Book -> Python for Everybody By Dr. Charles R. Severance

[gravatar]

@Adil: these two quotes show how difficult it is to include all the details, especially when writing introductory material.

Guido is using "interpreted" to mean, "does not produce a machine-language executable file", which is what many people mean when they say "interpreted". (Though no one calls Java interpreted..?)

Dr. Severance is speaking very broadly about the process of executing Python. The Python bytecode interpreter does result in a stream of machine instructions, the instructions it executes as it is interpreting the bytecode. They are not stored anywhere before execution though.

[gravatar]

Bartek phrases the question right. An online real-time compiler is essentially an interpreter. Of course, everything is ultimately converted into executable code for the instruction set that the platform support. Ned is absolutely right that compiled vs interpretation is an implementation issue and not a language issue. But some of us still care to know a language is natively compiled (one time) or compiled in real-time every time the code runs.

[gravatar]
Taneli Härkönen 9:53 AM on 3 Nov 2020

This is an awesome article! So much confusion out there.. Still some nagging stupid questions arise: Is Python-bytecode same as Java-bytecode? If not, can Python-source code be compiled/interpreted to Java-bytecode to run on JavaVirtualMachine or the other way around? Not sure why one would do that but hey just asking?

[gravatar]

@Taneli: Python bytecode and Java bytecode are different. Python bytecode can even change between versions (3.6 -> 3.7).

[gravatar]

AUTHOR is trying to hard to be cool with a complicated answer ("ain't i smart", todays hipster trend). It's as simple as this: does the final output EXECUTE directly on the CPU in it's native machine code (executed by it's hw micro machine) ? yes, this it's compiled, no then it's interpreted. Other one is: can it change operation with self modification of source code, yes? then interpreted.

[gravatar]

@Gee Dee: your definition is definitely simple. So Java is an interpreted language? And the component of Python that converts source code into bytecode, what should we call that?

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.