Thursday 18 August 2016 — This is eight years old. Be careful.
A common Python question: what’s the difference between a list and a tuple?
The answer is that there are two different differences, with complex interplay between the two. There is the Technical Difference, and the Cultural Difference.
First, the things that are the same: both lists and tuples are containers, a sequence of objects:
>>> my_list = [1, 2, 3]
>>> type(my_list)
<class 'list'>
>>> my_tuple = (1, 2, 3)
>>> type(my_tuple)
<class 'tuple'>
Either can have elements of any type, even within a single sequence. Both maintain the order of the elements (unlike sets and dicts).
Now for the differences. The Technical Difference between lists and tuples is that lists are mutable (can be changed) and tuples are immutable (cannot be changed). This is the only distinction that the Python language makes between them:
>>> my_list[1] = "two"
>>> my_list
[1, 'two', 3]
>>> my_tuple[1] = "two"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
That’s the only technical difference between lists and tuples, though it manifests in a few ways. For example, lists have a .append() method to add more elements to the list, while tuples do not:
>>> my_list.append("four")
>>> my_list
[1, 'two', 3, 'four']
>>> my_tuple.append("four")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'
Tuples have no need for an .append() method, because you can’t modify tuples.
The Cultural Difference is about how lists and tuples are actually used: lists are used where you have a homogeneous sequence of unknown length; tuples are used where you know the number of elements in advance because the position of the element is semantically significant.
Another way to say it: a list is an arbitrary number of similar things; a tuple is one thing with a known number of (possibly dissimilar) parts.
For example, suppose you have a function that looks in a directory for files ending with *.py. It should return a list, because you don’t know how many you will find, and all of them are the same semantically: just another file that you found.
>>> find_files("*.py")
["control.py", "config.py", "cmdline.py", "backward.py"]
On the other hand, let’s say you need to store five values to represent the location of weather observation stations: id, city, state, latitude, and longitude. A tuple is right for this, rather than a list:
>>> denver = (44, "Denver", "CO", 40, 105)
>>> denver[1]
'Denver'
(For the moment, let’s not talk about using a class for this.) Here the first element is the id, the second element is the city, and so on. The position determines the meaning.
To put the Cultural Difference in terms of the C language, lists are like arrays, tuples are like structs.
Python has a namedtuple facility that can make the meaning more explicit:
>>> from collections import namedtuple
>>> Station = namedtuple("Station", "id, city, state, lat, long")
>>> denver = Station(44, "Denver", "CO", 40, 105)
>>> denver
Station(id=44, city='Denver', state='CO', lat=40, long=105)
>>> denver.city
'Denver'
>>> denver[1]
'Denver'
One clever summary of the Cultural Difference between tuples and lists is: tuples are namedtuples without the names.
The Technical Difference and the Cultural Difference are an uneasy alliance, because they are sometimes at odds. Why should homogeneous sequences be mutable, but hetergenous sequences not be? For example, I can’t modify my weather station because a namedtuple is a tuple, which is immutable:
>>> denver.lat = 39.7392
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
And sometimes the Technical considerations override the Cultural considerations. You cannot use a list as a dictionary key, because only immutable values can be hashed, so only immutable values can be keys. To use a list as a key, you can turn it into a tuple:
>>> d = {}
>>> nums = [1, 2, 3]
>>> d[nums] = "hello"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>> d[tuple(nums)] = "hello"
>>> d
{(1, 2, 3): 'hello'}
Another conflict between the Technical and the Cultural: there are places in Python itself where a tuple is used when a list makes more sense. When you define a function with *args, args is passed to you as a tuple, even though the position of the values isn’t significant, at least as far as Python knows. You might say it’s a tuple because you cannot change what you were passed, but that’s just valuing the Technical Difference over the Cultural.
I know, I know: in *args, the position could be significant because they are positional parameters. But in a function that’s accepting *args and passing it along to another function, it’s just a sequence of arguments, none different from another, and the number of them can vary between invocations.
Python uses tuples here because they are a little more space-efficient than lists. Lists are over-allocated to make appending faster. This shows Python’s pragmatic side: rather than quibble over the list/tuple semantics of *args, just use the data structure that works best in this case.
For the most part, you should choose whether to use a list or a tuple based on the Cultural Difference. Think about what your data means. If it can have different lengths based on what your program encounters in the real world, then it is probably a list. If you know when you write the code what the third element means, then it is probably a tuple.
On the other hand, functional programming emphasizes immutable data structures as a way to avoid side-effects that can make it difficult to reason about code. If you are a functional programming fan, you will probably prefer tuples for their immutability.
So: should you use a tuple or a list? The answer is: it’s not always a simple answer.
Comments
Also, lists and tuples are not the only containers. In your glob example, a set would probably be a better option. Or, in Python you don't even have to choose a container: write a generator and let your user decide how to store the yielded items (and whether to store them at all).
An additional point of difference: lists have a nice shortcut for comprehension, tuples don't. In quick&dirty scripts, it's often a deciding factor. :-)
You mentioned starargs as an example of a tuple that should be a list. An opposite example is probably the result of str.split: in most cases, you split a string into fields, and you know which is which by index. Often I've wished .split returned a tuple.
And a technical nitpick: lists are naturally processed with for statement, (same name referring to their elements at different times), while tuples are naturally processed with unpacking (different names referring to their elements at the same time). Both of them are "iteration" from the perspective of Python (they use the iteration protocol).
Tuples don't have a nice shortcut for comprehension, but writing something like tuple(i for i in range(3)) is good enough (it's actually a generator compression, but since it is wrapped in a function, the extra parantheses can be ommited). As you say, that kind of usage is untypical for tuples anyway so they don't need their own syntax.
>>> xs = [None, 1, set()]
is perfectly valid of course ! This makes them different to C arrays.
Generator expressions can be used with function calls nicely:
eg
def process_list(data=[]):
every time you run that function the list will be the same list from the start of the program. also passing lists into functions doesnt copy it so can cause more issues.
more detail here http://docs.python-guide.org/en/latest/writing/gotchas/
One quibble: you use the term “the Cultural Difference” where you're talking not so much about culture, but *meaning*.
The term “tuple” was chosen, not because of Python culture, but because it *already* has meaning – semantic connotations – in the computing field, that pre-dates Python's use of the term.
I'd say you are referring rather to “the Semantic Difference” between lists and tuples. You even refer to “the list/tuple semantics”.
Care to update the article to strike “Cultural Difference”, and instead talk about meaning and “the Semantic Difference”?
In fact, with the rise in interest of functional languages, there are Python programmers who focus more on the mutability aspect, and favor tuples even for homogenous sequences.
I think you perhaps view this as a semantic distinction because you have fully and completely adopted the Python culture that you should choose between lists and tuples based on the meaning of the data.
typedef struct { int bar; } foo;
foo a = { 42 };
foo.bar = 37;
I think I understand what you're trying to say: once declared, you can't add or remove elements from a tuple, just like you can't add or remove fields from a struct. I think this has the potential to confuse a learner coming to Python from C, though.
Add a comment: