Lists vs. Tuples

Thursday 18 August 2016

A common beginner Python question: what's the difference between a list and a tuple?

The answer is that there are two different differences, with complex interplay between the two. There is the Technical Difference, and the Cultural Difference.

First, the things that are the same: both lists and tuples are containers, a sequence of objects:

>>> my_list = [1, 2, 3]
>>> type(my_list)
<class 'list'>
>>> my_tuple = (1, 2, 3)
>>> type(my_tuple)
<class 'tuple'>

Either can have elements of any type, even within a single sequence. Both maintain the order of the elements (unlike sets and dicts).

Now for the differences. The Technical Difference between lists and tuples is that lists are mutable (can be changed) and tuples are immutable (cannot be changed). This is the only distinction that the Python language makes between them:

>>> my_list[1] = "two"
>>> my_list
[1, 'two', 3]
>>> my_tuple[1] = "two"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

That's the only technical difference between lists and tuples, though it manifests in a few ways. For example, lists have a .append() method to add more elements to the list, while tuples do not:

>>> my_list.append("four")
>>> my_list
[1, 'two', 3, 'four']
>>> my_tuple.append("four")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'

Tuples have no need for an .append() method, because you can't modify tuples.

The Cultural Difference is about how lists and tuples are actually used: lists are used where you have a homogenous sequence of unknown length; tuples are used where you know the number of elements in advance because the position of the element is semantically significant.

For example, suppose you have a function that looks in a directory for files ending with *.py. It should return a list, because you don't know how many you will find, and all of them are the same semantically: just another file that you found.

>>> find_files("*.py")
["control.py", "config.py", "cmdline.py", "backward.py"]

On the other hand, let's say you need to store five values to represent the location of weather observation stations: id, city, state, latitude, and longitude. A tuple is right for this, rather than a list:

>>> denver = (44, "Denver", "CO", 40, 105)
>>> denver[1]
'Denver'

(For the moment, let's not talk about using a class for this.) Here the first element is the id, the second element is the city, and so on. The position determines the meaning.

To put the Cultural Difference in terms of the C language, lists are like arrays, tuples are like structs.

Python has a namedtuple facility that can make the meaning more explicit:

>>> from collections import namedtuple
>>> Station = namedtuple("Station", "id, city, state, lat, long")
>>> denver = Station(44, "Denver", "CO", 40, 105)
>>> denver
Station(id=44, city='Denver', state='CO', lat=40, long=105)
>>> denver.city
'Denver'
>>> denver[1]
'Denver'

One clever summary of the Cultural Difference between tuples and lists is: tuples are namedtuples without the names.

The Technical Difference and the Cultural Difference are an uneasy alliance, because they are sometimes at odds. Why should homogenous sequences be mutable, but hetergenous sequences not be? For example, I can't modify my weather station because a namedtuple is a tuple, which is immutable:

>>> denver.lat = 39.7392
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't set attribute

And sometimes the Technical considerations override the Cultural considerations. You cannot use a list as a dictionary key, because only immutable values can be hashed, so only immutable values can be keys. To use a list as a key, you can turn it into a tuple:

>>> d = {}
>>> nums = [1, 2, 3]
>>> d[nums] = "hello"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>> d[tuple(nums)] = "hello"
>>> d
{(1, 2, 3): 'hello'}

Another conflict between the Technical and the Cultural: there are places in Python itself where a tuple is used when a list makes more sense. When you define a function with *args, args is passed to you as a tuple, even though the position of the values isn't significant, at least as far as Python knows. You might say it's a tuple because you cannot change what you were passed, but that's just valuing the Technical Difference over the Cultural.

I know, I know: in *args, the position could be significant because they are positional parameters. But in a function that's accepting *args and passing it along to another function, it's just a sequence of arguments, none different from another, and the number of them can vary between invocations.

Python uses tuples here because they are a little more space-efficient than lists. Lists are over-allocated to make appending faster. This shows Python's pragmatic side: rather than quibble over the list/tuple semantics of *args, just use the data structure that works best in this case.

For the most part, you should choose whether to use a list or a tuple based on the Cultural Difference. Think about what your data means. If it can have different lengths based on what your program encounters in the real world, then it is probably a list. If you know when you write the code what the third element means, then it is probably a tuple.

On the other hand, functional programming emphasizes immutable data structures as a way to avoid side-effects that can make it difficult to reason about code. If you are a functional programming fan, you will probably prefer tuples for their immutability.

So: should you use a tuple or a list? The answer is: it's not always a simple answer.

tagged: » 23 reactions

Comments

[gravatar]
SylvainDe 12:01 PM on 18 Aug 2016

Once again, excellent article. I perfectly puts words on things I knew but couldn't explain. I also quite like Raymond Hettinger's wording "Loopy lists and structy tuples" ( https://twitter.com/raymondh/status/324920924103122944 ).

[gravatar]
Tim Arnold 1:15 PM on 18 Aug 2016

Hi, I enjoy your articles and learn. I hardly ever use tuples, just out of habit. I wonder what price I'm paying for always (nearly) using lists. The only time I can remember using tuples is when I need it as a dictionary key. So I understand what you're saying about the differences, but what is the real-life penalty for not using tuples? Is it performance? thanks again.

[gravatar]
intellimath 2:07 PM on 18 Aug 2016

There is one difficulty in current python implementation of lists: if one create list from another object with known size then the created list instance allocates more memory than one need. This sometimes prevent using of lists as mutable sequence with fixed size.

[gravatar]
Ned Batchelder 2:14 PM on 18 Aug 2016

@SylvainDe: thanks for that, it's a good point that lists are for iterating over, and tuples generally are not, although you can.

[gravatar]
Ned Batchelder 2:14 PM on 18 Aug 2016

@intellimath: although lists are over-allocated, I'm not sure what prevents you from using them?

[gravatar]
Christoph 2:33 PM on 18 Aug 2016

In my undersanding, regarding the "cultural difference", whether the position of the element is semantically significant or not makes the real difference, while the homogeneity is only a weak indicator. For instance, you usually want to implement 2D or 3D points in space as tuples, not as lists, even though their components have all the same type.

[gravatar]
Christoph 2:44 PM on 18 Aug 2016

Tim: Yes, tuples are a bit smaller and faster. Also, various optimizers might use the information that something is immutable to make it even faster. Plus the immutability protects you from accidentally changing something you don't want to change.

[gravatar]
intellimath 5:29 PM on 18 Aug 2016

@Ned Batchelder: You right that nothing prevent to use lists. But it would be nice if list instance created at first time (before any mutation) used memory without over-allocation.

[gravatar]
Veky 10:02 AM on 19 Aug 2016

It's a common misconception that mutable objects cannot be keys to a dictionary. The truth is, they are hashed, just not by value, but by identity. I don't usually use them as keys directly, but often I have sets of instances of my custom classes. It works perfectly fine. Only if you define __eq__, Python turns off that behavior. (Also, not all tuples are hashable. But you never claimed that.:)

Also, lists and tuples are not the only containers. In your glob example, a set would probably be a better option. Or, in Python you don't even have to choose a container: write a generator and let your user decide how to store the yielded items (and whether to store them at all).

An additional point of difference: lists have a nice shortcut for comprehension, tuples don't. In quick&dirty scripts, it's often a deciding factor. :-)

You mentioned starargs as an example of a tuple that should be a list. An opposite example is probably the result of str.split: in most cases, you split a string into fields, and you know which is which by index. Often I've wished .split returned a tuple.

And a technical nitpick: lists are naturally processed with for statement, (same name referring to their elements at different times), while tuples are naturally processed with unpacking (different names referring to their elements at the same time). Both of them are "iteration" from the perspective of Python (they use the iteration protocol).

[gravatar]
Christoph 4:29 PM on 19 Aug 2016

Good point, Veky. Processing of lists with "for" and tuples by unpacking is indeed characteristic for these types.

Tuples don't have a nice shortcut for comprehension, but writing something like tuple(i for i in range(3)) is good enough (it's actually a generator compression, but since it is wrapped in a function, the extra parantheses can be ommited). As you say, that kind of usage is untypical for tuples anyway so they don't need their own syntax.

[gravatar]
Barnaby Robson 4:14 AM on 20 Aug 2016

Wait, why are you saying lists are homogenous ?

>>> xs = [None, 1, set()]

is perfectly valid of course ! This makes them different to C arrays.

[gravatar]
Ned Batchelder 10:46 AM on 20 Aug 2016

@Veky: these are all excellent points. I don't quite agree that str.split() "in most cases" is unpacked, I'd put it in a middle ground where there are common cases for both looping and unpacking, a good demonstration of why Python's pragmatism wins out over any strict separation of loopy from structy.

[gravatar]
Ned Batchelder 10:48 AM on 20 Aug 2016

@Barnaby: keep in mind I put the homogeneity consideration under the Cultural Difference. Of course lists can have different types, but it's unusual, and indicates that you might want a tuple instead. I say "might" because your different-typed values might still all be treated homogeneously. There are other considerations than just type.

[gravatar]
Chris Mullins 6:26 PM on 20 Aug 2016

Great article. I love the way your writing is accessible to beginners, and builds from fundamentals of not just Python but software engineering in general. I'm trying to build my own talks this way. Thanks!

[gravatar]
stuart 1:07 AM on 21 Aug 2016

Thank you for the beginner accessible article, I appreciated the part that describes when and what you should use lists and tuples for, I found that most valuable. The fact that tuples are more space efficient was interesting as well.

[gravatar]
rbistolfi 11:44 AM on 22 Aug 2016

An additional point of difference: lists have a nice shortcut for comprehension, tuples don't. In quick&dirty scripts, it's often a deciding factor. :-)

Generator expressions can be used with function calls nicely:

>>> tuple(2**i for i in range(10))

[gravatar]
Veky 8:46 PM on 23 Aug 2016

Trust me, I know all about generator expressions. I said "in quick&dirty scripts", those 5 characters are a great weight. :-)

[gravatar]
Neil Khristian Manuel 11:24 AM on 24 Aug 2016

Nice article, Perfect for python programming aspirants.

[gravatar]
Ajurna 3:55 PM on 25 Aug 2016

one major caveat here is that if you use a list in default function parameter you are going to have lots of problems with unpredictability.
eg
def process_list(data=[]):
every time you run that function the list will be the same list from the start of the program. also passing lists into functions doesnt copy it so can cause more issues.

more detail here http://docs.python-guide.org/en/latest/writing/gotchas/

[gravatar]
Mihail Temelkov 4:02 PM on 25 Aug 2016

I prefer using tuples over lists, unless I know I need a mutable collection. They are faster, in fact they are the fastest collection type in terms of creation time, and "safer" (smaller chance of having a side effect).

[gravatar]
Curtis Miller 5:17 PM on 25 Aug 2016

Great post! I really like the distinction between "Technical" and "Cultural" difference. I also never knew what a named tuple was; thanks for the exposure.

[gravatar]
Ben Finney 12:27 AM on 29 Sep 2016

An important topic, well explained. Thank you!

One quibble: you use the term “the Cultural Difference” where you're talking not so much about culture, but *meaning*.

The term “tuple” was chosen, not because of Python culture, but because it *already* has meaning – semantic connotations – in the computing field, that pre-dates Python's use of the term.

I'd say you are referring rather to “the Semantic Difference” between lists and tuples. You even refer to “the list/tuple semantics”.

Care to update the article to strike “Cultural Difference”, and instead talk about meaning and “the Semantic Difference”?

[gravatar]
Ned Batchelder 10:52 AM on 30 Sep 2016

@Ben: I see your point. But there's something subtle here about the semantics imposed by the language (immutability, order), and the semantics understood implicitly by the programmer as they write the code (this data is like a struct). I chose Cultural because partly it's about what we as Python programmers have together decided to use lists and tuples for. It's part of the Python culture to split up lists and tuples as we have, just as it's part of the Lisp culture to use lists for nearly everything.

In fact, with the rise in interest of functional languages, there are Python programmers who focus more on the mutability aspect, and favor tuples even for homogenous sequences.

I think you perhaps view this as a semantic distinction because you have fully and completely adopted the Python culture that you should choose between lists and tuples based on the meaning of the data.

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.