Python interview question: tuple vs list
Let me start the series of “Python interview question” posts with this pretty common question. Tuples and lists are 2 seemingly similar sequence types in Python. And I love this question, because the depth of the answer is a good indicator of the candidate’s experience.
Literal syntax. The most evident way to get required data structure is to call the appropriate type — tuple
or list
. And you could also use syntactic sugar to construct either a list (using square brackets and commas to declare elements) or a tule (using commas to declare elements and optionally parenthesis to reduce ambiguity).
Mutability. Tuples are immutable, while lists are mutable. This point is the base for following ones.
Memory usage. Due to mutability, you need more memory for lists and less memory for tuples.
Extending. You can add a new element to both tuples and lists with the only difference that the id of the tuple will be changed (i.e., we’ll have new object).
Hashing. Tuples are hashable and lists are not. It means that you can use a tuple as a key in a dictionary.
Semantics. This point is more about best practice. You should use tuples as heterogeneous data structures, while lists are homogeneus sequences.
Links:
I’d argue that number 5, semantics, really ought to be number 1. Once you understand what a tuple means, you’ll understand the reason for all the rest, and be able to choose the right one.
People are often distracted by the low level performance features of these data structures. So you see things like “use tuples because they are faster” (for iterating over), or because they use less memory etc. This can turn into a proliferation of tuples (for consistency and/or compatibility with other code that is expecting tuples). And then you start to have real performance problems:
You cannot actually extend a tuple at all - they are immutable. As you say “extending” a tuple is actually creating an entirely new tuple, with all the elements of the original one copied over (by reference). However, if you ever want to “extend” a tuple, you’ve got the wrong data-structure - semantically, a tuple is a structure with a fixed number of elements, each of which has a different meaning e.g.
(name, email address)
pair. It never makes any sense to extend a tuple, and that is why it is immutable.So what does “extending” a tuple look like? Here is an example:
I have seen code like this in the wild - do NOT write code like this! The third line is not adding a new element to the tuple - you are in fact creating an entirely new tuple, which has to copy the elements of the old one, and because of this you end up with quadratic time complexity in the length of
things
, which is very bad.If you had used a list, as Guido intended, then you would be able to use
list.append
, which has excellent performance characteristics for this use case.The lesson is to just use the data structure that semantically matches what you have. Everything else will follow, because the performance characteristics of these data structures, and all the other language features and stdlib behaviour, have been optimized for their intended usage (e.g. there is such a thing as a list comprehension, but no tuple comprehension).
Donald Knuth is extremely relevant:
So we should worry about O(n²) performance (which is fixed by using the right data structure), and should not worry about tiny differences in memory/speed (which might be very different on other Python implementations, like PyPy for example).
I would argue that mutability has nothing to do with memory usage. Can you explain how one affects the other?
This is a really nice guide honestly. I feel like it covers all the aspects of these data types really well mate.
Pleased to hear that , David. Hope that will help someone during interview preparation process.