"Least Astonishment" and the Mutable Default Argument
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Ancient Construction
--
Chapters
00:00 &Quot;Least Astonishment&Quot; And The Mutable Default Argument
01:54 Answer 1 Score 337
03:58 Answer 2 Score 100
05:06 Answer 3 Score 149
06:27 Accepted Answer Score 1938
07:08 Thank you
--
Full question
https://stackoverflow.com/questions/1132...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #languagedesign #defaultparameters #leastastonishment
#avk47
ACCEPTED ANSWER
Score 1938
Actually, this is not a design flaw, and it is not because of internals or performance. It comes simply from the fact that functions in Python are first-class objects, and not only a piece of code.
As soon as you think of it this way, then it completely makes sense: a function is an object being evaluated on its definition; default parameters are kind of "member data" and therefore their state may change from one call to the other - exactly as in any other object.
In any case, the effbot (Fredrik Lundh) has a very nice explanation of the reasons for this behavior in Default Parameter Values in Python. I found it very clear, and I really suggest reading it for a better knowledge of how function objects work.
ANSWER 2
Score 337
Suppose you have the following code
fruits = ("apples", "bananas", "loganberries")
def eat(food=fruits):
...
When I see the declaration of eat, the least astonishing thing is to think that if the first parameter is not given, that it will be equal to the tuple ("apples", "bananas", "loganberries")
However, suppose later on in the code, I do something like
def some_random_function():
global fruits
fruits = ("blueberries", "mangos")
then if default parameters were bound at function execution rather than function declaration, I would be astonished (in a very bad way) to discover that fruits had been changed. This would be more astonishing IMO than discovering that your foo function above was mutating the list.
The real problem lies with mutable variables, and all languages have this problem to some extent. Here's a question: suppose in Java I have the following code:
StringBuffer s = new StringBuffer("Hello World!");
Map<StringBuffer,Integer> counts = new HashMap<StringBuffer,Integer>();
counts.put(s, 5);
s.append("!!!!");
System.out.println( counts.get(s) ); // does this work?
Now, does my map use the value of the StringBuffer key when it was placed into the map, or does it store the key by reference? Either way, someone is astonished; either the person who tried to get the object out of the Map using a value identical to the one they put it in with, or the person who can't seem to retrieve their object even though the key they're using is literally the same object that was used to put it into the map (this is actually why Python doesn't allow its mutable built-in data types to be used as dictionary keys).
Your example is a good one of a case where Python newcomers will be surprised and bitten. But I'd argue that if we "fixed" this, then that would only create a different situation where they'd be bitten instead, and that one would be even less intuitive. Moreover, this is always the case when dealing with mutable variables; you always run into cases where someone could intuitively expect one or the opposite behavior depending on what code they're writing.
I personally like Python's current approach: default function arguments are evaluated when the function is defined and that object is always the default. I suppose they could special-case using an empty list, but that kind of special casing would cause even more astonishment, not to mention be backwards incompatible.
ANSWER 3
Score 149
I know nothing about the Python interpreter inner workings (and I'm not an expert in compilers and interpreters either) so don't blame me if I propose anything unsensible or impossible.
Provided that python objects are mutable I think that this should be taken into account when designing the default arguments stuff. When you instantiate a list:
a = []
you expect to get a new list referenced by a.
Why should the a=[] in
def x(a=[]):
instantiate a new list on function definition and not on invocation? It's just like you're asking "if the user doesn't provide the argument then instantiate a new list and use it as if it was produced by the caller". I think this is ambiguous instead:
def x(a=datetime.datetime.now()):
user, do you want a to default to the datetime corresponding to when you're defining or executing x?
In this case, as in the previous one, I'll keep the same behaviour as if the default argument "assignment" was the first instruction of the function (datetime.now() called on function invocation).
On the other hand, if the user wanted the definition-time mapping he could write:
b = datetime.datetime.now()
def x(a=b):
I know, I know: that's a closure. Alternatively Python might provide a keyword to force definition-time binding:
def x(static a=b):
ANSWER 4
Score 100
Well, the reason is quite simply that bindings are done when code is executed, and the function definition is executed, well... when the functions is defined.
Compare this:
class BananaBunch:
bananas = []
def addBanana(self, banana):
self.bananas.append(banana)
This code suffers from the exact same unexpected happenstance. bananas is a class attribute, and hence, when you add things to it, it's added to all instances of that class. The reason is exactly the same.
It's just "How It Works", and making it work differently in the function case would probably be complicated, and in the class case likely impossible, or at least slow down object instantiation a lot, as you would have to keep the class code around and execute it when objects are created.
Yes, it is unexpected. But once the penny drops, it fits in perfectly with how Python works in general. In fact, it's a good teaching aid, and once you understand why this happens, you'll grok python much better.
That said it should feature prominently in any good Python tutorial. Because as you mention, everyone runs into this problem sooner or later.