'Why does passing and changing a class self-variable with an external function work for manipulating iterables but not variables?
I ran into a very hard to track down bug in my program where a class self-iterable was manipulated by an external function and discovered that some self-variables can be changed and some can't. Is it possible to manipulate a single self-variable like an int with an external function without passing the entire class?
Here's some example code:
class TestClass(object):
def __init__(self):
self.my_var = 0
self.my_str = "Foo"
self.my_tuple = (1, 2, 3)
self.my_list = [1, 2, 3]
self.my_dict = {"one": 1, "two": 2, "three": 3}
self.manipulate_1()
self.manipulate_2()
def manipulate_1(self):
external_1(self.my_var, self.my_list, self.my_str, self.my_tuple, self.my_dict)
print(self.my_var)
print(self.my_list)
print(self.my_str)
print(self.my_tuple[0])
print(self.my_dict["one"])
#prints 0, 15, Foo, 1, 15
def manipulate_2(self):
external_2(self)
print("\n" + str(self.my_var))
# prints 1
def external_1(instance_var, instance_list, instance_str, instance_tuple, instance_dict):
instance_var += 1
del instance_list[0]
del instance_list[0]
instance_list[0] = 15
instance_str = "Bar"
list(instance_tuple)[0] = 15
instance_dict.update({"one": 15})
def external_2(instance):
instance.my_var += 1
a = TestClass()
The list can be manipulated by deleting entries just by passing it as an argument, while the variable can only be manipulated while passing self
.
Is there a way to manipulate a single self-variable. If not, does passing self
come with any performance or other issues?
I.E., if I want to manipulate a self-variable, is using a method mandatory?
Solution 1:[1]
Python's arguments passing works the same for all objects - the original object is passed (not "a copy of", not "a reference to", not "a pointer to" - it IS the object itself that is passed), regardless of the object's type, whether it's mutable or not etc. These objects are then bound to their matching parameter's names as local variables.
The difference you observe is actually the result of the difference between to totally distinct operations: rebinding a (local) name and mutating an object.
Since parameters are local variables (local names actually) rebinding a parameter in your function's body only make this name point to another object, and does not impact the original argument (except for decreasing the reference counter). So obviously this has absolutely no effect outside the function itself.
Now when you mutate one of your argument, since you are working on the very object you passed to the function, those changes are, very obviously, visible outside the function.
Here:
def external_1(instance_var, instance_list, instance_str, instance_tuple, instance_dict):
# this one rebinds the local name `instance_var`
# to a new `int` object. Doesn't affect the object
# previously bound to `instance_var`
instance_var += 1
# those three statement mutate `instance_list`,
# so the effect is visible outside the function
del instance_list[0]
del instance_list[0]
instance_list[0] = 15
# this one rebinds the local name `instance_str`
# to the literal string "Bar". Same as for `instance_var`
instance_str = "Bar"
# this one creates a list from `instance_tuple`,
# mutate this list, and discard it. IOW it eats a
# couple processor cycles for nothing.
list(instance_tuple)[0] = 15
# and this one mutates `instance_dict` so the
# effect is visible outside the function
instance_dict.update({"one": 15})
And here:
def external_2(instance):
# this one mutates `instance` - it's actually
# syntactic sugar for
# `instance.__setattr__("my_var", instance.__getattribute__("my_var") + 1))`
instance.my_var += 1
As I already mentionned a couple times in the comments, all this (and much more) is explained in full details in Ned Batchelder's reference article.
Solution 2:[2]
Please see the addendum.
This is normal and expected behavior. Its because of the difference between sending reference and sending values.
With external_1(self.my_var, self.my_list)
:
You send
self.my_var
which is value. This means thatexternal_1
receives only a value. That value is then local to the function, so the class has no way of knowing if it changed. Try this to see this working:def external_1(instance_var, instance_list): instance_var += 1 print('This will print one:', instance_var) del instance_list[0] del instance_list[0]
The variable 'self.my_list
is an *reference*. This means you're sending the address of where to find the list. So the function
external_1` will go to that address and change the list values there.
With external_2(self)
you send a reference to external_2
of the entire class. So it does exactly the same as with self.my_list
.
If you still don't fully understand, don't worry, it took me quite some time to understand these kind of references (or pointers). There is millions of tutorials and videos about how they work.
Addendum:
@bruno is correct when saying that I'm not explaining it correct in the technical sense, and how Python actually handles all the variables. I'm simply trying to explain what happens as an overview, and coming from the C world, it's how I understand it.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | bruno desthuilliers |
Solution 2 |