'Are Python variables pointers? Or else, what are they?
Variables in Python are just pointers, as far as I know.
Based on this rule, I can assume that the result for this code snippet:
i = 5
j = i
j = 3
print(i)
would be 3
.
But I got an unexpected result for me, and it was 5
.
Moreover, my Python book does cover this example:
i = [1,2,3]
j = i
i[0] = 5
print(j)
The result would be [5,2,3]
.
What am I understanding wrong?
Solution 1:[1]
We call them references. They work like this
i = 5 # create int(5) instance, bind it to i
j = i # bind j to the same int as i
j = 3 # create int(3) instance, bind it to j
print i # i still bound to the int(5), j bound to the int(3)
Small ints are interned, but that isn't important to this explanation
i = [1,2,3] # create the list instance, and bind it to i
j = i # bind j to the same list as i
i[0] = 5 # change the first item of i
print j # j is still bound to the same list as i
Solution 2:[2]
Variables are not pointers. When you assign to a variable you are binding the name to an object. From that point onwards you can refer to the object by using the name, until that name is rebound.
In your first example the name i
is bound to the value 5
. Binding different values to the name j
does not have any effect on i
, so when you later print the value of i
the value is still 5
.
In your second example you bind both i
and j
to the same list object. When you modify the contents of the list, you can see the change regardless of which name you use to refer to the list.
Note that it would be incorrect if you said "both lists have changed". There is only one list but it has two names (i
and j
) that refer to it.
Related documentation
Solution 3:[3]
Python variables are names bound to objects
From the docs:
Names refer to objects. Names are introduced by name binding operations. Each occurrence of a name in the program text refers to the binding of that name established in the innermost function block containing the use.
When you do
i = 5
j = i
that's the same as doing:
i = 5
j = 5
j
doesn't point to i
, and after the assignment, j
doesn't know that i
exists. j
is simply bound to whatever i
was pointing to at the time of assignment.
If you did the assignments on the same line, it would look like this:
i = j = 5
And the result would be exactly the same.
Thus, later doing
i = 3
doesn't change what j
is pointing to - and you can swap it - j = 3
would not change what i
is pointing to.
Your example doesn't remove the reference to the list
So when you do this:
i = [1,2,3]
j = i
It's the same as doing this:
i = j = [1,2,3]
so i
and j
both point to the same list. Then your example mutates the list:
i[0] = 5
Python lists are mutable objects, so when you change the list from one reference, and you look at it from another reference, you'll see the same result because it's the same list.
What you do probably want is a copy of the list, like this perhaps:
i = [1,2,3]
j = i.copy()
Note that both lists contain the same objects and if they are mutable they will be in the same mutated state when accessed from both lists because they are the same objects.
Solution 4:[4]
TLDR: Python names work like pointers with automatic de/referencing but do not allow explicit pointer operations. Other targets represent indirections, which behave similar to pointers.
The Python language spec does not define what names and such actually are, only how they behave. However, the behaviour can be explained with pointers.
The CPython implementation uses pointers of type PyObject*
under the hood. As such, it is possible to translate name semantics to pointer operations. The key is to separate names from actual objects.
The example Python code includes both names (i
) and objects (5
).
i = 5 # name `i` refers to object `5`
j = i # ???
j = 3 # name `j` refers to object `3`
This can be roughly translated to C code with separate names and objects.
int three=3, five=5; // objects
int *i, *j; // names
i = &five; // name `i` refers to position of object `5`
j = i; // name `j` refers to referent of `i`
j = &three; // name `j` refers to position of object `3`
The important part is that "names-as-pointers" do not store objects! We did not define *i = five
, but i = &five
. The names and objects exist independent from each other.
Names only point to existing objects in memory.
When assigning from name to name, no objects are exchanged! When we define j = i
, this is equivalent to j = &five
. Neither i
nor j
are connected to the other.
+- name i -+ -\
\
--> + <five> -+
/ | 5 |
+- name j -+ -/ +----------+
As a result, changing the target of one name does not affect the other. It only updates what that specific name points to.
Python also has other kinds of name-like elements: attribute references (i.j
), subscriptions (i[j]
) and slicing (i[:j]
). Unlike names, which refer directly to objects, all three indirectly refer to elements of objects.
The example code includes both names (i
) and a subscription (i[0]
).
i = [1,2,3] # name `i` refers to object `[1, 2, 3]`
j = i # name `j` refers to referent of `i`
i[0] = 5 # ???
A CPython list
uses a C array of PyObject*
pointers under the hood. This can again be roughly translated to C code with separate names and objects.
typedef struct{
int *elements[3];
} list; // length 3 `list` type
int one = 1, two = 2, three = 3, five = 5;
list values = {&one, &two, &three}; // objects
list *i, *j; // names
i = &values; // name `i` refers to object `[1, 2, 3]`
j = i; // name `j` refers to referent of `i`
i->elements[0] = &five; // leading element of `i` refers to object `5`
The important part is that we did not change any names! We did change i->elements[0]
, the element of an object both our names point to.
Values of existing compound objects may be changed.
When changing the value of an object through a name, names are not changed. Both i
and j
still refer to the same object, whose value we can change.
+- name i -+ -\
\
--> + <values> -+
/ | elements | --> [1, 2, 3]
+- name j -+ -/ +-----------+
The intermediate object behaves similar to a pointer in that we can directly change what it points to and reference it from multiple names.
Solution 5:[5]
They are not quite pointers; they are references to objects. Objects can be either mutable, or immutable. An immutable object is copied when it is modified. A mutable object is altered in-place. An integer is an immutable object, that you reference by your i and j variables. A list is a mutable object.
In your first example
i = 5
# The label i now references 5
j = i
# The label j now references what i references
j = 3
# The label j now references 3
print i
# i still references 5
In your second example:
i = [1, 2, 3]
# 'i' references a list object (a mutable object)
j = i
# 'j' now references the same object as 'i' (they reference the same mutable object)
i[0] = 5
# Sets first element of references object to 5
print j
# Prints the list object that 'j' references. It's the same one as 'i'.
Solution 6:[6]
Assignment doesn't modify objects; all it does is change where the variable points. Changing where one variable points won't change where another one points.
You are probably thinking of the fact that lists and dictionaries are mutable types. There are operators to modify the actual objects in-place, and if you use one of those, you will see the change in all variables pointing to the same object:
x = []
y = x
x.append(1)
# x and y both are now [1]
But assignment still just moves the pointer around:
x = [2]
# x now points to new list [2]; y still points to old list [1]
Numbers, unlike dictionaries and lists, are immutable. If you do x = 3; x += 2
, you aren't transforming the number 3 into the number 5; you're just making the variable x
point to 5 instead. The 3 is still out there unchanged, and any variables pointing to it will still see 3 as their value.
(In the actual implementation, numbers are probably not reference types at all; it's more likely that the variables actually contain a representation of the value directly rather than pointing to it. But that implementation detail doesn't change the semantics where immutable types are concerned.)
Solution 7:[7]
When you set j=3
the label j
no longer applies (points) to i
, it starts to point to the integer 3
. The name i
is still referring to the value you set originally, 5
.
Solution 8:[8]
Whatever variable is on the left side of '=' sign is assigned with the value on the right side of '='
i = 5
j = i
--- j has 5
j = 3
--- j has 3 (overwrites the value of 5) but nothing has been changed regarding i
print(i)
-- so this prints 5
Solution 9:[9]
In Python, everything is object including the memory pieces themselves that you are returned. That means, when new memory chunk is created (irrespective of what've you created: int, str, custom object etc.), you have a new memory object. In your case this is the assignment to 3 which creates a new (memory) object and thus has a new address.
If you run the following, you see what I mean easily.
i = 5
j = i
print("id of j: {}", id(j))
j = 3
print("id of j: {}", id(j))
IMO, memory wise, this is the key understanding/difference between C and Python. In C/C++, you're returned a memory pointer (if you use pointer syntax of course) instead of a memory object which gives you more flexibility in terms of changing the referred address.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | John La Rooy |
Solution 2 | |
Solution 3 | |
Solution 4 | |
Solution 5 | Peter Mortensen |
Solution 6 | |
Solution 7 | mbatchkarov |
Solution 8 | Peter Mortensen |
Solution 9 | stdout |