4. Objects and References#

One of the important topics is understanding how Python works with storing data, and how one can have a lot of headaches if you blindly trust this old and familiar operator \(=\). Let’s look at some examples:

First, let’s create 2 variables var1 and var2 of some simple data type, like an integer, then let’s make them equal and finally let’s change var2.

var1 = 5
var2 = 7
print(f'var1 = {var1} and var2 = {var2} (initially)')

var1 = var2
print(f'var1 = {var1} and var2 = {var2} (after "=" accident)')

var2 -= 777
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
var1 = 5 and var2 = 7 (initially)
var1 = 7 and var2 = 7 (after "=" accident)
var1 = 7 and var2 = -770 (after altering var2)

As you can see, nothing extraordinary happened here. Let’s repeat this but now with a more complex data type — lists.

var1 = [1, 2, 3]
var2 = [555, 777, 888]
print(f'var1 = {var1} and var2 = {var2} (initially)')

var1 = var2
print(f'var1 = {var1} and var2 = {var2} (after "=" accident)')

var2[2] -= 777
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
var1 = [1, 2, 3] and var2 = [555, 777, 888] (initially)
var1 = [555, 777, 888] and var2 = [555, 777, 888] (after "=" accident)
var1 = [555, 777, 111] and var2 = [555, 777, 111] (after altering var2)

Hmmmm, that’s strange… we altered only one list but both were changed in the end! Why would that happen? Welp, the two reasons behind that are \(1)\) what a variable actually is; and, \(2)\) what the \(=\) operator actually does. In short, variables are just links to the spatial location where objects are stored. By reassigning the value of a variable, you’re just changing this link.

So, first, when you create 2 lists, var1 = [1, 2, 3] and var2 = [555, 777, 888], you create two different objects: var1, a variable referring to the list [1, 2, 3]; and var2, a variable referring to the list [555, 777, 888]. Then, with the line var1 = var2 you don’t actually change the content of var1 — you just make it refer to the list [555, 777, 888]! Thus, by changing one element of var2 you will be able to see changes in var1 as well since they both refer to the same object in memory! Here’s a sketch of the described situation:

image.png

You can also see this by using the id() function. It returns the unique id assigned to the object, thus one object will return the same id every time. However, a copy of that object with the same value but stored in a different place will return a different id. In addition, the is operator compares the identity of two variables and returns True if they reference the same object!

var1 = [1, 2, 3]
var2 = [555, 777, 888]

print(f'var1 = {var1} and var2 = {var2} (initially)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')


var1 = var2

print(f'var1 = {var1} and var2 = {var2} (after "=" accident)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')


var2[2] -= 777
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')
var1 = [1, 2, 3] and var2 = [555, 777, 888] (initially)
var1 id = 1871667411520 and var2 id = 1871667411328
var1 is var2 -> False

var1 = [555, 777, 888] and var2 = [555, 777, 888] (after "=" accident)
var1 id = 1871667411328 and var2 id = 1871667411328
var1 is var2 -> True

var1 = [555, 777, 111] and var2 = [555, 777, 111] (after altering var2)
var1 id = 1871667411328 and var2 id = 1871667411328
var1 is var2 -> True

You can see that initially var1 and var2 were two completely different objects; however, after using the = operator, they started to refer to the same object. Okay, now you understand how it works… but then — why doesn’t it happen with integer numbers but with lists? Well… numbers are actually immutable and this aliasing problem is not really a problem here, since any change creates a new number (instead of modifying the old one).

var1 = 5
var2 = 7

print(f'var1 = {var1} and var2 = {var2} (initially)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var1 = var2

print(f'var1 = {var1} and var2 = {var2} (after "=" accident)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var2 -= 777
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')
var1 = 5 and var2 = 7 (initially)
var1 id = 140713374457728 and var2 id = 140713374457792
var1 is var2 -> False

var1 = 7 and var2 = 7 (after "=" accident)
var1 id = 140713374457792 and var2 id = 140713374457792
var1 is var2 -> True

var1 = 7 and var2 = -770 (after altering var2)
var1 id = 140713374457792 and var2 id = 1871667384656
var1 is var2 -> False

The same will happen with any immutable object type: strings, tuples, etc

var1 = "ananas"
var2 = "pineapple"

print(f'var1 = {var1} and var2 = {var2} (initially)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var1 = var2

print(f'var1 = {var1} and var2 = {var2} (after "=" accident)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var2 += " is tasty"
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')
var1 = ananas and var2 = pineapple (initially)
var1 id = 1871666228400 and var2 id = 1871666490992
var1 is var2 -> False

var1 = pineapple and var2 = pineapple (after "=" accident)
var1 id = 1871666490992 and var2 id = 1871666490992
var1 is var2 -> True

var1 = pineapple and var2 = pineapple is tasty (after altering var2)
var1 id = 1871666490992 and var2 id = 1871667510128
var1 is var2 -> False

And here’s a small illustration of what happened:

image.png

So, as you can see — this confusion is not a big deal for immutable objects. However, it should still be explained what you should do in this situation with the mutable objects. How could you assign values of one list to another list? First, you can just write a for loop and copy the lower level data, which is immutable, with the = operator.

  • Option 1 - a simple, yet reliable, for loop.

var1 = [1, 2, 3]
var2 = [555, 777, 888]

print(f'var1 = {var1} and var2 = {var2} (initially)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

for i in range(len(var1)):
    var1[i] = var2[i] 

print(f'var1 = {var1} and var2 = {var2} (after making them equal)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var2[2] -= 777
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')
var1 = [1, 2, 3] and var2 = [555, 777, 888] (initially)
var1 id = 1871667412288 and var2 id = 1871667429760
var1 is var2 -> False

var1 = [555, 777, 888] and var2 = [555, 777, 888] (after making them equal)
var1 id = 1871667412288 and var2 id = 1871667429760
var1 is var2 -> False

var1 = [555, 777, 888] and var2 = [555, 777, 111] (after altering var2)
var1 id = 1871667412288 and var2 id = 1871667429760
var1 is var2 -> False

The second option is to use the implemented copy() method for the mutable objects you use. Such a method will return a shallow copy of the variable you copy.

  • Option 2 - implemented methods, like copy()

var1 = [1, 2, 3]
var2 = [555, 777, 888]

print(f'var1 = {var1} and var2 = {var2} (initially)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var1 = var2.copy()

print(f'var1 = {var1} and var2 = {var2} (after making them equal)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var2[2] -= 777
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')
var1 = [1, 2, 3] and var2 = [555, 777, 888] (initially)
var1 id = 1871666246080 and var2 id = 1871667432064
var1 is var2 -> False

var1 = [555, 777, 888] and var2 = [555, 777, 888] (after making them equal)
var1 id = 1871667429760 and var2 id = 1871667432064
var1 is var2 -> False

var1 = [555, 777, 888] and var2 = [555, 777, 111] (after altering var2)
var1 id = 1871667429760 and var2 id = 1871667432064
var1 is var2 -> False

The third option is to use the copy() function within the copy module. This function also performs a shallow level copying procedure. For your reference, the deepcopy() function recursively copies everything from its argument. More information on how to use the copy module can be found here.

  • Option 3 - module copy

import copy

var1 = [1, 2, 3]
var2 = [555, 777, 888]

print(f'var1 = {var1} and var2 = {var2} (initially)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var1 = copy.copy(var2)

print(f'var1 = {var1} and var2 = {var2} (after making them equal)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var2[2] -= 777
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')
var1 = [1, 2, 3] and var2 = [555, 777, 888] (initially)
var1 id = 1871667432256 and var2 id = 1871666251328
var1 is var2 -> False

var1 = [555, 777, 888] and var2 = [555, 777, 888] (after making them equal)
var1 id = 1871667432064 and var2 id = 1871666251328
var1 is var2 -> False

var1 = [555, 777, 888] and var2 = [555, 777, 111] (after altering var2)
var1 id = 1871667432064 and var2 id = 1871666251328
var1 is var2 -> False

You may be wondering the difference between shallow copy and deep copy. Shallow copy just copies the initial/top layer of an iterable, while deep copy makes sure to copy all values by going in all layers of an iterable. Here’s a better example to illustrate the difference:

import copy

lst = ["cool", 232, -876.5]
var1 = [1, lst, 3]
var2 = [555, 777, lst]

print(f'var1 = {var1} and var2 = {var2} (initially)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var1 = copy.copy(var2)
print(f'var1 = {var1} and var2 = {var2} (after shallow copy)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var2[1] -= 777
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')
var1 = [1, ['cool', 232, -876.5], 3] and var2 = [555, 777, ['cool', 232, -876.5]] (initially)
var1 id = 1871667410176 and var2 id = 1871667322304
var1 is var2 -> False

var1 = [555, 777, ['cool', 232, -876.5]] and var2 = [555, 777, ['cool', 232, -876.5]] (after shallow copy)
var1 id = 1871666251328 and var2 id = 1871667322304
var1 is var2 -> False

var1 = [555, 777, ['cool', 232, -876.5]] and var2 = [555, 0, ['cool', 232, -876.5]] (after altering var2)
var1 id = 1871666251328 and var2 id = 1871667322304
var1 is var2 -> False

Okay, looks good, but what if we now change the lst list?.

The same happens if we change it in one of the var’s

lst[0] = "I have changed my lst list, heh"
print(f'var1 = {var1} and var2 = {var2} (after altering lst)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var1[2][1] = 567898765
print(f'var1 = {var1} and var2 = {var2} (after altering lst inside var1)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')
var1 = [555, 777, ['I have changed my lst list, heh', 232, -876.5]] and var2 = [555, 0, ['I have changed my lst list, heh', 232, -876.5]] (after altering lst)
var1 id = 1871666251328 and var2 id = 1871667322304
var1 is var2 -> False

var1 = [555, 777, ['I have changed my lst list, heh', 567898765, -876.5]] and var2 = [555, 0, ['I have changed my lst list, heh', 567898765, -876.5]] (after altering lst inside var1)
var1 id = 1871666251328 and var2 id = 1871667322304
var1 is var2 -> False

You can see, that copy() made a copy of var2, so now changing immutable elements inside each of them won’t affect the other list. However, since both of them contain the variable lst, altering lst separately (or inside of var1 or var2) will alter all objects! This is because copy() only does a shallow copy, meaning that it copied a reference to lst, and not its contents (since it didn’t go inside, thus the name -> shallow). On the other hand, deepcopy() will make sure to copy all the values and no references will be shared, as shown below:

import copy

lst = ["cool", 232, -876.5]
var1 = [1, lst, 3]
var2 = [555, 777, lst]

print(f'var1 = {var1} and var2 = {var2} (initially)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var1 = copy.deepcopy(var2)

print(f'var1 = {var1} and var2 = {var2} (after deepcopy)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var2[1] -= 777
print(f'var1 = {var1} and var2 = {var2} (after altering var2)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

lst[0] = "I have changed my lst list, heh"
print(f'var1 = {var1} and var2 = {var2} (after altering lst)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')

var1[2][1] = 567898765
print(f'var1 = {var1} and var2 = {var2} (after altering lst inside var1)')
print(f'var1 id = {id(var1)} and var2 id = {id(var2)}')
print(f'var1 is var2 -> {var1 is var2}\n')
var1 = [1, ['cool', 232, -876.5], 3] and var2 = [555, 777, ['cool', 232, -876.5]] (initially)
var1 id = 1871667529856 and var2 id = 1871667529920
var1 is var2 -> False

var1 = [555, 777, ['cool', 232, -876.5]] and var2 = [555, 777, ['cool', 232, -876.5]] (after deepcopy)
var1 id = 1871667322304 and var2 id = 1871667529920
var1 is var2 -> False

var1 = [555, 777, ['cool', 232, -876.5]] and var2 = [555, 0, ['cool', 232, -876.5]] (after altering var2)
var1 id = 1871667322304 and var2 id = 1871667529920
var1 is var2 -> False

var1 = [555, 777, ['cool', 232, -876.5]] and var2 = [555, 0, ['I have changed my lst list, heh', 232, -876.5]] (after altering lst)
var1 id = 1871667322304 and var2 id = 1871667529920
var1 is var2 -> False

var1 = [555, 777, ['cool', 567898765, -876.5]] and var2 = [555, 0, ['I have changed my lst list, heh', 232, -876.5]] (after altering lst inside var1)
var1 id = 1871667322304 and var2 id = 1871667529920
var1 is var2 -> False

Tl;dr
Consider you have a mutable variable lst, inside the variable var2. Now, you want to make a copy of var2. The two options are:

1) Do you want var2_copy to be altered once you alter lst? Then perform a shallow copy.
*keep in mind that altering lst through var2 will also alter the original lst.

2) Do you want var2_copy to NOT be altered once you alter lst? Then perform a deep copy.

Below a last example on this.

Shallow copy

print('------SHALLOW COPY------')
lst = ['lst0','lst1','lst2'] 
print('original lst', lst)
var2 = [lst,'var[1]'] 
print('original var2', var2)
var2_copy = copy.copy(var2) 
print('original var2_copy', var2_copy)
------SHALLOW COPY------
original lst ['lst0', 'lst1', 'lst2']
original var2 [['lst0', 'lst1', 'lst2'], 'var[1]']
original var2_copy [['lst0', 'lst1', 'lst2'], 'var[1]']

In the following code we alter the first element inside the first element of var2 var2[0] means the first element of var2. Similarly, var2[0][0] means the first element of var2[0] which happens to be lst, i.e., the first element of lst is changes and this also affects var2 and var2_copy.

var2[0][0] = ["ALTERED"] 

print('altered lst', lst)
print('altered var2', var2)
print('altered var2_copy', var2_copy)
altered lst [['ALTERED'], 'lst1', 'lst2']
altered var2 [[['ALTERED'], 'lst1', 'lst2'], 'var[1]']
altered var2_copy [[['ALTERED'], 'lst1', 'lst2'], 'var[1]']

Deep copy

print('------DEEP COPY------')
lst = ['lst0','lst1','lst2'] 
print('original lst', lst) 
var2 = [lst,'var[1]'] 
print('original var2', var2)
var2_copy = copy.deepcopy(var2) 
print('original var2_copy', var2_copy)
------DEEP COPY------
original lst ['lst0', 'lst1', 'lst2']
original var2 [['lst0', 'lst1', 'lst2'], 'var[1]']
original var2_copy [['lst0', 'lst1', 'lst2'], 'var[1]']

In the following code we alter the first element inside the first element of var2, var2[0] means the first element of var2 similarly, var2[0][0] means the fist element of var2[0] which happens to be lst, in this case var2 is affected but var2_copy is not.

var2[0][0] = ['ALTERED']

print('altered lst', lst)
print('altered var2', var2)
print('NOT altered var2_copy', var2_copy)
altered lst [['ALTERED'], 'lst1', 'lst2']
altered var2 [[['ALTERED'], 'lst1', 'lst2'], 'var[1]']
NOT altered var2_copy [['lst0', 'lst1', 'lst2'], 'var[1]']

In case you would like the variable lst to not be altered once you alter element var2[0][0], you would have to make var2 to have a copy of lst, and not lst itself as shown below:

print('------SHALLOW COPY------')
lst = ['lst[0]','lst[1]','lst[2]'] 
print('original lst', lst)
var2 = [copy.copy(lst),'var[1]'] #
print('original var2', var2)
var2_copy = copy.copy(var2) 
print('original var2_copy', var2_copy)

var2[0][0] = ["ALTERED"] 

print('NOT altered lst', lst)
print('altered var2', var2)
print('altered var2_copy', var2_copy)


print('\n------DEEP COPY------')
lst = ['lst[0]','lst[1]','lst[2]'] 
print('original lst', lst) 
var2 = [copy.copy(lst),'var[1]'] 
print('original var2', var2)
var2_copy = copy.deepcopy(var2) 
print('original var2_copy', var2_copy)

var2[0][0] = ['ALTERED']

print('NOT altered lst', lst)
print('altered var2', var2)
print('NOT altered var2_copy', var2_copy)
------SHALLOW COPY------
original lst ['lst[0]', 'lst[1]', 'lst[2]']
original var2 [['lst[0]', 'lst[1]', 'lst[2]'], 'var[1]']
original var2_copy [['lst[0]', 'lst[1]', 'lst[2]'], 'var[1]']
NOT altered lst ['lst[0]', 'lst[1]', 'lst[2]']
altered var2 [[['ALTERED'], 'lst[1]', 'lst[2]'], 'var[1]']
altered var2_copy [[['ALTERED'], 'lst[1]', 'lst[2]'], 'var[1]']

------DEEP COPY------
original lst ['lst[0]', 'lst[1]', 'lst[2]']
original var2 [['lst[0]', 'lst[1]', 'lst[2]'], 'var[1]']
original var2_copy [['lst[0]', 'lst[1]', 'lst[2]'], 'var[1]']
NOT altered lst ['lst[0]', 'lst[1]', 'lst[2]']
altered var2 [[['ALTERED'], 'lst[1]', 'lst[2]'], 'var[1]']
NOT altered var2_copy [['lst[0]', 'lst[1]', 'lst[2]'], 'var[1]']

We understand that this topic can be a bit confusing, so don’t hesitate to ask questions! Our main goal is to make you familiar with different behaviors of Python, and we hope that it will help you to debug/understand your programs better!

Additional study material:#

After this Notebook you should be able to:

  • understand the difference between objects and references

  • understand the difference between copy and deepcopy