Classes and objects#
Education objectives
class, type, objects, attribute, methods
special methods (“dunder”)
OOP and encapsulation
Object-oriented programming: encapsulation#
Python is also an object-oriented language. For some problems, Object-Oriented Programming (OOP) is a very efficient paradigm. Many libraries use it so it is worth understanding what is object oriented programming, when it is useful and how it can be used in Python.
In this notebook, we are just going to consider the OOP notion of encapsulation and won’t study the more complicated concept of inheritance.
Concepts#
Object
An object is an entity that has a state and a behaviour. Objects are the basic elements of object-oriented system.
Class
Classes are “families” of objects. A class is a pattern that describes how objects will be built.
Introduction based on the complex type#
These concepts are so important for Python that we already used many objects and classes.
In particular, str, list and dict are “types”, or “classes”. In Python, these two
names basically means the same. We tend to use “types” for building types and classes for
types defined in libraries or in user code.
We have also used complex to do things like:
complex_number = complex("1j")
Here, we have just instantiated (i.e. create an instance of a class) the builtin type
complex.
We can use the dir function to get its attribute names. We filter out the names
starting by __ since they are special methods.
[name for name in dir(complex_number) if not name.startswith("__")]
['conjugate', 'imag', 'real']
real and imag are simple attributes and conjugate is a method (which can be
called):
complex_number.real
0.0
result = complex_number.conjugate()
result
-1j
We are now going to see how to define our own Complex class.
Attributes and __init__ special method#
You remember that it is better to first define a test function which defines what we want. Let us start with very simple requirements.
def test_complex_attributes(cls):
number = cls("1j")
assert number.imag == 1.0
assert number.real == 0.0
number = cls("1")
assert number.imag == 0.0
assert number.real == 1.0
number = cls(1)
assert number.imag == 0.0
assert number.real == 1.0
We can check if it works with the builtin complex type:
test_complex_attributes(complex)
No assert error indicates that our test is reasonable.
We can start by a too simple implementation
class Complex:
"""Our onw complex class"""
test_complex_attributes(Complex)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 test_complex_attributes(Complex)
Cell In[6], line 2, in test_complex_attributes(cls)
1 def test_complex_attributes(cls):
----> 2 number = cls("1j")
3 assert number.imag == 1.0
4 assert number.real == 0.0
TypeError: Complex() takes no arguments
We need to improve our implementation, which can lead to something like:
class Complex:
def __init__(self, obj):
if isinstance(obj, str):
obj = obj.strip()
if obj.endswith("j"):
self.real = 0.0
self.imag = float(obj[:-1])
# warning: early return
return
self.real = float(obj)
self.imag = 0.0
We defined a class with one __init__ method. Note that the methods take as first
argument a variable named self. The name self is just a convention but in practice it
is nearly always used. This first argument is the object used for the call of the method.
Note
We are going to understand that better in few minutes but the __init__ method is really
not adapted to explain this mechanism. So we will first see how this works for a simpler
method and then come back to the __init__ case.
Let us check if this implementation meet our requirements:
test_complex_attributes(Complex)
No assert error mean that this implementation is enough.
Add the conjugate method#
We are new going to focus on the conjugate method with this test:
def test_complex_conjugate(cls):
number = cls("1j").conjugate()
assert number.imag == -1.0
assert number.real == 0.0
test_complex_conjugate(complex)
test_complex_conjugate(Complex)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[14], line 1
----> 1 test_complex_conjugate(Complex)
Cell In[12], line 2, in test_complex_conjugate(cls)
1 def test_complex_conjugate(cls):
----> 2 number = cls("1j").conjugate()
3 assert number.imag == -1.0
4 assert number.real == 0.0
AttributeError: 'Complex' object has no attribute 'conjugate'
As expected, we have an exception. Let us modify our Complex class to fix that.
class Complex:
def __init__(self, real=0.0, imag=0.0):
if isinstance(real, str):
real = real.strip()
if real.endswith("j"):
if imag != 0.0:
raise TypeError(
"Complex() can't take second arg if first is a string"
)
self.real = 0.0
self.imag = float(real[:-1])
return
self.real = float(real)
self.imag = imag
def conjugate(self):
"""Return the complex conjugate of its argument."""
return Complex(real=self.real, imag=-self.imag)
Let’s check if it is sufficient:
test_complex_conjugate(Complex)
Note
Numbers in Python are immutable. Complex.conjugate returns a
new object and does not modify the object used for the call.
We can now come back to this weird self argument and note that:
number = Complex(imag=4)
assert Complex.conjugate(number).imag == -number.imag
assert number.conjugate().imag == -number.imag
Important
We now understand the purpose of the first argument of a method (self). It is the
object with whom the method is called.
Special (“dunder”) methods#
Special methods (also known as “dunder methods”) are methods whose name starts with __.
They are used to define how objects behave in specific situations. Python objects have a
lot of dunder methods:
complex_number = complex(imag=2)
[name for name in dir(complex_number) if name.startswith("__")]
['__abs__',
'__add__',
'__bool__',
'__class__',
'__complex__',
'__delattr__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__getnewargs__',
'__getstate__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__le__',
'__lt__',
'__mul__',
'__ne__',
'__neg__',
'__new__',
'__pos__',
'__pow__',
'__radd__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__rmul__',
'__rpow__',
'__rsub__',
'__rtruediv__',
'__setattr__',
'__sizeof__',
'__str__',
'__sub__',
'__subclasshook__',
'__truediv__']
Let’s now print a complex object. In IPython, we can just use its name for the last
instruction of a cell:
complex_number
2j
Note that we can get the string used by IPython by calling the builtin function str:
str(complex_number)
'2j'
Or by directly calling the special method __str__ of the object:
complex_number.__str__()
'2j'
This actually approximately what happens when we just write complex_number. IPython
produces a string with str(complex_number) and the str function calls
complex_number.__str__().
Let us see what happens for our object:
number = Complex(imag=2)
number
<__main__.Complex at 0x7f45e052e570>
Hum, not great. So we should write a test about this behaviour.
def test_complex_str(cls):
number = cls(imag=2)
assert str(number) == "2j"
Does it pass with the builtin complex type?
test_complex_str(complex)
Does it fail with our onw type?
test_complex_str(Complex)
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[25], line 1
----> 1 test_complex_str(Complex)
Cell In[23], line 3, in test_complex_str(cls)
1 def test_complex_str(cls):
2 number = cls(imag=2)
----> 3 assert str(number) == "2j"
AssertionError:
Good, let’s work on this:
class Complex:
def __init__(self, real=0.0, imag=0.0):
if isinstance(real, str):
real = real.strip()
if real.endswith("j"):
if imag != 0.0:
raise TypeError(
"Complex() can't take second arg if first is a string"
)
self.real = 0.0
self.imag = float(real[:-1])
return
self.real = float(real)
self.imag = imag
def conjugate(self):
"""Return the complex conjugate of its argument."""
return Complex(real=self.real, imag=-self.imag)
def __str__(self):
if self.real == 0.0:
return f"{self.imag}j"
return f"{self.real} + {self.imag}j"
Does it work better now?
test_complex_str(Complex)
Note on test coverage problem
Note that the last line of the class definition is not tested
(return f"{self.real} + {self.imag}j"). This is bad. It could be badly modified and the
tests would still pass. For real life code, one can consider and try to maximize the test
coverage, which is approximately defined as the percentage of lines covered by some tests.
Difference between __str__ and __repr__
We don’t care too much at this point, but these two different special methods exist. In few words:
The goal of
__str__is to be readable.__repr__has to be unambiguous.
Back to the __init__ special method#
number = Complex("1j")
is actually equivalent to:
# create a non-initialized object
# (no need to study and understand this line)
number = Complex.__new__(Complex)
# initialization of the object
number.__init__("1j")
You should now understand that the last line is equivalent to:
Complex.__init__(number, "1j")
Example: the weather stations#
Solution 0: a list of lists#
Let us suppose we have a set of weather stations that do measurements of wind speed and temperature. Suppose now one wants to compute some statistics on these data. A basic representation of a station will be an array of arrays: wind values and temperature values.
paris = [[10, 0, 20, 30, 20, 0], [1, 5, 1, -1, -1, 3]]
# get wind when temperature is maximal
idx_max_temp = paris[1].index(max(paris[1]))
print(f"max temp is {paris[1][idx_max_temp]}°C at index {idx_max_temp} ")
print(f"wind speed at max temp = {paris[0][idx_max_temp]} km/h")
max temp is 5°C at index 1
wind speed at max temp = 0 km/h
Solution 1: a dict of lists#
We can use a dictionnary:
paris = {"wind": [10, 0, 20, 30, 20, 0], "temperature": [1, 5, 1, -1, -1, 3]}
# get wind when temperature is minimal
paris_temp = paris["temperature"]
idx_max_temp = paris_temp.index(max(paris_temp))
print(f"max temp is {paris_temp[idx_max_temp]}°C at index {idx_max_temp}")
print(f"wind speed at max temp = {paris['wind'][idx_max_temp]} km/h")
max temp is 5°C at index 1
wind speed at max temp = 0 km/h
Comments#
Pro
More readable code (reading
paris["temperature"]is clearer thanparis[1]).Less error prone code (i.e. using words as keys allow to not use index numbers that are easily mistaken and lead to code that is hard to read and debug)
Con
The code to compute the final result is not very readable
Solution 2: add functions#
paris = {"wind": [10, 0, 20, 30, 20, 0], "temperature": [1, 5, 1, -1, -1, 3]}
def max_temp(station):
"""returns the maximum temperature available in the station"""
return max(station["temperature"])
def arg_max_temp(station):
"""returns the index of maximum temperature available in the station"""
max_temperature = max_temp(station)
return station["temperature"].index(max_temperature)
idx_max_temp = arg_max_temp(paris)
print(f"max temp is {max_temp(paris)}°C at index {arg_max_temp(paris)}")
print(f"wind speed at max temp = {paris['wind'][idx_max_temp]} km/h")
max temp is 5°C at index 1
wind speed at max temp = 0 km/h
Comments#
Pro:
Adding functions leads to a code that is easier to read, hence easier to debug.
Testing functions can be done separately from the rest of the code.
The computation done on the second part depends upon the functions (i.e it depends on the function definitions not their implementations).
Adding function allows to reuse code: computing the max temperature is something one could want to do in other places.
Con
We rely on the fact that the dictionnaries have been built correctly (for example wind and temperature arrays have the same length).
Solution 3: init function#
Define a function that builds the station (delegate the generation of the station dictionnary to a function).
def build_station(wind, temp):
"""Build a station given wind and temp
:param wind: (list) floats of winds
:param temp: (list) float of temperatures
"""
if len(wind) != len(temp):
raise ValueError("wind and temperature should have the same size")
return {"wind": list(wind), "temperature": list(temp)}
def max_temp(station):
"""returns the maximum temperature available in the station"""
return max(station["temperature"])
def arg_max_temp(station):
"""returns the index of maximum temperature available in the station"""
max_temperature = max_temp(station)
return station["temperature"].index(max_temperature)
paris = build_station([10, 0, 20, 30, 20, 0], [1, 5, 1, -1, -1, 3])
idx_max_temp = arg_max_temp(paris)
print(f"max temp is {max_temp(paris)}°C at index {arg_max_temp(paris)}")
print(f"wind speed at max temp = {paris['wind'][idx_max_temp]} km/h")
max temp is 5°C at index 1
wind speed at max temp = 0 km/h
Comments#
If the dedicated function
build_stationis used, the returned dictionary is well structured.If one changes
build_station, onlymax_tempandarg_max_temphave to be changed accordinglyWe use a list comprehension to be able to have parameters wind and temp provided by any ordered iterable (e.g. see
test_build_station_with_iterablewtihrange)BUT if we have a new kind of station, i.e. that holds only wind and humidity, we want to avoid to be able to use
max_tempwith it.
Solution 4: using a class#
We would like to “embed” the max_temp and the arg_max_temp in the “dictionnary
station” in order to address the last point.
And here comes object-oriented programming !
A class defines a template used for building object. In our example, the class (named
WeatherStation) defines the specifications of what is a weather station (i.e, a
weather station should contain an array for wind speeds, named “wind”, and an array for
temperatures, named “temp”). paris should now be an object that answers to these
specifications. Is is called an instance of the class WeatherStation.
When defining the class, we need to define how to initialize the object (special
“function” __init__).
class WeatherStation(object):
"""A weather station that holds wind and temperature
:param wind: any ordered iterable
:param temperature: any ordered iterable
wind and temperature must have the same length.
"""
def __init__(self, wind, temperature):
"""initialize the weather station.
Precondition: wind and temperature must have the same length.
ValueError is raised if this is not the case
:param wind: any ordered iterable
:param temperature: any ordered iterable"""
self.wind = list(wind)
self.temp = list(temperature)
if len(self.wind) != len(self.temp):
raise ValueError(
"wind and temperature should have the same size"
f" got len(wind)={len(self.wind)} vs "
f" len(temp)={len(self.temp)}"
)
def max_temp(self):
"""returns the maximum temperature recorded in the station"""
return max(self.temp)
def arg_max_temp(self):
"""returns the index of (one of the) maximum temperature recorded in the station"""
return self.temp.index(self.max_temp())
paris = WeatherStation([10, 0, 20, 30, 20, 0], [1, 5, 1, -1, -1, 3])
idx_max_temp = paris.arg_max_temp()
print(f"max temp is {paris.max_temp()}°C at index {paris.arg_max_temp()}")
print(f"wind speed at max temp = {paris.wind[idx_max_temp]} km/h")
max temp is 5°C at index 1
wind speed at max temp = 0 km/h
Comments#
The
max_tempand thearg_max_tempare now part of the classWeatherStation. Functions attached to classes are named methods. Similary,windandtemplists are also now part this class. Variables attached to classes are named members or attributes.if
max_tempmethod is called in many places, we can improve it by caching the result. This will not affect code the uses the class.arg_max_tempmethod should be rewriten as we implicitelly check equality of floats.
An object (here paris) thus contains both attributes (holding data for example) and
methods to access and/or process the data.
Exercise 21 (Try to code with class)
Add a method (
perceived_temp) that takes as input a temperature and wind and return the perceived temperature, i.e. taking into account the wind chill effect.Modify
max_tempandarg_max_tempso that they take an additional optional boolean parameter (e.g. perceived default to False). Ifperceivedis False, the methods have the same behaviour as before. If perceived is True, the temperatures to process are the perceived temperatures.
Solution to Exercise 21 (Try to code with class)
class WeatherStation(object):
"""A weather station that holds wind and temperature"""
def __init__(self, wind, temperature):
"""initialize the weather station.
Precondition: wind and temperature must have the same length
ValueError is raised if this is not the case
:param wind: any ordered iterable
:param temperature: any ordered iterable"""
self.wind = [x for x in wind]
self.temp = [x for x in temperature]
if len(self.wind) != len(self.temp):
raise ValueError(
"wind and temperature should have the same size"
f" got len(wind)={len(self.wind)} vs "
f" len(temp)={len(self.temp)}"
)
def perceived_temp(self, index):
"""computes the perceived temp according to
https://en.wikipedia.org/wiki/Wind_chill
i.e. The standard Wind Chill formula for Environment Canada is:
apparent = 13.12 + 0.6215*air_temp - 11.37*wind_speed^0.16 + 0.3965*air_temp*wind_speed^0.16
:param index: the index for which the computation must be made
:return: the perceived temperature"""
air_temp = self.temp[index]
wind_speed = self.wind[index]
# Perceived temperature does not have a sense without wind...
if wind_speed == 0:
apparent_temp = air_temp
else:
apparent_temp = (
13.12
+ 0.6215 * air_temp
- 11.37 * wind_speed**0.16
+ 0.3965 * air_temp * wind_speed**0.16
)
# Let's round to avoid trailing decimals...
return round(apparent_temp, 2)
def perceived_temperatures(self):
"""Returns an array of percieved temp computed from the temperatures and wind speed data"""
apparent_temps = []
for index in range(len(self.wind)):
# Reusing the method perceived_temp defined above
apparent_temperature = self.perceived_temp(index)
apparent_temps.append(apparent_temperature)
return apparent_temps
def max_temp(self, perceived):
"""returns the maximum temperature record in the station"""
if perceived:
apparent_temp = self.perceived_temperatures()
return max(apparent_temp)
else:
return max(self.temp)
def arg_max_temp(self, perceived):
"""returns the index of (one of the) maximum temperature record in the station"""
if perceived:
temp_array_to_search = self.perceived_temperatures()
else:
temp_array_to_search = self.temp
return temp_array_to_search.index(self.max_temp(perceived))
Comments#
The wind array was changed to have different maximum temperatures for the air and perceived temperatures: for air temperatures, the max is 5°C (with a wind speed 50 km/h). For perceived temperatures, the max is 3°C (as the wind speed is 0).
It was a choice to set the apparent/perceived temperature to the air temperature if the wind speed is 0 so the tests were written with this in mind. Testing such choices allows to have clear inputs/outputs.
isinstanceallows to test the type of an object (in this case, we test ifapparent_tempsis a list)When testing boolean in
ifstructures: useif perceived:rather thanif perceived == True:. It is equivalent but clearer and shorter !
Coming next: inheritance#
What if we now have a weather station that also measure humidity ?
Do we need to rewrite everything ?
What if we rewrite everything and we find a bug ?
Here comes inheritance
Comments on this solution#
Many problems:
if the number of measurements increases (e.g. having rainfall, humidity, …) the previous indexing will not be valid (what will
paris[5]represent? wind, temperature, …, ?)Code analysis is not (that) straightforward