5 mistakes to never make again in python

If you're serious about improving your programming skills, consider subscribing to my newsletter, it's free and keeps you updated with my latest python and programming tips.
This article is destined to help you write cleaner python code and help you stop making these mistakes that could be slowing/hindering your code or even stopping you from achieving what you want in your program.
Let's get right into it!
Mistake 1: Not using built-in functions
Python is written in C and is a higher-level language than C. This means that it takes more time to translate python to the lowest level language possible than C. Now, the thing is that built-in functions in python are actually written in C and are therefore way faster compared to functions written by the user in python.
As a demonstration, let's try to count values in an array:
import time
test_list = [x%10 for x in range(1000000)]
def python_count(array, value):
n = 0
for element in array:
if element == value:
n+=1
return n
#Measuring the user-implement function runtime
t_start = time.perf_counter()
found = python_count(test_list, 5)
t_end = time.perf_counter()
print(f"found {found} matching values in {t_end-t_start} seconds (user implemented)")
#Measuring the built-in function runtime
t_start = time.perf_counter()
found = test_list.count(5)
t_end = time.perf_counter()
print(f"found {found} matching values in {t_end-t_start} seconds (built-in)")
Output:
found 100000 matching values in 0.02308309997897595 seconds (user implemented)
found 100000 matching values in 0.007067200029268861 seconds (built-in)
So, if you have the possibility of using a built-in function, do it!
A type of function people usually write themselves (because its one of the first concept they learn about) are the sorting functions. Although they are very interesting to understand and everyone should write their own sorting function at least once, in a real project, just use the built-in python sorting function.
To give you a quick idea, these are the runtime of three sorting functions, and although they all have the same time complexity, the quickest one is the timsort_c, which is the built-in sorting funtion sorted()
(timsort is the algorithm used for the built-in sorting function in python).
To learn more about sorting speed and time complexity, read this.
Mistake 2: copying lists unproperly
This one was a very frustrating one for me personally when I started with python, and that's why I chose to put it on this list. There is something called "deep copying" and "shallow copying", which will come into play when creating another instance of an object. Shallow copying means that when you create a copy of an object, this copy references the original object and any modifications made to this copy will affect the original.
To demonstrate this, look at this code:
array1 = [1,2,3,4,5]
array2 = array1
array2[0] = 99
print(array1)
Output:
[99, 2, 3, 4, 5]
Strange isn't it? The first array should have stayed unchanged, but it didn't, because of shallow copying.
Now to work around that, we will need to make a deep copy of the first object. We'll use python's copy
library:
import copy
array1 = [1,2,3,4,5]
array2 = copy.deepcopy(array1)
array2[0] = 99
print(array1)
Output:
[1, 2, 3, 4, 5]
Deep copying makes a clone of the referenced object and creates a new object altogether.
Mistake 3: measuring time performance with time.time()
The time.time() function isn't actually the best function for measuring the performance of a process. This is because the function relies on the system clock which can be influenced by external factors and be unreliable. Instead, use time.monotonic()
or time.perf_counter()
.
time.monotonic()
is instead based on a linear clock, which is defined as a clock that "cannot go backward", which is more appropriate for measuring the difference between two times. The important thing to remember, however, is that, unlike the time.time()
function, the time.monotonic()
function has no time reference, which means that the only meaningful measure using this function is the difference between two measured times.
time.perf_counter()
also uses a different clock, this clock is used to measure shorter processes, as this clock has a way, way higher tick rate than the system clock.
Here is a program showcasing those functions:
import time
#comparing time.time() to monotonic() and perf_counter()
def measure_loop(f):
n = 0
t_start = f()
for i in range(1000):
n+=1
t_end = f()
print(f"{f.__name__}, {t_end-t_start}")
measure_loop(time.time)
measure_loop(time.monotonic)
measure_loop(time.perf_counter)
Output:
time, 0.0
monotonic, 0.0
perf_counter, 2.9999995604157448e-05
If you're interested in learning more, you can read this article!
Mistake 4: Not commenting or naming variables random names
I'm not even going to explain this one, but just show an example of two different codes that do the same thing. Which one is more readable?
def f(t, c1, c2):
a = 0
n = [abs(c1[i]-c2[i]) for i in range(len(t))]
for i in range(0, len(t)-1):
u = (n[i+1])*(t[i+1]-t[i])
a += u
return a
Or
def integral_compare(time, curve_1, curve_2):
"""
takes two curves in form of points list both associated with a time list
"""
area = 0
new_function = [abs(curve_1[i]-curve_2[i]) for i in range(len(time))]
for i in range(0, len(time)-1):
u_area = (new_function[i+1])*(time[i+1]-time[i])
area += u_area
return area
You don't actually have to understand what the function does here, but understand that in a project, it's important to make your code readable.
Mistake 5: not using enumerate()
This can be also classified as tip making the code more understandable, but not faster. Python's description for the enumerate()
function is: The enumerate object yields pairs containing a count (from start, which defaults to zero) and a value yielded by the iterable argument
.
What this means is that the enumerate function keeps track, in an iterable not only of an index of an item but of the value as well. It takes an iterable as argument and returns an enumerate type object which can be converted to an iterable as well, mad up of tuples containing the index and values of the argument iterable. For example, if we write:
list(enumerate(["word1", "word2", "word3"]))
We'd get:
[(0, 'word1'), (1, 'word2'), (2, 'word3')]
To give an example of an implementation, instead of writing:
x = ["this", "list", "contains", "random", "values", 1, 2, 3]
for i in range(len(x)):
if i==4:
print(x[i])
We could write:
x = ["this", "list", "contains", "random", "values", 1, 2, 3]
for i, val in enumerate(x):
if i==4:
print(val)
In big chunks of code, this subtle change can make it more understandable.