Using a genetic algorithm to express a number as a multiple and sum of others

# Preface

Genetic algorithms are algorithms used for optimizing parameters. I won't dwell more on explaining what they are (perhaps in another article), I'll focus on implementing one (while still explaining what's happening). However, if you have no idea what genetic algorithms are, I invite you to check out this video and then keep reading!

## Intro

So, what do I mean by expressing a number as a multiple and a sum of others? Well, for instance, 14 can be written as :

```14 = 1*7+7*1
14 = 1*5+6+3
14 = 2*5+2*2```

Or even :

`14 = 5+1*5*1+3+6-5*1`

You get the idea. Two things are noteworthy: First of all, the order of operation follows the standard order (2*5+2*2 is the same as 10+4). Secondly, even though I only talked about multiple and sums, a subtraction made its way to this article. However, this is not a problem because subtractions are just additions with one (or more) negative numbers.
Now the real question: What is the practical use of writing a number this way? I have no idea, but it's fun, so let's get into it!

## The idea behind the algorithm

Each individual will be stored as a `str` type representing one or multiple operations. For example, 1*5+6+3 could be an individual and stored as
`individual = "1*5+6+3"`. Then the fitness evaluation function will simply evaluate each individual with python's `eval` function and calculate the distance to the target number (14 in this case). For example, an individual represented by 2+5*8-9 (=33)would get a fitness score of 19 (because the distance from 14 to 33 is 19). We will try to minimize the fitness score to optimize our individuals.

# Code

If you're simply interested in the code then you can just download it down below or scroll to the bottom of the page and copy-paste it. However, if you follow through, I'll dissect the code and split it into multiple parts to make it easier to understand. So let's get right into it:

## Setup and creating our population

``````import random

TARGET = 1234 #CHANGE THIS TO WHATEVER

class FindNumber:

def __init__(self, target):
self.target = target
self.mutation_probability = 0.05
self.individual_length = 7 #This represents the numbers in an individual, not total characters
self.population_size = 1000
self.keep_population = 0.2

def __initialize_population(self):
self.population = list()
components = ['123456789', '+-*']

for _ in range(self.population_size):
individual = str()
for _ in range(self.individual_length-1):
individual = individual + random.choice(components[0])
individual = individual + random.choice(components[1])
individual = individual + random.choice(components[0])

self.population.append(individual)``````

This may seem a bit daunting but hang in there!
We'll use the random library to generate random individuals, and randomly select those individuals to merge and mutate. Next up, the variable `TARGET` will be the number we'll try to express, this can be pretty much any number (as long as it's small enough to be expressed with a certain number of characters, for instance with only two digits we can't go higher than 81 = 9*9, with three digits we can't go higher than 729 = 9*9*9, etc...)
Our whole algorithm will revolve around a class named `FindNumber`, which will use different methods on a population to try and optimize it. This class only takes the target number as an argument, there are other class variables but those have to be changed manually. I'll quickly go over each variable :

• `mutation_probability` represents the probability of an individual mutating each iteration. 0.05 is the same as 5%
• `individual_length` represents the number of digits in an individual, for example, an individual with two digits could be something like 8-3 and an individual with four digits could be 9*5+9*2
• `population_size` means how many individuals will there be in the population
• `keep_population` represents what proportion of the population will be kept for the next generation. 0.2 is the same as 20%

The private method `__initialize_population()` creates a class variable called `population` containing each individual in a list. The length of the population and each individual depends on the values chosen for the variable above.

## Evaluating the fitness and sorting the population

``````    def __fitness(self):
self.population_fitness = list()

for individual in self.population:
value = eval(individual)
distance = abs(self.target-value)

self.population_fitness.append(distance)

def __sort_by_fitness(self):
self.__fitness()
ordered_population = zip(self.population_fitness, self.population)
ordered_population = sorted(ordered_population)

self.population = list()
for individual in ordered_population:
self.population.append(individual[1])``````

Still inside the `FindNumber` class, we create the `__fitness()` method that calculates the fitness of each individual and stores it in a list in the same order as the population. I'll give an example to explain it better:
say your target number is 23 and you have a population of three individuals stored like this: `population = ["7+6*9", "1*1-5", "2+1-2"]`, then if we calculate the fitness of each individual we'd have respectively 38, 27, and 22. The `__fitness()` method would therefore create this list: `population_fitness = [38, 27, 22]`.

The `__sort_by_fitness()` method's objective is to sort the population from lowest fitness (which is better) to highest fitness (which is worse). It starts by calling the previously defined `__fitness()` and then creates a list regrouping each individual with its fitness score. If we use the same example as before it would look like this: `ordered_population = [(38, "7+6*9"), (27, "1*1-5), (22, "2+1-2")]`. The fitness score needs to be before the individual because the `sorted()` function intrinsic to python will only look at the first element of each element in a list (read that twice!), which means that we can sort our population by fitness score. Once this list is ordered we extract each individual and re-create a population list, but this time in order. This means that still using our example list, the population would now be: `population = ["2+1-2", "1*1-5", "7+6*9"]`.

## Reproduction and mutation

``````    def __mutate(self, individual):
components = ['123456789', '+-*']
index_mutate = random.randint(0, len(individual)-1)

if individual[index_mutate] in components[0]:
new_character = random.choice(components[0])
else:
new_character = random.choice(components[1])

new_individual = individual[:index_mutate] + new_character + individual[index_mutate+1:]
return new_individual

def __merge(self, ind1, ind2):
half = len(ind1)//2
return ind1[:half] + ind2[half:]

def __new_population(self):
keep_population = round(self.population_size*self.keep_population)
self.population = self.population[:keep_population]

while len(self.population) < self.population_size:
ind1 = random.choice(self.population[:keep_population])
ind2 = random.choice(self.population[:keep_population])
individual = self.__merge(ind1, ind2)

self.population.append(individual)``````

The `__mutate()` method takes an individual as argument and returns that individual with one changed digit or operation, for example, if we passed `5+2+4*8` through this method, we might get `2+2+4*8` or `5+2*4*8` or many other possibilities.
The `__merge()` method takes two individuals as arguments and returns an individual made up of halves of each parent. For example, taking `"5+2+4*8"` and `"3*8-1+5"` as arguments, we will get `5+2-1+5` or `3*8+4*8` depending on the order of the arguments.
Finally, we create a method called `__new_population()` that uses all methods created previously to make a new population out of the best individuals, the code is pretty straightforward here.

## Finalizing our class

``````    def start(self):
self.__initialize_population()
self.__sort_by_fitness()

average_fitness = sum(self.population_fitness)/self.population_size
best_fitness = abs(self.target - eval(self.population[0]))

return average_fitness, best_fitness

def epoch(self):
self.__new_population()
if self.mutation_probability:
for i, individual in enumerate(self.population):
if random.random()<self.mutation_probability:
self.population[i] = self.__mutate(individual)
self.__sort_by_fitness()

average_fitness = sum(self.population_fitness)/self.population_size
best_fitness = abs(self.target - eval(self.population[0]))

return average_fitness, best_fitness``````

To make all of this work, we create 2 final methods (public this time) : `start()` and `epoch`.
`start()` creates the population and sorts it, it then calculates the average population fitness and the best fitness out of all the individuals and returns those values.
`epoch()` replaces the current population with the next generation and just like `start()`, it sorts it and returns the average and best fitness.

And before we run the code, this is what the final file looks like (notice I added a main function at the end):

``````import random

TARGET = 1234 #CHANGE THIS TO WHATEVER

class FindNumber:

def __init__(self, target):
self.target = target
self.mutation_probability = 0.05
self.individual_length = 7 #This represents the numbers in an individual, not total characters
self.population_size = 1000
self.keep_population = 0.2

def __initialize_population(self):
self.population = list()
components = ['123456789', '+-*']

for _ in range(self.population_size):
individual = str()
for _ in range(self.individual_length-1):
individual = individual + random.choice(components[0])
individual = individual + random.choice(components[1])
individual = individual + random.choice(components[0])

self.population.append(individual)

def __fitness(self):
self.population_fitness = list()

for individual in self.population:
value = eval(individual)
distance = abs(self.target-value)

self.population_fitness.append(distance)

def __sort_by_fitness(self):
self.__fitness()
ordered_population = zip(self.population_fitness, self.population)
ordered_population = sorted(ordered_population)

self.population = list()
for individual in ordered_population:
self.population.append(individual[1])

def __mutate(self, individual):
components = ['123456789', '+-*']
index_mutate = random.randint(0, len(individual)-1)

if individual[index_mutate] in components[0]:
new_character = random.choice(components[0])
else:
new_character = random.choice(components[1])

new_individual = individual[:index_mutate] + new_character + individual[index_mutate+1:]
return new_individual

def __merge(self, ind1, ind2):
half = len(ind1)//2
return ind1[:half] + ind2[half:]

def __new_population(self):
keep_population = round(self.population_size*self.keep_population)
self.population = self.population[:keep_population]

while len(self.population) < self.population_size:
ind1 = random.choice(self.population[:keep_population])
ind2 = random.choice(self.population[:keep_population])
individual = self.__merge(ind1, ind2)

self.population.append(individual)

def start(self):
self.__initialize_population()
self.__sort_by_fitness()

average_fitness = sum(self.population_fitness)/self.population_size
best_fitness = abs(self.target - eval(self.population[0]))

return average_fitness, best_fitness

def epoch(self):
self.__new_population()
if self.mutation_probability:
for i, individual in enumerate(self.population):
if random.random()<self.mutation_probability:
self.population[i] = self.__mutate(individual)
self.__sort_by_fitness()

average_fitness = sum(self.population_fitness)/self.population_size
best_fitness = abs(self.target - eval(self.population[0]))

return average_fitness, best_fitness

def main(target):
method = FindNumber(target)
avg, best = method.start()
print(f"#gen: 0 #average: {avg} #best: {best}")

for i in range(100):
avg, best = method.epoch()
print(f"#gen: {i+1} #average: {avg} #best: {best}")
if best==0:
break

print(f"{method.population[0]} = {eval(method.population[0])}")

if __name__ == '__main__':
main(TARGET)
input()``````

# Running the code

I'm going to show what the results are for different individual lengths, going from longest to shortest :
Scroll down in the output to see how it expressed 1234. Also note that the result is random and if you run the program another time, it will probably find another way to express it.

• individual length = 7
Output : 6*6*4*9-8*7-6 = 1234
``````#gen: 0 #average: 1283.497 #best: 45
#gen: 1 #average: 1624.319 #best: 33
#gen: 2 #average: 5443.837 #best: 9
#gen: 3 #average: 11078.058 #best: 4
#gen: 4 #average: 7854.75 #best: 4
#gen: 5 #average: 9967.631 #best: 2
#gen: 6 #average: 6061.968 #best: 2
#gen: 7 #average: 10449.052 #best: 2
#gen: 8 #average: 5800.774 #best: 2
#gen: 9 #average: 6804.307 #best: 2
#gen: 10 #average: 6410.493 #best: 2
#gen: 11 #average: 7261.284 #best: 2
#gen: 12 #average: 8808.024 #best: 2
#gen: 13 #average: 3182.644 #best: 2
#gen: 14 #average: 5649.5 #best: 2
#gen: 15 #average: 5097.521 #best: 2
#gen: 16 #average: 5472.465 #best: 2
#gen: 17 #average: 370.246 #best: 2
#gen: 18 #average: 336.828 #best: 0
6*6*4*9-8*7-6 = 1234``````
• individual length = 6
Output : 9+5*7*1*5*7 = 1234
``````#gen: 0 #average: 1303.262 #best: 11
#gen: 1 #average: 1555.56 #best: 11
#gen: 2 #average: 3538.95 #best: 3
#gen: 3 #average: 2367.064 #best: 3
#gen: 4 #average: 1944.92 #best: 3
#gen: 5 #average: 1359.517 #best: 3
#gen: 6 #average: 1172.492 #best: 3
#gen: 7 #average: 965.504 #best: 3
#gen: 8 #average: 932.787 #best: 3
#gen: 9 #average: 651.214 #best: 3
#gen: 10 #average: 807.595 #best: 3
#gen: 11 #average: 1030.949 #best: 3
#gen: 12 #average: 1221.126 #best: 1
#gen: 13 #average: 103.367 #best: 0
9+5*7*1*5*7 = 1234``````
• individual length = 5
Output : 7*5*5*7+9 = 1234
``````#gen: 0 #average: 1203.101 #best: 145
#gen: 1 #average: 1204.109 #best: 5
#gen: 2 #average: 1392.504 #best: 5
#gen: 3 #average: 1017.989 #best: 5
#gen: 4 #average: 860.794 #best: 5
#gen: 5 #average: 725.785 #best: 5
#gen: 6 #average: 670.127 #best: 5
#gen: 7 #average: 628.187 #best: 5
#gen: 8 #average: 405.353 #best: 5
#gen: 9 #average: 318.836 #best: 5
#gen: 10 #average: 188.848 #best: 1
#gen: 11 #average: 30.342 #best: 0
7*5*5*7+9 = 1234``````
• individual length = 4
Output : 5*7*5*7 = 1225
``````#gen: 0 #average: 1204.587 #best: 26
#gen: 1 #average: 1062.097 #best: 9
#gen: 2 #average: 737.667 #best: 9
#gen: 3 #average: 394.063 #best: 9
#gen: 4 #average: 286.486 #best: 9
#gen: 5 #average: 207.212 #best: 9
#gen: 6 #average: 98.09 #best: 9
#gen: 7 #average: 81.662 #best: 9
#gen: 8 #average: 76.739 #best: 9
#gen: 9 #average: 64.103 #best: 9
#gen: 10 #average: 56.901 #best: 9
#gen: 11 #average: 72.082 #best: 9
#gen: 12 #average: 63.886 #best: 9
#gen: 13 #average: 71.025 #best: 9
#gen: 14 #average: 56.678 #best: 9
#gen: 15 #average: 71.073 #best: 9
#gen: 16 #average: 64.129 #best: 9
#gen: 17 #average: 77.108 #best: 9
#gen: 18 #average: 79.855 #best: 9
#gen: 19 #average: 103.666 #best: 9
#gen: 20 #average: 117.171 #best: 9
#gen: 21 #average: 134.503 #best: 9
#gen: 22 #average: 156.947 #best: 9
#gen: 23 #average: 36.257 #best: 9
#gen: 24 #average: 38.516 #best: 9
#gen: 25 #average: 49.014 #best: 9
#gen: 26 #average: 36.746 #best: 9
#gen: 27 #average: 39.686 #best: 9
#gen: 28 #average: 37.657 #best: 9
#gen: 29 #average: 41.802 #best: 9
#gen: 30 #average: 32.015 #best: 9
#gen: 31 #average: 38.399 #best: 9
#gen: 32 #average: 35.048 #best: 9
#gen: 33 #average: 40.86 #best: 9
#gen: 34 #average: 32.855 #best: 9
#gen: 35 #average: 32.892 #best: 9
#gen: 36 #average: 35.576 #best: 9
#gen: 37 #average: 41.061 #best: 9
#gen: 38 #average: 39.714 #best: 9
#gen: 39 #average: 48.719 #best: 9
#gen: 40 #average: 38.182 #best: 9
#gen: 41 #average: 42.599 #best: 9
#gen: 42 #average: 40.68 #best: 9
#gen: 43 #average: 38.966 #best: 9
#gen: 44 #average: 39.667 #best: 9
#gen: 45 #average: 47.506 #best: 9
#gen: 46 #average: 40.718 #best: 9
#gen: 47 #average: 38.979 #best: 9
#gen: 48 #average: 46.27 #best: 9
#gen: 49 #average: 32.961 #best: 9
#gen: 50 #average: 41.688 #best: 9
#gen: 51 #average: 36.572 #best: 9
#gen: 52 #average: 46.256 #best: 9
#gen: 53 #average: 44.951 #best: 9
#gen: 54 #average: 42.005 #best: 9
#gen: 55 #average: 43.246 #best: 9
#gen: 56 #average: 26.846 #best: 9
#gen: 57 #average: 34.736 #best: 9
#gen: 58 #average: 43.15 #best: 9
#gen: 59 #average: 36.646 #best: 9
#gen: 60 #average: 43.788 #best: 9
#gen: 61 #average: 33.342 #best: 9
#gen: 62 #average: 46.356 #best: 9
#gen: 63 #average: 42.165 #best: 9
#gen: 64 #average: 45.391 #best: 9
#gen: 65 #average: 41.978 #best: 9
#gen: 66 #average: 40.736 #best: 9
#gen: 67 #average: 52.285 #best: 9
#gen: 68 #average: 37.754 #best: 9
#gen: 69 #average: 35.377 #best: 9
#gen: 70 #average: 36.597 #best: 9
#gen: 71 #average: 41.606 #best: 9
#gen: 72 #average: 35.752 #best: 9
#gen: 73 #average: 31.369 #best: 9
#gen: 74 #average: 41.644 #best: 9
#gen: 75 #average: 42.229 #best: 9
#gen: 76 #average: 28.423 #best: 9
#gen: 77 #average: 40.902 #best: 9
#gen: 78 #average: 37.607 #best: 9
#gen: 79 #average: 43.581 #best: 9
#gen: 80 #average: 26.735 #best: 9
#gen: 81 #average: 34.304 #best: 9
#gen: 82 #average: 43.786 #best: 9
#gen: 83 #average: 41.965 #best: 9
#gen: 84 #average: 31.633 #best: 9
#gen: 85 #average: 42.292 #best: 9
#gen: 86 #average: 30.567 #best: 9
#gen: 87 #average: 29.47 #best: 9
#gen: 88 #average: 41.595 #best: 9
#gen: 89 #average: 44.918 #best: 9
#gen: 90 #average: 40.125 #best: 9
#gen: 91 #average: 36.224 #best: 9
#gen: 92 #average: 33.657 #best: 9
#gen: 93 #average: 36.966 #best: 9
#gen: 94 #average: 46.021 #best: 9
#gen: 95 #average: 33.586 #best: 9
#gen: 96 #average: 43.349 #best: 9
#gen: 97 #average: 38.567 #best: 9
#gen: 98 #average: 39.415 #best: 9
#gen: 99 #average: 50.545 #best: 9
#gen: 100 #average: 36.911 #best: 9
5*7*5*7 = 1225
``````
• individual length = 3
Output : 9*9*9 = 729
``````#gen: 0 #average: 1211.794 #best: 667
#gen: 1 #average: 1111.828 #best: 586
#gen: 2 #average: 894.121 #best: 505
#gen: 3 #average: 726.419 #best: 505
#gen: 4 #average: 610.22 #best: 505
#gen: 5 #average: 548.688 #best: 505
#gen: 6 #average: 522.487 #best: 505
#gen: 7 #average: 526.195 #best: 505
#gen: 8 #average: 522.154 #best: 505
#gen: 9 #average: 520.039 #best: 505
#gen: 10 #average: 529.939 #best: 505
#gen: 11 #average: 518.14 #best: 505
#gen: 12 #average: 524.305 #best: 505
#gen: 13 #average: 524.953 #best: 505
#gen: 14 #average: 522.775 #best: 505
#gen: 15 #average: 522.109 #best: 505
#gen: 16 #average: 516.412 #best: 505
#gen: 17 #average: 520.336 #best: 505
#gen: 18 #average: 518.419 #best: 505
#gen: 19 #average: 527.095 #best: 505
#gen: 20 #average: 522.487 #best: 505
#gen: 21 #average: 517.933 #best: 505
#gen: 22 #average: 517.249 #best: 505
#gen: 23 #average: 525.862 #best: 505
#gen: 24 #average: 520.021 #best: 505
#gen: 25 #average: 523.405 #best: 505
#gen: 26 #average: 524.161 #best: 505
#gen: 27 #average: 523.279 #best: 505
#gen: 28 #average: 526.438 #best: 505
#gen: 29 #average: 522.064 #best: 505
#gen: 30 #average: 525.502 #best: 505
#gen: 31 #average: 522.289 #best: 505
#gen: 32 #average: 520.057 #best: 505
#gen: 33 #average: 527.095 #best: 505
#gen: 34 #average: 524.35 #best: 505
#gen: 35 #average: 520.444 #best: 505
#gen: 36 #average: 529.021 #best: 505
#gen: 37 #average: 530.56 #best: 505
#gen: 38 #average: 525.79 #best: 505
#gen: 39 #average: 527.671 #best: 505
#gen: 40 #average: 530.812 #best: 505
#gen: 41 #average: 530.272 #best: 505
#gen: 42 #average: 525.322 #best: 505
#gen: 43 #average: 516.799 #best: 505
#gen: 44 #average: 524.422 #best: 505
#gen: 45 #average: 520.003 #best: 505
#gen: 46 #average: 527.797 #best: 505
#gen: 47 #average: 520.354 #best: 505
#gen: 48 #average: 520.633 #best: 505
#gen: 49 #average: 525.682 #best: 505
#gen: 50 #average: 524.701 #best: 505
#gen: 51 #average: 525.88 #best: 505
#gen: 52 #average: 527.383 #best: 505
#gen: 53 #average: 525.34 #best: 505
#gen: 54 #average: 525.664 #best: 505
#gen: 55 #average: 525.457 #best: 505
#gen: 56 #average: 525.43 #best: 505
#gen: 57 #average: 520.21 #best: 505
#gen: 58 #average: 522.163 #best: 505
#gen: 59 #average: 529.309 #best: 505
#gen: 60 #average: 526.708 #best: 505
#gen: 61 #average: 524.035 #best: 505
#gen: 62 #average: 525.943 #best: 505
#gen: 63 #average: 527.797 #best: 505
#gen: 64 #average: 526.663 #best: 505
#gen: 65 #average: 525.115 #best: 505
#gen: 66 #average: 518.401 #best: 505
#gen: 67 #average: 521.083 #best: 505
#gen: 68 #average: 530.839 #best: 505
#gen: 69 #average: 521.452 #best: 505
#gen: 70 #average: 529.858 #best: 505
#gen: 71 #average: 524.494 #best: 505
#gen: 72 #average: 526.546 #best: 505
#gen: 73 #average: 524.521 #best: 505
#gen: 74 #average: 528.625 #best: 505
#gen: 75 #average: 525.34 #best: 505
#gen: 76 #average: 526.843 #best: 505
#gen: 77 #average: 521.173 #best: 505
#gen: 78 #average: 519.643 #best: 505
#gen: 79 #average: 530.776 #best: 505
#gen: 80 #average: 522.649 #best: 505
#gen: 81 #average: 525.52 #best: 505
#gen: 82 #average: 525.853 #best: 505
#gen: 83 #average: 526.339 #best: 505
#gen: 84 #average: 517.672 #best: 505
#gen: 85 #average: 519.517 #best: 505
#gen: 86 #average: 522.901 #best: 505
#gen: 87 #average: 521.92 #best: 505
#gen: 88 #average: 522.289 #best: 505
#gen: 89 #average: 520.606 #best: 505
#gen: 90 #average: 525.196 #best: 505
#gen: 91 #average: 527.023 #best: 505
#gen: 92 #average: 521.011 #best: 505
#gen: 93 #average: 523.153 #best: 505
#gen: 94 #average: 521.479 #best: 505
#gen: 95 #average: 528.625 #best: 505
#gen: 96 #average: 524.512 #best: 505
#gen: 97 #average: 520.984 #best: 505
#gen: 98 #average: 518.257 #best: 505
#gen: 99 #average: 521.641 #best: 505
#gen: 100 #average: 526.663 #best: 505
9*9*9 = 729``````

We can observe, that for the last two test runs (lengths of 3 and 4) the algorithm is unable to find a solution and just tries to minimize the fitness score. We could give it a lower number to see it succeed at a lower individual length.

If you reached the end, thanks a lot for reading and you can subscribe to my newsletter, it's free and keeps me motivated!