# TIL: constant folding in python.

September 10, 2022 · 6 mins · 1085 words

I’m working on a blog post about which option is faster x*x or x**2, and while writing it I discovered some interesting things about how python is implemented at a low level. So for today’s post I won’t be able to answer which option is faster, but I’ll explain you how python optimizes your code.

# First experiment: literals vs symbols

To start let’s go back to the original question: what is faster: x*x or x**2? Before getting too rigorous I wanted to run some experiments to get some intuition about the problem, and surprisingly here’s where I found the first weird results. My first experiment was to check if data types had a big impact in this experiment, so I created this first snippet

import time
EXPERIMENTS = 1_000_000
ti = time.time()
for i in range(EXPERIMENTS):
2**2
tf = time.time()
print(f"int**int took {tf-ti:.5f} seconds")
ti = time.time()
for i in range(EXPERIMENTS):
2.**2.
tf = time.time()
print(f"float**float took {tf-ti:.5f} seconds")


And after running it I got

> int**int took 0.03004 seconds
> float**float took 0.03070 seconds


So apparently data types do not have a huge impact. However, the code didn’t look nice to me, so I decided to refactor it a little bit

import time
EXPERIMENTS = 1_000_000
def power_time(base, exponent):
ti = time.time()
for i in range(EXPERIMENTS):
base**exponent
tf = time.time()
return tf-ti
print(f"int**int took {power_time(2, 2):.5f} seconds")
print(f"float**float took {power_time(2., 2.):.5f} seconds")


Same logic as before but less lines of code. However, after running it I got

> int**int took 0.20140 seconds
> float**float took 0.05051 seconds


It makes sense that in the case of float**float it takes a little bit more, since python needs to pass the arguments to the method and this takes time. But what the hell is happening in the case of int**int?

After asking about it in several places, I discovered that the problem comes from using literals in the first case and using symbols in the second case. By refactoring the first case to

import time
EXPERIMENTS = 1_000_000
+ BASE = 2
+ EXPONENT = 2
ti = time.time()
for i in range(EXPERIMENTS):
-    2**2
+    BASE**EXPONENT
tf = time.time()
print(f"int**int took {tf-ti:.5f} seconds")

+ BASE = 2.
+ EXPONENT = 2.
ti = time.time()
for i in range(EXPERIMENTS):
-    2.**2.
+    BASE**EXPONENT
tf = time.time()
print(f"float**float took {tf-ti:.5f} seconds")


I got the same results in two cases. The problem didn’t come from data types or methods, but rather from literals and symbols. In the case of using literals, it was much more faster because a technique named constant folding was used by the interpreter.

# Constant Folding

This post does not intend to be a deep dive about constant folding, so if you’re here to learn the internals of constant folding I recommend you this post. Here I will just share the intuitions I developed while reading about constant folding.

Basically, constant folding consists in the interpreter replacing constant expressions with their value. For example, the expression A = 10*10 is replaced by the parser by A = 100.

You can check that by using the dis module. This module allows you to check the python bytecode of your code. Basically, it allows you to check the “compiled” version of your code. If we run this example

import dis
dis.dis("A=123*10")


we’ll get

  1           0 LOAD_CONST               0 (1230)
2 STORE_NAME               0 (A)
6 RETURN_VALUE


where the first line indicated that a constant value 1230 has been loaded without having to evaluate 123*10. This means that the interpreter has replaced 123*10 by 1230 before compiling the code. If instead be run the code

import dis
dis.dis("A=123;B=10;A*B")


we’ll get

  1           0 LOAD_CONST               0 (10)
2 STORE_NAME               0 (A)
6 STORE_NAME               1 (B)
12 BINARY_MULTIPLY
14 POP_TOP
18 RETURN_VALUE


where we can see that the operation is not pre-computed, and it’ll be executed by the instruction 12 BINARY_MULTIPLY.

Another interesting point is that the constants are not always folded. For example

import dis
dis.dis("2**64")
dis.dis("2**65")


returns

  1           0 LOAD_CONST               0 (4294967296)
2 RETURN_VALUE