Profiling memory
In terms of memory allocation, the simplest way to monitor memory is with the library memory_profiler. This doesn't come with either Python or Anaconda, but is available for quick and easy install for
both, provided you have the appropriate install permissions. If you have a standard Python install, details of how you can install it can be found
on pypi. For Anaconda, it is best to integrate it with the rest of Anaconda by using the management system "Conda", thus (you may need to restart Spyder):
conda install memory_profiler
Once installed, the easiest way to use it is to add:
from memory_profiler import profile
to the top of your code. If you then put:
from memory_profiler import profile
@profile
Immediately above any function you want to profile. Although it will only profile functions, it is relatively simple to shift all of a script into a function and call it. This @profile
is what is known as a "decorator" – there are a wide range of decorators for doing jobs in Python (list); they're essentially a signal to run some other code, in this case in the memory_profiler library.
Here, then, is our code marked up for this: test2.py, and here's an example of the output (just one of the loop runs) for size == 100:
Line # Mem usage Increment Line Contents ================================================ 9 24.4 MiB 0.0 MiB @profile 10 def fill_row(): 11 24.4 MiB 0.0 MiB data_row = [] 12 24.4 MiB 0.0 MiB for i in range(size): 13 24.4 MiB 0.0 MiB data_row.append(random.random()) 14 24.4 MiB 0.0 MiB return data_row
The first column is the line number. The second is the memory after the line has run. The third is the increment in the memory requirements between that and the previous state. The last is the line itself.
As you can see, the memory allocation in this specific run was too small to register as changing. The units are, unusually but correctly, listed in Mebibytes.
You can increase the precision of the output by adjusting to, for example:
@profile(precision=6)
I had to crank the size up to 1000 to see any impact:
Line # Mem usage Increment Line Contents ================================================ 9 24.3671875000 MiB 0.0000000000 MiB @profile(precision=10) 10 def fill_row(): 11 24.3671875000 MiB 0.0000000000 MiB data_row = [] 12 24.3789062500 MiB 0.0117187500 MiB for i in range(size): 13 24.3789062500 MiB 0.0000000000 MiB data_row.append(random.random()) 14 24.3789062500 MiB 0.0000000000 MiB return data_row
The allocation of memory for each line might seem suprising – why is the allocation happening on the for-loop but not the
appending of a new number, however, range
generates a new list of numbers. random.random()
must just be, for example, linking to
the already existing random number table in Python. In general, random number generation is optimised to heck, because it is so important in so many things. There are a number of additional nice
features in memory_profiler, and it is worth reading the short
documentation.
So, using all this. Can you estimate how long this code will take to run using 10000 as the size, and how much memory it will use?
- Using a profiler
- Profiling speed
- This page
- Answer <-- next