Profiling python code with celery

In this article I explain how you can profile python code with celery, and why I find this solution disappointing. I propose a better solution in the conclusion. Have fun !
Head of celery, sold as a vegetable. Usually o...
Head of celery, sold as a vegetable. Usually only the stalks are eaten. (Photo credit: Wikipedia)
If you are in a hurry, you can jump to the conclusion about python profiling tools at the end of this post. Otherwise, you will find below the summary of the tests I performed with the most common python profiling software.

First I install celerymon:

pip install celerymon

Then to run my celery powered module, I add the -E option:

celery -A mypackage.mymodule worker --loglevel=info -E

At this point, the events monitored by celerymon are available at:

firefox http://localhost:8989/

Celerymon displays few events, so it is not adapted for code profiling.

For code profiling, I try using cprofile. So, to launch celery now I use a different command

sudo apt-get install python-profiler
python -m cProfile -o test-`date +%Y-%m-%d-%T`.prof /home/toto/virtualenv_1/bin/celery -A mypackage.mymodule worker --loglevel=info -E

Alternatively, it's possible to modify the python code to include cProfile directives (but I have not yet managed to collect the output in a file):

import cProfile
cProfile.run('foo()','filename.prof')

The profiling data is pure text, and hard to manipulate, even with pstats. So, I use visualization tools.

### KCACHEGRIND ###
kcachegrind is probably the best tool today to analyse profiling data.

sudo apt-get install kcachegrind
easy_install pyprof2calltree
pyprof2calltree -i myfile.prof -o myfile.prof.grind
kcachegrind myfile.prof.grind

### RUNSNAKERUN ###
runsnakerun is a more recent tool, with more limited functionalities

pip install SquareMap RunSnakeRun
runsnake OpenGLContext.profile

With this tool, you have a nice display of all the calls, the cumulative time spent per function, etc.
If you have the following problem, reinstall wxpython:
    from squaremap import squaremap
File "/home/toto/virtualenv_1/lib/python2.6/site-packages/squaremap/squaremap.py", line 3, in
import wx.lib.newevent
ImportError: No module named lib.newevent
(virtualenv_1)toto:~/virtualenv_1/djangoProj_1$ pip install wxpython

### CONCLUSION ###

None of the above tools was handy for my app. They did not allow me to see clearly what lines of my code where taking the most time. Almost all the time seemed to be spent in Kombu module which is used by AMPQ. See my next post about manual profiling to see how I progressed nevertheless and managed to divide by 3000 the time spent in my most time consuming function!





Share:

No comments:

Post a Comment