Automating performance tests for python code in a CI

Question

My apologies in advance for a question that might seem trivial - I am a mostly solo dev in academic environment and a lot of industry best practices don't necessarily make it here.

Several of my projects run high-performance numerical computation loops. Per se, a single iteration of the loop is rather fast (~1 sec), but there are a lot of them (10-100k per run). Due to that, the performance of the loop is essential to the performance of the whole application and some minor modifications to it (such as unnecessary array type/shape conversions) can slow it down by a lot (latest optimization pass I performed accelerated it by a factor of 20-ish).

As such, it is critical for me to monitor if any changes to the code I am making are having a performance impact on the core loop- be it immediate or by accumulating the loss of performance over time.

I am using a CI suite and number of tools to run unittest, measure test coverage and automatically build the apidocs. However, so far I have not found anything that would perform performance tests as easily or combine their outputs into a graph-over-time. Looking around, I realized that actually comparing performance on different builds is non trivial, given it can be affected by the hardware running the performance tests as well as the software enabling the code isolation and results collection.

Is there a recommended way to perform performance tests to minimize the effects of hardware? Or at least make sure the values are comparable? Is there a standard way of outputting/ingesting performance test results in Python? For instance something along the lines of pytest --duration?

I removed the (off-topic) tool recommendation request from the question, since some community members here vote to close everything which contains the buzzwords "is there a tool for ..." in it, regardless how valid the rest of the question is. — Doc Brown, Commented May 10, 2021 at 20:35

Bhakta Raghavan · Accepted Answer · 2021-05-10 14:36:31Z

Setup a Jenkins instance (master). Add Slave nodes and tag them based on your needs (says PERF_NODE, FUNCTIONAL_NODE, ). Create Jobs in jenkins (you can configure the job to run on git push or manual trigger). Configure the jobs to run on the specific nodes.

Setting up the above should not take more than 2 hours (assuming you are familiar with Jenkins and you have the necessary machines with admin access).

If you want to do fine-grained resource control, then you need to do that outside jenkins scope. Say you write a script that is invoked as part of the job that books the resources to run the job.

As part of the job configuration you can run anything (it could be a tox command on a pytest). Tox can produce Junit style results that Jenkins can plot (with additional plugins) and show you detailed test-cases pass/fail results + Historical graphs.

If this approach would work, then shoot any more questions..

Jim J · Accepted Answer · 2021-05-10 21:53:35Z

I haven't done this in the past or validated it but at first glance, although you're right that comparing absolute time elapsed across runs would be tricky, maybe it's worth focusing on the number of instructions/function calls during a run? I think that would only vary between processor/versions of dependencies, not depending on whether or not your VM has a noisy neighbor or some other process slows down a test run a smidge. If you have one less function call in your hot loop, you'll make it faster wherever/however it runs.

cProfile can dump binary reports, which you can probably store as artifacts from each build, and then you can read them/compare them.

Stack Exchange Network

Automating performance tests for python code in a CI

2 Answers 2

Hot Network Questions

Automating performance tests for python code in a CI

2 Answers 2

Related

Hot Network Questions