0

I am working on a +10k LOC program, and I need to ensure its output is always the same for given input. The program consists of dozens of modules and classes, inherited by a MainClass. (The examples below are very simplified compared to the actual code.)


Initially I created tests ensuring multiple instances of MainClass have the same output for given input.

def check_consistent_instances(input_dict):
    """Checks if all results for given input are the same."""
    first_instance_result = MainClass(input_dict).result()

    for i in range(10000):
        result = MainClass(input_dict).result()    
        if first_instance_result != result:
            raise Exception('Different result detected!!')

This however can't detect different results in multiple runs caused by sets/dicts having a different order due to a new hash seed in each run (this could in some rare cases affect the output).

For example:

class MyClass(object):
    AVAILABLE_MONEY = 6

    ITEMS_CATALOG = {'a': 5, 'b': 5, 'c': 5, 'd': 5, 'e': 3}

    def buy_most_expensive(self):
        """Deducts the most expensive items from the available money."""


        for i in self.ITEMS_CATALOG:
        # Assume I made a mistake here! I should have used sorted(self.ITEMS_CATALOG, key=lambda x: self.ITEMS_CATALOG[x])
        # but accidentally i used only self.ITEMS_CATALOG.
            cost = self.ITEMS_CATALOG[i]

            if cost <= self.AVAILABLE_MONEY:
                self.AVAILABLE_MONEY -= cost

Code like the above would pass my check_consistent_instances() test, since on each run the PYTHONHASHSEED would have a specific value and all results would be the same during a specific run. The results would differ only between different runs.

Question:
Which are the best practices when it comes to checking whether a program's results are consistent?

8
  • 3
    If you need ordered data structures, use ordered data structures (e.g. OrderedDict, which retains insertion order. But you may find you need totally deterministic output less often than you think, and you can still test the results appropriately when using vanilla dict or set objects.
    – jonrsharpe
    Commented Aug 23, 2015 at 14:03
  • In this particular program results must be deterministic otherwise it implies there is something wrong. I know about OrderedDict, but I haven't used it at all in given program since it wasn't really needed. The examples I gave were simply there to illustrate how some bugs might be created in my program. I will edit to clarify.
    – user
    Commented Aug 23, 2015 at 15:33
  • Then I don't understand what you're asking. If order is important, don't use structures that don't guarantee order.
    – jonrsharpe
    Commented Aug 23, 2015 at 15:34
  • Order is important only in some cases in which (hopefully) I haven't made mistakes. The problem is that i need to ensure no such mistakes exist, hence the module reloading.
    – user
    Commented Aug 23, 2015 at 15:37
  • "I need to ensure its output is always the same for given input" - just for different runs of the same unmodified version of the program, or also for modified versions? That are very different cases.
    – Doc Brown
    Commented Aug 23, 2015 at 16:01

2 Answers 2

2

Since you wrote in your comment you want to make the output of the unchanged program stable when it is not changed between two runs, you already excluded "accidental changes to module variables" - without any changes, there can be no "accidental changes".

The SO link you posted in your question mentioned how to initialize Python's hash seed to a fixed value, so that should be obviously the next thing to do. Same measure will apply when your program uses some kind of pseudo random numbers directly.

So what remains are the places of your program where it uses some external input like data from a clock, or data from processes outside your own program (like calling an OS function telling you how many other processes are currently running on the machine), these are AFAIK the only things where non-determinism can be introduced in a typical Python program. We actually do not know the internals of your program, but you should know. So the best recommendation I can give you: make a thorough inspection if your program contains calls to external functions of the type I mentioned, and avoid to use them inside the part of the program where the output is calculated.

Additionally, to detect if you overlooked something in your inspection, you should consider to create automated regression tests (not necessarily unit tests, but lots of people mix these terms up). Pick a set of well-chosen input data, run your program to produce the desired output, and write a simple comparer tool which allows you to detect if a new run of the program still produces the previous output again. That is also the standard technique to detect if the output of "version 1.1" of your program is still the same as the output of "version 1.0", when the changes from 1.0 to 1.1 should have not affected the results.

0
1

I would add a bunch of unit tests for the individual modules and use something like Jenkins to compile and run the tests every time you make code changes. If you are finding variations when running, this should help you narrow it down so that you can change module logic to ensure repeatability.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.