-1

Problem

I am currently working with a class that necessarily has a very complicated initialization function (>350 lines of code) given that many computations and attributes need to be performed and defined.

I am considering using either helper functions within the __init__ function of this class in order to make the code cleaner and broken into logical components; however, I am not sure the best way to go about this.

For context, the class is a Neural Network (PyTorch) and the init function contains several blocks of code that are essentially large if-else trees that define the values of attributes based off of the init inputs. It is these blocks that I feel are logically distinct and could therefore be placed into their own functions.

Potential Solutions:

In particular, I see 4 potential methods, and I'm hoping that to get clarifications on the pros, cons, and dangers of each of these methods.

I've listed each of the methods out below with some example code - any advice is greatly appreciated!

Method 1

Do not change anything, and keep a very large __init__ function:

class ClassName(SuperClass):

  def __init__(self, arg1, arg2, ..., argN):
    # <350 lines of code setting attribute values>

Method 2

Define default attribute values at the top of __init__, and then override them with instance functions based on __init__'s arguments. The reason for listing the attributes out at the top of __init__ is so that all attributes are clearly listed out in one location.

The _fns will require access to self and potentially some __init__ arguments.

class ClassName(SuperClass):

  def __init__(self, arg1, arg2, ..., argN):
    super().__init__()

    # Attribute defaults
    self.attr1 = None
    self.attr2 = None
    # ...
    self.attrM = None

    # Set attributes
    self._fn1()
    self._fn2()
    # ...
    self._fnM()

  def _fn1(self, **relevant_args):
    # <relevant computations>
    self.attr1 = #<result-of-relevant-computations>

  def _fn2(self, **relevant_args):
    # <relevant computations>
    self.attr2 = #<result-of-relevant-computations>

  # ...

  def _fnM(self, **relevant_args):
    # <relevant computations>
    self.attrM = #<result-of-relevant-computations>

This has the benefit of compartmentalizing the code

Method 3 (variation of 2)

The same as Method 2 except return a value in the helper functions and then assign those to the instance attributes. Not sure if this makes a difference / is better because it is more explicit about which attributes are being set where. Also has the potential to avoid listing default attribute values which would make the code shorter.

The _fns will usually require access to self and potentially some __init__ arguments.

class ClassName(SuperClass):

  def __init__(self, arg1, arg2, ..., argN):
    super().__init__()

    # Attribute defaults (could potentially remove with this approach)
    self.attr1 = None
    self.attr2 = None
    # ...
    self.attrM = None

    # Set attributes
    self.attr1 = self._fn1()
    self.attr2 = self._fn2()
    # ...
    self.attrM = self._fnM()

  def _fn1(self, **relevant_args):
    # <relevant computations>
    return #<result-of-relevant-computations>

  def _fn2(self, **relevant_args):
    # <relevant computations>
    return #<result-of-relevant-computations>

  # ...

  def _fnM(self, **relevant_args):
    # <relevant computations>
    return = #<result-of-relevant-computations>

Method 4

Define inner functions within __init__ that set all relevant attributes, either as in method 2 or method 3. This has the benefit of not cluttering the instance function list, but makes the __init__ function messy.

class ClassName(SuperClass):

  def __init__(self, arg1, arg2, ..., argN):
    # Helper functions
    def fn1(self, **relevant_args):
      # <relevant computations>
      return #<result-of-relevant-computations>

    # ...

    def fnM(self, **relevant_args):
      # <relevant computations>
      return = #<result-of-relevant-computations>

    super().__init__()


    # Attribute defaults (could potentially remove with this approach)
    self.attr1 = None
    # ...
    self.attrM = None

    # Set attributes
    self.attr1 = _fn1()
    # ...
    self.attrM = _fnM()

Finally, I supposed I could define any pure functions at the module level and pass in attributes as arguments, but I do not really think it makes sense to do this since these functions will not be used anywhere else and are pretty specific to the class itself.

Thanks in advance for any advice!

9
  • I'd personally move all of that out of the class - I'd create a data structure that I'd initialize by some free function before creating the class, then I'd pass that data structure into the class via a constructor parameter. The class would not be responsible for initialization, except that it would perform some sanity checks (like that the argument is not None), and it would be entirely focused on its specific task. Commented Jul 7, 2022 at 15:19
  • @FilipMilovanović thanks for your input! To clarify, you mean have some function attr_generator defined at the module level which would take in the arguments that currenty go into ClassName's __init__ function, perform all necessary computations, and then return an e.g. dictionary of attribute value pairs? And then pass this into the __init__ function to set the instance attributes?
    – WhoDatBoy
    Commented Jul 7, 2022 at 15:26
  • please don't cross-post: stackoverflow.com/questions/72889479/… "Cross-posting is frowned upon as it leads to fragmented answers splattered all over the network..."
    – gnat
    Commented Jul 7, 2022 at 15:48
  • 1
    @gnat - the OP was instructed to post here, and has said in their comment there "I will remove this question in a bit (in case you need to respond)." Commented Jul 7, 2022 at 15:53
  • 1
    @FilipMilovanović thank you very much for this answer - I definitely see why this makes sense. I'm not sure if I need to set instance attributes to ensure that backprop works as intended (not sure how PyTorch would handle this), but I will start experimenting and implement some verison of this. Thanks again for your help!
    – WhoDatBoy
    Commented Jul 7, 2022 at 16:08

1 Answer 1

3

You didn't provide a lot of detail, but this usually happens because you're trying to do too much in one class.

Now, if you don't want to clutter your class with helper functions, you could, as others have suggested on the StackOverflow main site, define your helpers outside of the class.

But, in this case, I'd personally move all of that initialization code out of the class - I'd create a plain data structure that I'd initialize using some free function before creating the class, then I'd pass that data structure into the class via a constructor parameter.

The class would then not be responsible for parameter initialization and the associated decision logic, except that it would perform some sanity checks (like that the argument is not None). It would be entirely focused on the specific task it was designed to do, once the values of the parameters have been decided on by something else.

In your original post on StackOverflow, you expressed your concern that "none of the code would be useful beyond initializing the class", but that's not really an issue. You don't necessarily separate things out because they'd be useful elsewhere, but because doing this lets you make your code cleaner and more readable. It lets you introduce meaningful names and divide work between parts of your codebase.

If what you really have are groups of attributes or parameters - e.g. (and I'm making these up) input params, output params, network params - you could create a number of such plain data classes, each letting you name and represent each group. It doesn't have to be a single data class.

Doing this could potentially simplify your initialization logic, as you'd only need to focus on the subset of values covered by each class. If it becomes simple enough, you could end up distributing it between the __init__ methods of each of these classes.

4
  • Thanks very much for compiling this into one succinct answer Filip!
    – WhoDatBoy
    Commented Jul 7, 2022 at 16:14
  • I believe it's also "bad practice" to do too much work inside __init__ (but still looking for references on that). Probably at the very least, and a model not listed in your options, you can create separate methods for setting the values, and just set all the attributes to None or whatever inside __init__. Once the object is created you can then call the appropriate initialisers, and even provide another method that given a minimal set of inputs can decide which initialisers to call.
    – NeilG
    Commented Mar 8, 2023 at 0:50
  • 1
    @NeilG - yeah, it's a general advice across languages not to do "too much work" inside a constructor (where what exactly constitutes "too much" is a bit blurry). Generally speaking, I don't like the "assign values after construction using setters" idea - you should strive to make it difficult to construct objects in an invalid state, unless it represents something like, say, a plain data structure somewhere at the edge of the application that collects user input and is expected to exist in an invalid state part of the time. Commented Mar 8, 2023 at 4:36
  • Thanks @FilipMilovanović, I guess I was trying to find some concrete reasons but some things aren't exact I guess. I think you've nailed it there. The constructor is particularly to ensure you always create a valid object initially. That's probably where the role of the constructor should stop (most of the time).
    – NeilG
    Commented Mar 9, 2023 at 0:23

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.