Documenting/defining data structures in Python

Question

What is the preferred way to document the contents of and logic behind dynamically generated data structures in Python? E.g. a dict which contains the mapping of a string to a list of lists of strings?

I find it hard to transport the meaning and algorithmic workings of the data elements through Python code alone. It is unsatisfying and sometimes counterproductive to try to shape the code which erects such a dict into a form which tells another programmer easily about the structure of a dict entry. Similarly, placing a comment with an example entry is anything but optimal in my eyes.

user69037user69037 · Accepted Answer · 2014-01-14 01:05:36Z

If you are writing code that may be used by other programmers in isolation to you, and you are using data structures that are beyond simple built-ins, consider whether or not you can encapsulate them as classes.

Lets look at some example methods and what they return.

def getMatrix(**args):

Calling this I can presume that the output will be an list of lists wth some simple value inside. Not much help needed.

def getCharacterCounts(**args):

In this, I could presume that based on the arguements I'll get a dictionary where each key is a character and then the value will be a count based on the occurances in some string.

def getPerson(**args):

Ah, now we are getting into trouble. A person is a complex thing, and I could store it as a dictionary, like so:

person = {  name:"Bob",
            age: 37,
            date_of_birth: "1932-01-15",
            hobbies:['programming','arguing on the internet', 'cross stitch'],
         }

But, as this data structure gets larger, more fields occur, the example gets bigger and then the documentation needs to start including how to resolve issues when values conflict - for example, unless we are transported back to the year 1969, the date-of-birth and age are inconsistent.

In summation, if you are using a data structre that is varied in structure and includes some "algorithmic workings" then perhaps its time to call upon the power of Object-Oriented Programming^TM.

The downside, though, is that the in-situ functional programming mechanisms of Python are somehow unhinged by this... if I use a class, I should adhere to clean OO techniques instead of e.g. dirty-constructing objects with a list comprehension. — Vroomfondel, Commented Jan 15, 2014 at 8:25
Thats no necessarily true, if you have an object that can conform to a list (eg. a sportsball team) you can subclass from the list type, and operate on the new object, — user69037, Commented Jan 15, 2014 at 22:40

Filip Malczak · Accepted Answer · 2014-01-14 01:41:21Z

Names. They should mean something. Argument names shouldn't be d or a, more likely user_id_to_instance_dict or something like that.

Docstrings. Functions and classes can have documentation strings, you can explain types and structure in them

class UserRegistry:
    """
    Here we tell what does this class do.

    Fields:
    - some_public_field - this field is used to store such and such data

    Private fields:
    - _user_id_to_instance - this is dictionary mapping user ID to User instance.
    """
    <class code>

Annotations. Python3 allows us to annotate function arguments and output with arbitrary python expression.

def count_women(self, id_to_instance_dict: "dict( user id -> User instance)") -> int:

def average_age(self, id_to_instance_dict: dict) -> float:

Comments. You can also describe fields with comments: class UserRegistry:

some_field = {} # dict(str -> int)

def __init__(self):
    self.another_field = [] # list(tuple(str, str))

PS. Can someone help me with formatting?

Stack Exchange Network

Documenting/defining data structures in Python

2 Answers 2

Hot Network Questions

Documenting/defining data structures in Python

2 Answers 2

Related

Hot Network Questions