Backtracking graph algorithms for a word puzzle in Python

Question

Motivation

Recently, Buzzfeed released a browser-based puzzle game called "Pyramid Scheme." It is available at the following link. The puzzle in an initial state looks like this:

To solve the puzzle, you have to find sequences of adjacent nodes in the pyramid that form valid words (3 letters long or longer) in English. Once all the nodes have been used in words, the puzzle is solved. For example, here is the completed puzzle (spoilers added as this is today's puzzle):

I thought it would be fun to write a solver for this puzzle using graph algorithms and the python package networkx.

Solution method

I represent the pyramid as a undirected graph, where letters are represented as nodes, and neighboring letters are represented as edge connections in the graph.

My approach follows a 2-level backtracking/depth-first-search approach. The first level finds valid sets of words given the original graph. The second level finds a single valid word, given a subset of the original graph. A sketch of the approach follows:

(+) Graph solver

If the graph is invalid, backtrack.
If the graph is empty (i.e., finished), then return the set of node lists.
Otherwise:

A. Find a list of nodes in the graph that corresponds to a valid word (*)

B. Remove these nodes from the current graph.

C. Add the found node list to the previous set of node lists.

D. Recurse (+).

(*) Node list solver

If the list of nodes is invalid, backtrack.
If the list of nodes corresponds to a valid English word (checks a dictionary), yield it to the first backtracking solver (+).
Otherwise:

A. Extend the node path to the next adjacent node.

B. Recurse (*).

The two backtracking levels are similar, but with a critical difference. The node path solver (*) can extend previous solutions. For example in the first image, the valid word pro could be extended to another valid word probe. On the other hand, the graph solver (+) cannot extend previous solutions, as there is no remaining graph to search over.

Solution implementation

The following block is my implementation of my approach. Here, wordlist.txt contains many common words in English (e.g., found on github).

# Solver for Buzfeed's "Pyramid Scheme" game
import networkx as nx

class PyramidSolver:
    """ Solves pyramid puzzles of base length 5 """

    def __init__(self,inputlist,filename_validwords):
        self.inputlist = inputlist
        self.letterlist = sum(inputlist, [])
        self.pyramid_size = 5
        self.validwords = self.get_validwords(filename_validwords)
        self.found_solutions = set()
        self.adjency_graph = nx.Graph({
            0 : [1,2],
            1 : [0,2,3,4],
            2 : [0,1,4,5],
            3 : [1,4,6,7],
            4 : [1,2,3,5,7,8],
            5 : [2,4,8,9],
            6 : [3,7,10,11],
            7 : [3,4,6,8,11,12],
            8 : [4,5,7,9,12,13],
            9 : [5,8,13,14],
            10 : [6,11],
            11 : [6,7,10,12],
            12 : [7,8,11,13],
            13 : [8,9,12,14],
            14 : [9,13]
        })

    def __repr__(self):
        """ Pretty printing for pyramid puzzle """
        out_string = "\n"
        rows = self.pyramid_size
        for i in range(rows):
            for _ in range(1, rows-i):
                out_string += " "
            for j in range(i+1):
                letter = self.inputlist[i][j]
                out_string += letter
                out_string += " "
            out_string += "\n"
        return out_string

    def get_validwords(self,filename):
        """ Loads the set of valid words from a file """
        with open(filename) as f:
            lines = f.readlines()
        validwords = set(map(lambda x: x.replace("\n",""), lines))
        return validwords
    
    def nodelist_to_string(self,nodelist):
        """ Converts nodes in the pyramid to a corresponding string """
        return ''.join(map(lambda i: self.letterlist[i], nodelist))
    
    def solve(self):
        """ Finds solutions to the pyramid puzzle """
        for nodes_solution in self.backtrack_graph(self.adjency_graph):
            wordset_solution = tuple(map(self.nodelist_to_string, nodes_solution))
            if wordset_solution not in self.found_solutions:
                self.found_solutions.add(wordset_solution)
                yield wordset_solution
    
    # Graph methods

    def backtrack_graph(self,graph,nodelist_set=set()):
        """ Search the letter graph for a valid word set, using backtracking """
        if self.reject_graph(graph):
            return None 
        if self.accept_graph(graph): 
            yield nodelist_set 
            return None # Backtrack previous solution
        for (new_graph, new_nodelist_set) in self.next_graph(graph,nodelist_set): 
            yield from self.backtrack_graph(new_graph,new_nodelist_set)

    def reject_graph(self,graph):
        """ Reject graphs with any components containing n nodes, 0 < n < 3 """
        graph_components = list(nx.connected_components(graph))
        if len(graph_components) == 0: # Graph is empty, will be accepted
            return False
        smallest_component = min(graph_components, key=len)
        return (len(smallest_component) < 3) 

    def accept_graph(self,graph):
        """ Valid solution if the remaining graph is empty """
        return (len(graph) == 0)

    def next_graph(self,graph,nodelist_set):
        """ Generate next graph by removing the nodes of valid words """
        for nodelist in self.backtrack_nodelist(graph): 
            new_graph = self.remove_pathnodes(graph,nodelist)
            new_nodelist_set = nodelist_set.copy()
            new_nodelist_set.add(nodelist)
            yield new_graph, new_nodelist_set

    def remove_pathnodes(self,graph,nodelist):
        """ Remove valid nodes from graph """
        new_graph = graph.copy()
        new_graph.remove_nodes_from(nodelist)
        return new_graph
    
    # Nodelist methods

    def backtrack_nodelist(self,graph,nodelist=[]):
        """ Search the letter graph for a valid word, using backtracking """
        if self.reject_nodelist(nodelist):
            return None 
        if self.accept_nodelist(nodelist): 
            yield tuple(nodelist) 
        for new_nodelist in self.next_nodelist(graph,nodelist):
            yield from self.backtrack_nodelist(graph,new_nodelist)

    def reject_nodelist(self,nodelist):
        """ Reject lists with repeated elements """
        return (len(nodelist) != len(set(nodelist)))

    def accept_nodelist(self,nodelist):
        """ Accept if the word is 3 letters or longer & in the word set """
        if len(nodelist) >= 3:
            return self.nodelist_to_string(nodelist) in self.validwords
        return False

    def next_nodelist(self,graph,nodelist):
        """ Generate next node list with +1 node """
        for node in self.new_nodes(graph,nodelist):
            yield nodelist + [node]

    def new_nodes(self,graph,nodelist):
        """ Enumerate first nodes, or next adjacent nodes """
        if (len(nodelist) == 0):
            return graph.nodes
        else:
            return graph.neighbors(nodelist[-1])
            

if __name__ == "__main__":
    
    # Initialize pyramid and list of valid words
    my_pyramid = PyramidSolver(
        [
                    ['p'],
                  ['i','r'],
                ['a','a','o'],
              ['n','t','i','b'],
            ['o','g','u','e','o'],
        ],
        "wordlist.txt"
    )
    print(my_pyramid)

    # Solve for valid configurations
    for solution in my_pyramid.solve():
        print(solution)

This code produces the expected output (spoilers added as this is today's puzzle):

    p 
   i r 
  a a o 
 n t i b 
o g u e o

('oboe', 'piano', 'guitar')

Discussion

What I am looking for feedback on:

The overall approach, both for the two-step backtracking solver structure, as well as the graph algorithms I used.
I have repeated the backtracking pattern twice. Is there a way to only write the pattern once?
The node path solver wastes a lot of time on strings that are obviously not valid English words. For example, no English words start with rp, so that solution branch should be discarded early. How might one implement this?
I would appreciate feedback on the overall readability and presentation of the code.
Feedback on other topics is also welcome.

When getting valid words, lambda x: x.replace("\n",""), is a long way of spelling str.rstrip. There is an opportunity for early pruning of valid words based on the input puzzle's character set, e.g. the example puzzle lacks "s" so we don't need to treat "s" words as valid. Consider maintaining a sorted list of valid words. — J_H, Commented Jun 11, 2023 at 3:35
For words that have a proper prefix being an eligible word I see a problem in the algorithm sketched: cannot can't be found due to can. — greybeard, Commented Jun 12, 2023 at 11:04
The body of get_validwords should be closer to with open(filename) as f: return set(line.rstrip().lower() for line in f if not line.startswith('#!comment:')). While strictly speaking for your particular usage there is no need to convert to lower case (both 'a' and 'A' are present in the word list) or to filter out comments, this would be the way to get an actual list of words while removing duplicates that differ only in case. For your purposes you could do: return set(line.rstrip() for line in f) — Booboo, Commented Jul 3, 2023 at 11:25

Stack Exchange Network

Backtracking graph algorithms for a word puzzle in Python

Motivation

Solution method

Solution implementation

Discussion

0

Hot Network Questions

Backtracking graph algorithms for a word puzzle in Python

Motivation

Solution method

Solution implementation

Discussion

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Related

Hot Network Questions