Motivation
Recently, Buzzfeed released a browser-based puzzle game called "Pyramid Scheme." It is available at the following link. The puzzle in an initial state looks like this:
To solve the puzzle, you have to find sequences of adjacent nodes in the pyramid that form valid words (3 letters long or longer) in English. Once all the nodes have been used in words, the puzzle is solved. For example, here is the completed puzzle (spoilers added as this is today's puzzle):
I thought it would be fun to write a solver for this puzzle using graph algorithms and the python package networkx
.
Solution method
I represent the pyramid as a undirected graph, where letters are represented as nodes, and neighboring letters are represented as edge connections in the graph.
My approach follows a 2-level backtracking/depth-first-search approach. The first level finds valid sets of words given the original graph. The second level finds a single valid word, given a subset of the original graph. A sketch of the approach follows:
(+) Graph solver
If the graph is invalid, backtrack.
If the graph is empty (i.e., finished), then return the set of node lists.
Otherwise:
A. Find a list of nodes in the graph that corresponds to a valid word (*)
B. Remove these nodes from the current graph.
C. Add the found node list to the previous set of node lists.
D. Recurse (+).
(*) Node list solver
If the list of nodes is invalid, backtrack.
If the list of nodes corresponds to a valid English word (checks a dictionary), yield it to the first backtracking solver (+).
Otherwise:
A. Extend the node path to the next adjacent node.
B. Recurse (*).
The two backtracking levels are similar, but with a critical difference. The node path solver (*) can extend previous solutions. For example in the first image, the valid word pro
could be extended to another valid word probe
. On the other hand, the graph solver (+) cannot extend previous solutions, as there is no remaining graph to search over.
Solution implementation
The following block is my implementation of my approach. Here, wordlist.txt
contains many common words in English (e.g., found on github).
# Solver for Buzfeed's "Pyramid Scheme" game
import networkx as nx
class PyramidSolver:
""" Solves pyramid puzzles of base length 5 """
def __init__(self,inputlist,filename_validwords):
self.inputlist = inputlist
self.letterlist = sum(inputlist, [])
self.pyramid_size = 5
self.validwords = self.get_validwords(filename_validwords)
self.found_solutions = set()
self.adjency_graph = nx.Graph({
0 : [1,2],
1 : [0,2,3,4],
2 : [0,1,4,5],
3 : [1,4,6,7],
4 : [1,2,3,5,7,8],
5 : [2,4,8,9],
6 : [3,7,10,11],
7 : [3,4,6,8,11,12],
8 : [4,5,7,9,12,13],
9 : [5,8,13,14],
10 : [6,11],
11 : [6,7,10,12],
12 : [7,8,11,13],
13 : [8,9,12,14],
14 : [9,13]
})
def __repr__(self):
""" Pretty printing for pyramid puzzle """
out_string = "\n"
rows = self.pyramid_size
for i in range(rows):
for _ in range(1, rows-i):
out_string += " "
for j in range(i+1):
letter = self.inputlist[i][j]
out_string += letter
out_string += " "
out_string += "\n"
return out_string
def get_validwords(self,filename):
""" Loads the set of valid words from a file """
with open(filename) as f:
lines = f.readlines()
validwords = set(map(lambda x: x.replace("\n",""), lines))
return validwords
def nodelist_to_string(self,nodelist):
""" Converts nodes in the pyramid to a corresponding string """
return ''.join(map(lambda i: self.letterlist[i], nodelist))
def solve(self):
""" Finds solutions to the pyramid puzzle """
for nodes_solution in self.backtrack_graph(self.adjency_graph):
wordset_solution = tuple(map(self.nodelist_to_string, nodes_solution))
if wordset_solution not in self.found_solutions:
self.found_solutions.add(wordset_solution)
yield wordset_solution
# Graph methods
def backtrack_graph(self,graph,nodelist_set=set()):
""" Search the letter graph for a valid word set, using backtracking """
if self.reject_graph(graph):
return None
if self.accept_graph(graph):
yield nodelist_set
return None # Backtrack previous solution
for (new_graph, new_nodelist_set) in self.next_graph(graph,nodelist_set):
yield from self.backtrack_graph(new_graph,new_nodelist_set)
def reject_graph(self,graph):
""" Reject graphs with any components containing n nodes, 0 < n < 3 """
graph_components = list(nx.connected_components(graph))
if len(graph_components) == 0: # Graph is empty, will be accepted
return False
smallest_component = min(graph_components, key=len)
return (len(smallest_component) < 3)
def accept_graph(self,graph):
""" Valid solution if the remaining graph is empty """
return (len(graph) == 0)
def next_graph(self,graph,nodelist_set):
""" Generate next graph by removing the nodes of valid words """
for nodelist in self.backtrack_nodelist(graph):
new_graph = self.remove_pathnodes(graph,nodelist)
new_nodelist_set = nodelist_set.copy()
new_nodelist_set.add(nodelist)
yield new_graph, new_nodelist_set
def remove_pathnodes(self,graph,nodelist):
""" Remove valid nodes from graph """
new_graph = graph.copy()
new_graph.remove_nodes_from(nodelist)
return new_graph
# Nodelist methods
def backtrack_nodelist(self,graph,nodelist=[]):
""" Search the letter graph for a valid word, using backtracking """
if self.reject_nodelist(nodelist):
return None
if self.accept_nodelist(nodelist):
yield tuple(nodelist)
for new_nodelist in self.next_nodelist(graph,nodelist):
yield from self.backtrack_nodelist(graph,new_nodelist)
def reject_nodelist(self,nodelist):
""" Reject lists with repeated elements """
return (len(nodelist) != len(set(nodelist)))
def accept_nodelist(self,nodelist):
""" Accept if the word is 3 letters or longer & in the word set """
if len(nodelist) >= 3:
return self.nodelist_to_string(nodelist) in self.validwords
return False
def next_nodelist(self,graph,nodelist):
""" Generate next node list with +1 node """
for node in self.new_nodes(graph,nodelist):
yield nodelist + [node]
def new_nodes(self,graph,nodelist):
""" Enumerate first nodes, or next adjacent nodes """
if (len(nodelist) == 0):
return graph.nodes
else:
return graph.neighbors(nodelist[-1])
if __name__ == "__main__":
# Initialize pyramid and list of valid words
my_pyramid = PyramidSolver(
[
['p'],
['i','r'],
['a','a','o'],
['n','t','i','b'],
['o','g','u','e','o'],
],
"wordlist.txt"
)
print(my_pyramid)
# Solve for valid configurations
for solution in my_pyramid.solve():
print(solution)
This code produces the expected output (spoilers added as this is today's puzzle):
p
i r
a a o
n t i b
o g u e o
('oboe', 'piano', 'guitar')
Discussion
What I am looking for feedback on:
- The overall approach, both for the two-step backtracking solver structure, as well as the graph algorithms I used.
- I have repeated the backtracking pattern twice. Is there a way to only write the pattern once?
- The node path solver wastes a lot of time on strings that are obviously not valid English words. For example, no English words start with
rp
, so that solution branch should be discarded early. How might one implement this? - I would appreciate feedback on the overall readability and presentation of the code.
- Feedback on other topics is also welcome.
lambda x: x.replace("\n",""),
is a long way of spellingstr.rstrip
. There is an opportunity for early pruning of valid words based on the input puzzle's character set, e.g. the example puzzle lacks "s" so we don't need to treat "s" words as valid. Consider maintaining a sorted list of valid words. \$\endgroup\$adjency_graph
. \$\endgroup\$get_validwords
should be closer towith open(filename) as f: return set(line.rstrip().lower() for line in f if not line.startswith('#!comment:'))
. While strictly speaking for your particular usage there is no need to convert to lower case (both 'a' and 'A' are present in the word list) or to filter out comments, this would be the way to get an actual list of words while removing duplicates that differ only in case. For your purposes you could do:return set(line.rstrip() for line in f)
\$\endgroup\$