8
\$\begingroup\$

Here is my code in Python 2.7 for a Sudoku resolver. Any advice on performance improvement, code bugs or general code style advice is appreciated.

My major idea is:

  1. Using method generate some random data, then using check_invalid to see if it is a valid Sudoku
  2. If from 1, it is a valid Sudoku, then I will call resolve_sudoku to fill valid values

I find my code sometimes run a long time without completion. Are there any code bugs you can find?

import random
from collections import defaultdict
found = False
def check_invalid(matrix):
    # check for each row
    for row in range(len(matrix)):
        cur_row = set()
        for col in range(len(matrix[0])):
            if matrix[row][col] == 0:
                continue
            elif 1 <= matrix[row][col] <= 9:
                if matrix[row][col] in cur_row:
                    return False
                else:
                    cur_row.add(matrix[row][col])
            else:
                return False # invalid number
    # check each col
    for col in range(len(matrix[0])):
        cur_col = set()
        for row in range(len(matrix)):
            if matrix[row][col] == 0:
                continue
            elif 1 <= matrix[row][col] <= 9:
                if matrix[row][col] in cur_col:
                    return False
                else:
                    cur_col.add(matrix[row][col])
            else:
                return False # invalid number
    # check each 3*3 square
    for start_row in [0,3,6]:
        for start_col in [0,3,6]:
            cur_square = set()
            for row in range(start_row, start_row+3):
                for col in range(start_col, start_col + 3):
                    if matrix[row][col] == 0:
                        continue
                    elif 1 <= matrix[row][col] <= 9:
                        if matrix[row][col] not in cur_square:
                            cur_square.add(matrix[row][col])
                        else:
                            return False
                    else:
                        return False # invalid value
    return True

def resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col):
    global found
    if found:
        return
    if cur_row == len(matrix):
        found = True
        for r in matrix:
            print r
    elif cur_col == len(matrix[0]):
        resolve_sudoku(matrix, row_map, col_map, square_map, cur_row+1, 0)
    elif matrix[cur_row][cur_col] != 0:
        resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col+1)
    else:
        for val in range(1,10):
            square_x = cur_row / 3
            square_y = cur_col / 3
            if val in row_map[cur_row] or val in col_map[cur_col] or val in square_map[(square_x, square_y)]:
                continue
            else:
                row_map[cur_row].add(val)
                col_map[cur_col].add(val)
                square_map[(square_x, square_y)].add(val)
                matrix[cur_row][cur_col] = val
                resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col+1)
                row_map[cur_row].remove(val)
                col_map[cur_col].remove(val)
                square_map[(square_x, square_y)].remove(val)
                matrix[cur_row][cur_col] = 0
                if found:
                    return
if __name__ == "__main__":
    matrix = []
    for row in range(9):
        cur_row = []
        for col in range(9):
            if random.random() < 0.1:
                cur_row.append(random.randint(1,9))
            else:
                cur_row.append(0)
        matrix.append(cur_row)
    for r in matrix:
        print r
    re = check_invalid(matrix)
    print re
    if re:
        # init for row map and col map
        row_map = defaultdict(set)
        col_map = defaultdict(set)
        square_map = defaultdict(set)
        for row in range(len(matrix)):
            for col in range(len(matrix[0])):
                square_x = row / 3
                square_y = row / 3
                if matrix[row][col] != 0:
                    row_map[row].add(matrix[row][col])
                    col_map[col].add(matrix[row][col])
                    square_map[(row, col)].add(matrix[row][col])
        resolve_sudoku(matrix, row_map, col_map, square_map, 0, 0)
\$\endgroup\$
3
  • 1
    \$\begingroup\$ Right now, without a set debugger, I have noticed that after running your code literally 100 times, the only time I do not get any output is if it runs for 5 seconds. This must be a time out feature. As for bugs, relook at your loops to see if you can improve its speed. This 5-second timeout occurs in both my Online Console Compiler and in Sphere Engine. If I see anything that can be improved, I'll post it as an answer. \$\endgroup\$ Commented Feb 26, 2017 at 5:01
  • \$\begingroup\$ Thanks George, vote up for your advice. If you found any bugs, please let me know. BTW, I run in PyCharm with Python 2.7, it does not run to complete in some cases. \$\endgroup\$
    – Lin Ma
    Commented Mar 5, 2017 at 1:34
  • \$\begingroup\$ Noted. I have that too. Will use to work your sources. Thanks. \$\endgroup\$ Commented Mar 6, 2017 at 5:29

2 Answers 2

4
+50
\$\begingroup\$

Let's break it down, piece by piece.

Generating a matrix

First, move the "generating a potential sudoku matrix" into a separate function. And, we can improve it by making use of list comprehensions:

def generate_sudoku_matrix():
    """Generates a 9x9 matrix with random numbers from 1 to 9, leaving 0s as unsolved."""
    return [[0 if random.random() < 0.1 else random.randint(1, 9)
             for _ in range(9)]
            for _ in range(9)]

Note the use of _ for throw-away variables.

Validating a matrix

First of all, I would call the function check_valid() instead of check_invalid() since you are returning True in case a matrix is valid. Though, a better name would probably be something like is_valid_sudoku().

The function itself is over-complicated and also violates the DRY principle because you basically repeat the same exact check for rows, columns and 3x3 squares.

What if instead we would define a reusable function that would validate a "block" (row, column or square):

def is_valid_block(block):
    """Returns true if a block (row, column or square) is a valid Sudoku block."""
    return len(block) == 9 and sum(block) == sum(set(block))

And apply it to all the 3 cases:

from itertools import chain


def is_valid_sudoku(matrix):
    """Returns True if a Sudoku matrix is valid, False otherwise."""

    # check each row
    for row in matrix:
        if not is_valid_block(row):
            return False

    # check each column
    for col in zip(*matrix):
        if not is_valid_block(col):
            return False

    # check each 3x3 square
    for i in range(0, 9, 3):
        for j in range(0, 9, 3):
            square = list(chain(row[j:j + 3] for row in matrix[i:i + 3]))
            if not is_valid_block(square):
                return False

    return True

We can also take it a step further and make use of generators and all():

def generate_blocks(matrix):
    """Generates rows, columns and 3x3 squares for a Sudoku matrix."""
    for row in matrix:
        yield row

    for col in zip(*matrix):
        yield col

    for i in range(0, 9, 3):
        for j in range(0, 9, 3):
            yield list(chain(*(row[j:j + 3] for row in matrix[i:i + 3])))


def is_valid_sudoku(matrix):
    """Returns True if a Sudoku matrix is valid, False otherwise."""
    return all(is_valid_block(block) for block in generate_blocks(matrix))

Solving Sudoku

First of all, move the initial map setup to a separate function to keep the "main" function clean and readable.

I don't particularly like the way you handle exiting the function, using the global found variable. It makes difficult to follow the flow of the program and hurts modularity and separation of concerns.

The other place for improvements is the temporary map and matrix modifications and rollbacks. Though, I am not sure if copying the data structures to pass it to the next recursive call would be cleaner and faster.

I have never written a sudoku solver before, but here is how I would think about implementing the brute-force solution (I guess that is just a different approach and it's not related to the code review):

  • define the is_valid_solved_sudoku() function that would validate if a matrix is a valid solved sudoku matrix. The function would be quite similar to the is_valid_sudoku() except the way you would check each of the "blocks" - you would not allow for 0 anymore
  • then, the is_valid_solved_sudoku() positive result will be our recursive base condition
  • then, on each recursive call, we find the position of the next unsolved cell (with 0 value)
  • for each row_index, col_index of the 0 cell, get the set of impossible values/excluded numbers from the corresponding row, column and square
  • make a recursive call for every number excluding the numbers from a set of excluded numbers we've constructed on the previous step

Code Style and other issues and suggestions:

  • use pretty-printing via pprint instead of a regular print
  • unused square_x and square_y variables in the "main" part of the script (that might be actually an error/bug, since, I was expecting you to use them to make a key for the square_map dictionary)
  • organize imports per PEP8 (reference)
  • two blank lines between the functions (reference)
\$\endgroup\$
1
  • \$\begingroup\$ Thanks for the answer alecxe, mark your reply as answer and grant bounty point. \$\endgroup\$
    – Lin Ma
    Commented Mar 12, 2017 at 23:49
1
\$\begingroup\$

What comes to performance, it is not due to your algorithm (which is very efficient), but rather a couple of bugs in it:

def resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col):
    global found
    if found:
        return
    if cur_row == len(matrix):
        found = True
        for r in matrix:
            print r
    elif cur_col == len(matrix[0]):
        resolve_sudoku(matrix, row_map, col_map, square_map, cur_row+1, 0)
    elif matrix[cur_row][cur_col] != 0:
        resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col+1)
    else:
        for val in range(1,10):
            # square_x = cur_row / 3 # <- bug
            square_x = cur_col / 3
            # square_y = cur_col / 3 # <- bug
            square_y = cur_row / 3
            if val in row_map[cur_row] or val in col_map[cur_col] or val in square_map[(square_x, square_y)]:
                continue
            else:
                row_map[cur_row].add(val)
                col_map[cur_col].add(val)
                square_map[(square_x, square_y)].add(val)
                matrix[cur_row][cur_col] = val
                resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col+1)
                row_map[cur_row].remove(val)
                col_map[cur_col].remove(val)
                square_map[(square_x, square_y)].remove(val)
                matrix[cur_row][cur_col] = 0
                if found:
                    return
\$\endgroup\$
1
  • \$\begingroup\$ Nice catch coderodde, vote up! \$\endgroup\$
    – Lin Ma
    Commented Mar 12, 2017 at 23:49

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.