8
\$\begingroup\$

This problem is from Automate The Boring Stuff using Python - Chapter 7.

Write a function that takes a string and does the same thing as the strip() string method. If no other arguments are passed other than the string to strip, then whitespace characters will be removed from the beginning and end of the string. Otherwise, the characters specified in the second argument to the function will be removed from the string.

It takes a string and a character as input and return new stripped string.

#regexStrip.py - Regex Version of strip()

import re

def regex_strip(s, char=None):
    """
    Write a function that takes a string and does the same thing as the strip()
    string method. If no other arguments are passed other than the string to
    strip, then whitespace characters will be removed from the beginning and
    end of the string. Otherwise, the characters specified in the second argu-
    ment to the function will be removed from the string.
    """

    if not char:
        strip_left = re.compile(r'^\s*') #string starting with whitespace
        strip_right = re.compile(r'\s*$') #string ending with whitespace

        s = re.sub(strip_left, "", s) #replacing strip_left with "" in string s
        s = re.sub(strip_right, "", s) #replacing strip_right with "" in string s

    else:
        strip_char = re.compile(char)
        s = re.sub(strip_char, "", s)
    return s

if __name__ == '__main__':
    string_to_be_stripped = input("Enter string to be stripped: ")
    char_to_be_removed = input("Enter character to be removed, if none press enter: ")
    print(regex_strip(string_to_be_stripped, char_to_be_removed))        

Output:

Enter string to be stripped: foo, bar, cat
Enter character to be removed, if none press enter: ,
foo bar cat
\$\endgroup\$
1
  • \$\begingroup\$ You should add an automatic test for regex_strip('[in brackets]', '[]'). The result should be 'in brackets'. \$\endgroup\$ Commented Jul 8, 2019 at 6:57

2 Answers 2

4
\$\begingroup\$

Pre-compilation

In theory, you're going to want to call this function more than once. That means that you only want to pay the cost of regex compilation once, and you should move your re.compile calls out of the function, setting your regex variables in the module's global scope.

Type-hinting

s is s: str, and char is (I think) also char: str.

'Removed from the string'?

I think this is the fault of unclear requirements, but - it would make more sense for the char argument to be the character[s] to strip from the string edges, not characters to remove from anywhere in the string. As such, you would need to re-evaluate how you create your regex.

Combine left and right

There's no need for two regexes. You can use one with a capturing group:

^\s*(.*?)\s*$
\$\endgroup\$
2
\$\begingroup\$

With unsanitized user input, re.compile(char) is dangerous. You should use re.compile(re.escape(char)), which will allow you to strip the asterisks from "***Winner***", instead of crashing with an invalid regular expression.

See also this question and related answers for a different interpretation of the question’s intent for stripping other characters.

\$\endgroup\$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.