5
\$\begingroup\$

Problem

Validate if a given string can be interpreted as a decimal or scientific number.

Some examples:

"0" => true
" 0.1 " => true
"abc" => false
"1 a" => false
"2e10" => true
" -90e3   " => true
" 1e" => false
"e3" => false
" 6e-1" => true
" 99e2.5 " => false
"53.5e93" => true
" --6 " => false
"-+3" => false
"95a54e53" => false

Code

I've solved the valid number LeetCode problem using Python re module. If you'd like to review the code and provide any change/improvement recommendations, please do so and I'd really appreciate that.

import re
from typing import Optional    

def is_numeric(input_string: Optional[str]) -> bool:
    """
    Returns True for valid numbers and input string can be string or None
    """
    if input_string is None:
        return False

    expression_d_construct = r"^[+-]?(?:\d*\.\d+|\d+\.\d*|\d+)[Ee][+-]?\d+$|^[+-]?(?:\d*\.\d+|\d+\.\d*|\d+)$|^[+-]?\d+$"
    expression_char_class = r"^[+-]?(?:[0-9]*\.[0-9]+|[0-9]+\.[0-9]*|[0-9]+)[Ee][+-]?[0-9]+$|^[+-]?(?:[0-9]*\.[0-9]+|[0-9]+\.[0-9]*|[0-9]+)$|^[+-]?[0-9]+$"

    if re.match(expression_d_construct, input_string.strip()) is not None and re.match(expression_char_class, input_string.strip()) is not None:
        return True
    return False


if __name__ == "__main__":
        # ---------------------------- TEST ---------------------------
    DIVIDER_DASH = '-' * 50
    GREEN_APPLE = '\U0001F34F'
    RED_APPLE = '\U0001F34E'

    test_input_strings = [None, "0  ", "0.1", "abc", "1 a", "2e10", "-90e3",
                          "1e", "e3", "6e-1", "99e2.5", "53.5e93", "--6", "-+3", "95a54e53"]

    count = 0
    for string in test_input_strings:
        print(DIVIDER_DASH)
        if is_numeric(string):
            print(f'{GREEN_APPLE} Test {int(count + 1)}: {string} is a valid number.')
        else:
            print(f'{RED_APPLE} Test {int(count + 1)}: {string} is an invalid number.')
        count += 1

Output

--------------------------------------------------
🍎 Test 1: None is an invalid number.
--------------------------------------------------
🍏 Test 2: 0   is a valid number.
--------------------------------------------------
🍏 Test 3: 0.1 is a valid number.
--------------------------------------------------
🍎 Test 4: abc is an invalid number.
--------------------------------------------------
🍎 Test 5: 1 a is an invalid number.
--------------------------------------------------
🍏 Test 6: 2e10 is a valid number.
--------------------------------------------------
🍏 Test 7: -90e3 is a valid number.
--------------------------------------------------
🍎 Test 8: 1e is an invalid number.
--------------------------------------------------
🍎 Test 9: e3 is an invalid number.
--------------------------------------------------
🍏 Test 10: 6e-1 is a valid number.
--------------------------------------------------
🍎 Test 11: 99e2.5 is an invalid number.
--------------------------------------------------
🍏 Test 12: 53.5e93 is a valid number.
--------------------------------------------------
🍎 Test 13: --6 is an invalid number.
--------------------------------------------------
🍎 Test 14: -+3 is an invalid number.
--------------------------------------------------
🍎 Test 15: 95a54e53 is an invalid number.

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here


RegEx Demo 1

RegEx Demo 2

If you wish to explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


Source

LeetCode Valid Number

\$\endgroup\$
1
  • 4
    \$\begingroup\$ nice apple icons \$\endgroup\$
    – RomanPerekhrest
    Commented Oct 16, 2019 at 15:25

3 Answers 3

5
\$\begingroup\$

Instead of diving into cumbersome and lengthy regex expressions consider the following improvement/correction:

The main thesis for the underlying aspect is:

Numeric literals containing a decimal point or an exponent sign yield floating point numbers.

https://docs.python.org/3.4/library/stdtypes.html#numeric-types-int-float-complex

Therefore Python treats values like 53.5e93, -90e3 as float type numbers.

Eventually I would proceed with the following approach (retaining those cute icons) including additional small optimizations:

from typing import TypeVar, Optional


def is_numeric(input_string: Optional[str]) -> bool:
    """
    Returns True for valid numbers. Acceptable types of items: str or None
    """
    if input_string is None:
        return False

    try:
        input_string = input_string.strip()
        float(input_string)
    except ValueError:
        return False
    return True


if __name__ == "__main__":
    # ---------------------------- TEST ---------------------------
    DIVIDER_DASH = '-' * 50
    GREEN_APPLE = '\U0001F34F'
    RED_APPLE = '\U0001F34E'

    test_input_strings = [None, "0  ", "0.1", "abc", "1 a", "2e10", "-90e3",
                          "1e", "e3", "6e-1", "99e2.5", "53.5e93", "--6", "-+3", "95a54e53"]

    count = 0
    for string in test_input_strings:
        print(DIVIDER_DASH)
        count += 1

        if is_numeric(string):
            print(f'{GREEN_APPLE} Test {count}: `{string}` is a valid number.')
        else:
            print(f'{RED_APPLE} Test {count}: `{string}` is not a valid number.')

The output:

--------------------------------------------------
🍎 Test 1: `None` is not a valid number.
--------------------------------------------------
🍏 Test 2: `0  ` is a valid number.
--------------------------------------------------
🍏 Test 3: `0.1` is a valid number.
--------------------------------------------------
🍎 Test 4: `abc` is not a valid number.
--------------------------------------------------
🍎 Test 5: `1 a` is not a valid number.
--------------------------------------------------
🍏 Test 6: `2e10` is a valid number.
--------------------------------------------------
🍏 Test 7: `-90e3` is a valid number.
--------------------------------------------------
🍎 Test 8: `1e` is not a valid number.
--------------------------------------------------
🍎 Test 9: `e3` is not a valid number.
--------------------------------------------------
🍏 Test 10: `6e-1` is a valid number.
--------------------------------------------------
🍎 Test 11: `99e2.5` is not a valid number.
--------------------------------------------------
🍏 Test 12: `53.5e93` is a valid number.
--------------------------------------------------
🍎 Test 13: `--6` is not a valid number.
--------------------------------------------------
🍎 Test 14: `-+3` is not a valid number.
--------------------------------------------------
🍎 Test 15: `95a54e53` is not a valid number.
\$\endgroup\$
1
  • \$\begingroup\$ The .strip() part is not necessary, because python allows optional leading and trailing whitespace. Also, you can omit the None check and catch both ValueError and TypeError. \$\endgroup\$
    – Wombatz
    Commented Oct 17, 2019 at 11:03
4
\$\begingroup\$

I'd just go with @Roman's suggestion. You should just leave it up to the language to decide what is and isn't valid.

I'd make two further suggestions though:

I don't think the parameter to is_numeric should be Optional; either conceptually, or to comply with the challenge. None will never be a valid number, so why even check it? I don't think dealing with invalid data should be that function's responsibility. Make it take just a str, then deal with Nones externally. I also don't really think it's is_numeric's responsibility to be dealing with trimming either; and that isn't even required:

print(float(" 0.1 "))  # prints 0.1

I'd also return True from within the try. The behavior will be the same, but I find it makes it clearer the intent of the try.

After the minor changes, I'd go with:

def is_numeric(input_string: str) -> bool:
    """
    Returns True for valid numbers. Acceptable types of items: str or None
    """
    try:
        parsed = float(input_string)
        return True

    except ValueError:
        return False

if string is not None and is_numeric(string):
    print(f'{GREEN_APPLE} Test {count}: `{string}` is a valid number.')
else:
    print(f'{RED_APPLE} Test {count}: `{string}` is not a valid number.')
\$\endgroup\$
1
\$\begingroup\$

That regex visualisation you provided is really neat. It shows that there is a lot of potential overlap in the conditions.

You should be able to reduce it down to something similar to this:

^[+-]?\d+(\.\d+)?([Ee][+-]?\d+)?$
\$\endgroup\$
1
  • 1
    \$\begingroup\$ Your regex doesn't match 1. or .5, both of which are matched by the regex in the OP. Your regex assumes that there will always be a leading digit before a period, and that a period will always be followed by decimals. Neither is true. \$\endgroup\$
    – JAD
    Commented Oct 17, 2019 at 8:39

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.