Python Module “re”

  • Posted by Mike Naberezny in Python

    Without question, one of my favorite bundled Python modules is re, a PCRE (Perl Compatible Regular Expression) engine.

    Python’s string object has many useful methods that I use frequently such as split(), strip(), replace(), and especially find(). When more sophisticated operations are needed, import re.

    A good deal of the work I do involves interpreting short messages sent from instrumentation. For most of these communications, I need to quickly evaluate a small quantity of data in three ways:

    1. Validate that the response is formatted as expected.
    2. Extract specific data from the response.
    3. Validate and then act on the data extracted.

    re‘s object-oriented interface is very nice, however I find that it can overcomplicate very small operations where a simple function like PHP’s preg_match or preg_match_all would be entirely sufficient. Luckily, identical functionality is easily recreated with re by cascading the re objects on one line and using the findall() method of the pattern object to return a list:

    import re
    testString = 'Light Switch = ON'
    matches = re.compile('Light Switch = (\w+)').findall(testString)
    if matches == []:
        # If an empty list is returned, the string under test
        # was nothing like we expected.
    elif matches[0] not in ('ON', 'OFF'):
        # The string under test was formatted as expected,
        # however the extracted data is unusual.
    else:
        # Success.  The string under test was formatted as
        # expected, and the expected data was extracted from it.