Compare strings, allowing one character difference

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Horror Game Menu Looping

--

Chapters
00:00 Question
01:09 Accepted answer (Score 5)
02:56 Answer 2 (Score 12)
03:47 Answer 3 (Score 4)
05:11 Answer 4 (Score 3)
05:22 Thank you

--

Full question
https://stackoverflow.com/questions/2521...

Answer 2 links:
[future_builtins.zip()]: https://docs.python.org/2/library/future...
[zip()]: https://docs.python.org/3/library/functi...
[any()]: https://docs.python.org/3/library/functi...
[all()]: https://docs.python.org/3/library/functi...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #string

#avk47

ACCEPTED ANSWER

Score 5

This is a more well-performing solution, coded in Python 3:

def match(s1, s2):
    ok = False

    for c1, c2 in zip(s1, s2):
        if c1 != c2:
            if ok:
                return False
            else:
                ok = True

    return ok

I do not checked for length difference because you said the two strings are equal, but for a more general approach I would add it.

If you need the position of the different character:

def match(s1, s2):
    pos = -1

    for i, (c1, c2) in enumerate(zip(s1, s2)):
        if c1 != c2:
            if pos != -1:
                return -1
            else:
                pos = i

    return pos

These are benchmarks performed with timeit, tested with match("0-001", "0-101"). I translated all solutions to py3 and removed length test.

your solution: 5.12
Martijn Pieters' solution: 4.92
enrico.bacis' and lakesh's solution: 5.51
my solution: 2.42

Tests with a longer string:

Martijn Pieters' solution:

timeit.timeit('match("0-0016ub5j2oi06u30tj30g6790v3nug[hoyj39867i6gy9thvb05y4b896y3n098vty98thn98qg5y4n8ygnqp", "0-0016ub5j2oi06u30tj30g6790v3gug[hoyj39867i6gy9thvb05y4b896y3n098vty98thn98qg5y4n8ygnqp")', setup="""
def match(s1, s2):
    combo = zip(s1, s2)
    return any(c1 != c2 for c1, c2 in combo) and all(c1 == c2 for c1, c2 in combo)
""")

result: 32.82

My solution:

timeit.timeit('match("0-0016ub5j2oi06u30tj30g6790v3nug[hoyj39867i6gy9thvb05y4b896y3n098vty98thn98qg5y4n8ygnqp", "0-0016ub5j2oi06u30tj30g6790v3gug[hoyj39867i6gy9thvb05y4b896y3n098vty98thn98qg5y4n8ygnqp")', setup="""
def match(s1, s2):
    ok = False

    for c1, c2 in zip(s1, s2):
        if c1 != c2:
            if ok:
                return False
            else:
                ok = True

    return ok
""")

Result: 20.21

ANSWER 2

Score 4

In Python 2, use future_builtins.zip() (Python 2 and 3 compatible zip()-as-iterator) (in Python 3 the built-in will do fine) to combine the two strings character by character, then use any() and all() to loop over the resulting iterator:

try:
    from future_builtins import zip
except ImportError:
    pass

def match(s1, s2):
    if len(s1) != len(s2):
        return False
    combo = zip(s1, s2)
    return any(c1 != c2 for c1, c2 in combo) and all(c1 == c2 for c1, c2 in combo)

This works because combo is an iterator; it yields pairs of characters one by one on demand, and any() and all() only take as many pairs as needed to determine their outcome.

any() stops iterating as soon as it finds two characters that are not equal. all() will only return True if all the remaining characters are equal.

Together the two conditions then are only True if there is exactly one pair that differs.

Because an iterator approach is used, the above does the absolute minimum amount of work to determine if your strings are a match; the moment a second pair is found that doesn't match iteration stops; there is no point in looking at the rest of the character combinations.

Demo (Python 2, so zip() is imported):

>>> from future_builtins import zip
>>> def match(s1, s2):
...     if len(s1) != len(s2):
...         return False
...     combo = zip(s1, s2)
...     return any(c1 != c2 for c1, c2 in combo) and all(c1 == c2 for c1, c2 in combo)
... 
>>> match('0011', '0111')
True
>>> match('0-001', '0-101')
True
>>> match('0-011', '0-101')
False

ANSWER 3

Score 3

def match(a,b):
    s = sum([a[i] != b[i] for i in range(len(a))])
    if s == 1:
       return True
    else:
       return False

ANSWER 4

Score 1

Same Lengths

If the two string have the same length:

def match(s1, s2):
    return sum(s1[i] != s2[i] for i in xrange(len(s1))) <= 1

You can also create a generator of matchers like this:

def create_matcher(maxdiff):
    return lambda s1, s2: sum(s1[i] != s2[i] for i in xrange(len(s1))) <= maxdiff

match = create_matcher(1)

Example:

print match('0011', '0111')      # True
print match('0-001', '0-101')    # True
print match('0-011', '0-101')    # False

Different Lengths

If they have different length we can assume that the exceeding characters are different, so:

def match(s1, s2):
    l = min(len(s1), len(s2))
    return sum(s1[i] != s2[i] for i in xrange(l)) + abs(len(s1) - len(s2)) <= 1