labm8.text¶
Text utilities.
-
exception
labm8.text.
Error
¶ Module-level error.
-
exception
labm8.text.
TruncateError
¶ Thrown in case of truncation error.
-
labm8.text.
diff
(s1, s2)¶ Return a normalised Levenshtein distance between two strings.
Distance is normalised by dividing the Levenshtein distance of the two strings by the max(len(s1), len(s2)).
Examples
>>> text.diff("foo", "foo") 0
>>> text.diff("foo", "fooo") 1
>>> text.diff("foo", "") 1
>>> text.diff("1234", "1 34") 1
Parameters: - s1 (str) – Argument A.
- s2 (str) – Argument B.
Returns: Normalised distance between the two strings.
Return type: float
-
labm8.text.
get_substring_idxs
(substr, string)¶ Return a list of indexes of substr. If substr not found, list is empty.
Parameters: - substr (str) – Substring to match.
- string (str) – String to match in.
Returns: Start indices of substr.
Return type: list of int
-
labm8.text.
levenshtein
(s1, s2)¶ Return the Levenshtein distance between two strings.
Implementation of Levenshtein distance, one of a family of edit distance metrics.
Based on: https://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#Python
Examples
>>> text.levensthein("foo", "foo") 0
>>> text.levensthein("foo", "fooo") 1
>>> text.levensthein("foo", "") 3
>>> text.levensthein("1234", "1 34") 1
Parameters: - s1 (str) – Argument A.
- s2 (str) – Argument B.
Returns: Levenshtein distance between the two strings.
Return type: int
-
labm8.text.
truncate
(string, maxchar)¶ Truncate a string to a maximum number of characters.
If the string is longer than maxchar, then remove excess characters and append an ellipses.
Parameters: - string (str) – String to truncate.
- maxchar (int) – Maximum length of string in characters. Must be >= 4.
Returns: Of length <= maxchar.
Return type: str
Raises: TruncateError
– In case of an error.