What exactly do “u” and “r” string prefixes do, and what are raw string literals?
While asking this question, I realized I didn’t know much about raw strings. For somebody claiming to be a Django trainer, this sucks.
While asking this question, I realized I didn’t know much about raw strings. For somebody claiming to be a Django trainer, this sucks.
Technically, any odd number of backslashes, as described in the documentation.
From the python documentation on regex, regarding the ''
character:
When an 'r'
or 'R'
prefix is present,
a character following a backslash is
included in the string without change,
and all backslashes are left in the
string. For example, the string
literal r"n"
consists of two
characters: a backslash and a
lowercase 'n'
. String quotes can be
escaped with a backslash, but the
backslash remains in the string; for
example, r"""
is a valid string
literal consisting of two characters:
a backslash and a double quote; r""
is not a valid string literal (even a
raw string cannot end in an odd number
of backslashes). Specifically, a raw
string cannot end in a single
backslash (since the backslash would
escape the following quote character).
Note also that a single backslash
followed by a newline is interpreted
as those two characters as part of the
string, not as a line continuation.
I don’t understand the logic in the functioning of the scape operator in python regex together with r’ of raw strings.
Some help is appreciated.
First the tokenizer looks for the closing quote. It recognizes backslashes when it does this, but doesn’t interpret them – it just looks for a sequence of string elements followed by the closing quote mark, where “string elements” are either (a character that’s not a backslash, closing quote or a newline – except newlines are allowed in triple-quotes), or (a backslash, followed by any single character).