python - What exactly do "u" and "r" string flags do, and what are raw string literals? -
while asking this question, realized didn't know raw strings. claiming django trainer, sucks.
i know encoding is, , know u
alone since unicode.
but r
exactly? kind of string result in?
and above all, heck ur
do?
finally, there reliable way go unicode string simple raw string?
ah, , way, if system , text editor charset set utf-8, u
anything?
there's not "raw string"; there raw string literals, string literals marked 'r'
before opening quote.
a "raw string literal" different syntax string literal, in backslash, \
, taken meaning "just backslash" (except when comes right before quote otherwise terminate literal) -- no "escape sequences" represent newlines, tabs, backspaces, form-feeds, , on. in normal string literals, each backslash must doubled avoid being taken start of escape sequence.
this syntax variant exists because syntax of regular expression patterns heavy backslashes (but never @ end, "except" clause above doesn't matter) , looks bit better when avoid doubling each of them -- that's all. gained popularity express native windows file paths (with backslashes instead of regular slashes on other platforms), that's needed (since normal slashes work fine on windows too) , imperfect (due "except" clause above).
r'...'
byte string (in python 2.*), ur'...'
unicode string (again, in python 2.*), , of other 3 kinds of quoting produces same types of strings (so example r'...'
, r'''...'''
, r"..."
, r"""..."""
byte strings, , on).
not sure mean "going back" - there no intrinsically , forward directions, because there's no raw string type, it's alternative syntax express normal string objects, byte or unicode may be.
and yes, in python 2.*, u'...'
is of course distinct '...'
-- former unicode string, latter byte string. encoding literal might expressed in orthogonal issue.
e.g., consider (python 2.6):
>>> sys.getsizeof('ciao') 28 >>> sys.getsizeof(u'ciao') 34
the unicode object of course takes more memory space (very small difference short string, ;-).
Comments
Post a Comment