'latin-1' and 'utf-8'.
These are optimized in the Python Unicode implementation
to result in more direct processing, bypassing the codec
registry.
Also see issue11303.
raw string literals. I added a whole bunch of tests but am still not sure
I am testing all paths through the code. I really think the code could be
simplified quite a bit.