renamed tokenize and now works with bytes rather than strings. A new
detect_encoding function has been added for determining source file encoding
according to PEP-0263. Token sequences returned by tokenize always start
with an ENCODING token which specifies the encoding used to decode the file.
This token is used to encode the output of untokenize back to bytes.
Credit goes to Michael "I'm-going-to-name-my-first-child-unittest" Foord from Resolver Systems for this work.
(with one small bugfix in bgen/bgen/scantools.py)
This replaces string module functions with string methods
for the stuff in the Tools directory. Several uses of
string.letters etc. are still remaining.