as docstrings and translatable strings, and rejects
bytes literals and f-string expressions.
(cherry picked from commit 69524821a8)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
renamed tokenize and now works with bytes rather than strings. A new
detect_encoding function has been added for determining source file encoding
according to PEP-0263. Token sequences returned by tokenize always start
with an ENCODING token which specifies the encoding used to decode the file.
This token is used to encode the output of untokenize back to bytes.
Credit goes to Michael "I'm-going-to-name-my-first-child-unittest" Foord from Resolver Systems for this work.
list of files to not extract docstrings from when the -D option is
given. This isn't optimal, but I didn't want to change the semantics
of -D, and it's bad form to allow optional switch arguments.
Bumping __version__ to 1.4.
TokenEater.__init__(): Initialize __curfile to None.
__waiting(): In order to extract docstrings from the module, both the
-D flag should be set, and the __curfile should not be named in
the -X filename (i.e. it isn't in opts.nodocstrings).
set_filename(): Fixed a bug where once the first module docstring is
extracted, no subsequent module docstrings will be extracted. The
bug was that the first extraction set __freshmodule to 0, but that
flag was never reset back to 1. set_filename() is always called
when the next file is being processed, so use it to reset the
__freshmodule flag.
main(): Add support for -X/--no-docstring.
indicating whether the entry was extracted from a docstring or not.
write(): If any of the locations of a string appearance came from a
docstring, add a comment such as
#. docstring
before the references (after a suggestion by Martin von Loewis).
(inspired by Detlef Lannert). Specifically,
-k/--keyword no longer takes an optional argument to clear the
default keywords. Instead, use -K/--no-default-keywords to clear
them.
-n/--add-location also no longer takes an optional argument to set
the comment style. Instead, use -S/--style to set the comment
style to GNU or Solaris.
-o/--output can take `-' as the filename, meaning write to
standard output.
The inputfile name can also be `-' meaning read from standard in.
A few other changes include
Kludge to mark the file docstring as translatable. Since the
marking is to place _() around the docstring, and because we
actually have to define the _() function before we use it, this
means that we have to manually assign to __doc__ the output of
_(). This doesn't seem too bad because you'll only use this idiom
when translating a script's docstring (you really don't need to
translate most module docstrings).
Convert everything to string methods and do not import the string
module.
Bump the version number to 1.1
This will fold all ISO 8859 chars from the upper half of the
charset into the lower half, which is ...ummm.... unintened.
The second is a typo in the reference to options.escape in main().
make pygettext more compatible with GNU xgettext, specifically:
Added -E/--escape for allowing pass-thru of iso8859-1 characters above
7 bits.
Added -o/--output option for renaming the output file from
messages.pot (there's overlap with -d/--default-domain, but GNU
xgettext has them both).
Added -p/--output-dir for specifying the output directory for
messages.pot.
Added -V/--version for printing the version number.
Added -w/--width for specifying the output page width (this is because
now pygettext, like GNU xgettext will put several locations on the
same line to cut down on vertical space).
Added -x/--exclude-file for specifying a list of strings that are not
to be extracted from the input files.
Bumped version number to 1.0
Try to import fintl and use fintl.gettext as _ if available. Fall
back is to use identity definition of _().
Moved the escape creation to a function make_escapes() so that its
behavior can be controlled by the -E option.
__openseen(): Support the -x option.
write(): Support -w option and vertical space preserving feature.
main(): Support new options.
Herzog <herzog@online.de>. Specifically,
--verbose/-v flag added
pot_header added to make msgmerge and Emacs po-mode work better
normalize(), escape(), safe_eval(): Improved normalization of strings
for more .po file compatibility (e.g. C style). Handles emmbedded
newlines better.
Also added an identity function called _() and use it in the file
where messages are printed. This allows us to selftest pygettext.py
with itself as input.