Modify rfc822.formatdate() to always generate English names,
regardless of locale. This is required by RFC 1123.
In open_local_file() of urllib and urllib2, use new formatdate() from
rfc822.
now allowed in unquoted RealName areas (technically, they are defined
as "obsolete syntax" we MUST accept in phrases, as part of the
obs-phrase production). Thus, parsing
To: User J. Person <person@dom.ain>
correctly returns "User J. Person" as the RealName.
AddrlistClass.__init__(): Add definition of self.phraseends which is
just self.atomends with `.' removed.
getatom(): Add an optional argument `atomends' which, if None (the
default) means use self.atomends.
getphraselist(): Pass self.phraseends to getatom() and break out of
the loop only when the current character is in phraseends instead of
atomends. This allows dots to continue to serve as atom delimiters in
all contexts except phrases.
Also, loads of docstring updates to document RFC 2822 conformance
(sorry, this should have been two separate patches).
setdefault() the empty string. In setdefault(), use + to join the value
to create the entry for the headers attribute so that TypeError is raised
if the value is of the wrong type.
also modified check_all function to suppress all warnings since they aren't
relevant to what this test is doing (allows quiet checking of regsub, for
instance)
rfc822 (Addresslist) modules. Also a preliminary testcase for augmented
assignment, which should actually be merged with the test_class testcase I
added last week.
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").
There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)
It breaks Mailman, it was actually documented in the docstring, so it
was an intentional deviation from the usual del semantics. Let's
document the original behavior in Doc/lib/librfc822.tex.
for gotonext() pushing self.pos past the end of the string. This can
happen if the message has a To field like "To: :" and you call
msg.getaddrlist('to').
Problem: rfc822.py in 1.5.2 final loses the quotes around
quoted local-part names.
The fix is to preserve the quotes around a local-part
name in an address.
Test:
import rfc822
a = rfc822.AddrlistClass('(Comment stuff) "Quoted
name"@somewhere.com')
a.getaddrlist()
The correct result is:
[('Comment stuff', '"Quoted name"@somewhere.com')]
named header, so that if a message has, e.g. multiple CC: lines, all
will get returned by the call to getaddrlist(). It also correctly
handles addresses which show up in continuation lines.
AdderlistClass.__init__(): Added \n to self.CR which fixes a bug that
sometimes, an address would contain a bogus trailing newline.
Message.getaddress(): In final else clause, added a test for the
character we're at being in self.specials. Without this, such
characters never get consumed and we infloop. Case in point (as
posted to c.l.py):
To: <[smtp:dd47@mail.xxx.edu]_at_hmhq@hdq-mdm1-imgout.companay.com>
----------------------------^
otherwise we'd infloop here
Extended the rfc822 parsedate routines to handle the cases they failed
on in an archive of ~37,000 messages. I believe the changes are
compatible, in that all previously correct parsing are still correct.
[I still see problems with some messages, but no showstoppers.]
- explain seekable
- when seekable==1, test fp.tell() and set it to 0 if that fails
- support overridable method iscomment(line) to weed out comments
- check for unread() method on file object before trying to seek
And one of my own:
- Add a get() method which behaves like a dictionary's get(); this is
actually implemented by giving getheader() an optional second argument
to specify the default, and aliasing get to getheader.
the comma after the day name is optional if it is a recognized day
name; and the date and month may be swapped. Thus, the following two
test dates will now be parsed correctly:
Thu, Feb 13 12:16:57 1992
Thu Feb 13 12:16:57 1992