mirror of https://github.com/python/cpython
Updated the astimezone() proof to recover from all the last week's
changes (and there were a lot of relevant changes!).
This commit is contained in:
parent
2fbe5378f9
commit
8bb5ad2e56
|
@ -4806,103 +4806,96 @@ Now some derived rules, where k is a duration (timedelta).
|
|||
This is again a requirement for a sane tzinfo class.
|
||||
|
||||
4. (x+k).s = x.s
|
||||
This follows from #2, and that datimetime+timedelta preserves tzinfo.
|
||||
This follows from #2, and that datimetimetz+timedelta preserves tzinfo.
|
||||
|
||||
5. (x+k).n = x.n + k
|
||||
Again follows from how arithmetic is defined.
|
||||
|
||||
Now we can explain x.astimezone(tz). Let's assume it's an interesting case
|
||||
Now we can explain tz.fromutc(x). Let's assume it's an interesting case
|
||||
(meaning that the various tzinfo methods exist, and don't blow up or return
|
||||
None when called).
|
||||
|
||||
The function wants to return a datetime y with timezone tz, equivalent to x.
|
||||
x is already in UTC.
|
||||
|
||||
By #3, we want
|
||||
|
||||
y.n - y.o = x.n - x.o [1]
|
||||
y.n - y.o = x.n [1]
|
||||
|
||||
The algorithm starts by attaching tz to x.n, and calling that y. So
|
||||
x.n = y.n at the start. Then it wants to add a duration k to y, so that [1]
|
||||
becomes true; in effect, we want to solve [2] for k:
|
||||
|
||||
(y+k).n - (y+k).o = x.n - x.o [2]
|
||||
(y+k).n - (y+k).o = x.n [2]
|
||||
|
||||
By #1, this is the same as
|
||||
|
||||
(y+k).n - ((y+k).s + (y+k).d) = x.n - x.o [3]
|
||||
(y+k).n - ((y+k).s + (y+k).d) = x.n [3]
|
||||
|
||||
By #5, (y+k).n = y.n + k, which equals x.n + k because x.n=y.n at the start.
|
||||
Substituting that into [3],
|
||||
|
||||
x.n + k - (y+k).s - (y+k).d = x.n - x.o; the x.n terms cancel, leaving
|
||||
k - (y+k).s - (y+k).d = - x.o; rearranging,
|
||||
k = (y+k).s - x.o - (y+k).d; by #4, (y+k).s == y.s, so
|
||||
k = y.s - x.o - (y+k).d; then by #1, y.s = y.o - y.d, so
|
||||
k = y.o - y.d - x.o - (y+k).d
|
||||
x.n + k - (y+k).s - (y+k).d = x.n; the x.n terms cancel, leaving
|
||||
k - (y+k).s - (y+k).d = 0; rearranging,
|
||||
k = (y+k).s - (y+k).d; by #4, (y+k).s == y.s, so
|
||||
k = y.s - (y+k).d
|
||||
|
||||
On the RHS, (y+k).d can't be computed directly, but all the rest can be, and
|
||||
we approximate k by ignoring the (y+k).d term at first. Note that k can't
|
||||
be very large, since all offset-returning methods return a duration of
|
||||
magnitude less than 24 hours. For that reason, if y is firmly in std time,
|
||||
(y+k).d must be 0, so ignoring it has no consequence then.
|
||||
On the RHS, (y+k).d can't be computed directly, but y.s can be, and we
|
||||
approximate k by ignoring the (y+k).d term at first. Note that k can't be
|
||||
very large, since all offset-returning methods return a duration of magnitude
|
||||
less than 24 hours. For that reason, if y is firmly in std time, (y+k).d must
|
||||
be 0, so ignoring it has no consequence then.
|
||||
|
||||
In any case, the new value is
|
||||
|
||||
z = y + y.o - y.d - x.o [4]
|
||||
z = y + y.s [4]
|
||||
|
||||
It's helpful to step back at look at [4] from a higher level: rewrite it as
|
||||
|
||||
z = (y - x.o) + (y.o - y.d)
|
||||
|
||||
(y - x.o).n = [by #5] y.n - x.o = [since y.n=x.n] x.n - x.o = [by #3] x's
|
||||
UTC equivalent time. So the y-x.o part essentially converts x to UTC. Then
|
||||
the y.o-y.d part essentially converts x's UTC equivalent into tz's standard
|
||||
time (y.o-y.d=y.s by #1).
|
||||
It's helpful to step back at look at [4] from a higher level: it's simply
|
||||
mapping from UTC to tz's standard time.
|
||||
|
||||
At this point, if
|
||||
|
||||
z.n - z.o = x.n - x.o [5]
|
||||
z.n - z.o = x.n [5]
|
||||
|
||||
we have an equivalent time, and are almost done. The insecurity here is
|
||||
at the start of daylight time. Picture US Eastern for concreteness. The wall
|
||||
time jumps from 1:59 to 3:00, and wall hours of the form 2:MM don't make good
|
||||
sense then. A sensible Eastern tzinfo class will consider such a time to be
|
||||
EDT (because it's "after 2"), which is a redundant spelling of 1:MM EST on the
|
||||
day DST starts. We want to return the 1:MM EST spelling because that's
|
||||
sense then. The docs ask that an Eastern tzinfo class consider such a time to
|
||||
be EDT (because it's "after 2"), which is a redundant spelling of 1:MM EST
|
||||
on the day DST starts. We want to return the 1:MM EST spelling because that's
|
||||
the only spelling that makes sense on the local wall clock.
|
||||
|
||||
In fact, if [5] holds at this point, we do have the standard-time spelling,
|
||||
but that takes a bit of proof. We first prove a stronger result. What's the
|
||||
difference between the LHS and RHS of [5]? Let
|
||||
|
||||
diff = (x.n - x.o) - (z.n - z.o) [6]
|
||||
diff = x.n - (z.n - z.o) [6]
|
||||
|
||||
Now
|
||||
z.n = by [4]
|
||||
(y + y.o - y.d - x.o).n = by #5
|
||||
y.n + y.o - y.d - x.o = since y.n = x.n
|
||||
x.n + y.o - y.d - x.o = since y.o = y.s + y.d by #1
|
||||
x.n + (y.s + y.d) - y.d - x.o = cancelling the y.d terms
|
||||
x.n + y.s - x.o = since z and y are have the same tzinfo member,
|
||||
y.s = z.s by #2
|
||||
x.n + z.s - x.o
|
||||
(y + y.s).n = by #5
|
||||
y.n + y.s = since y.n = x.n
|
||||
x.n + y.s = since z and y are have the same tzinfo member,
|
||||
y.s = z.s by #2
|
||||
x.n + z.s
|
||||
|
||||
Plugging that back into [6] gives
|
||||
|
||||
diff =
|
||||
(x.n - x.o) - ((x.n + z.s - x.o) - z.o) = expanding
|
||||
x.n - x.o - x.n - z.s + x.o + z.o = cancelling
|
||||
- z.s + z.o = by #2
|
||||
x.n - ((x.n + z.s) - z.o) = expanding
|
||||
x.n - x.n - z.s + z.o = cancelling
|
||||
- z.s + z.o = by #2
|
||||
z.d
|
||||
|
||||
So diff = z.d.
|
||||
|
||||
If [5] is true now, diff = 0, so z.d = 0 too, and we have the standard-time
|
||||
spelling we wanted in the endcase described above. We're done.
|
||||
spelling we wanted in the endcase described above. We're done. Contrarily,
|
||||
if z.d = 0, then we have a UTC equivalent, and are also done.
|
||||
|
||||
If [5] is not true now, diff = z.d != 0, and z.d is the offset we need to
|
||||
add to z (in effect, z is in tz's standard time, and we need to shift the
|
||||
offset into tz's daylight time).
|
||||
local clock into tz's daylight time).
|
||||
|
||||
Let
|
||||
|
||||
|
@ -4910,45 +4903,46 @@ Let
|
|||
|
||||
and we can again ask whether
|
||||
|
||||
z'.n - z'.o = x.n - x.o [8]
|
||||
z'.n - z'.o = x.n [8]
|
||||
|
||||
If so, we're done. If not, the tzinfo class is insane, or we're trying to
|
||||
convert to the hour that can't be spelled in tz. This also requires a
|
||||
bit of proof. As before, let's compute the difference between the LHS and
|
||||
RHS of [8] (and skipping some of the justifications for the kinds of
|
||||
substitutions we've done several times already):
|
||||
If so, we're done. If not, the tzinfo class is insane, according to the
|
||||
assumptions we've made. This also requires a bit of proof. As before, let's
|
||||
compute the difference between the LHS and RHS of [8] (and skipping some of
|
||||
the justifications for the kinds of substitutions we've done several times
|
||||
already):
|
||||
|
||||
diff' = (x.n - x.o) - (z'.n - z'.o) = replacing z'.n via [7]
|
||||
(x.n - x.o) - (z.n + diff - z'.o) = replacing diff via [6]
|
||||
(x.n - x.o) - (z.n + (x.n - x.o) - (z.n - z.o) - z'.o) =
|
||||
x.n - x.o - z.n - x.n + x.o + z.n - z.o + z'.o = cancel x.n
|
||||
- x.o - z.n + x.o + z.n - z.o + z'.o = cancel x.o
|
||||
- z.n + z.n - z.o + z'.o = cancel z.n
|
||||
diff' = x.n - (z'.n - z'.o) = replacing z'.n via [7]
|
||||
x.n - (z.n + diff - z'.o) = replacing diff via [6]
|
||||
x.n - (z.n + x.n - (z.n - z.o) - z'.o) =
|
||||
x.n - z.n - x.n + z.n - z.o + z'.o = cancel x.n
|
||||
- z.n + z.n - z.o + z'.o = cancel z.n
|
||||
- z.o + z'.o = #1 twice
|
||||
-z.s - z.d + z'.s + z'.d = z and z' have same tzinfo
|
||||
z'.d - z.d
|
||||
|
||||
So z' is UTC-equivalent to x iff z'.d = z.d at this point. If they are equal,
|
||||
we've found the UTC-equivalent so are done.
|
||||
we've found the UTC-equivalent so are done. In fact, we stop with [7] and
|
||||
return z', not bothering to compute z'.d.
|
||||
|
||||
How could they differ? z' = z + z.d [7], so merely moving z' by a dst()
|
||||
offset, and starting *from* a time already in DST (we know z.d != 0), would
|
||||
have to change the result dst() returns: we start in DST, and moving a
|
||||
little further into it takes us out of DST.
|
||||
How could z.d and z'd differ? z' = z + z.d [7], so merely moving z' by
|
||||
a dst() offset, and starting *from* a time already in DST (we know z.d != 0),
|
||||
would have to change the result dst() returns: we start in DST, and moving
|
||||
a little further into it takes us out of DST.
|
||||
|
||||
There's (only) one sane case where this can happen: at the end of DST,
|
||||
there's an hour in UTC with no spelling in a hybrid tzinfo class. In US
|
||||
Eastern, that's 6:MM UTC = 1:MM EST = 2:MM EDT. During that hour, on an
|
||||
Eastern clock 1:MM is taken as being in daylight time (5:MM UTC), but 2:MM is
|
||||
taken as being in standard time (7:MM UTC). There is no local time mapping to
|
||||
6:MM UTC. The local clock jumps from 1:59 back to 1:00 again, and repeats the
|
||||
1:MM hour in standard time. Since that's what the local clock *does*, we want
|
||||
to map both UTC hours 5:MM and 6:MM to 1:MM Eastern. The result is ambiguous
|
||||
There isn't a sane case where this can happen. The closest it gets is at
|
||||
the end of DST, where there's an hour in UTC with no spelling in a hybrid
|
||||
tzinfo class. In US Eastern, that's 5:MM UTC = 0:MM EST = 1:MM EDT. During
|
||||
that hour, on an Eastern clock 1:MM is taken as being in standard time (6:MM
|
||||
UTC) because the docs insist on that, but 0:MM is taken as being in daylight
|
||||
time (4:MM UTC). There is no local time mapping to 5:MM UTC. The local
|
||||
clock jumps from 1:59 back to 1:00 again, and repeats the 1:MM hour in
|
||||
standard time. Since that's what the local clock *does*, we want to map both
|
||||
UTC hours 5:MM and 6:MM to 1:MM Eastern. The result is ambiguous
|
||||
in local time, but so it goes -- it's the way the local clock works.
|
||||
|
||||
When x = 6:MM UTC is the input to this algorithm, x.o=0, y.o=-5 and y.d=0,
|
||||
so z=1:MM. z.d=60 (minutes) then, so [5] doesn't hold and we keep going.
|
||||
z' = z + z.d = 2:MM then, and z'.d=0, and z'.d - z.d = -60 != 0 so [8]
|
||||
When x = 5:MM UTC is the input to this algorithm, x.o=0, y.o=-5 and y.d=0,
|
||||
so z=0:MM. z.d=60 (minutes) then, so [5] doesn't hold and we keep going.
|
||||
z' = z + z.d = 1:MM then, and z'.d=0, and z'.d - z.d = -60 != 0 so [8]
|
||||
(correctly) concludes that z' is not UTC-equivalent to x.
|
||||
|
||||
Because we know z.d said z was in daylight time (else [5] would have held and
|
||||
|
@ -4957,10 +4951,33 @@ and we we have stopped then), and there are only 2 possible values dst() can
|
|||
return in Eastern, it follows that z'.d must be 0 (which it is in the example,
|
||||
but the reasoning doesn't depend on the example -- it depends on there being
|
||||
two possible dst() outcomes, one zero and the other non-zero). Therefore
|
||||
z' must be in standard time, and is not the spelling we want in this case.
|
||||
z is in daylight time, and is the spelling we want. Note again that z is
|
||||
not UTC-equivalent as far as the hybrid tzinfo class is concerned (because
|
||||
it takes z as being in standard time rather than the daylight time we intend
|
||||
here), but returning it gives the real-life "local clock repeats an hour"
|
||||
behavior when mapping the "unspellable" UTC hour into tz.
|
||||
z' must be in standard time, and is the spelling we want in this case.
|
||||
|
||||
Note again that z' is not UTC-equivalent as far as the hybrid tzinfo class is
|
||||
concerned (because it takes z' as being in standard time rather than the
|
||||
daylight time we intend here), but returning it gives the real-life "local
|
||||
clock repeats an hour" behavior when mapping the "unspellable" UTC hour into
|
||||
tz.
|
||||
|
||||
When the input is 6:MM, z=1:MM and z.d=0, and we stop at once, again with
|
||||
the 1:MM standard time spelling we want.
|
||||
|
||||
So how can this break? One of the assumptions must be violated. Two
|
||||
possibilities:
|
||||
|
||||
1) [2] effectively says that y.s is invariant across all y belong to a given
|
||||
time zone. This isn't true if, for political reasons or continental drift,
|
||||
a region decides to change its base offset from UTC.
|
||||
|
||||
2) There may be versions of "double daylight" time where the tail end of
|
||||
the analysis gives up a step too early. I haven't thought about that
|
||||
enough to say.
|
||||
|
||||
In any case, it's clear that the default fromutc() is strong enough to handle
|
||||
"almost all" time zones: so long as the standard offset is invariant, it
|
||||
doesn't matter if daylight time transition points change from year to year, or
|
||||
if daylight time is skipped in some years; it doesn't matter how large or
|
||||
small dst() may get within its bounds; and it doesn't even matter if some
|
||||
perverse time zone returns a negative dst()). So a breaking case must be
|
||||
pretty bizarre, and a tzinfo subclass can override fromutc() if it is.
|
||||
--------------------------------------------------------------------------- */
|
||||
|
|
Loading…
Reference in New Issue