Updated the astimezone() proof to recover from all the last week's

changes (and there were a lot of relevant changes!).
This commit is contained in:
Tim Peters 2003-01-24 02:44:45 +00:00
parent 2fbe5378f9
commit 8bb5ad2e56
1 changed files with 92 additions and 75 deletions

View File

@ -4806,103 +4806,96 @@ Now some derived rules, where k is a duration (timedelta).
This is again a requirement for a sane tzinfo class. This is again a requirement for a sane tzinfo class.
4. (x+k).s = x.s 4. (x+k).s = x.s
This follows from #2, and that datimetime+timedelta preserves tzinfo. This follows from #2, and that datimetimetz+timedelta preserves tzinfo.
5. (x+k).n = x.n + k 5. (x+k).n = x.n + k
Again follows from how arithmetic is defined. Again follows from how arithmetic is defined.
Now we can explain x.astimezone(tz). Let's assume it's an interesting case Now we can explain tz.fromutc(x). Let's assume it's an interesting case
(meaning that the various tzinfo methods exist, and don't blow up or return (meaning that the various tzinfo methods exist, and don't blow up or return
None when called). None when called).
The function wants to return a datetime y with timezone tz, equivalent to x. The function wants to return a datetime y with timezone tz, equivalent to x.
x is already in UTC.
By #3, we want By #3, we want
y.n - y.o = x.n - x.o [1] y.n - y.o = x.n [1]
The algorithm starts by attaching tz to x.n, and calling that y. So The algorithm starts by attaching tz to x.n, and calling that y. So
x.n = y.n at the start. Then it wants to add a duration k to y, so that [1] x.n = y.n at the start. Then it wants to add a duration k to y, so that [1]
becomes true; in effect, we want to solve [2] for k: becomes true; in effect, we want to solve [2] for k:
(y+k).n - (y+k).o = x.n - x.o [2] (y+k).n - (y+k).o = x.n [2]
By #1, this is the same as By #1, this is the same as
(y+k).n - ((y+k).s + (y+k).d) = x.n - x.o [3] (y+k).n - ((y+k).s + (y+k).d) = x.n [3]
By #5, (y+k).n = y.n + k, which equals x.n + k because x.n=y.n at the start. By #5, (y+k).n = y.n + k, which equals x.n + k because x.n=y.n at the start.
Substituting that into [3], Substituting that into [3],
x.n + k - (y+k).s - (y+k).d = x.n - x.o; the x.n terms cancel, leaving x.n + k - (y+k).s - (y+k).d = x.n; the x.n terms cancel, leaving
k - (y+k).s - (y+k).d = - x.o; rearranging, k - (y+k).s - (y+k).d = 0; rearranging,
k = (y+k).s - x.o - (y+k).d; by #4, (y+k).s == y.s, so k = (y+k).s - (y+k).d; by #4, (y+k).s == y.s, so
k = y.s - x.o - (y+k).d; then by #1, y.s = y.o - y.d, so k = y.s - (y+k).d
k = y.o - y.d - x.o - (y+k).d
On the RHS, (y+k).d can't be computed directly, but all the rest can be, and On the RHS, (y+k).d can't be computed directly, but y.s can be, and we
we approximate k by ignoring the (y+k).d term at first. Note that k can't approximate k by ignoring the (y+k).d term at first. Note that k can't be
be very large, since all offset-returning methods return a duration of very large, since all offset-returning methods return a duration of magnitude
magnitude less than 24 hours. For that reason, if y is firmly in std time, less than 24 hours. For that reason, if y is firmly in std time, (y+k).d must
(y+k).d must be 0, so ignoring it has no consequence then. be 0, so ignoring it has no consequence then.
In any case, the new value is In any case, the new value is
z = y + y.o - y.d - x.o [4] z = y + y.s [4]
It's helpful to step back at look at [4] from a higher level: rewrite it as It's helpful to step back at look at [4] from a higher level: it's simply
mapping from UTC to tz's standard time.
z = (y - x.o) + (y.o - y.d)
(y - x.o).n = [by #5] y.n - x.o = [since y.n=x.n] x.n - x.o = [by #3] x's
UTC equivalent time. So the y-x.o part essentially converts x to UTC. Then
the y.o-y.d part essentially converts x's UTC equivalent into tz's standard
time (y.o-y.d=y.s by #1).
At this point, if At this point, if
z.n - z.o = x.n - x.o [5] z.n - z.o = x.n [5]
we have an equivalent time, and are almost done. The insecurity here is we have an equivalent time, and are almost done. The insecurity here is
at the start of daylight time. Picture US Eastern for concreteness. The wall at the start of daylight time. Picture US Eastern for concreteness. The wall
time jumps from 1:59 to 3:00, and wall hours of the form 2:MM don't make good time jumps from 1:59 to 3:00, and wall hours of the form 2:MM don't make good
sense then. A sensible Eastern tzinfo class will consider such a time to be sense then. The docs ask that an Eastern tzinfo class consider such a time to
EDT (because it's "after 2"), which is a redundant spelling of 1:MM EST on the be EDT (because it's "after 2"), which is a redundant spelling of 1:MM EST
day DST starts. We want to return the 1:MM EST spelling because that's on the day DST starts. We want to return the 1:MM EST spelling because that's
the only spelling that makes sense on the local wall clock. the only spelling that makes sense on the local wall clock.
In fact, if [5] holds at this point, we do have the standard-time spelling, In fact, if [5] holds at this point, we do have the standard-time spelling,
but that takes a bit of proof. We first prove a stronger result. What's the but that takes a bit of proof. We first prove a stronger result. What's the
difference between the LHS and RHS of [5]? Let difference between the LHS and RHS of [5]? Let
diff = (x.n - x.o) - (z.n - z.o) [6] diff = x.n - (z.n - z.o) [6]
Now Now
z.n = by [4] z.n = by [4]
(y + y.o - y.d - x.o).n = by #5 (y + y.s).n = by #5
y.n + y.o - y.d - x.o = since y.n = x.n y.n + y.s = since y.n = x.n
x.n + y.o - y.d - x.o = since y.o = y.s + y.d by #1 x.n + y.s = since z and y are have the same tzinfo member,
x.n + (y.s + y.d) - y.d - x.o = cancelling the y.d terms y.s = z.s by #2
x.n + y.s - x.o = since z and y are have the same tzinfo member, x.n + z.s
y.s = z.s by #2
x.n + z.s - x.o
Plugging that back into [6] gives Plugging that back into [6] gives
diff = diff =
(x.n - x.o) - ((x.n + z.s - x.o) - z.o) = expanding x.n - ((x.n + z.s) - z.o) = expanding
x.n - x.o - x.n - z.s + x.o + z.o = cancelling x.n - x.n - z.s + z.o = cancelling
- z.s + z.o = by #2 - z.s + z.o = by #2
z.d z.d
So diff = z.d. So diff = z.d.
If [5] is true now, diff = 0, so z.d = 0 too, and we have the standard-time If [5] is true now, diff = 0, so z.d = 0 too, and we have the standard-time
spelling we wanted in the endcase described above. We're done. spelling we wanted in the endcase described above. We're done. Contrarily,
if z.d = 0, then we have a UTC equivalent, and are also done.
If [5] is not true now, diff = z.d != 0, and z.d is the offset we need to If [5] is not true now, diff = z.d != 0, and z.d is the offset we need to
add to z (in effect, z is in tz's standard time, and we need to shift the add to z (in effect, z is in tz's standard time, and we need to shift the
offset into tz's daylight time). local clock into tz's daylight time).
Let Let
@ -4910,45 +4903,46 @@ Let
and we can again ask whether and we can again ask whether
z'.n - z'.o = x.n - x.o [8] z'.n - z'.o = x.n [8]
If so, we're done. If not, the tzinfo class is insane, or we're trying to If so, we're done. If not, the tzinfo class is insane, according to the
convert to the hour that can't be spelled in tz. This also requires a assumptions we've made. This also requires a bit of proof. As before, let's
bit of proof. As before, let's compute the difference between the LHS and compute the difference between the LHS and RHS of [8] (and skipping some of
RHS of [8] (and skipping some of the justifications for the kinds of the justifications for the kinds of substitutions we've done several times
substitutions we've done several times already): already):
diff' = (x.n - x.o) - (z'.n - z'.o) = replacing z'.n via [7] diff' = x.n - (z'.n - z'.o) = replacing z'.n via [7]
(x.n - x.o) - (z.n + diff - z'.o) = replacing diff via [6] x.n - (z.n + diff - z'.o) = replacing diff via [6]
(x.n - x.o) - (z.n + (x.n - x.o) - (z.n - z.o) - z'.o) = x.n - (z.n + x.n - (z.n - z.o) - z'.o) =
x.n - x.o - z.n - x.n + x.o + z.n - z.o + z'.o = cancel x.n x.n - z.n - x.n + z.n - z.o + z'.o = cancel x.n
- x.o - z.n + x.o + z.n - z.o + z'.o = cancel x.o - z.n + z.n - z.o + z'.o = cancel z.n
- z.n + z.n - z.o + z'.o = cancel z.n
- z.o + z'.o = #1 twice - z.o + z'.o = #1 twice
-z.s - z.d + z'.s + z'.d = z and z' have same tzinfo -z.s - z.d + z'.s + z'.d = z and z' have same tzinfo
z'.d - z.d z'.d - z.d
So z' is UTC-equivalent to x iff z'.d = z.d at this point. If they are equal, So z' is UTC-equivalent to x iff z'.d = z.d at this point. If they are equal,
we've found the UTC-equivalent so are done. we've found the UTC-equivalent so are done. In fact, we stop with [7] and
return z', not bothering to compute z'.d.
How could they differ? z' = z + z.d [7], so merely moving z' by a dst() How could z.d and z'd differ? z' = z + z.d [7], so merely moving z' by
offset, and starting *from* a time already in DST (we know z.d != 0), would a dst() offset, and starting *from* a time already in DST (we know z.d != 0),
have to change the result dst() returns: we start in DST, and moving a would have to change the result dst() returns: we start in DST, and moving
little further into it takes us out of DST. a little further into it takes us out of DST.
There's (only) one sane case where this can happen: at the end of DST, There isn't a sane case where this can happen. The closest it gets is at
there's an hour in UTC with no spelling in a hybrid tzinfo class. In US the end of DST, where there's an hour in UTC with no spelling in a hybrid
Eastern, that's 6:MM UTC = 1:MM EST = 2:MM EDT. During that hour, on an tzinfo class. In US Eastern, that's 5:MM UTC = 0:MM EST = 1:MM EDT. During
Eastern clock 1:MM is taken as being in daylight time (5:MM UTC), but 2:MM is that hour, on an Eastern clock 1:MM is taken as being in standard time (6:MM
taken as being in standard time (7:MM UTC). There is no local time mapping to UTC) because the docs insist on that, but 0:MM is taken as being in daylight
6:MM UTC. The local clock jumps from 1:59 back to 1:00 again, and repeats the time (4:MM UTC). There is no local time mapping to 5:MM UTC. The local
1:MM hour in standard time. Since that's what the local clock *does*, we want clock jumps from 1:59 back to 1:00 again, and repeats the 1:MM hour in
to map both UTC hours 5:MM and 6:MM to 1:MM Eastern. The result is ambiguous standard time. Since that's what the local clock *does*, we want to map both
UTC hours 5:MM and 6:MM to 1:MM Eastern. The result is ambiguous
in local time, but so it goes -- it's the way the local clock works. in local time, but so it goes -- it's the way the local clock works.
When x = 6:MM UTC is the input to this algorithm, x.o=0, y.o=-5 and y.d=0, When x = 5:MM UTC is the input to this algorithm, x.o=0, y.o=-5 and y.d=0,
so z=1:MM. z.d=60 (minutes) then, so [5] doesn't hold and we keep going. so z=0:MM. z.d=60 (minutes) then, so [5] doesn't hold and we keep going.
z' = z + z.d = 2:MM then, and z'.d=0, and z'.d - z.d = -60 != 0 so [8] z' = z + z.d = 1:MM then, and z'.d=0, and z'.d - z.d = -60 != 0 so [8]
(correctly) concludes that z' is not UTC-equivalent to x. (correctly) concludes that z' is not UTC-equivalent to x.
Because we know z.d said z was in daylight time (else [5] would have held and Because we know z.d said z was in daylight time (else [5] would have held and
@ -4957,10 +4951,33 @@ and we we have stopped then), and there are only 2 possible values dst() can
return in Eastern, it follows that z'.d must be 0 (which it is in the example, return in Eastern, it follows that z'.d must be 0 (which it is in the example,
but the reasoning doesn't depend on the example -- it depends on there being but the reasoning doesn't depend on the example -- it depends on there being
two possible dst() outcomes, one zero and the other non-zero). Therefore two possible dst() outcomes, one zero and the other non-zero). Therefore
z' must be in standard time, and is not the spelling we want in this case. z' must be in standard time, and is the spelling we want in this case.
z is in daylight time, and is the spelling we want. Note again that z is
not UTC-equivalent as far as the hybrid tzinfo class is concerned (because Note again that z' is not UTC-equivalent as far as the hybrid tzinfo class is
it takes z as being in standard time rather than the daylight time we intend concerned (because it takes z' as being in standard time rather than the
here), but returning it gives the real-life "local clock repeats an hour" daylight time we intend here), but returning it gives the real-life "local
behavior when mapping the "unspellable" UTC hour into tz. clock repeats an hour" behavior when mapping the "unspellable" UTC hour into
tz.
When the input is 6:MM, z=1:MM and z.d=0, and we stop at once, again with
the 1:MM standard time spelling we want.
So how can this break? One of the assumptions must be violated. Two
possibilities:
1) [2] effectively says that y.s is invariant across all y belong to a given
time zone. This isn't true if, for political reasons or continental drift,
a region decides to change its base offset from UTC.
2) There may be versions of "double daylight" time where the tail end of
the analysis gives up a step too early. I haven't thought about that
enough to say.
In any case, it's clear that the default fromutc() is strong enough to handle
"almost all" time zones: so long as the standard offset is invariant, it
doesn't matter if daylight time transition points change from year to year, or
if daylight time is skipped in some years; it doesn't matter how large or
small dst() may get within its bounds; and it doesn't even matter if some
perverse time zone returns a negative dst()). So a breaking case must be
pretty bizarre, and a tzinfo subclass can override fromutc() if it is.
--------------------------------------------------------------------------- */ --------------------------------------------------------------------------- */