From 8bb5ad2e566102e7319f6d5b8eac3347c64261cf Mon Sep 17 00:00:00 2001 From: Tim Peters Date: Fri, 24 Jan 2003 02:44:45 +0000 Subject: [PATCH] Updated the astimezone() proof to recover from all the last week's changes (and there were a lot of relevant changes!). --- Modules/datetimemodule.c | 167 +++++++++++++++++++++------------------ 1 file changed, 92 insertions(+), 75 deletions(-) diff --git a/Modules/datetimemodule.c b/Modules/datetimemodule.c index 82f5dd9f39e..6f0cf1e6b59 100644 --- a/Modules/datetimemodule.c +++ b/Modules/datetimemodule.c @@ -4806,103 +4806,96 @@ Now some derived rules, where k is a duration (timedelta). This is again a requirement for a sane tzinfo class. 4. (x+k).s = x.s - This follows from #2, and that datimetime+timedelta preserves tzinfo. + This follows from #2, and that datimetimetz+timedelta preserves tzinfo. 5. (x+k).n = x.n + k Again follows from how arithmetic is defined. -Now we can explain x.astimezone(tz). Let's assume it's an interesting case +Now we can explain tz.fromutc(x). Let's assume it's an interesting case (meaning that the various tzinfo methods exist, and don't blow up or return None when called). The function wants to return a datetime y with timezone tz, equivalent to x. +x is already in UTC. By #3, we want - y.n - y.o = x.n - x.o [1] + y.n - y.o = x.n [1] The algorithm starts by attaching tz to x.n, and calling that y. So x.n = y.n at the start. Then it wants to add a duration k to y, so that [1] becomes true; in effect, we want to solve [2] for k: - (y+k).n - (y+k).o = x.n - x.o [2] + (y+k).n - (y+k).o = x.n [2] By #1, this is the same as - (y+k).n - ((y+k).s + (y+k).d) = x.n - x.o [3] + (y+k).n - ((y+k).s + (y+k).d) = x.n [3] By #5, (y+k).n = y.n + k, which equals x.n + k because x.n=y.n at the start. Substituting that into [3], - x.n + k - (y+k).s - (y+k).d = x.n - x.o; the x.n terms cancel, leaving - k - (y+k).s - (y+k).d = - x.o; rearranging, - k = (y+k).s - x.o - (y+k).d; by #4, (y+k).s == y.s, so - k = y.s - x.o - (y+k).d; then by #1, y.s = y.o - y.d, so - k = y.o - y.d - x.o - (y+k).d + x.n + k - (y+k).s - (y+k).d = x.n; the x.n terms cancel, leaving + k - (y+k).s - (y+k).d = 0; rearranging, + k = (y+k).s - (y+k).d; by #4, (y+k).s == y.s, so + k = y.s - (y+k).d -On the RHS, (y+k).d can't be computed directly, but all the rest can be, and -we approximate k by ignoring the (y+k).d term at first. Note that k can't -be very large, since all offset-returning methods return a duration of -magnitude less than 24 hours. For that reason, if y is firmly in std time, -(y+k).d must be 0, so ignoring it has no consequence then. +On the RHS, (y+k).d can't be computed directly, but y.s can be, and we +approximate k by ignoring the (y+k).d term at first. Note that k can't be +very large, since all offset-returning methods return a duration of magnitude +less than 24 hours. For that reason, if y is firmly in std time, (y+k).d must +be 0, so ignoring it has no consequence then. In any case, the new value is - z = y + y.o - y.d - x.o [4] + z = y + y.s [4] -It's helpful to step back at look at [4] from a higher level: rewrite it as - - z = (y - x.o) + (y.o - y.d) - -(y - x.o).n = [by #5] y.n - x.o = [since y.n=x.n] x.n - x.o = [by #3] x's -UTC equivalent time. So the y-x.o part essentially converts x to UTC. Then -the y.o-y.d part essentially converts x's UTC equivalent into tz's standard -time (y.o-y.d=y.s by #1). +It's helpful to step back at look at [4] from a higher level: it's simply +mapping from UTC to tz's standard time. At this point, if - z.n - z.o = x.n - x.o [5] + z.n - z.o = x.n [5] we have an equivalent time, and are almost done. The insecurity here is at the start of daylight time. Picture US Eastern for concreteness. The wall time jumps from 1:59 to 3:00, and wall hours of the form 2:MM don't make good -sense then. A sensible Eastern tzinfo class will consider such a time to be -EDT (because it's "after 2"), which is a redundant spelling of 1:MM EST on the -day DST starts. We want to return the 1:MM EST spelling because that's +sense then. The docs ask that an Eastern tzinfo class consider such a time to +be EDT (because it's "after 2"), which is a redundant spelling of 1:MM EST +on the day DST starts. We want to return the 1:MM EST spelling because that's the only spelling that makes sense on the local wall clock. In fact, if [5] holds at this point, we do have the standard-time spelling, but that takes a bit of proof. We first prove a stronger result. What's the difference between the LHS and RHS of [5]? Let - diff = (x.n - x.o) - (z.n - z.o) [6] + diff = x.n - (z.n - z.o) [6] Now z.n = by [4] - (y + y.o - y.d - x.o).n = by #5 - y.n + y.o - y.d - x.o = since y.n = x.n - x.n + y.o - y.d - x.o = since y.o = y.s + y.d by #1 - x.n + (y.s + y.d) - y.d - x.o = cancelling the y.d terms - x.n + y.s - x.o = since z and y are have the same tzinfo member, - y.s = z.s by #2 - x.n + z.s - x.o + (y + y.s).n = by #5 + y.n + y.s = since y.n = x.n + x.n + y.s = since z and y are have the same tzinfo member, + y.s = z.s by #2 + x.n + z.s Plugging that back into [6] gives diff = - (x.n - x.o) - ((x.n + z.s - x.o) - z.o) = expanding - x.n - x.o - x.n - z.s + x.o + z.o = cancelling - - z.s + z.o = by #2 + x.n - ((x.n + z.s) - z.o) = expanding + x.n - x.n - z.s + z.o = cancelling + - z.s + z.o = by #2 z.d So diff = z.d. If [5] is true now, diff = 0, so z.d = 0 too, and we have the standard-time -spelling we wanted in the endcase described above. We're done. +spelling we wanted in the endcase described above. We're done. Contrarily, +if z.d = 0, then we have a UTC equivalent, and are also done. If [5] is not true now, diff = z.d != 0, and z.d is the offset we need to add to z (in effect, z is in tz's standard time, and we need to shift the -offset into tz's daylight time). +local clock into tz's daylight time). Let @@ -4910,45 +4903,46 @@ Let and we can again ask whether - z'.n - z'.o = x.n - x.o [8] + z'.n - z'.o = x.n [8] -If so, we're done. If not, the tzinfo class is insane, or we're trying to -convert to the hour that can't be spelled in tz. This also requires a -bit of proof. As before, let's compute the difference between the LHS and -RHS of [8] (and skipping some of the justifications for the kinds of -substitutions we've done several times already): +If so, we're done. If not, the tzinfo class is insane, according to the +assumptions we've made. This also requires a bit of proof. As before, let's +compute the difference between the LHS and RHS of [8] (and skipping some of +the justifications for the kinds of substitutions we've done several times +already): - diff' = (x.n - x.o) - (z'.n - z'.o) = replacing z'.n via [7] - (x.n - x.o) - (z.n + diff - z'.o) = replacing diff via [6] - (x.n - x.o) - (z.n + (x.n - x.o) - (z.n - z.o) - z'.o) = - x.n - x.o - z.n - x.n + x.o + z.n - z.o + z'.o = cancel x.n - - x.o - z.n + x.o + z.n - z.o + z'.o = cancel x.o - - z.n + z.n - z.o + z'.o = cancel z.n + diff' = x.n - (z'.n - z'.o) = replacing z'.n via [7] + x.n - (z.n + diff - z'.o) = replacing diff via [6] + x.n - (z.n + x.n - (z.n - z.o) - z'.o) = + x.n - z.n - x.n + z.n - z.o + z'.o = cancel x.n + - z.n + z.n - z.o + z'.o = cancel z.n - z.o + z'.o = #1 twice -z.s - z.d + z'.s + z'.d = z and z' have same tzinfo z'.d - z.d So z' is UTC-equivalent to x iff z'.d = z.d at this point. If they are equal, -we've found the UTC-equivalent so are done. +we've found the UTC-equivalent so are done. In fact, we stop with [7] and +return z', not bothering to compute z'.d. -How could they differ? z' = z + z.d [7], so merely moving z' by a dst() -offset, and starting *from* a time already in DST (we know z.d != 0), would -have to change the result dst() returns: we start in DST, and moving a -little further into it takes us out of DST. +How could z.d and z'd differ? z' = z + z.d [7], so merely moving z' by +a dst() offset, and starting *from* a time already in DST (we know z.d != 0), +would have to change the result dst() returns: we start in DST, and moving +a little further into it takes us out of DST. -There's (only) one sane case where this can happen: at the end of DST, -there's an hour in UTC with no spelling in a hybrid tzinfo class. In US -Eastern, that's 6:MM UTC = 1:MM EST = 2:MM EDT. During that hour, on an -Eastern clock 1:MM is taken as being in daylight time (5:MM UTC), but 2:MM is -taken as being in standard time (7:MM UTC). There is no local time mapping to -6:MM UTC. The local clock jumps from 1:59 back to 1:00 again, and repeats the -1:MM hour in standard time. Since that's what the local clock *does*, we want -to map both UTC hours 5:MM and 6:MM to 1:MM Eastern. The result is ambiguous +There isn't a sane case where this can happen. The closest it gets is at +the end of DST, where there's an hour in UTC with no spelling in a hybrid +tzinfo class. In US Eastern, that's 5:MM UTC = 0:MM EST = 1:MM EDT. During +that hour, on an Eastern clock 1:MM is taken as being in standard time (6:MM +UTC) because the docs insist on that, but 0:MM is taken as being in daylight +time (4:MM UTC). There is no local time mapping to 5:MM UTC. The local +clock jumps from 1:59 back to 1:00 again, and repeats the 1:MM hour in +standard time. Since that's what the local clock *does*, we want to map both +UTC hours 5:MM and 6:MM to 1:MM Eastern. The result is ambiguous in local time, but so it goes -- it's the way the local clock works. -When x = 6:MM UTC is the input to this algorithm, x.o=0, y.o=-5 and y.d=0, -so z=1:MM. z.d=60 (minutes) then, so [5] doesn't hold and we keep going. -z' = z + z.d = 2:MM then, and z'.d=0, and z'.d - z.d = -60 != 0 so [8] +When x = 5:MM UTC is the input to this algorithm, x.o=0, y.o=-5 and y.d=0, +so z=0:MM. z.d=60 (minutes) then, so [5] doesn't hold and we keep going. +z' = z + z.d = 1:MM then, and z'.d=0, and z'.d - z.d = -60 != 0 so [8] (correctly) concludes that z' is not UTC-equivalent to x. Because we know z.d said z was in daylight time (else [5] would have held and @@ -4957,10 +4951,33 @@ and we we have stopped then), and there are only 2 possible values dst() can return in Eastern, it follows that z'.d must be 0 (which it is in the example, but the reasoning doesn't depend on the example -- it depends on there being two possible dst() outcomes, one zero and the other non-zero). Therefore -z' must be in standard time, and is not the spelling we want in this case. -z is in daylight time, and is the spelling we want. Note again that z is -not UTC-equivalent as far as the hybrid tzinfo class is concerned (because -it takes z as being in standard time rather than the daylight time we intend -here), but returning it gives the real-life "local clock repeats an hour" -behavior when mapping the "unspellable" UTC hour into tz. +z' must be in standard time, and is the spelling we want in this case. + +Note again that z' is not UTC-equivalent as far as the hybrid tzinfo class is +concerned (because it takes z' as being in standard time rather than the +daylight time we intend here), but returning it gives the real-life "local +clock repeats an hour" behavior when mapping the "unspellable" UTC hour into +tz. + +When the input is 6:MM, z=1:MM and z.d=0, and we stop at once, again with +the 1:MM standard time spelling we want. + +So how can this break? One of the assumptions must be violated. Two +possibilities: + +1) [2] effectively says that y.s is invariant across all y belong to a given + time zone. This isn't true if, for political reasons or continental drift, + a region decides to change its base offset from UTC. + +2) There may be versions of "double daylight" time where the tail end of + the analysis gives up a step too early. I haven't thought about that + enough to say. + +In any case, it's clear that the default fromutc() is strong enough to handle +"almost all" time zones: so long as the standard offset is invariant, it +doesn't matter if daylight time transition points change from year to year, or +if daylight time is skipped in some years; it doesn't matter how large or +small dst() may get within its bounds; and it doesn't even matter if some +perverse time zone returns a negative dst()). So a breaking case must be +pretty bizarre, and a tzinfo subclass can override fromutc() if it is. --------------------------------------------------------------------------- */