Skip to content

ENH: fallback to zoneinfo Python API (for correct tz localization) for distant dates#65733

Draft
jorisvandenbossche wants to merge 1 commit into
pandas-dev:mainfrom
jorisvandenbossche:tz-conversion-py-fallback
Draft

ENH: fallback to zoneinfo Python API (for correct tz localization) for distant dates#65733
jorisvandenbossche wants to merge 1 commit into
pandas-dev:mainfrom
jorisvandenbossche:tz-conversion-py-fallback

Conversation

@jorisvandenbossche
Copy link
Copy Markdown
Member

@jorisvandenbossche jorisvandenbossche commented May 25, 2026

Attempt for the third option mentioned in #65712: Use our fast-path for a "normal" range (eg up to 2100) to cover most use cases, and then for dates after that, use the zoneinfo python API directly as a slower fallback.

This is adding back a modified version of #65481, but then inside the general for-loop to fallback per value, instead of always for zoneinfo tz objects.

I tried parametrizing some of of the tz_localize tests for a current year and a distant year, to have some coverage for this fallback.

If we would go with this PR, then #65705 (expanding the cached transition data's range to 2262 instead of 2100) might not be needed

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@jorisvandenbossche jorisvandenbossche added the Timezones Timezone data dtype label May 25, 2026
and info.deltas[delta_idx - 1] >= 0
):
delta_idx = delta_idx - 1
if info.use_zoneinfo and new_local > info.tdata[info.ntrans - 1]:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: I have to check (and test) a bit for new_local close that last transition value, because AFAIK new_local is supposedly local time but tdata are UTC values.

So in theory the local time could still be smaller than the last transition value, while it is actually (in UTC) beyond it. But it might be that this is fine, since offsets beyond that last transition value should still be correct until the next transition?
(in theory one could probably create a tz rule with two transitions only a few hours apart, but that should not occur in practice?)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we'd still get the correct result, just not take the fastpath?

"start_ts, tz, end_ts, shift",
[
["2015-03-29 02:20:00", "Europe/Warsaw", "2015-03-29 03:00:00", "forward"],
["2018-03-25 02:20:00", "Europe/Warsaw", "2018-03-25 03:00:00", "forward"],
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The year is only changed here so that all values in this parametrization have the same year (below there are existing cases with 2018), such that replacing with the distant year works for all cases

and info.deltas[delta_idx - 1] >= 0
):
delta_idx = delta_idx - 1
if info.use_zoneinfo and new_local > info.tdata[info.ntrans - 1]:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we calculate info.tdata[info.ntrans - 1] just once?

@jbrockmendel
Copy link
Copy Markdown
Member

I think this is a reasonable approach, haven't close-reviewed yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Timezones Timezone data dtype

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants