API: microsecond resolution for Timedelta strings #63196

jbrockmendel · 2025-11-24T22:52:20Z

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.
If I used AI to develop this pull request, I prompted it to follow AGENTS.md.

Ideally we'd do the check for needs_nano_unit at a lower level and avoid effectively re-parsing the string, but the existing parse_timedelta_string function is a tough nut to crack.

rhshadrach

lgtm

pandas/_libs/tslibs/timedeltas.pyx

jorisvandenbossche · 2025-11-26T16:01:54Z

pandas/_libs/tslibs/timedeltas.pyx

+    # TODO: more performant way of doing this check?
+    if ival % 1000 != 0:
+        return True
+    return re.search(r"\.\d{9}", item) or "ns" in item or "nano" in item


Is the 9 correct here? Shouldn't 7 be enough to have something below microseconds?

(although it is not entirely clear what this is exactly used for, will first check the rest of the PR, but some additional docstring might be useful)

Ah, is this meant for something that ends on 0's to still infer as nanosecond?

i.e.

>>> pd.Timedelta("1.123456000 seconds").unit 'ns'

i.e. that could be microseconds (no loss of data), but because the string included the nanos, we infer as nanosecond unit?

I suppose 7 digits should then also still count as nanosconds?

Right now we get:

>>> pd.Timedelta("1.1234561 seconds").unit 'ns' >>> pd.Timedelta("1.1234560 seconds").unit 'us'

good point. 7 digits should be enough. will update

jorisvandenbossche

Looks good!

jorisvandenbossche · 2025-11-26T16:08:04Z

pandas/tests/groupby/test_groupby.py

 def test_groupby_timedelta_median():
    # issue 57926
-    expected = Series(data=Timedelta("1D"), index=["foo"])
+    expected = Series(data=Timedelta("1D"), index=["foo"], dtype="m8[ns]")


I was wondering why ns, but it is the other PR that will preserve the unit of the Timedelta object when converting to an array?

Adding the dtype here preserves the current dtype for expected. The other PR will change the dtype for df["timedelta"] below, so an update will be needed after one of them gets merged

jorisvandenbossche · 2025-11-26T16:09:31Z

pandas/tests/scalar/timedelta/test_arithmetic.py

                "ufunc 'divide' cannot use operands",
                "Invalid dtype object for __floordiv__",
                r"unsupported operand type\(s\) for /: 'int' and 'str'",
+                r"unsupported operand type\(s\) for /: 'datetime.timedelta' and 'str'",


Just curious, how is this caused by the changes here?

It was only on the dev builds and when box=True so we are dividing by np.array(["1"], dtype=object), so i suspect that when td.to_timedelt64() has a "us" unit it tries casting to a pytimedelta to operate

pandas/tests/scalar/timedelta/test_constructors.py

pandas/_libs/tslibs/timedeltas.pyx

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

… td-string-parse

jbrockmendel · 2025-11-30T02:12:12Z

Updating this to also handle the array case (now that #63018 is merged) breaks a bunch of tests. I spent some time today updating those expecteds, will hopefully finish that process tomorrow.

jorisvandenbossche · 2025-11-30T10:48:51Z

pandas/_libs/tslibs/timedeltas.pyx

+    # TODO: more performant way of doing this check?
+    if ival % 1000 != 0:
+        return True
+    return re.search(r"\.\d{7}", item) or "ns" in item or "nano" in item


Suggested change

return re.search(r"\.\d{7}", item) or "ns" in item or "nano" in item

return re.search(r"\.\d{7}", item) or "ns" in item or "nano" in item.lower()

(or add or "Nano" in item)

?

I was checking the Timedelta tests for the other PR, and seeing that we allow title case

Sure. I'm still chasing down the failures that are caused when we add this to the array inference. there are some actual bugs that uncovered.

jorisvandenbossche · 2025-12-01T09:39:06Z

pandas/tests/arithmetic/test_timedelta64.py

+            isinstance(two_hours, Timedelta)
+            and two_hours.unit == "ns"
+            and box_with_array is not pd.array
+        ):
+            expected = expected.as_unit("ns")


This is because with a TimedeltaArray, the inplace += is actually inplace preserving the dtype, while for the other objects it is just doing rng = rng + two_hours under the hood?

Is that an intentional difference?

yes and yes

Could you add a comment about that to the test (it is not obvious as a reader that this is the reason for that check) ?

jorisvandenbossche · 2025-12-01T10:20:05Z

The docstrings seem to have uncovered a bug:

>>> pd.to_timedelta(["1 days 06:05:01.00003"])
TimedeltaIndex(['1 days 06:05:01.000030'], dtype='timedelta64[us]', freq=None)
>>> pd.to_timedelta(["1 days 06:05:01.00003", "15.5us"])
TimedeltaIndex(['0 days 00:01:48.301000030', '0 days 00:00:00.000015500'], dtype='timedelta64[ns]', freq=None)

The parsed integer for the first element is not cast from micro to nano when the second element infers it needs nanos (or it should just reparse the full array?)

… td-string-parse

jbrockmendel · 2025-12-01T15:42:49Z

(or it should just reparse the full array?)

Looks like it is reparsing the full array, but not respecting the "creso" arg when we do. Should now be fixed.

jorisvandenbossche · 2025-12-01T16:50:32Z

Hmm, some conda issue. Restarted, hopefully will pas now

That last commit before your latest update also has one remaining mypy failures, not sure if that is fixed now:

   /home/runner/work/pandas/pandas/pandas/core/indexes/timedeltas.py:239:56 - error: Cannot access attribute "unit" for class "NaTType"
    Attribute "unit" is unknown (reportAttributeAccessIssue)

jorisvandenbossche · 2025-12-01T19:11:36Z

pandas/core/tools/timedeltas.py

    >>> pd.to_timedelta(np.arange(5), unit="D")
    TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'],
-                   dtype='timedelta64[ns]', freq=None)
+                   dtype='timedelta64[us]', freq=None)


Suggested change

dtype='timedelta64[us]', freq=None)

dtype='timedelta64[ns]', freq=None)

jbrockmendel added 3 commits November 24, 2025 14:50

API: microsecond resolution for Timedelta strings

f656e6b

Merge branch 'main' into td-string-parse

887ded1

update exception message

c2eac39

rhshadrach approved these changes Nov 25, 2025

View reviewed changes

pandas/_libs/tslibs/timedeltas.pyx Outdated Show resolved Hide resolved

jorisvandenbossche added this to the 3.0 milestone Nov 26, 2025

jorisvandenbossche added Timedelta Timedelta data type Non-Nano datetime64/timedelta64 with non-nanosecond resolution labels Nov 26, 2025

jorisvandenbossche reviewed Nov 26, 2025

View reviewed changes

jbrockmendel added 2 commits November 26, 2025 09:34

Merge branch 'main' into td-string-parse

85a1745

handle 7 digits

d6e9464

mroeschke mentioned this pull request Nov 26, 2025

RLS: 3.0 #57064

Open

jorisvandenbossche approved these changes Nov 28, 2025

View reviewed changes

pandas/_libs/tslibs/timedeltas.pyx Outdated Show resolved Hide resolved

jbrockmendel and others added 4 commits November 28, 2025 20:00

Update pandas/_libs/tslibs/timedeltas.pyx

a615d43

Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

Merge branch 'main' into td-string-parse

506db5c

Merge branch 'td-string-parse' of github.com:jbrockmendel/pandas into…

8f9fca0

… td-string-parse

Merge branch 'main' into td-string-parse

723cba8

jorisvandenbossche reviewed Nov 30, 2025

View reviewed changes

This was referenced Nov 30, 2025

BUG/API: to_json with non-nano TimedeltaIndex #63236

Open

BUG: plotting with non-nano TimedeltaIndex #63237

Open

TST: xfail non-nano json test #63238

Open

update with array_to_timedelta64 cases

8b8e0b9

jbrockmendel mentioned this pull request Dec 1, 2025

BUG: pytables with non-nano timedelta64 #63239

Open

6 tasks

jbrockmendel added 2 commits November 30, 2025 19:06

Merge branch 'main' into td-string-parse

a27706f

mypy fixup

eec8818

jorisvandenbossche approved these changes Dec 1, 2025

View reviewed changes

update docstrings

01966a7

Merge branch 'main' into td-string-parse

f4256bf

jbrockmendel added 3 commits December 1, 2025 07:40

fix doctest example

c48de7e

fix condition

0664bd3

Merge branch 'td-string-parse' of github.com:jbrockmendel/pandas into…

c9d22df

… td-string-parse

mypy fixup

24b7a25

jorisvandenbossche reviewed Dec 1, 2025

View reviewed changes

	return re.search(r"\.\d{7}", item) or "ns" in item or "nano" in item
	return re.search(r"\.\d{7}", item) or "ns" in item or "nano" in item.lower()

	dtype='timedelta64[us]', freq=None)
	dtype='timedelta64[ns]', freq=None)

Uh oh!

API: microsecond resolution for Timedelta strings #63196

Are you sure you want to change the base?

API: microsecond resolution for Timedelta strings #63196

Conversation

jbrockmendel commented Nov 24, 2025

Uh oh!

rhshadrach left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbrockmendel commented Nov 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche commented Dec 1, 2025

Uh oh!

jbrockmendel commented Dec 1, 2025

Uh oh!

jorisvandenbossche commented Dec 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants