Skip to content

Conversation

@abellina
Copy link
Collaborator

@abellina abellina commented Oct 20, 2025

Contributes to #7529

Description

This PR removes LazySpillableColumnarBatch from the join in order to make the apis cleaner to use. I have not run performance tests on this yet.

Checklists

  • This PR has added documentation for new or modified features or behaviors.
  • This PR has added new tests or modified existing tests to cover new code paths.
    (Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.)
  • Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description.

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
Copilot AI review requested due to automatic review settings October 20, 2025 14:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes LazySpillableColumnarBatch from the join code to simplify the APIs. The changes replace the lazy spillable wrapper pattern with direct usage of SpillableColumnarBatch, which now manages reference counting through incRefCount() instead of the previous "spill only" approach.

Key changes:

  • Removed LazySpillableColumnarBatch class and replaced all usages with SpillableColumnarBatch
  • Updated join gatherers to use SpillableGatherMap instead of LazySpillableGatherMap
  • Modified resource management to use incRefCount() for shared ownership instead of spill-only wrappers

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
GpuHashJoin.scala Updated join iterators to use SpillableColumnarBatch and simplified resource management with incRefCount
GpuBroadcastNestedLoopJoinExecBase.scala Replaced LazySpillableColumnarBatch with SpillableColumnarBatch in cross join and conditional nested loop join iterators
GpuBroadcastExchangeExec.scala Added debug println statement in closeInternal
ExistenceJoin.scala Updated ExistenceJoinIterator to use SpillableColumnarBatch instead of LazySpillableColumnarBatch
GpuCartesianProductExec.scala Changed serializable batch handling and replaced spill-only pattern with incRefCount
SpillFramework.scala Added warning log when handles remain in store during close
SpillableColumnarBatch.scala Added numCols method and new apply overloads with description parameter
JoinGatherer.scala Removed LazySpillable trait and LazySpillableColumnarBatch, updated gatherers to use Spillable types
GpuShuffledSizedHashJoinExec.scala Updated join iterators to use SpillableColumnarBatch and simplified batch ownership
AbstractGpuJoinIterator.scala Updated join iterator base class to use SpillableColumnarBatch
InternalRowToColumnarBatchIterator.java Added blank line (formatting only)
GpuDeleteLoader.scala Updated delete loader to return SpillableColumnarBatch
GpuDeleteFilter.scala Updated delete filter context to use SpillableColumnarBatch
GpuDeleteFilterSuite.scala Updated test to use SpillableColumnarBatch and removed unnecessary withResource
Comments suppressed due to low confidence (1)

sql-plugin/src/main/scala/com/nvidia/spark/rapids/JoinGatherer.scala:1

  • Missing method call parentheses: 'data.numRows' should be 'data.numRows()'. This will cause a compilation error as numRows is a method, not a property.
/*

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@abellina
Copy link
Collaborator Author

build

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first pass looks good, but I still see some formatting changes that I don't think are correct.

@abellina
Copy link
Collaborator Author

build

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
@abellina abellina force-pushed the remove_lazy_hash_join branch from 3d3db07 to 81d9560 Compare October 20, 2025 16:31
@abellina
Copy link
Collaborator Author

seeing some issues while running benchmarks, I'll have to fix.

@abellina abellina marked this pull request as draft October 20, 2025 17:25
@abellina
Copy link
Collaborator Author

build

@abellina
Copy link
Collaborator Author

No improvement/regression based on this patch at sf3k:

query1: Previous (1624.3333333333333 ms) vs Current (1675.6666666666667 ms) Diff -51 E2E 0.97x
query2: Previous (1313.0 ms) vs Current (1330.0 ms) Diff -17 E2E 0.99x
query3: Previous (384.3333333333333 ms) vs Current (413.0 ms) Diff -28 E2E 0.93x
query4: Previous (4615.666666666667 ms) vs Current (4719.666666666667 ms) Diff -104 E2E 0.98x
query5: Previous (2072.0 ms) vs Current (2048.3333333333335 ms) Diff 23 E2E 1.01x
query6: Previous (766.0 ms) vs Current (729.6666666666666 ms) Diff 36 E2E 1.05x
query7: Previous (2968.6666666666665 ms) vs Current (3022.6666666666665 ms) Diff -54 E2E 0.98x
query8: Previous (767.0 ms) vs Current (750.3333333333334 ms) Diff 16 E2E 1.02x
query9: Previous (1791.6666666666667 ms) vs Current (1779.0 ms) Diff 12 E2E 1.01x
query10: Previous (1544.0 ms) vs Current (1564.3333333333333 ms) Diff -20 E2E 0.99x
query11: Previous (2926.0 ms) vs Current (2925.3333333333335 ms) Diff 0 E2E 1.00x
query12: Previous (599.6666666666666 ms) vs Current (497.0 ms) Diff 102 E2E 1.21x
query13: Previous (1152.6666666666667 ms) vs Current (1457.3333333333333 ms) Diff -304 E2E 0.79x
query14_part1: Previous (5357.666666666667 ms) vs Current (5048.0 ms) Diff 309 E2E 1.06x
query14_part2: Previous (4382.333333333333 ms) vs Current (4436.333333333333 ms) Diff -54 E2E 0.99x
query15: Previous (982.3333333333334 ms) vs Current (1032.0 ms) Diff -49 E2E 0.95x
query16: Previous (3295.3333333333335 ms) vs Current (3293.0 ms) Diff 2 E2E 1.00x
query17: Previous (1555.6666666666667 ms) vs Current (1592.0 ms) Diff -36 E2E 0.98x
query18: Previous (1731.6666666666667 ms) vs Current (1765.6666666666667 ms) Diff -34 E2E 0.98x
query19: Previous (1105.0 ms) vs Current (1427.3333333333333 ms) Diff -322 E2E 0.77x
query20: Previous (541.0 ms) vs Current (518.6666666666666 ms) Diff 22 E2E 1.04x
query21: Previous (543.0 ms) vs Current (546.0 ms) Diff -3 E2E 0.99x
query22: Previous (1026.3333333333333 ms) vs Current (1010.6666666666666 ms) Diff 15 E2E 1.02x
query23_part1: Previous (5598.666666666667 ms) vs Current (5393.666666666667 ms) Diff 205 E2E 1.04x
query23_part2: Previous (6019.0 ms) vs Current (6095.0 ms) Diff -76 E2E 0.99x
query24_part1: Previous (6573.666666666667 ms) vs Current (6665.0 ms) Diff -91 E2E 0.99x
query24_part2: Previous (6628.333333333333 ms) vs Current (6517.333333333333 ms) Diff 111 E2E 1.02x
query25: Previous (1938.0 ms) vs Current (1858.6666666666667 ms) Diff 79 E2E 1.04x
query26: Previous (718.6666666666666 ms) vs Current (726.3333333333334 ms) Diff -7 E2E 0.99x
query27: Previous (1215.6666666666667 ms) vs Current (1036.6666666666667 ms) Diff 179 E2E 1.17x
query28: Previous (4363.333333333333 ms) vs Current (4255.333333333333 ms) Diff 108 E2E 1.03x
query29: Previous (2491.3333333333335 ms) vs Current (2511.0 ms) Diff -19 E2E 0.99x
query30: Previous (1944.6666666666667 ms) vs Current (1874.0 ms) Diff 70 E2E 1.04x
query31: Previous (1642.6666666666667 ms) vs Current (1678.3333333333333 ms) Diff -35 E2E 0.98x
query32: Previous (935.6666666666666 ms) vs Current (906.3333333333334 ms) Diff 29 E2E 1.03x
query33: Previous (1036.0 ms) vs Current (1132.3333333333333 ms) Diff -96 E2E 0.91x
query34: Previous (1743.6666666666667 ms) vs Current (1798.3333333333333 ms) Diff -54 E2E 0.97x
query35: Previous (1805.6666666666667 ms) vs Current (1697.6666666666667 ms) Diff 108 E2E 1.06x
query36: Previous (1273.0 ms) vs Current (1104.6666666666667 ms) Diff 168 E2E 1.15x
query37: Previous (520.6666666666666 ms) vs Current (521.0 ms) Diff 0 E2E 1.00x
query38: Previous (1976.6666666666667 ms) vs Current (1963.6666666666667 ms) Diff 13 E2E 1.01x
query39_part1: Previous (1781.6666666666667 ms) vs Current (1963.0 ms) Diff -181 E2E 0.91x
query39_part2: Previous (1227.6666666666667 ms) vs Current (1172.3333333333333 ms) Diff 55 E2E 1.05x
query40: Previous (1113.0 ms) vs Current (1138.3333333333333 ms) Diff -25 E2E 0.98x
query41: Previous (311.3333333333333 ms) vs Current (264.6666666666667 ms) Diff 46 E2E 1.18x
query42: Previous (349.6666666666667 ms) vs Current (282.0 ms) Diff 67 E2E 1.24x
query43: Previous (826.6666666666666 ms) vs Current (804.6666666666666 ms) Diff 22 E2E 1.03x
query44: Previous (676.0 ms) vs Current (706.0 ms) Diff -30 E2E 0.96x
query45: Previous (1180.3333333333333 ms) vs Current (1233.0 ms) Diff -52 E2E 0.96x
query46: Previous (1268.0 ms) vs Current (1270.0 ms) Diff -2 E2E 1.00x
query47: Previous (1495.3333333333333 ms) vs Current (1532.3333333333333 ms) Diff -37 E2E 0.98x
query48: Previous (839.3333333333334 ms) vs Current (850.3333333333334 ms) Diff -11 E2E 0.99x
query49: Previous (1700.6666666666667 ms) vs Current (1715.3333333333333 ms) Diff -14 E2E 0.99x
query50: Previous (7327.0 ms) vs Current (7332.333333333333 ms) Diff -5 E2E 1.00x
query51: Previous (1668.6666666666667 ms) vs Current (1952.0 ms) Diff -283 E2E 0.85x
query52: Previous (538.6666666666666 ms) vs Current (503.3333333333333 ms) Diff 35 E2E 1.07x
query53: Previous (613.3333333333334 ms) vs Current (594.3333333333334 ms) Diff 19 E2E 1.03x
query54: Previous (1195.0 ms) vs Current (1264.3333333333333 ms) Diff -69 E2E 0.95x
query55: Previous (364.6666666666667 ms) vs Current (447.0 ms) Diff -82 E2E 0.82x
query56: Previous (755.3333333333334 ms) vs Current (752.0 ms) Diff 3 E2E 1.00x
query57: Previous (1288.6666666666667 ms) vs Current (1326.6666666666667 ms) Diff -38 E2E 0.97x
query58: Previous (854.3333333333334 ms) vs Current (803.3333333333334 ms) Diff 51 E2E 1.06x
query59: Previous (1893.3333333333333 ms) vs Current (1930.0 ms) Diff -36 E2E 0.98x
query60: Previous (1243.0 ms) vs Current (1231.3333333333333 ms) Diff 11 E2E 1.01x
query61: Previous (1222.3333333333333 ms) vs Current (1202.0 ms) Diff 20 E2E 1.02x
query62: Previous (1060.0 ms) vs Current (1100.3333333333333 ms) Diff -40 E2E 0.96x
query63: Previous (871.3333333333334 ms) vs Current (776.0 ms) Diff 95 E2E 1.12x
query64: Previous (13908.0 ms) vs Current (14133.333333333334 ms) Diff -225 E2E 0.98x
query65: Previous (3118.6666666666665 ms) vs Current (3091.0 ms) Diff 27 E2E 1.01x
query66: Previous (2570.3333333333335 ms) vs Current (2487.6666666666665 ms) Diff 82 E2E 1.03x
query67: Previous (16397.0 ms) vs Current (16740.0 ms) Diff -343 E2E 0.98x
query68: Previous (1142.0 ms) vs Current (1175.6666666666667 ms) Diff -33 E2E 0.97x
query69: Previous (1325.0 ms) vs Current (1436.0 ms) Diff -111 E2E 0.92x
query70: Previous (1366.0 ms) vs Current (1262.6666666666667 ms) Diff 103 E2E 1.08x
query71: Previous (2994.3333333333335 ms) vs Current (3046.0 ms) Diff -51 E2E 0.98x
query72: Previous (2612.6666666666665 ms) vs Current (2554.6666666666665 ms) Diff 58 E2E 1.02x
query73: Previous (957.0 ms) vs Current (997.0 ms) Diff -40 E2E 0.96x
query74: Previous (3257.0 ms) vs Current (3178.0 ms) Diff 79 E2E 1.02x
query75: Previous (6921.666666666667 ms) vs Current (6812.0 ms) Diff 109 E2E 1.02x
query76: Previous (1519.3333333333333 ms) vs Current (1632.3333333333333 ms) Diff -113 E2E 0.93x
query77: Previous (878.0 ms) vs Current (926.3333333333334 ms) Diff -48 E2E 0.95x
query78: Previous (7308.666666666667 ms) vs Current (7379.0 ms) Diff -70 E2E 0.99x
query79: Previous (797.6666666666666 ms) vs Current (818.0 ms) Diff -20 E2E 0.98x
query80: Previous (3927.6666666666665 ms) vs Current (3946.0 ms) Diff -18 E2E 1.00x
query81: Previous (2179.6666666666665 ms) vs Current (2011.3333333333333 ms) Diff 168 E2E 1.08x
query82: Previous (651.0 ms) vs Current (676.3333333333334 ms) Diff -25 E2E 0.96x
query83: Previous (636.0 ms) vs Current (619.6666666666666 ms) Diff 16 E2E 1.03x
query84: Previous (1368.3333333333333 ms) vs Current (1174.6666666666667 ms) Diff 193 E2E 1.16x
query85: Previous (1781.0 ms) vs Current (1595.0 ms) Diff 186 E2E 1.12x
query86: Previous (933.0 ms) vs Current (895.3333333333334 ms) Diff 37 E2E 1.04x
query87: Previous (1862.3333333333333 ms) vs Current (1858.0 ms) Diff 4 E2E 1.00x
query88: Previous (3113.3333333333335 ms) vs Current (3212.0 ms) Diff -98 E2E 0.97x
query89: Previous (1057.3333333333333 ms) vs Current (946.3333333333334 ms) Diff 110 E2E 1.12x
query90: Previous (638.3333333333334 ms) vs Current (791.0 ms) Diff -152 E2E 0.81x
query91: Previous (867.6666666666666 ms) vs Current (990.3333333333334 ms) Diff -122 E2E 0.88x
query92: Previous (487.0 ms) vs Current (563.6666666666666 ms) Diff -76 E2E 0.86x
query93: Previous (9311.333333333334 ms) vs Current (9267.333333333334 ms) Diff 44 E2E 1.00x
query94: Previous (3889.6666666666665 ms) vs Current (3995.0 ms) Diff -105 E2E 0.97x
query95: Previous (5455.333333333333 ms) vs Current (5326.0 ms) Diff 129 E2E 1.02x
query96: Previous (4371.0 ms) vs Current (5035.0 ms) Diff -664 E2E 0.87x
query97: Previous (1841.6666666666667 ms) vs Current (1882.0 ms) Diff -40 E2E 0.98x
query98: Previous (1171.6666666666667 ms) vs Current (1292.0 ms) Diff -120 E2E 0.91x
query99: Previous (1332.3333333333333 ms) vs Current (1652.0 ms) Diff -319 E2E 0.81x
benchmark: Previous (239666.66666666666 ms) vs Current (241333.33333333334 ms) Diff -1666 E2E 0.99x

@abellina
Copy link
Collaborator Author

build

@abellina abellina marked this pull request as ready for review October 20, 2025 19:00
@sameerz sameerz added the performance A performance related task/issue label Oct 25, 2025
@nvauto
Copy link
Collaborator

nvauto commented Nov 17, 2025

NOTE: release/25.12 has been created from main. Please retarget your PR to release/25.12 if it should be included in the release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance A performance related task/issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants