Skip to content

[iceberg] Storage Partition Join (SPJ) returns mismatch results #2119

@hsiang-c

Description

@hsiang-c

Describe the bug

TestStoragePartitionedJoins > testJoinsWithDaysOnTimestampColumn() > catalogName = testhadoop, implementation = org.apache.iceberg.spark.SparkCatalog, config = {type=hadoop, cache-enabled=false}, planningMode = LOCAL FAILED
    java.lang.AssertionError: [SPJ should not change query output: number of results should match] 
    Actual and expected should have same size but actual size is:
      2
    while expected size is:
      178
    Actual was:
      [[140534217292656L, -1406103584, 1974-06-15T13:16:57.599 (java.sql.Timestamp)],
        [9223372036854775807L,
        -2147483648,
        1932-03-23T21:28:39.797 (java.sql.Timestamp)]]
    Expected was:
      [[-9223372036854775808L,
        1650132067,
        1920-06-09T14:08:12.160 (java.sql.Timestamp)],
        [-9223372036854775808L,
        -2147483648,
        1920-12-20T23:58:32.428 (java.sql.Timestamp)],
        [-9223372036854775808L,
        -1769087495,
        1956-04-13T03:00:08.371 (java.sql.Timestamp)],
        [-9223372036854775808L,
        -2065479922,
        1965-07-15T02:08:21.220 (java.sql.Timestamp)],
        [-9223372036854775808L,
        1127801700,
        1984-01-05T20:27:11.007 (java.sql.Timestamp)],
        [-9223372036854775808L,
        -713629100,
        2003-07-16T17:24:57.172 (java.sql.Timestamp)],
        [-9223372036854775808L,
        -593534002,
        2008-11-03T04:00:07.942 (java.sql.Timestamp)],
        [-9223372036854775808L,
        2147483647,
        2011-02-08T11:20:18.093 (java.sql.Timestamp)],
        [-9223372036854775808L,
        1368584943,
        2016-09-05T15:27:54.576 (java.sql.Timestamp)],
        [-9120159428539361639L,
        -687129009,
        1946-08-31T06:24:57.604 (java.sql.Timestamp)],
        [-8941870129025646747L, 686371606, 1990-11-10T01:11:27.403 (java.sql.Timestamp)],
        [-8769180425344970336L, 0, 1922-01-26T09:45:38.503 (java.sql.Timestamp)],
        [-8599333797261938299L,
        -367120745,
        1930-06-01T20:43:08.098 (java.sql.Timestamp)],
        [-8563313190802122138L,
        2116580964,
        1933-06-03T06:45:26.917 (java.sql.Timestamp)],
        [-8459922072023254507L,
        -1201611686,
        1981-04-22T12:25:11.286 (java.sql.Timestamp)],
        [-8329438904283095774L, 0, 1952-12-02T12:17:52.526 (java.sql.Timestamp)],
        [-8305302106015115688L, 238730129, 2000-10-20T23:53:19.086 (java.sql.Timestamp)],
        [-7806941514492271995L,
        -2147483648,
        1959-02-08T11:13:26.943 (java.sql.Timestamp)],
        [-7705184547197615951L,
        1763472818,
        1940-08-08T11:50:39.686 (java.sql.Timestamp)],
        [-7555272422567687896L, 622796079, 1961-05-13T00:10:15.856 (java.sql.Timestamp)],
        [-7503583964229111866L, 0, 1954-06-24T14:23:10.389 (java.sql.Timestamp)],
        [-7500014004281248580L, 639623106, 1963-09-09T06:44:57.557 (java.sql.Timestamp)],
        [-7244480259266885303L,
        -1156184787,
        1951-10-21T09:32:11.357 (java.sql.Timestamp)],
        [-7182830067069269905L,
        -2147483648,
        1968-11-04T13:19:35.642 (java.sql.Timestamp)],
        [-7008688991718472221L,
        1024342239,
        1940-03-09T16:41:50.118 (java.sql.Timestamp)],
        [-6925694130040970654L, -13453556, 1995-07-26T20:10:03.778 (java.sql.Timestamp)],
        [-6625431310305617772L,
        -1430871151,
        1986-04-09T01:52:15.763 (java.sql.Timestamp)],
        [-6396101906293679482L,
        -1406291688,
        2005-09-22T12:25:37.020 (java.sql.Timestamp)],
        [-6281693934592022761L,
        -1164826604,
        1975-09-11T10:06:14.634 (java.sql.Timestamp)],
        [-6089332048478515019L,
        -513462194,
        1966-12-23T11:11:01.732 (java.sql.Timestamp)],
        [-5834146448441283484L,
        2072755304,
        2019-10-12T20:25:55.747 (java.sql.Timestamp)],
        [-5755484195525570438L, null, 1923-08-27T12:21:39.750 (java.sql.Timestamp)],
        [-5754825772160496080L,
        -115501757,
        2004-02-19T21:02:49.077 (java.sql.Timestamp)],
        [-5743919551267123659L, null, 1974-07-25T19:05:37.370 (java.sql.Timestamp)],
        [-5719690575850774808L,
        -1413982619,
        2017-03-21T03:43:04.922 (java.sql.Timestamp)],
        [-5707672370091761013L,
        2147483647,
        1935-12-06T03:39:48.102 (java.sql.Timestamp)],
        [-5660963231016779954L, 462923418, 1943-03-30T16:57:31.208 (java.sql.Timestamp)],
        [-5596654713944733569L, null, 1980-09-16T10:21:54.616 (java.sql.Timestamp)],
        [-5465499096649258077L, 741520358, 1959-06-07T16:57:11.882 (java.sql.Timestamp)],
        [-5364377487206044023L, null, 1964-08-19T19:55:24.268 (java.sql.Timestamp)],
        [-5323840495266574478L,
        1322158356,
        2010-11-19T17:35:46.999 (java.sql.Timestamp)],
        [-5322681299929659730L,
        -905912955,
        1991-05-18T20:01:22.193 (java.sql.Timestamp)],
        [-5130150117578180586L, 869823999, 1928-01-09T08:40:37.493 (java.sql.Timestamp)],
        [-4969456586130451623L, 885467306, 1954-03-19T00:03:40.205 (java.sql.Timestamp)],
        [-4928655234943867516L, null, 2004-02-04T14:03:53.928 (java.sql.Timestamp)],
        [-4787022384348844445L,
        2147483647,
        1929-12-09T22:05:50.104 (java.sql.Timestamp)],
        [-4750573931075731782L,
        -2129762461,
        1936-09-10T13:37:39.598 (java.sql.Timestamp)],
        [-4736249434991636459L,
        -212480636,
        1992-03-05T16:36:09.559 (java.sql.Timestamp)],
        [-4716545504160437453L, 698451784, 2007-06-30T17:57:01.754 (java.sql.Timestamp)],
        [-4249802838372158016L,
        2037520871,
        2011-06-23T10:07:52.033 (java.sql.Timestamp)],
        [-4237173817802008222L, 448848760, 1975-01-08T22:41:42.220 (java.sql.Timestamp)],
        [-4169582156941034132L, 188133978, 1977-10-05T23:20:44.741 (java.sql.Timestamp)],
        [-4079197736038352319L,
        -2147483648,
        2006-04-20T12:18:05.412 (java.sql.Timestamp)],
        [-4010614443645193307L,
        -1031656866,
        2001-02-05T06:08:00.952 (java.sql.Timestamp)],
        [-3564579169151130364L,
        -188173652,
        2001-12-11T13:19:27.953 (java.sql.Timestamp)],
        [-3403651146605291779L,
        -959306978,
        1926-04-15T11:39:53.319 (java.sql.Timestamp)],
        [-3135856295433232036L, 656053644, 1937-09-01T13:26:38.614 (java.sql.Timestamp)],
        [-2726377905668691505L, 0, 1995-06-18T12:57:00.646 (java.sql.Timestamp)],
        [-2327702443608084539L,
        -1608446524,
        1991-07-09T13:58:29.245 (java.sql.Timestamp)],
        [-2282623849526784568L, 433591263, 1990-05-07T09:23:18.570 (java.sql.Timestamp)],
        [-2186730121337187006L,
        -159733880,
        1960-10-25T05:35:51.111 (java.sql.Timestamp)],
        [-2155415181304284072L, 531546953, 1977-05-14T07:44:13.128 (java.sql.Timestamp)],
        [-2143281998218610920L,
        1653570732,
        1955-11-11T14:58:10.509 (java.sql.Timestamp)],
        [-1756637242817117570L,
        1800042595,
        2012-10-02T07:44:46.762 (java.sql.Timestamp)],
        [-1685308985507641749L,
        1517309270,
        1994-11-18T21:14:27.683 (java.sql.Timestamp)],
        [-1674563217135621776L,
        -1192151497,
        1980-03-07T06:17:01.305 (java.sql.Timestamp)],
        [-1365639961131482115L, null, 1996-07-02T13:53:15.832 (java.sql.Timestamp)],
        [-1118402545190101707L, null, 1996-10-06T10:03:37.233 (java.sql.Timestamp)],
        [-1019362334163299157L,
        -387312699,
        1926-12-25T09:07:19.792 (java.sql.Timestamp)],
        [-934136432924416325L,
        -1196054574,
        2015-03-22T06:19:01.557 (java.sql.Timestamp)],
        [-879060640046520320L, 962320737, 1995-07-31T00:55:11.408 (java.sql.Timestamp)],
        [-840678944920088765L, 1535391450, 1946-07-13T02:24:32.629 (java.sql.Timestamp)],
        [-752954469917193465L, -994797495, 1962-02-23T08:52:45.110 (java.sql.Timestamp)],
        [-703715820629548805L, 724530072, 1938-12-12T06:35:46.961 (java.sql.Timestamp)],
        [-638374349769057639L, 0, 1944-06-03T08:43:53.973 (java.sql.Timestamp)],
        [-526488519178829888L,
        -1725691958,
        1953-02-18T08:32:04.547 (java.sql.Timestamp)],
        [-508943482563998989L, -684405286, 2010-10-05T07:57:25.270 (java.sql.Timestamp)],
        [-351499755283555692L, 55004124, 1955-12-02T06:21:10.593 (java.sql.Timestamp)],
        [-314851395928121051L, 1792660917, 1976-06-26T17:13:53.577 (java.sql.Timestamp)],
        [-32220643507018774L, 0, 1933-06-21T14:23:22.024 (java.sql.Timestamp)],
        [-26658067013538148L, 891909914, 1926-01-12T16:16:59.005 (java.sql.Timestamp)],
        [0L, 258229417, 1930-02-05T09:00:12.816 (java.sql.Timestamp)],
        [0L, -473379316, 1971-09-04T17:53:36.910 (java.sql.Timestamp)],
        [0L, -2106061907, 1972-08-17T09:07:31.500 (java.sql.Timestamp)],
        [0L, 2147483647, 1995-12-19T13:01:08.159 (java.sql.Timestamp)],
        [0L, -821065781, 1996-10-28T17:51:23.021 (java.sql.Timestamp)],
        [0L, 2143338798, 1997-11-20T23:40:41.517 (java.sql.Timestamp)],
        [0L, 2147483647, 1998-05-03T20:33:44.280 (java.sql.Timestamp)],
        [0L, 39620447, 1999-03-14T21:16:16.735 (java.sql.Timestamp)],
        [0L, -1940901203, 1999-11-03T19:58:34.980 (java.sql.Timestamp)],
        [0L, 712172551, 2018-05-29T09:42:14.934 (java.sql.Timestamp)],
        [200753079219431109L, -2141482063, 2007-02-13T02:12:18.158 (java.sql.Timestamp)],
        [221358426708416632L, -1297536581, 1951-05-13T17:15:50.217 (java.sql.Timestamp)],
        [699704949712600115L, 1385753324, 1926-09-23T00:52:38.825 (java.sql.Timestamp)],
        [937863182454646424L, -1678814554, 1992-08-03T23:32:03.452 (java.sql.Timestamp)],
        [1005332161641625257L, 1890417189, 2005-09-30T20:12:05.336 (java.sql.Timestamp)],
        [1008423794855048618L,
        -1693091376,
        2002-01-31T05:33:01.517 (java.sql.Timestamp)],
        [1035633149643239325L, 503585435, 1988-05-14T16:14:28.210 (java.sql.Timestamp)],
        [1064860644173497024L, 2147483647, 1979-07-20T18:23:34.368 (java.sql.Timestamp)],
        [1681421259993292196L, 2147483647, 1978-08-10T12:47:27.569 (java.sql.Timestamp)],
        [1773398468402013874L,
        -2147483648,
        1958-07-30T20:47:54.905 (java.sql.Timestamp)],
        [1951473645747468132L, 2109221692, 1997-05-29T16:17:38.367 (java.sql.Timestamp)],
        [2037648442184655719L,
        -2147483648,
        1943-04-13T19:51:10.736 (java.sql.Timestamp)],
        [2144351936338032011L, 61145538, 1976-06-08T14:05:32.518 (java.sql.Timestamp)],
        [2314884016879877725L,
        -1320874397,
        1950-07-16T22:51:52.432 (java.sql.Timestamp)],
        [2463336835426524753L, 282189899, 1956-10-06T12:03:13.644 (java.sql.Timestamp)],
        [2470753881526567411L, 92022521, 1950-03-27T11:47:40.607 (java.sql.Timestamp)],
        [2474910122962342645L, 1239604626, 2004-06-13T16:08:09.984 (java.sql.Timestamp)],
        [2547801522413471235L, 154574356, 1945-04-02T21:05:51.239 (java.sql.Timestamp)],
        [2587850844475733788L, null, 1963-07-03T20:18:29.458 (java.sql.Timestamp)],
        [2891554156192639419L, -655401709, 1972-10-02T01:07:31.568 (java.sql.Timestamp)],
        [2938884597150545720L, 1948321399, 1986-10-05T00:58:27.075 (java.sql.Timestamp)],
        [3260813280637601224L,
        -1505097534,
        1975-07-12T22:29:00.994 (java.sql.Timestamp)],
        [3281918510377147959L, -671888222, 1951-03-13T12:55:57.223 (java.sql.Timestamp)],
        [3361697295616129626L, -252927047, 1925-11-10T20:34:03.599 (java.sql.Timestamp)],
        [3403275329300082380L, 2147483647, 1999-01-23T21:34:21.678 (java.sql.Timestamp)],
        [3432855030304886624L, 0, 2016-01-18T13:10:21.632 (java.sql.Timestamp)],
        [3698875473825747302L, 1922104200, 1991-05-07T23:31:12.946 (java.sql.Timestamp)],
        [3716407816784250679L, 748608020, 1993-04-18T11:04:17.395 (java.sql.Timestamp)],
        [3736036545575614572L, -219565529, 1958-06-26T14:49:30.506 (java.sql.Timestamp)],
        [3860385356258538578L,
        -1763924914,
        1925-05-15T14:15:57.296 (java.sql.Timestamp)],
        [3875173390025246931L, -930952854, 1949-04-13T05:36:11.319 (java.sql.Timestamp)],
        [4048137209577229221L,
        -1670344739,
        1947-10-17T20:54:57.096 (java.sql.Timestamp)],
        [4683627759359498803L, 2115671722, 1971-06-22T21:39:50.034 (java.sql.Timestamp)],
        [4796796124253410186L,
        -1745527253,
        1949-02-10T11:19:14.470 (java.sql.Timestamp)],
        [4880037797499618312L, 1278875422, 2019-08-11T14:39:06.199 (java.sql.Timestamp)],
        [4904196865423085294L, 1595921099, 1981-08-10T03:09:41.636 (java.sql.Timestamp)],
        [4955054754969663203L,
        -1596856364,
        1961-09-03T01:44:11.850 (java.sql.Timestamp)],
        [5221413118508444096L, 2147483647, 2012-02-28T18:45:44.984 (java.sql.Timestamp)],
        [5581727959619701416L, 955292705, 1980-03-03T21:24:38.621 (java.sql.Timestamp)],
        [5620119595698489414L, 0, 2014-01-09T09:51:19.800 (java.sql.Timestamp)],
        [5958797457229175960L, 0, 2007-12-27T22:34:39.944 (java.sql.Timestamp)],
        [6113742976523950690L,
        -1949121810,
        1947-10-07T05:45:26.216 (java.sql.Timestamp)],
        [6126279625664051246L, 780693573, 2000-08-05T23:39:38.430 (java.sql.Timestamp)],
        [6136410516779160355L,
        -1713716523,
        1998-02-01T09:51:25.542 (java.sql.Timestamp)],
        [6146794652083548235L, 2147483647, 1947-02-07T18:07:57.849 (java.sql.Timestamp)],
        [6404772245724905040L, -866933502, 1990-02-21T16:20:00.408 (java.sql.Timestamp)],
        [6409500192903294876L, 1791960565, 1963-10-08T06:58:01.590 (java.sql.Timestamp)],
        [6443170592819695545L, 187653156, 1996-06-15T10:52:53.485 (java.sql.Timestamp)],
        [6699063467282483612L, 478680517, 2010-01-09T15:15:41.936 (java.sql.Timestamp)],
        [6786406047068405579L, -845834629, 2010-06-07T08:59:09.715 (java.sql.Timestamp)],
        [6817473413534271398L, -270809726, 1987-01-17T14:57:05.741 (java.sql.Timestamp)],
        [6956805596389810642L, 1000706191, 1932-06-06T22:17:31.853 (java.sql.Timestamp)],
        [6968242321063736055L,
        -1792248271,
        2004-09-23T20:45:54.080 (java.sql.Timestamp)],
        [7031769777223827979L, null, 1986-08-26T16:22:06.994 (java.sql.Timestamp)],
        [7074732231626865653L, -418146186, 1961-07-20T00:44:20.078 (java.sql.Timestamp)],
        [7311352842601082005L,
        -1305086064,
        2002-03-24T09:54:20.412 (java.sql.Timestamp)],
        [7474511925530221096L, 646446593, 1983-09-10T07:40:41.494 (java.sql.Timestamp)],
        [7533238485814795685L, 197007162, 1938-07-31T06:41:46.515 (java.sql.Timestamp)],
        [7601087418687882863L, 798447888, 1932-09-19T11:49:41.388 (java.sql.Timestamp)],
        [7661520242631645434L, -772614666, 1944-12-04T10:00:35.611 (java.sql.Timestamp)],
        [7696908321112421928L, 2147483647, 1944-08-12T09:50:06.975 (java.sql.Timestamp)],
        [7786015629227782772L, 1225185384, 1948-12-05T16:46:35.954 (java.sql.Timestamp)],
        [8207312568729393198L,
        -1516928223,
        1984-09-11T06:48:38.170 (java.sql.Timestamp)],
        [8213198292805119537L, null, 1956-07-23T16:54:47.176 (java.sql.Timestamp)],
        [8231392449131810496L, null, 1932-01-27T23:41:19.901 (java.sql.Timestamp)],
        [8244083390276966433L, 536543192, 2008-08-19T13:20:18.146 (java.sql.Timestamp)],
        [8329206903797064010L,
        -2147483648,
        1923-05-06T13:39:08.135 (java.sql.Timestamp)],
        [8404755141598120236L, 883117578, 1974-03-25T03:57:12.988 (java.sql.Timestamp)],
        [8648279696612050209L,
        -1748943605,
        1974-05-21T02:27:44.955 (java.sql.Timestamp)],
        [8841646798864268211L,
        -1989055510,
        2014-07-13T13:00:27.257 (java.sql.Timestamp)],
        [8970115247833113645L, 386297006, 1966-11-17T00:44:45.570 (java.sql.Timestamp)],
        [9146779409294820058L, 1056258715, 1959-04-28T14:34:21.355 (java.sql.Timestamp)],
        [9223372036854775807L, 1724556115, 1922-05-13T17:31:05.791 (java.sql.Timestamp)],
        [9223372036854775807L, 960731419, 1929-12-20T05:20:15.993 (java.sql.Timestamp)],
        [9223372036854775807L,
        -2147483648,
        1932-03-23T21:28:39.797 (java.sql.Timestamp)],
        [9223372036854775807L, -122204761, 1932-04-30T17:54:42.938 (java.sql.Timestamp)],
        [9223372036854775807L, 1034398573, 1942-03-12T17:14:01.614 (java.sql.Timestamp)],
        [9223372036854775807L, 75205040, 1961-03-18T01:36:38.603 (java.sql.Timestamp)],
        [9223372036854775807L,
        -2020641285,
        1969-10-31T14:09:30.021 (java.sql.Timestamp)],
        [9223372036854775807L, 0, 1974-06-21T03:24:50.395 (java.sql.Timestamp)],
        [9223372036854775807L,
        -1093701172,
        1977-10-19T21:37:14.629 (java.sql.Timestamp)],
        [9223372036854775807L, 1692926518, 1979-05-16T15:10:55.212 (java.sql.Timestamp)],
        [9223372036854775807L, 514995219, 1979-09-18T21:36:41.369 (java.sql.Timestamp)],
        [9223372036854775807L, -614471243, 1985-08-01T03:50:50.954 (java.sql.Timestamp)],
        [9223372036854775807L,
        -1949026359,
        1995-02-17T08:26:21.139 (java.sql.Timestamp)],
        [9223372036854775807L,
        -1954215371,
        2007-01-19T18:43:44.817 (java.sql.Timestamp)],
        [9223372036854775807L, 1842130704, 2013-01-17T12:42:40.575 (java.sql.Timestamp)]]
        at org.apache.iceberg.spark.SparkTestHelperBase.assertEquals(SparkTestHelperBase.java:61)
        at org.apache.iceberg.spark.sql.TestStoragePartitionedJoins.assertPartitioningAwarePlan(TestStoragePartitionedJoins.java:661)
        at org.apache.iceberg.spark.sql.TestStoragePartitionedJoins.checkJoin(TestStoragePartitionedJoins.java:612)
        at org.apache.iceberg.spark.sql.TestStoragePartitionedJoins.testJoinsWithDaysOnTimestampColumn(TestStoragePartitionedJoins.java:208)

Steps to reproduce

SparkSession configs used:

            .config("spark.plugins", "org.apache.spark.CometPlugin")
            .config("spark.shuffle.manager", "org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager")
            .config("spark.comet.explainFallback.enabled", "true")
            .config("spark.sql.iceberg.parquet.reader-type", "COMET")
            .config("spark.memory.offHeap.enabled", "true")
            .config("spark.memory.offHeap.size", "10g")
            .config("spark.comet.use.lazyMaterialization", "false")
            .config("spark.comet.schemaEvolution.enabled", "true")

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions