Skip to content

Errors spotted during dataset conversion #7

@stachu3478

Description

@stachu3478

Hi,

I have written a script that converts MUSCIMA++ v2.1 dataset into an other, internal, similar dataset. The problems i have met seem to be result from inaccuracies of the annotations. To isolate the problematic objects, i have prepared a log file below. There are three things i have discovered while i was writing my script:

  1. Duplicated bounding boxes - found pairs of objects that have the same features except id, relations and sometimes mask (?!). Those are logged like: Found X duplicates in Y tensor([<coordinates of the duplicated bounding boxes>]) - spotted during exploration of the two next points,
  2. Some of the objects failed to convert due to probably being misclassified (usually as an articStaccato being augmentationDot in real). For articulationStaccato my converter needs to know whether the dot is below or above related notes and for those cases the predictions are ambigous, so the lines start like Found multiple matches for articStaccatoBelow [...]
  3. Missing relations like articulationStaccato at [X1, Y1, X2, Y2] is not related to any object [...]. Conversion also fails in the way of the 2.

And here is the log:

datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-02_N-13_D-ideal.xml:
Found 1 duplicates in barline tensor([[713, 269, 723, 385]])
Found 1 duplicates in measureSeparator tensor([[713, 269, 723, 385]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-02_N-17_D-ideal.xml:
Found 1 duplicates in barline tensor([[ 705,  985,  714, 1101]])
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-02_N-17_D-ideal.xml [331.0, 919.0, 342.0, 926.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-03_N-01_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[1615,  270, 1620,  276]])
Found 1 duplicates in tuple tensor([[2271,  521, 2406,  554]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-06_N-02_D-ideal.xml:
Found 2 duplicates in barline tensor([[ 585,  261,  594,  363],
        [1082,  495, 1092,  612]])
Found 2 duplicates in measureSeparator tensor([[ 585,  261,  594,  363],
        [1082,  495, 1092,  612]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-06_N-16_D-ideal.xml:
Found 8 duplicates in articulationStaccato tensor([[1129, 1591, 1137, 1598],
        [1166, 1583, 1174, 1590],
        [1276, 1629, 1285, 1636],
        [1336, 1627, 1343, 1634],
        [1373, 1628, 1381, 1634],
        [1416, 1622, 1424, 1630],
        [1370, 1654, 1379, 1662],
        [1411, 1654, 1419, 1662]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-07_N-05_D-ideal.xml:
Found 1 duplicates in augmentationDot tensor([[2897,  557, 2904,  565]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-07_N-08_D-ideal.xml:
articulationStaccato at [1198.0, 346.0, 1206.0, 354.0] is not related to any object. Used tensor([[1188.,  307., 1219.,  329.]]) as fallback
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-08_N-15_D-ideal.xml:
Found 1 duplicates in augmentationDot tensor([[3087,  547, 3095,  556]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-10_N-01_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[848, 606, 855, 613]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-12_N-04_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[ 691, 1150,  698, 1157]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-13_N-02_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[664, 515, 669, 519]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-13_N-16_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[ 983, 1068,  991, 1075]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-16_N-06_D-ideal.xml:
Found 1 duplicates in accidentalFlat tensor([[2092, 1128, 2110, 1163]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-16_N-17_D-ideal.xml:
articulationStaccato at [647.0, 1498.0, 653.0, 1506.0] is not related to any object. Used tensor([[ 634., 1457.,  661., 1486.]]) as fallback
articulationStaccato at [2603.0, 2352.0, 2611.0, 2357.0] is not related to any object. Used tensor([[2601., 2298., 2636., 2334.]]) as fallback
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-18_N-12_D-ideal.xml:
Found 10 duplicates in repeatDot tensor([[ 457, 1258,  464, 1265],
        [ 457, 1278,  466, 1285],
        [ 457, 1495,  465, 1501],
        [ 457, 1516,  466, 1523],
        [ 456, 1732,  463, 1737],
        [ 456, 1748,  463, 1754],
        [ 461, 2008,  469, 2014],
        [2412,  789, 2418,  799],
        [2419, 1026, 2423, 1031],
        [2420, 1045, 2424, 1051]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-19_N-04_D-ideal.xml:
articulationStaccato at [2576.0, 223.0, 2584.0, 230.0] is not related to any object. Used tensor([[2566.,  239., 2594.,  268.]]) as fallback
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-21_N-08_D-ideal.xml:
Found 1 duplicates in stem tensor([[1557, 1341, 1567, 1399]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-22_N-10_D-ideal.xml:
Found 1 duplicates in characterSmallP tensor([[1290, 1387, 1320, 1439]])
Found 1 duplicates in dynamicLetterP tensor([[1290, 1387, 1320, 1439]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-23_N-17_D-ideal.xml:
Found 1 duplicates in beam tensor([[732, 498, 792, 508]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-24_N-01_D-ideal.xml:
articulationStaccato at [2401.0, 241.0, 2409.0, 249.0] is not related to any object. Used tensor([[2394.,  271., 2419.,  298.]]) as fallback
articulationStaccato at [2699.0, 261.0, 2706.0, 269.0] is not related to any object. Used tensor([[2691.,  290., 2718.,  312.]]) as fallback
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-24_N-01_D-ideal.xml [2196.0, 519.0, 2203.0, 527.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-24_N-18_D-ideal.xml:
Found 2 duplicates in restHalf tensor([[2435, 2023, 2471, 2045],
        [2178, 2020, 2214, 2041]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-27_N-02_D-ideal.xml:
articulationStaccato at [2621.0, 240.0, 2626.0, 248.0] is not related to any object. Used tensor([[2591.,  261., 2618.,  280.]]) as fallback
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-28_N-09_D-ideal.xml:
Found 1 duplicates in repeat tensor([[1151,  488, 1210,  611]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-30_N-17_D-ideal.xml:
Found 5 duplicates in articulationStaccato tensor([[1969,  777, 1976,  783],
        [1391,  627, 1401,  634],
        [1503,  634, 1510,  641],
        [2248, 1106, 2255, 1114],
        [1331, 2207, 1341, 2214]])
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-30_N-17_D-ideal.xml [445.0, 1812.0, 452.0, 1819.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-33_N-04_D-ideal.xml:
articulationStaccato at [1131.0, 379.0, 1138.0, 387.0] is not related to any object. Used tensor([[1129.,  328., 1143.,  347.]]) as fallback
articulationStaccato at [1921.0, 213.0, 1927.0, 220.0] is not related to any object. Used tensor([[1914.,  235., 1927.,  266.]]) as fallback
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-33_N-04_D-ideal.xml [1519.0, 461.0, 1526.0, 468.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-34_N-03_D-ideal.xml:
Found 1 duplicates in restWhole tensor([[780, 753, 829, 770]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-35_N-05_D-ideal.xml:
Found 1 duplicates in barline tensor([[1290,  959, 1300, 1082]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-36_N-14_D-ideal.xml:
Found 3 duplicates in characterDot tensor([[ 494, 1160,  503, 1166],
        [1332, 1139, 1341, 1145],
        [ 589, 1624,  597, 1631]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-37_N-13_D-ideal.xml:
Found 1 duplicates in barline tensor([[2579,  979, 2587, 1087]])
Found 1 duplicates in measureSeparator tensor([[2579,  979, 2587, 1087]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-37_N-17_D-ideal.xml
Found 1 duplicates in augmentationDot tensor([[2400, 1006, 2408, 1014]])
Found 1 duplicates in articulationStaccato tensor([[ 380, 1622,  388, 1630]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-38_N-18_D-ideal.xml:
Found 2 duplicates in augmentationDot tensor([[ 511, 1466,  518, 1472],
        [1323, 1992, 1332, 2000]])
Found 1 duplicates in restWhole tensor([[1100, 1706, 1145, 1727]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-41_N-02_D-ideal.xml:
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-41_N-02_D-ideal.xml [810.0, 241.0, 817.0, 248.0], treating previous as augmentationDot
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-41_N-02_D-ideal.xml [1269.0, 220.0, 1275.0, 229.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-41_N-16_D-ideal.xml:
articulationStaccato at [1410.0, 1653.0, 1418.0, 1659.0] is not related to any object. Used tensor([[1398., 1663., 1429., 1685.]]) as fallback
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-43_N-14_D-ideal.xml:
Found 3 duplicates in articulationStaccato tensor([[2287, 1526, 2294, 1531],
        [2407, 1437, 2414, 1443],
        [2557, 1403, 2564, 1408]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-44_N-06_D-ideal.xml:
Found 2 duplicates in barline tensor([[2041, 1202, 2050, 1315],
        [2790, 1434, 2799, 1553]])
Found 2 duplicates in measureSeparator tensor([[2041, 1202, 2050, 1315],
        [2790, 1434, 2799, 1553]])
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-44_N-06_D-ideal.xml [1513.0, 484.0, 1524.0, 494.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-44_N-13_D-ideal.xml:
Found 1 duplicates in characterSmallF tensor([[2943,  871, 2972,  915]])
Found 1 duplicates in dynamicLetterF tensor([[2943,  871, 2972,  915]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-44_N-17_D-ideal.xml:
articulationStaccato at [1556.0, 893.0, 1562.0, 899.0] is not related to any object. Used tensor([[1551.,  861., 1573.,  883.]]) as fallback
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-47_N-04_D-ideal.xml:
Found 1 duplicates in barline tensor([[1427,  265, 1435,  377]])
Found 1 duplicates in measureSeparator tensor([[1427,  265, 1435,  377]])
Found 1 duplicates in tuple tensor([[1877,  157, 1894,  182]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-48_N-02_D-ideal.xml:
Found 1 duplicates in augmentationDot tensor([[1935, 1403, 1942, 1409]])
articulationStaccato at [2755.0, 244.0, 2762.0, 250.0] is not related to any object. Used tensor([[2741.,  266., 2762.,  292.]]) as fallback
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-48_N-16_D-ideal.xml:
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-48_N-16_D-ideal.xml [1669.0, 1637.0, 1681.0, 1651.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-49_N-11_D-ideal.xml:
Found 1 duplicates in repeatDot tensor([[ 792, 1473,  799, 1483]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-03_N-01_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[1615,  270, 1620,  276]])
Found 1 duplicates in tuple tensor([[2271,  521, 2406,  554]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-13_N-02_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[664, 515, 669, 519]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-07_N-08_D-ideal.xml:
articulationStaccato at [1198.0, 346.0, 1206.0, 354.0] is not related to any object. Used tensor([[1188.,  307., 1219.,  329.]]) as fallback
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-49_N-11_D-ideal.xml:
Found 1 duplicates in repeatDot tensor([[ 792, 1473,  799, 1483]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-41_N-02_D-ideal.xml:
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-41_N-02_D-ideal.xml [810.0, 241.0, 817.0, 248.0], treating previous as augmentationDot
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-41_N-02_D-ideal.xml [1269.0, 220.0, 1275.0, 229.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-38_N-18_D-ideal.xml:
Found 2 duplicates in augmentationDot tensor([[ 511, 1466,  518, 1472],
        [1323, 1992, 1332, 2000]])
Found 1 duplicates in restWhole tensor([[1100, 1706, 1145, 1727]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-13_N-16_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[ 983, 1068,  991, 1075]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-38_N-18_D-ideal.xml:
Found 2 duplicates in augmentationDot tensor([[ 511, 1466,  518, 1472],
        [1323, 1992, 1332, 2000]])
Found 1 duplicates in restWhole tensor([[1100, 1706, 1145, 1727]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-35_N-05_D-ideal.xml:
Found 1 duplicates in barline tensor([[1290,  959, 1300, 1082]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-03_N-01_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[1615,  270, 1620,  276]])
Found 1 duplicates in tuple tensor([[2271,  521, 2406,  554]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-08_N-15_D-ideal.xml:
Found 1 duplicates in augmentationDot tensor([[3087,  547, 3095,  556]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-02_N-17_D-ideal.xml:
Found 1 duplicates in barline tensor([[ 705,  985,  714, 1101]])
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-02_N-17_D-ideal.xml [331.0, 919.0, 342.0, 926.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-13_N-02_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[664, 515, 669, 519]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-27_N-02_D-ideal.xml:
articulationStaccato at [2621.0, 240.0, 2626.0, 248.0] is not related to any object. Used tensor([[2591.,  261., 2618.,  280.]]) as fallback
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-47_N-04_D-ideal.xml:
Found 1 duplicates in barline tensor([[1427,  265, 1435,  377]])
Found 1 duplicates in measureSeparator tensor([[1427,  265, 1435,  377]])
Found 1 duplicates in tuple tensor([[1877,  157, 1894,  182]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-47_N-04_D-ideal.xml:
Found 1 duplicates in barline tensor([[1427,  265, 1435,  377]])
Found 1 duplicates in measureSeparator tensor([[1427,  265, 1435,  377]])
Found 1 duplicates in tuple tensor([[1877,  157, 1894,  182]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-02_N-17_D-ideal.xml:
Found 1 duplicates in barline tensor([[ 705,  985,  714, 1101]])
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-02_N-17_D-ideal.xml [331.0, 919.0, 342.0, 926.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-13_N-02_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[664, 515, 669, 519]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-23_N-17_D-ideal.xml:
Found 1 duplicates in beam tensor([[732, 498, 792, 508]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-38_N-18_D-ideal.xml:
Found 2 duplicates in augmentationDot tensor([[ 511, 1466,  518, 1472],
        [1323, 1992, 1332, 2000]])
Found 1 duplicates in restWhole tensor([[1100, 1706, 1145, 1727]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-47_N-04_D-ideal.xml:
Found 1 duplicates in barline tensor([[1427,  265, 1435,  377]])
Found 1 duplicates in measureSeparator tensor([[1427,  265, 1435,  377]])
Found 1 duplicates in tuple tensor([[1877,  157, 1894,  182]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-44_N-06_D-ideal.xml:
Found 2 duplicates in barline tensor([[2041, 1202, 2050, 1315],
        [2790, 1434, 2799, 1553]])
Found 2 duplicates in measureSeparator tensor([[2041, 1202, 2050, 1315],
        [2790, 1434, 2799, 1553]])
Found multiple matches for articStaccatoBelow at datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-44_N-06_D-ideal.xml [1513.0, 484.0, 1524.0, 494.0], treating previous as augmentationDot
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-10_N-01_D-ideal.xml:
Found 1 duplicates in articulationStaccato tensor([[848, 606, 855, 613]])
datasets/muscima/v2.1/data/cropobjects_withstaff/CVC-MUSCIMA_W-44_N-17_D-ideal.xml:
articulationStaccato at [1556.0, 893.0, 1562.0, 899.0] is not related to any object. Used tensor([[1551.,  861., 1573.,  883.]]) as fallback

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions