Skip to content

Changing WEIGHT_MAX from __builtin_inf() to DBL_MAX removes speedup #2

@chipbuster

Description

@chipbuster

I noticed that the compile command in the README did not contain the -ffast-math flag. Without that flag, there is almost no advantage for the optimized code (speedups of 0.9x - 1.1x). Adding -ffast-math gave me up to a 1,000,000x speedup on the largest problem size in the benchmark.

After doing some poking around, I found that the following patch, which changes the definition of WEIGHT_MAX from __builtin_inf() to DBL_MAX more-or-less removes the advantage that DMMSY Opt has, even when using -ffast-math:

diff --git a/include/common.h b/include/common.h
index 1234107..a31c568 100644
--- a/include/common.h
+++ b/include/common.h
@@ -3,12 +3,13 @@

 #include <stdint.h>
 #include <stdbool.h>
+#include <float.h>

 typedef uint32_t node_t;
 typedef double weight_t;

 // Infinity equivalent for double
-#define WEIGHT_MAX __builtin_inf()
+#define WEIGHT_MAX DBL_MAX
 #define NODE_MAX ((node_t)-1)

 // Edge structure

I compiled the code with the following command: clang -O3 -ffast-math -march=native -flto -DNDEBUG src/*.c -I include -o dmmsy_bench -lm, using Clang version 21.1.8

Timing Results

Timing output before patch (WEIGHT_MAX = __builtin_inf())
Details
===========================================================================
n          m          Dijkstra     DMMSY Res    DMMSY Opt    Spd(Opt)
---------------------------------------------------------------------------
1000       5000       0.0170       0.0037       0.0000       342.80  x
5000       25000      0.0008       0.0002       0.0000       18.41   x
10000      50000      0.9935       0.8738       0.0001       12480.83x
25000      125000     2.9053       2.6133       0.0001       51573.26x
50000      250000     6.4332       5.7430       0.0001       74372.67x
75000      375000     10.0353      9.1648       0.0001       129767.44x
100000     500000     13.8912      12.5027      0.0001       93438.25x
150000     750000     22.6547      20.2336      0.0002       114417.77x
200000     1000000    29.8072      27.1743      0.0003       96541.64x
250000     1250000    42.9747      35.8088      0.0002       179810.35x
350000     1750000    64.0961      60.6058      0.0005       133071.51x
500000     2500000    114.3707     102.8302     0.0002       646162.01x
750000     3750000    195.8446     189.5765     0.0002       979222.88x
1000000    5000000    305.1935     321.8353     0.0003       1220774.15x
===========================================================================
Timing output after patch (WEIGHT_MAX = DBL_MAX)
Details
===========================================================================
n          m          Dijkstra     DMMSY Res    DMMSY Opt    Spd(Opt)
---------------------------------------------------------------------------
1000       5000       0.0180       0.0244       0.0200       0.90    x
5000       25000      0.0008       0.0002       0.0000       17.26   x
10000      50000      1.0094       0.9951       1.0156       0.99    x
25000      125000     2.9112       2.9222       2.9948       0.97    x
50000      250000     6.4643       6.4533       6.6439       0.97    x
75000      375000     10.0461      10.4196      10.2352      0.98    x
100000     500000     13.5921      13.7043      14.0339      0.97    x
150000     750000     21.5044      21.6953      22.5653      0.95    x
200000     1000000    30.4776      36.0661      32.5664      0.94    x
250000     1250000    42.2718      40.8033      44.2124      0.96    x
350000     1750000    68.5082      69.1117      71.3077      0.96    x
500000     2500000    113.9036     114.7898     121.7260     0.94    x
750000     3750000    189.4533     214.2840     214.3745     0.88    x
1000000    5000000    312.2623     317.0314     340.4416     0.92    x
===========================================================================

Both implementations appear to pass the correctness checks.

Unfortunately, I don't have the resources to dig in further to see if this can be rectified and the original advantage of DMMSY Opt restored.

This problem is mentioned in #1, but it is claimed there that the benchmarks should still show the speedup with -ffast-math removed. This does not appear to be the case on my computer, and further, it seems that -ffast-math shows no speedup when WEIGHT_MAX is modified as above. I do not know if this is specific to my system or a more general problem.

Pinned by danalec

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions