Skip to content

Performance tuning for 1D calculations (Trac #783) #126

@pkienzle

Description

@pkienzle

Many of the 1D calculations involve nested integrals over a fixed set of theta and phi values, with trig functions to turn these values into something useful for the kernel.

For example, the cylinder kernel, 2 trig operations, a multiply and an add can be replaced by two table lookups by converting:

    const double zm = M_PI_4;
    const double zb = M_PI_4; 

    double total = 0.0;
    for (int i=0; i<76 ;i++) {
        const double alpha = Gauss76Z[i]*zm + zb;
        double sn, cn; // slots to hold sincos function output
        // alpha(theta,phi) the projection of the cylinder on the detector plane
        SINCOS(alpha, sn, cn);
        total += Gauss76Wt[i] * square(fq(q, sn, cn, radius, length)) * sn;
    }
    return total*zm;

into:

    double total = 0.0;
    for (int i=0; i<76 ;i++) {
        total += Gauss76Wt_times_sin_alpha[i] * square(fq(q, sin_alpha[i], cos_alpha[i], radius, length));
    }
    return total*M_PI_4;

To see if this might be worthwhile, I substituted Gauss76Z for the sin and Gauss76Wt for cos, and observed a 64% speedup. It had the wrong answer of course, but it does show that this approach could be worthwhile.

Migrated from http://trac.sasview.org/ticket/783

{
    "status": "new",
    "changetime": "2016-10-14T15:34:24",
    "_ts": "2016-10-14 15:34:24.625363+00:00",
    "description": "Many of the 1D calculations involve nested integrals over a fixed set of theta and phi values, with trig functions to turn these values into something useful for the kernel.  \n\nFor example, the cylinder kernel, 2 trig operations, a multiply and an add can be replaced by two table lookups by converting:\n{{{\n    const double zm = M_PI_4;\n    const double zb = M_PI_4; \n\n    double total = 0.0;\n    for (int i=0; i<76 ;i++) {\n        const double alpha = Gauss76Z[i]*zm + zb;\n        double sn, cn; // slots to hold sincos function output\n        // alpha(theta,phi) the projection of the cylinder on the detector plane\n        SINCOS(alpha, sn, cn);\n        total += Gauss76Wt[i] * square(fq(q, sn, cn, radius, length)) * sn;\n    }\n    return total*zm;\n}}}\n\ninto:\n{{{\n    double total = 0.0;\n    for (int i=0; i<76 ;i++) {\n        total += Gauss76Wt_times_sin_alpha[i] * square(fq(q, sin_alpha[i], cos_alpha[i], radius, length));\n    }\n    return total*M_PI_4;\n}}}\n\nTo see if this might be worthwhile, I substituted Gauss76Z for the sin and Gauss76Wt for cos, and observed a 64% speedup.  It had the wrong answer of course, but it does show that this approach could be worthwhile.\n",
    "reporter": "pkienzle",
    "cc": "",
    "resolution": "",
    "workpackage": "SasView Bug Fixing",
    "time": "2016-10-14T15:34:24",
    "component": "sasmodels",
    "summary": "Performance tuning for 1D calculations",
    "priority": "minor",
    "keywords": "",
    "milestone": "sasmodels WishList",
    "owner": "",
    "type": "enhancement"
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions