-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
Many of the 1D calculations involve nested integrals over a fixed set of theta and phi values, with trig functions to turn these values into something useful for the kernel.
For example, the cylinder kernel, 2 trig operations, a multiply and an add can be replaced by two table lookups by converting:
const double zm = M_PI_4;
const double zb = M_PI_4;
double total = 0.0;
for (int i=0; i<76 ;i++) {
const double alpha = Gauss76Z[i]*zm + zb;
double sn, cn; // slots to hold sincos function output
// alpha(theta,phi) the projection of the cylinder on the detector plane
SINCOS(alpha, sn, cn);
total += Gauss76Wt[i] * square(fq(q, sn, cn, radius, length)) * sn;
}
return total*zm;
into:
double total = 0.0;
for (int i=0; i<76 ;i++) {
total += Gauss76Wt_times_sin_alpha[i] * square(fq(q, sin_alpha[i], cos_alpha[i], radius, length));
}
return total*M_PI_4;
To see if this might be worthwhile, I substituted Gauss76Z for the sin and Gauss76Wt for cos, and observed a 64% speedup. It had the wrong answer of course, but it does show that this approach could be worthwhile.
Migrated from http://trac.sasview.org/ticket/783
{
"status": "new",
"changetime": "2016-10-14T15:34:24",
"_ts": "2016-10-14 15:34:24.625363+00:00",
"description": "Many of the 1D calculations involve nested integrals over a fixed set of theta and phi values, with trig functions to turn these values into something useful for the kernel. \n\nFor example, the cylinder kernel, 2 trig operations, a multiply and an add can be replaced by two table lookups by converting:\n{{{\n const double zm = M_PI_4;\n const double zb = M_PI_4; \n\n double total = 0.0;\n for (int i=0; i<76 ;i++) {\n const double alpha = Gauss76Z[i]*zm + zb;\n double sn, cn; // slots to hold sincos function output\n // alpha(theta,phi) the projection of the cylinder on the detector plane\n SINCOS(alpha, sn, cn);\n total += Gauss76Wt[i] * square(fq(q, sn, cn, radius, length)) * sn;\n }\n return total*zm;\n}}}\n\ninto:\n{{{\n double total = 0.0;\n for (int i=0; i<76 ;i++) {\n total += Gauss76Wt_times_sin_alpha[i] * square(fq(q, sin_alpha[i], cos_alpha[i], radius, length));\n }\n return total*M_PI_4;\n}}}\n\nTo see if this might be worthwhile, I substituted Gauss76Z for the sin and Gauss76Wt for cos, and observed a 64% speedup. It had the wrong answer of course, but it does show that this approach could be worthwhile.\n",
"reporter": "pkienzle",
"cc": "",
"resolution": "",
"workpackage": "SasView Bug Fixing",
"time": "2016-10-14T15:34:24",
"component": "sasmodels",
"summary": "Performance tuning for 1D calculations",
"priority": "minor",
"keywords": "",
"milestone": "sasmodels WishList",
"owner": "",
"type": "enhancement"
}