-
-
Notifications
You must be signed in to change notification settings - Fork 416
Rewrite core.checkedint
functions.
#1394
base: master
Are you sure you want to change the base?
Conversation
Looks good, maybe add a few microbenchmarks to https://github.com/D-Programming-Language/druntime/tree/master/benchmark. |
Why all the style changes? And because you specifically mention the module as a potential style showcase: I don't think we should encourage the use of e.g. bitwise not for boolean values, nor relying on operator precedence wherever possible. Also, you should add your name after Walter's. He is still the primary author of the API. |
a0b8b50
to
83ec881
Compare
OK, I will able to do it probably tomorrow.
Do you mean
No boolean values are used. All types are
Hm, I don't see any complicated cases in my code, the only thing used are
Sorry, I supposed it was alphabetically sorted for some reason. Fixed. |
@@ -19,10 +19,9 @@ | |||
* relative to the cost of the operation itself, compiler implementations are free | |||
* to recognize them and generate equivalent and faster code. | |||
* | |||
* References: $(LINK2 http://blog.regehr.org/archives/1139, Fast Integer Overflow Checks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why take out the reference? This is critical information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function implementations are completely rewritten so there is no connection with this blog any more. Also after looking into the blog it looks very silly: after the author performs bitshifts in checked_add_3
instead of using bitwise operators (like done in this pull) he uses this long and slow way:
(sa && sb && !sr) ||
(!sa && !sb && sr)
So is the really a reason to include a reference to this blog?
As noted in the reference that was elided by this PR (and should be reinstated), doing these checks is tricky because of the ways some compilers generate code. The way I wrote the code was to avoid various code generators invalidating the results. I'm not saying your method is wrong, but I do suggest a careful review of the unittests to ensure that it really does work correctly and is tested thoroughly. |
Do you mean there are code generation bugs and these bugs should be considered in every module we write? Because I don't see when this pull rely on some concrete implementation of UB case. |
Do you suppose |
const r = x * y; | ||
if ( ~x._differentSign(y) // x and y has the same sign | ||
& r._signBit // and result is negative (covers min * -1) | ||
|| x && r / x != y) // or perform simple check for x != 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something amazing going on with whitespace here. Quite dizzying to review at a glance.
83ec881
to
cdbc2c0
Compare
|
Such a shame that GDC ignores these function bodies then. ;-) |
if (r < int.min || r > int.max) | ||
const r = x - y; | ||
if ( x._differentSign(y) // x and y has different sign | ||
&& x._differentSign(r)) // result and x has different sign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really rather put the &&
at the end of the first line (which is recommended in style guides anyway, e.g. Code Complete) so that you don't need the funky whitespace in the first line. Just makes it harder for the brain to parse.
LDC won't use those function bodies either, except on platforms where there is no native instruction (e.g. long on x86). |
This version of Performance-wise, I find it a bit slower with DMD than the old version was. |
With DMD, could you check the first commit (the one without the use of logical operations) please? |
(I should note that my tests assume no one has gotten around to implementing intrinsics for The bitwise version was slightly slower than the logical version, even after I optimized it by combining some unnecessary shift operations. |
What functions exactly are you measuring? Is |
I am running the benchmark found in It was designed to test the performance of my checked integer type, but in the process it exercises all the contents of I am not really a big fan of micro-benchmarks, because it is so hard to get them right. If you really want, I can take a stab at breaking things down in more detail later. However, I think it's probably a waste of time; the old Compared to bitwise equivalents, the kind of test logic used in the old implementation looks slow, but in reality it is highly amenable to optimization and superscalar execution. If this pull request is accepted, it should be for some reason other than runtime performance. |
cdbc2c0
to
2986f54
Compare
Agree. I tried to improve the whole module implementation style. |
Readded.
Changed as I'm probably the only one here using such style and ignoring Code Complete. |
Let's please add them anyhow. Number of instructions and latency is a good proxy, though you can hardly measure the impact on the branch cache. |
Not sure about benchmarks. For now I only can describe resulting machine code. This is what Clang does (one can see full adds(int, int, bool*):
lea eax, [rsi + rdi]
xor esi, edi
js .LBB0_3
xor edi, eax
jns .LBB0_3
mov byte ptr [rdx], 1
.LBB0_3:
ret and this is what GCC does (just direct shifting and comparison, looks rather slow): adds(int, int, bool*):
leal (%rdi,%rsi), %eax
shrl $31, %esi
shrl $31, %edi
cmpb %dil, %sil
jne .L4
movl %eax, %ecx
shrl $31, %ecx
cmpb %cl, %sil
je .L4
movb $1, (%rdx)
.L4:
rep ret So yes, one have to use |
Is there an equivalent standard benchmark for Phobos somewhere? |
adds(int, int, bool*):
add esi, edi # looks unoptimal, could use EAX via lea
xor edi, esi
jns .LBB0_2
mov byte ptr [rdx], 1
.LBB0_2:
mov eax, esi # as EAX wasn't used
ret
adds(int, int, bool*):
lea eax, [rsi+rdi]
xor edi, eax
jns .L4
mov BYTE PTR [rdx], 1
.L4:
rep ret
|
|
Sorry for the noise, just my example code was incorrect. |
2986f54
to
f57723c
Compare
Use fast and simple sign comparison instead of long, slow and complicated value comparisons. Also remove reference as it doesn't apply anymore.
At least Clang knows `(a < 0) != (b < 0)` pattern. The only possible shortcoming of this change is that use of logical operators currently causes compilers to emit two conditional jumps instead of one for bitwise operations.
Changed the first commit to use sign-extending shift to make |
Will try to add one in a couple of days. |
Not that I'm aware of, but this PR is druntime code anyhow. |
Sure. I was just wondering because it would be relevant to something else I'm working on. |
Ping @denis-sh |
|
I think benchmarking is more important and was your last intention. |
Use fast and simple sign comparison instead of long, slow and complicated value comparisons.
Also remove reference as it doesn't apply anymore.
P.S.
I planned to rewrite this module implementation since its creation because it's clearly written in a hurry (e.g. see
r < x || r < y
check inaddu
), sorry this took that long to get to it. IMHO this is one of the modules that have to show source code readers how simple, robust and efficient code should be written in D.