-
Notifications
You must be signed in to change notification settings - Fork 279
Improving avx/avx2 swizzles #1141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving avx/avx2 swizzles #1141
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you could reduce code duplication by syndicating patterns as suggested, that would be great!
If this passes the tests it can be merged. I hope I did not break any build this time. Should I add myself to the copyright? |
You can add yourself, yes. And rebase on master branch ;-) |
8239b5e
to
cfb5838
Compare
I rebased. I did not squash as I have some commits that I would like to keep for reference in the branch. Feel free to squash and merge in master. |
Hey @DiamonDinoia : I'll have a look at this next week, don't worry if you don't have any feedback since then 🙇 |
ec97ae8
to
e99877d
Compare
e99877d
to
c6b58b6
Compare
I am not sure what the problem is with PowerPC. It should not touch any code I wrote and from the print the results seems correct. |
How strange: the failure shows batches that... look the same /o\ |
I could not find an explanation for it. |
If you don't mind, I'll split your commit history in pieces and validate them step-by-step in order to understand what's happening - keeping your authorship information, of course. |
Sure, no problem, go ahead! |
c6b58b6
to
b285dbd
Compare
perfect, can you just squash your commits into a nice history? Thanks o/ |
Fixes: some swizzled did not allow duplicates in the output
0a906e5
to
8b1c9b9
Compare
Thanks for being patient and for the great patch set \o/ |
See ##1155 for the sse2 part |
My pleasure! |
Hi,
This PR optimzes swizzles where possible
Fixed a bug in the float avx2 swizzle which did not allow duplicates
Adds more tests