-
-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AdvSimd in ComponentProcessor #2429
Add AdvSimd in ComponentProcessor #2429
Conversation
Are you hitting an API wall here? |
I had no clue how to implement Permute4x64 to Arm |
Yeah, that's got me stumped also. @tannergooding is there anything we can do here? |
Another Api which I have no clue how to port is Sse.Shuffle. If I knew how to write that in AdvSimd, then I could go forward porting other methods. |
Depending on exactly how you're shuffling, you want one of:
There is also the more powerful:
Depends on exactly how you're permuting the 4x doubles, but you can at worst use 2x |
Thanks Tanner!! That's gonna take me a few reads to get my head around 🤣 |
No worries, happy to provide additional suggestions and/or review if needed. Feel free to tag me :) |
Am I missing something? VectorTableLookupExtensions only support byte and sbyte. Unfortunately this code (SumHorizontal) is working on floats. |
Yes, because it does things bytewise, much as That is, if you wanted to pick |
@JimBobSquarePants I think I will try to tackle the missing Method in a different PR. I probably need much more time to first understand what the Permute4x64 does and then try my way with ARM. Do you know of any Benchmarks which cover the ComponentProcessor? |
@stefannikolei Yeah I'm happy for that to be separate. There's a lot of figuring out to do to implement.
Not to my knowledge no. |
Prerequisites
Description
Added Arm intrinsics in the ComponentProcessor.
Only SumHorizontal is not ported to Arm.