-
Couldn't load subscription status.
- Fork 42
Neighborhood Maps for NativeType ArrayImg #191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f2d9e8a to
8184674
Compare
|
@dietzc This is pretty much done. What is still left is to consider whether we want to provide this as a alternate implementation to the other implementations which use |
|
We should definetively add this for rectangular shapes. The code should be more or less the same as what you did for benchmarking oob. |
|
Okay, I have an idea. |
|
Alright, I came up with something which saves all my aforementioned problems. I need to do some tests and then abstract everything to not have to duplicate code for the This is however a performance decrease compared to using the optimized op directly. I tried to accommodate for that by using multithreading for different intervals, but no idea if that was effective. I could provide benchmarks if neccessary. |
|
While testing, I found some problems which will require me to do a pullrequest for imglib2-algorithm first. This will take a while. |
|
can you shortly ellaborate what you are planning. |
|
I need access to the span and skipCenter of the RectangleShape for I already opened an issue for that a while ago, but since I need it now, I will write those getters, do a pullrequest and then I can use them here in ops. |
|
Ah alright. So, your plan is to split up the image in the individual intervals and perform an individual map on each of these individual intervals. Safe interval without OOB, unsafe with. Right? If you use existing |
I know, but I am using it outside the intervals. |
|
I don't understand. Lets say we can split a 2D image in four intervals: one in the middle and four at the border. then you can use your optimized variant in the center interval and a default variant on the other intervals. this of course only works for |
Then those intervals will be iterated over in 4 threads. Some threads using unoptimized, standard implemention, one using the optimized implementation for the center interval. |
why 4 threads? why don' you use the default |
Ah, now I understand. Well that happens anyway. But everything would need to wait for the "optimized" implementation, which is not threaded. I could of course use only two threads, one for the center interval and one for all the others. |
Can't you easily parallelize your optimized version, too (e.g. splitting up the center interval in blocks ...? I mean, how much is the performance gain of the non-parallelized version of yours compared to the unparallellized version? |
|
The idea was the following: I cannot use I am cutting the original image in My optimized version could be parallelized, of course, but the idea behind it was to help jit be able to optimize the loops, which kicks in only after a few iterations. If we split that up, then the chunks might not be big enough for jit to realize there is a hotspot to potentially optimize/on-stack-replace or whatever it does. Disclaimer: That is not based off of any statistical evidence. I hope that clears up any issues/concerns. |
Can you support this statement with a benchmark? I have the feeling, that your optimized version + palletization will be the fastest (as the JIT might recognize it across threads). |
Signed-off-by: squareys <[email protected]>
Signed-off-by: squareys <[email protected]>
Signed-off-by: squareys <[email protected]>
Signed-off-by: squareys <[email protected]>
To avoid code duplication, I decided to expand the existing implementation rather than implementing a new Op for 3D/1D. While this may not be "optimal peformance", the maintainability outweights the minimal and probably unsignificant "performance loss". Signed-off-by: squareys <[email protected]>
Signed-off-by: squareys <[email protected]>
Signed-off-by: squareys <[email protected]>
... implementation `MapNeighborhoodNativeType`. This was requested to be able to let OpService choose the optimized implementation implicitly. There is however some performance loss which is hopefully traded off by using multiple threads. Signed-off-by: squareys <[email protected]>
d87609b to
17ccca3
Compare
|
Current state: I realized that my optimized implementation works only for complete ArrayImgs and not subintervals. I almost fixed that now, but the implementation will require access to the height of the original image (for skipping to the next line) which is why I will need to change the parameters of that a bit. Should be done tomorrow. |
This Op provides the same interface as all the other MapNeighborhood Ops and therefore makes the optimized implementation of MapNeighborhood for NativeType (namely `MapNeighborhoodNativeType`) available for implicit optimization of depending algorithms. Signed-off-by: squareys <[email protected]>
Signed-off-by: squareys <[email protected]>
17ccca3 to
e102430
Compare
Signed-off-by: squareys <[email protected]>
Signed-off-by: squareys <[email protected]>
Signed-off-by: squareys <[email protected]>
Signed-off-by: squareys <[email protected]>
|
@dietc Done. We need to wait for imglib2-algorithm 0.3.2 before merge. |
|
I kindly asked @tpietzsch to release. He will. |
|
Benchmarks have run! Summary: The imagej-ops way which is compatible with the common |
|
Great work @Squareys. Can you shortly elaborate/confirm my assumpations on the implementations? ImageJ-Ops:
ImageJ-Ops-Extends:
ImgLib2:
|
|
In Order:
|
|
@Squareys can you shorty explain how these results compare to something which runs parallelized over the pixels e.g. the default |
|
@Squareys can we discuss this PR next Thursday in KN1? |
|
@dietzc Sure :) |
|
@Squareys Can you find any time to rebase this over the latest master? Let us know either way, because it would be great to get this merged. Thanks! |
|
I took a crack at rebasing this branch. However, the code has changed a lot (especially maps and unary/binary functions), so there were many conflicts. I pushed what I have so far to the array-img-opt-neighborhoods-CTR branch. @Squareys Do you have any time to clean this up? If not, I will ask @LeonYang5114 to do it after he completes his current project. |
|
Closed in favor of #395. |
Hello everybody!
Here's an work-in-progress pullrequest for an optimized implementation of 2D/3D
ArrayImgs consisting ofNativeType.TODOs:
Greetings,
Squareys