Skip to content

Conversation

@gselzer
Copy link
Contributor

@gselzer gselzer commented Apr 19, 2018

This PR adds an op to add uniformly distributed noise throughout an image, as an alternative to the non-uniform AddGaussianNoise op, renamed in this PR from AddNoise.

do {
final double newVal = (rangeMax - rangeMin) * rng.nextDouble(true, true) +
rangeMin + input.getRealDouble();
if (newVal <= input.getMaxValue() && newVal >= input.getMinValue()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what would be the least surprising behavior here. When I have an UnsignedByteType image with all pixels at 255, this would create negative noise only, thereby shifting the image mean to be <255. When adding any noise, I'd expect the image mean to be essentially unchanged. Although that's a bit special case, how about making it optional if the result should be 1) clipped, 2) overflowing, or 3) some smarter way of adding noise, such as the one here?

Copy link
Contributor Author

@gselzer gselzer Apr 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply!

I think that clipping can be pretty easily done by manipulating the rangeMin and rangeMax parameters. rangeMax just has to be set to 0 in this scenario.

I do not understand what you mean by overflow. Are you suggesting that you would like to have values in the ouptut that are greater than 255? If so you can just convert your image to ShortType or something else that allows larger values. Or are you meaning that it would be desired to wrap values with positive noise values back around to the lower values of the type (i.e. if given rangeMin = -4 and rangeMax = 4 we generate a noise value of 4 for your input of 255, thus the output is set to 3)? If so this is not correct, seeing as output < rangeMin. Can you elaborate on what you mean, if you mean something different?

I guess by this I mean that I do not see any reason to add these options, since it all can be controlled by the type of the input and the rangeMin and rangeMax parameters. And you are right in that the mean of your example will be less than 255. However in this scenario I think that is expected, because there is no way to go above the starting value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry, when I said "overflow", I meant wrapping values from max to min and vice versa, e.g. 255 + 4 -> 3. When working with ImgLib2 types (such as UnsignedByteType), this is the observed behavior, and it actually comes in handy when working with cyclic colormaps (see also http://peterkovesi.com/projects/colourmaps/) to represent e.g. directionality. For that special case, it would be nice to be able to choose between wrapping, clipping, or possibly type-widening behavior for ops in general (from a user perspective).

@ctrueden
Copy link
Member

ctrueden commented May 1, 2018

When working with ImgLib2 types (such as UnsignedByteType), this is the observed behavior, and it actually comes in handy when working with cyclic colormaps

I suggest caution: there is no guarantee that all ImgLib2 types will behave in this way. (@tpietzsch @axtimwalde please correct me if I am wrong. ) Behavior in this case is left up to the implementation. E.g., IIRC, there are some types that when you setReal(value) them with an out-of-bounds value, they get corrupted somehow. @awalter17 do you remember?

@awalter17
Copy link
Contributor

there are some types that when you setReal(value) them with an out-of-bounds value, they get corrupted somehow. @awalter17 do you remember?

I don't remember a problem with setReal (sorry!), but there was inconsistency with the out-of-bounds behavior (some wrapped, others clamped). I think most of them have masking/wrapping behavior now (imglib/imglib2#96 - this was for constructors but I think it affected the setters as well). That being said, I don't think RealType guarantees this wrapping behavior for setReal or any of the setters.

@imagejan
Copy link
Member

imagejan commented May 2, 2018

I suggest caution: there is no guarantee that all ImgLib2 types will behave in this way.

Yes, I figured it would be undefined behavior (as always in ImgLib2, it's the callers responsibility to conform to types and intervals...), that's why i said "observed" behavior 😄 Ok, I will not rely on that, but the "cyclic scale" use case came to my mind as something where this would be really useful.

when you setReal(value) them with an out-of-bounds value

BTW, are there terms to distinguish between out-of-bounds for intervals (i.e. x,y,z,...) and out-of-bounds for types (i.e. values)?

Anyhow, this is now far off-topic for this PR. I suggest to simply ignore my comments (and maybe just keep in mind that it would be useful to have consistent clipping/wrapping behavior across ops in general).

@ctrueden ctrueden added the to do label Oct 3, 2018
gselzer and others added 6 commits May 18, 2020 11:36
This op was renamed to make it clear that the noise generated by this op
is NON-uniform. This distinction is being made to separate this op from
a uniform noise generation op (coming in a later commit)
In previous commits the op did not allow input image data to influence
the output values. Now the algorithm allows, given a nonzero input image
(in which all of the pixels do not have to be the same value), the input
image to influence the output noise, i.e. if it is desired to add
uniform noise with range [-4, 4] upon an imported image, the algorithm
will now add a value x such that -4 <= x <= 4 to each pixel in the
image, whatever value that may be, as long as it does not extend beyond
the range of the type.
Previously the Op would recompute the calculation whenver the Mersenne
Twister provided a random value that would result in the input going
outside of the type bounds, in the hopes that another computation would
provide a result that was inside the type bounds. Not only can this be
slow, but also does not line up with what the user might expect.

Suppose that the input to the Op is an UnsignedByteType of value 254,
	with rangeMin=0 and rangeMax=4. Since we have uniform noise, we
	would expect that the probability that the output has an equal
	probability of being 254, 255, 256, 257, 258 (with a probability
	of 0.2 for each). But we know that this UnsignedByteType cannot
	hold a value greater than 255. If the Op redoes the calculation
	every time the output is greater than 255, then we know that the
	output will be 254 half of the time and 255 half of the time.
	Therefore no probability matches what might be expected.

Consider the same example, but instead of a recomputation we clamp the
output when it is outside the type bounds. We now have
P(output=254)=0.2, P(output=255)=0.8. In this case, all outputs within
the type range will have the expected probability of occurring, which
makes more sense than the scenario explained above. This is why the
change was made.
@gselzer gselzer force-pushed the addUniformNoise branch 2 times, most recently from 67f7e97 to 04af3fe Compare May 19, 2020 16:46
@gselzer
Copy link
Contributor Author

gselzer commented May 19, 2020

Alright, I gave this PR some TLC. @imagejan, after some thinking I realized that I liked your solution a bit better, as I described in 5440061. I found it difficult to treat floating point and integer types the same, however (see 04af3fe), so I created two different Ops.

I did, however, leave the computations for the integer type Op in doubles, however, to allow for some support for some of our larger IntegerTypes. The only other option that I saw was to convert everything to BigIntegers, but I did not want to do that for performance reasons. (Note that we cannot simply do this all using Imglib math because that would leave us open to all of the issues of the edge behavior of each type) Let me know what you think.

gselzer added 2 commits May 19, 2020 12:24
This allows us to handle integer types and realTypes separately. The
reason for this is as follows:

It seems clear that we would want to allow floating point randomness
when our type allows floating points

If we always base our calculations on floating points, this results in
edge effects when we add noise to integer types. This is due to
rounding: suppose we have a pseudorandomly generated double between 0
and 2. Most IntegerTypes simply round that double to a long, integer,
    etc in setReal. Thus the probability we add 0 to the input is 0.25,
    1 to the input is 0.5, and 2 to the input is 0.25. To get around
    this case we want to generate the random number differently when we
    have an integer type.

For these reasons we have two different Ops to perform the different
noise generation and have shared logic for setting the output.

N.B. We do not allow wrapping for floating point types. This is because
we operate within the following constraint: the number after the max of
the type should map to the minimum of the type. In the case of floating
points, what then should happen to a number equal to 0.5 + the maximum
of the type? In this paradigm, we would expect it to be the minimum of
the type - 0.5; this is impossible, however, since this would be outside
of the range of the type. For this reason, we always clamp.
@gselzer gselzer force-pushed the addUniformNoise branch from 90e1436 to 0bc3d3d Compare May 19, 2020 17:36
The default seed resulted in non-random noise. Removing it ensures that
the result is different every time (unless the user passes a seed)
@gselzer
Copy link
Contributor Author

gselzer commented May 19, 2020

I think the way setting the seed parameter works should be flipped, i.e. don't use a constant seed unless otherwise specified.

Thanks for catching that @rimadoma! Should be updated in the uniform noise Ops. It looks like AddGaussianNoise and AddPoissonNoise do this as well, should we alter them with the risk of breaking backwards compatibility?

Also Random is not thread safe.

Sure, yes. I used MersenneTwisterFast at @ctrueden's recommendation (which is thread safe according to the Javadoc). We can switch this in the other noise Ops as well if we decide it is okay to do so.

@ctrueden
Copy link
Member

ctrueden commented May 20, 2020

The previous default seed parameter was correct. One rule of ops is that they are deterministic. If you want TRULY RANDOM noise in your script, then you must generate a random number (externally to ops) and pass that as the seed. I know it's annoying, but it's one of the guarantees of ops.

Therefore, we also need to be careful about multithreaded pseudorandomness. Just because MersenneTwisterFast is thread-safe doesn't mean the random order will be deterministic per thread. You need to use a separate instance of the RNG per thread. Happy to discuss more tomorrow during our meeting, @gselzer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants