How to bin integral vs floating point correctly (precision issue and general question)

Dear @HDembinski and other authors, thanks for this awesome library. We have just started using it and the speed and versatility are impressive! We have a problem that manifests as a precision problem, but also shows a more general question.

The problem: We use the floating precision histogram to bin any kind of data. When we bin data (gray value camera images) of integer precision, we set lower and upper bound based on the data range to `[min, max+1]` and the number of bins to `upper bound - lower bound`. However there are sometimes cases where the numeric precision then leads to an empty last bin. This is due to rounding error, the values fall into `max-1` instead of `max`. We checked and the bin index is then computed as `(max-1).999...` which finally resolves to `max-1`. Is there a good solution to this? It leads to confusion that the last bin is empty, and its also a bit problematic in writing tests.

The underlying question: We are not completely certain what is a "recommended" way to set histogram bounds for integral data space. Are there common guidelines? In floating point precision it is more understandable what data range is covered by each bin. But in integral space, it seems the value `0` could be best represented by bin `[-0.5, 0.5)` rather than the more obvious `[0, 1)`. And help, links or pointers to documentation would be highly appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to bin integral vs floating point correctly (precision issue and general question) #336

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to bin integral vs floating point correctly (precision issue and general question) #336

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions