Skip to content

statistical significance method and jargon #76

@matanox

Description

@matanox

Thanks for Criterion!

I spent some time understanding sample-size and measurement-time which I believe I got right. My questions relate to statistical significance and may underscore not having the a-priori knowledge of best practices in benchmarking and here they go.


https://bheisler.github.io/criterion.rs/book/analysis.html seems to make it clear that the samples of any granular bench go in linearly increasing iteration count: [d, 2d, 3d, ... Nd].

How does this interact with statistical significance testing which Criterion is doing, which ultimately ends up as possibly warnings like:

Found 10 outliers among 50 measurements (20.00%)
6 (12.00%) low severe
2 (4.00%) high mild
2 (4.00%) high severe

Or, what is the motivation for [d, 2d, 3d, ... Nd] in case it's not for catering to a statistical property per-se?

Is there also already some place where these low/high and mild/severe text indications are defined?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions