Skip to content

Resource Efficiency & Performance #1035

@KevinW1998

Description

@KevinW1998

Hi! I am interested in this platform, but I have some questions about its overall performance.

How well does the MWDB perform when storing large numbers of samples (up to millions)? Has it been used in this way before?

Is there any reason why hashes are stored as hex-text rather than binary? https://github.com/CERT-Polska/mwdb-core/blob/master/mwdb/model/file.py#L32-L37
Wouldn't storing it in binary be more resource efficient?

The same question applies to blobs: https://github.com/CERT-Polska/mwdb-core/blob/master/mwdb/model/blob.py#L49.

_content=content.encode("unicode_escape").decode("utf-8")

Escaping a byte would take 4–5 bytes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type:questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions