Skip to content

Segmentation also of the Compound ID #112

@HiteSit

Description

@HiteSit

Dear Development team,

Locking trought your code i've noticed that there is not an option to segment not only the structures but also the relevant ID that is often present in many patent (more or less with the same style, attached an example). I'm imagining a protocol that segment also the ID and than a simple OCR (pytesseract) or more complex OCR (maybe something based on DL) could recognise the number ID and associate it to the structure.
I'm aware of the fact that not in all the patent the ID is present in a constant position (for example sometimes is at 12ptx another times is at 6ptx from the recognised structures. Or sometimes is horizontally and centrated other times is not centrated). But again I can imagine some sort of sample script in which the user input some parameters until is not satisfied of the segmentation.

Before that I start to see if I can do it by myself there is a specific reason why such feature was not implmented and/or what could be the challenges.

Thanks much and terrific work

image

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions