Converters - EDS-NLP #446
Replies: 1 comment
-
|
Add columns in input and output By default, when taking a dataframe as input (for example saved in docs = edsnlp.data.read_parquet(DATA_PATH,
converter="omop",
doc_attributes=["note_id", "person_id", "note_type", "note_datetime"])These columns can then be used during the pipelines that we apply to our data. If we then want to retrieve columns as output, for example
edsnlp.data.write_parquet(docs,
SAVE_PATH,
doc_attributes=["note_id", "person_id"]
)
def get_entities(doc):
entities = []
for ent in doc.ents:
d = dict(note_id=ent.doc._.note_id,
person_id=ent.doc._.person_id,
label=ent.label_,
sentence=ent.sent.text
)Then: edsnlp.data.write_parquet(docs,
SAVE_PATH,
converter=get_entities,
) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Converters - EDS-NLP
https://aphp.github.io/edsnlp/latest/data/converters/
Beta Was this translation helpful? Give feedback.
All reactions