Skip to content

Quotes in TSV files #939

@TomazErjavec

Description

@TomazErjavec

Currently, all the ParlaMint generated TSV files have a simple structure, i.e. we have fields containing character data and assumed to be space-normalised, and the fields are separated by a TAB char. However, many applications have problems parsing this format when a field contains a quote characters (even though it does not make anything ambiguous).
For an example see Samples/ParlaMint-UA/2023/ParlaMint-UA_2023-11-09-m0-meta.tsv.

This should be fixed to enable processing of our TSVs. My suggestion is to:

  • enclose all fields that have "free" text as their content in double quotes
  • if such a field does indeed contain a quote, double it

This would make our TSV files most likely a specialisation of CSV.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingenhancementNew feature or request

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions