Skip to content

Fail to generate the CTD dataset #11

@freesunshine0316

Description

@freesunshine0316

[lsong10@bhg0031 bran]$ ./extract.sh
Downloading Pubtator dump
--2019-03-31 21:09:22-- ftp://ftp.ncbi.nlm.nih.gov/pub/lu/PubTator/bioconcepts2pubtator_offsets.gz
=> ‘/home/lsong10/ws/exp.dep_forest/bran/data/ctd/bioconcepts2pubtator_offsets.gz’
Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 130.14.250.13, 2607:f220:41e:250::7
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.13|:21... failed: Connection refused.
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|2607:f220:41e:250::7|:21... failed: Network is unreachable.
Converting data from pubtator to tsv format
usage: process_CDR_data.py [-h] -i INPUT_FILE -d OUTPUT_DIR -f
OUTPUT_FILE_SUFFIX [-s MAX_SEQ] [-a FULL_ABSTRACT]
[-p PUBMED_FILTER] [-r RELATIONS]
[-w WORD_PIECE_CODES] [-t SHARDS]
[-x EXPORT_ALL_EPS] [-n EXPORT_NEGATIVES]
[-e ENCODING] [-m MAX_DISTANCE]
process_CDR_data.py: error: argument -a/--full_abstract: expected one argument
split: extra operand ‘up’
Try 'split --help' for more information.
map relations to smaller set
awk: cmd. line:1: fatal: cannot open file positive_0_genia' for reading (No such file or directory) seperate data into train dev test positive train 50 500 positive dev 50 500 positive test 50 500 negative train 50 500 awk: cmd. line:1: fatal: cannot open file negative_0_genia' for reading (No such file or directory)
negative dev 50 500
awk: cmd. line:1: fatal: cannot open file negative_0_genia' for reading (No such file or directory) negative test 50 500 awk: cmd. line:1: fatal: cannot open file negative_0_genia' for reading (No such file or directory)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions