Skip to content

Commit 4f10f91

Browse files
authored
Update metadata from Papers with Code
1 parent eb6361b commit 4f10f91

20 files changed

+31
-17
lines changed

data/xml/2020.aacl.xml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -555,7 +555,6 @@
555555
<abstract>Pairwise data automatically constructed from weakly supervised signals has been widely used for training deep learning models. Pairwise datasets such as parallel texts can have uneven quality levels overall, but usually contain data subsets that are more useful as learning examples. We present two methods to refine data that are aimed to obtain that kind of subsets in a self-supervised way. Our methods are based on iteratively training dual-encoder models to compute similarity scores. We evaluate our methods on de-noising parallel texts and training neural machine translation models. We find that: (i) The self-supervised refinement achieves most machine translation gains in the first iteration, but following iterations further improve its intrinsic evaluation. (ii) Machine translations can improve the de-noising performance when combined with selection steps. (iii) Our methods are able to reach the performance of a supervised method. Being entirely self-supervised, our methods are well-suited to handle pairwise data without the need of prior knowledge or human annotations.</abstract>
556556
<url hash="d0e82ad5">2020.aacl-main.45</url>
557557
<bibkey>hernandez-abrego-etal-2020-self</bibkey>
558-
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
559558
</paper>
560559
<paper id="46">
561560
<title>A Survey of the State of Explainable <fixed-case>AI</fixed-case> for Natural Language Processing</title>
@@ -1532,7 +1531,7 @@
15321531
<abstract>We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such as end-to-end speech recognition and speech-to-text translation. It follows fairseq’s careful design for scalability and extensibility. We provide end-to-end workflows from data pre-processing, model training to offline (online) inference. We implement state-of-the-art RNN-based as well as Transformer-based models and open-source detailed training recipes. Fairseq’s machine translation models and language models can be seamlessly integrated into S2T workflows for multi-task learning or transfer learning. Fairseq S2T is available at https://github.com/pytorch/fairseq/tree/master/examples/speech_to_text.</abstract>
15331532
<url hash="ba6e2aa3">2020.aacl-demo.6</url>
15341533
<bibkey>wang-etal-2020-fairseq</bibkey>
1535-
<pwccode url="https://github.com/pytorch/fairseq/tree/master/examples/speech_to_text" additional="true">pytorch/fairseq</pwccode>
1534+
<pwccode url="https://github.com/pytorch/fairseq" additional="true">pytorch/fairseq</pwccode>
15361535
<pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
15371536
<pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
15381537
</paper>

data/xml/2020.acl.xml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6125,6 +6125,7 @@
61256125
<doi>10.18653/v1/2020.acl-main.417</doi>
61266126
<video href="http://slideslive.com/38928733"/>
61276127
<bibkey>banon-etal-2020-paracrawl</bibkey>
6128+
<pwccode url="" additional="true"/>
61286129
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
61296130
</paper>
61306131
<paper id="418">
@@ -7788,7 +7789,7 @@
77887789
<doi>10.18653/v1/2020.acl-main.528</doi>
77897790
<video href="http://slideslive.com/38928863"/>
77907791
<bibkey>ma-etal-2020-simplify</bibkey>
7791-
<pwccode url="https://github.com/v-mipeng/LexiconAugmentedNER" additional="false">v-mipeng/LexiconAugmentedNER</pwccode>
7792+
<pwccode url="https://github.com/v-mipeng/LexiconAugmentedNER" additional="true">v-mipeng/LexiconAugmentedNER</pwccode>
77927793
<pwcdataset url="https://paperswithcode.com/dataset/ontonotes-4-0">OntoNotes 4.0</pwcdataset>
77937794
<pwcdataset url="https://paperswithcode.com/dataset/resume-ner">Resume NER</pwcdataset>
77947795
<pwcdataset url="https://paperswithcode.com/dataset/weibo-ner">Weibo NER</pwcdataset>

data/xml/2020.clinicalnlp.xml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -242,6 +242,9 @@
242242
<bibkey>wen-etal-2020-medal</bibkey>
243243
<pwccode url="https://github.com/BruceWen120/medal" additional="false">BruceWen120/medal</pwccode>
244244
<pwcdataset url="https://paperswithcode.com/dataset/medal">MeDAL</pwcdataset>
245+
<pwcdataset url="https://paperswithcode.com/dataset/adam">ADAM</pwcdataset>
246+
<pwcdataset url="https://paperswithcode.com/dataset/mimic-iii">MIMIC-III</pwcdataset>
247+
<pwcdataset url="https://paperswithcode.com/dataset/pubmed">Pubmed</pwcdataset>
245248
</paper>
246249
<paper id="16">
247250
<title>Knowledge Grounded Conversational Symptom Detection with Graph Memory Networks</title>

data/xml/2020.eamt.xml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -509,7 +509,6 @@
509509
<abstract>With official status in both Ireland and the EU, there is a need for high-quality English-Irish (EN-GA) machine translation (MT) systems which are suitable for use in a professional translation environment. While we have seen recent research on improving both statistical MT and neural MT for the EN-GA pair, the results of such systems have always been reported using automatic evaluation metrics. This paper provides the first human evaluation study of EN-GA MT using professional translators and in-domain (public administration) data for a more accurate depiction of the translation quality available via MT.</abstract>
510510
<url hash="d59b6dd8">2020.eamt-1.46</url>
511511
<bibkey>dowling-etal-2020-human</bibkey>
512-
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
513512
</paper>
514513
<paper id="47">
515514
<title>Machine Translation Quality: A comparative evaluation of <fixed-case>SMT</fixed-case>, <fixed-case>NMT</fixed-case> and tailored-<fixed-case>NMT</fixed-case> outputs</title>

data/xml/2020.emnlp.xml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4217,6 +4217,7 @@
42174217
<doi>10.18653/v1/2020.emnlp-main.281</doi>
42184218
<video href="https://slideslive.com/38938810"/>
42194219
<bibkey>he-etal-2020-amalgamating</bibkey>
4220+
<pwccode url="https://github.com/siat-nlp/TTOS" additional="false">siat-nlp/TTOS</pwccode>
42204221
</paper>
42214222
<paper id="282">
42224223
<title>Task-oriented Domain-specific Meta-Embedding for Text Classification</title>
@@ -6550,7 +6551,7 @@
65506551
<doi>10.18653/v1/2020.emnlp-main.438</doi>
65516552
<video href="https://slideslive.com/38938985"/>
65526553
<bibkey>hardalov-etal-2020-exams</bibkey>
6553-
<pwccode url="https://github.com/mhardalov/exams-qa" additional="false">mhardalov/exams-qa</pwccode>
6554+
<pwccode url="https://github.com/mhardalov/exams-qa" additional="true">mhardalov/exams-qa</pwccode>
65546555
<pwcdataset url="https://paperswithcode.com/dataset/exams">EXAMS</pwcdataset>
65556556
<pwcdataset url="https://paperswithcode.com/dataset/arc">ARC</pwcdataset>
65566557
<pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
@@ -7181,7 +7182,6 @@
71817182
<video href="https://slideslive.com/38939183"/>
71827183
<bibkey>el-kishky-etal-2020-ccaligned</bibkey>
71837184
<pwcdataset url="https://paperswithcode.com/dataset/ccaligned">CCAligned</pwcdataset>
7184-
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
71857185
<pwcdataset url="https://paperswithcode.com/dataset/wikimatrix">WikiMatrix</pwcdataset>
71867186
</paper>
71877187
<paper id="481">
@@ -7473,7 +7473,7 @@
74737473
<doi>10.18653/v1/2020.emnlp-main.498</doi>
74747474
<video href="https://slideslive.com/38938695"/>
74757475
<bibkey>garg-ramakrishnan-2020-bae</bibkey>
7476-
<pwccode url="https://github.com/QData/TextAttack/blob/master/textattack/attack_recipes/bae_garg_2019.py" additional="true">QData/TextAttack</pwccode>
7476+
<pwccode url="https://github.com/QData/TextAttack" additional="true">QData/TextAttack</pwccode>
74777477
<pwcdataset url="https://paperswithcode.com/dataset/mpqa-opinion-corpus">MPQA Opinion Corpus</pwcdataset>
74787478
</paper>
74797479
<paper id="499">
@@ -9276,7 +9276,6 @@
92769276
<pwcdataset url="https://paperswithcode.com/dataset/esxnli">esXNLI</pwcdataset>
92779277
<pwcdataset url="https://paperswithcode.com/dataset/mlqa">MLQA</pwcdataset>
92789278
<pwcdataset url="https://paperswithcode.com/dataset/multinli">MultiNLI</pwcdataset>
9279-
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
92809279
<pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
92819280
<pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
92829281
<pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>

data/xml/2020.findings.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2036,6 +2036,7 @@
20362036
<bibkey>feng-etal-2020-codebert</bibkey>
20372037
<pwccode url="https://github.com/microsoft/CodeBERT" additional="true">microsoft/CodeBERT</pwccode>
20382038
<pwcdataset url="https://paperswithcode.com/dataset/codesearchnet">CodeSearchNet</pwcdataset>
2039+
<pwcdataset url="https://paperswithcode.com/dataset/manytypes4typescript">ManyTypes4TypeScript</pwcdataset>
20392040
</paper>
20402041
<paper id="140">
20412042
<title><fixed-case>S</fixed-case>tyle<fixed-case>DGPT</fixed-case>: Stylized Response Generation with Pre-trained Language Models</title>

data/xml/2020.ngt.xml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -276,7 +276,6 @@
276276
<doi>10.18653/v1/2020.ngt-1.20</doi>
277277
<video href="http://slideslive.com/38929834"/>
278278
<bibkey>nagoudi-etal-2020-growing</bibkey>
279-
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
280279
</paper>
281280
<paper id="21">
282281
<title>Generating Diverse Translations via Weighted Fine-tuning and Hypotheses Filtering for the <fixed-case>D</fixed-case>uolingo <fixed-case>STAPLE</fixed-case> Task</title>

data/xml/2020.tlt.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,7 @@
167167
<doi>10.18653/v1/2020.tlt-1.13</doi>
168168
<video href="https://uni-duesseldorf.sciebo.de/s/wg97bnr5QS7B7CP"/>
169169
<bibkey>kleiweg-van-noord-2020-alpinograph</bibkey>
170+
<pwccode url="https://github.com/rug-compling/alpinograph" additional="false">rug-compling/alpinograph</pwccode>
170171
</paper>
171172
<paper id="14">
172173
<title>Implementing an End-to-End Treebank-Informed Pipeline for <fixed-case>B</fixed-case>ulgarian</title>

data/xml/2020.wmt.xml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,7 @@
248248
<url hash="f3fb101e">2020.wmt-1.15</url>
249249
<video href="https://slideslive.com/38939633"/>
250250
<bibkey>krislauks-pinnis-2020-tilde</bibkey>
251+
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
251252
</paper>
252253
<paper id="16">
253254
<title><fixed-case>S</fixed-case>amsung <fixed-case>R</fixed-case>&amp;<fixed-case>D</fixed-case> Institute <fixed-case>P</fixed-case>oland submission to <fixed-case>WMT</fixed-case>20 News Translation Task</title>
@@ -1076,6 +1077,7 @@
10761077
<video href="https://slideslive.com/38939678"/>
10771078
<bibkey>koehn-etal-2020-findings</bibkey>
10781079
<pwcdataset url="https://paperswithcode.com/dataset/ccaligned">CCAligned</pwcdataset>
1080+
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
10791081
</paper>
10801082
<paper id="79">
10811083
<title>Findings of the <fixed-case>WMT</fixed-case> 2020 Shared Task on Quality Estimation</title>
@@ -1196,7 +1198,6 @@
11961198
<url hash="1120ae61">2020.wmt-1.87</url>
11971199
<video href="https://slideslive.com/38939591"/>
11981200
<bibkey>corral-saralegi-2020-elhuyar</bibkey>
1199-
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
12001201
</paper>
12011202
<paper id="88">
12021203
<title><fixed-case>Y</fixed-case>ereva<fixed-case>NN</fixed-case>’s Systems for <fixed-case>WMT</fixed-case>20 Biomedical Translation Task: The Effect of Fixing Misaligned Sentence Pairs</title>

data/xml/2021.acl.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3424,6 +3424,7 @@
34243424
<bibkey>mao-etal-2021-lightweight</bibkey>
34253425
<pwccode url="https://github.com/Mao-KU/lightweight-crosslingual-sent2vec" additional="false">Mao-KU/lightweight-crosslingual-sent2vec</pwccode>
34263426
<pwcdataset url="https://paperswithcode.com/dataset/mldoc">MLDoc</pwcdataset>
3427+
<pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
34273428
</paper>
34283429
<paper id="227">
34293430
<title><fixed-case>ERNIE</fixed-case>-<fixed-case>D</fixed-case>oc: A Retrospective Long-Document Modeling Transformer</title>
@@ -5466,7 +5467,6 @@
54665467
<bibkey>liu-etal-2021-element</bibkey>
54675468
<revision id="1" href="2021.acl-long.361v1" hash="502ea135"/>
54685469
<revision id="2" href="2021.acl-long.361v2" hash="0b78b856" date="2021-08-17">Typo fixes in Abstract, clearer content in Related Work and Conclusion</revision>
5469-
<pwccode url="https://github.com/Lfc1993/EI_ORE" additional="false">Lfc1993/EI_ORE</pwccode>
54705470
<pwcdataset url="https://paperswithcode.com/dataset/t-rex">T-REx</pwcdataset>
54715471
</paper>
54725472
<paper id="362">

0 commit comments

Comments
 (0)