Update metadata from Papers with Code

acl-pwc-bot · web-flow · commit 4f10f91d5f02 · 2022-03-10T02:07:30.000+01:00
diff --git a/data/xml/2020.aacl.xml b/data/xml/2020.aacl.xml
@@ -555,7 +555,6 @@
       <abstract>Pairwise data automatically constructed from weakly supervised signals has been widely used for training deep learning models. Pairwise datasets such as parallel texts can have uneven quality levels overall, but usually contain data subsets that are more useful as learning examples. We present two methods to refine data that are aimed to obtain that kind of subsets in a self-supervised way. Our methods are based on iteratively training dual-encoder models to compute similarity scores. We evaluate our methods on de-noising parallel texts and training neural machine translation models. We find that: (i) The self-supervised refinement achieves most machine translation gains in the first iteration, but following iterations further improve its intrinsic evaluation. (ii) Machine translations can improve the de-noising performance when combined with selection steps. (iii) Our methods are able to reach the performance of a supervised method. Being entirely self-supervised, our methods are well-suited to handle pairwise data without the need of prior knowledge or human annotations.</abstract>
       <url hash="d0e82ad5">2020.aacl-main.45</url>
       <bibkey>hernandez-abrego-etal-2020-self</bibkey>
-      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="46">
       <title>A Survey of the State of Explainable <fixed-case>AI</fixed-case> for Natural Language Processing</title>
@@ -1532,7 +1531,7 @@
       <abstract>We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such as end-to-end speech recognition and speech-to-text translation. It follows fairseq’s careful design for scalability and extensibility. We provide end-to-end workflows from data pre-processing, model training to offline (online) inference. We implement state-of-the-art RNN-based as well as Transformer-based models and open-source detailed training recipes. Fairseq’s machine translation models and language models can be seamlessly integrated into S2T workflows for multi-task learning or transfer learning. Fairseq S2T is available at https://github.com/pytorch/fairseq/tree/master/examples/speech_to_text.</abstract>
       <url hash="ba6e2aa3">2020.aacl-demo.6</url>
       <bibkey>wang-etal-2020-fairseq</bibkey>
-      <pwccode url="https://github.com/pytorch/fairseq/tree/master/examples/speech_to_text" additional="true">pytorch/fairseq</pwccode>
+      <pwccode url="https://github.com/pytorch/fairseq" additional="true">pytorch/fairseq</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/librispeech">LibriSpeech</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/must-c">MuST-C</pwcdataset>
     </paper>
diff --git a/data/xml/2020.acl.xml b/data/xml/2020.acl.xml
@@ -6125,6 +6125,7 @@
       <doi>10.18653/v1/2020.acl-main.417</doi>
       <video href="http://slideslive.com/38928733"/>
       <bibkey>banon-etal-2020-paracrawl</bibkey>
+      <pwccode url="" additional="true"/>
       <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="418">
@@ -7788,7 +7789,7 @@
       <doi>10.18653/v1/2020.acl-main.528</doi>
       <video href="http://slideslive.com/38928863"/>
       <bibkey>ma-etal-2020-simplify</bibkey>
-      <pwccode url="https://github.com/v-mipeng/LexiconAugmentedNER" additional="false">v-mipeng/LexiconAugmentedNER</pwccode>
+      <pwccode url="https://github.com/v-mipeng/LexiconAugmentedNER" additional="true">v-mipeng/LexiconAugmentedNER</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/ontonotes-4-0">OntoNotes 4.0</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/resume-ner">Resume NER</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/weibo-ner">Weibo NER</pwcdataset>
diff --git a/data/xml/2020.clinicalnlp.xml b/data/xml/2020.clinicalnlp.xml
@@ -242,6 +242,9 @@
       <bibkey>wen-etal-2020-medal</bibkey>
       <pwccode url="https://github.com/BruceWen120/medal" additional="false">BruceWen120/medal</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/medal">MeDAL</pwcdataset>
+      <pwcdataset url="https://paperswithcode.com/dataset/adam">ADAM</pwcdataset>
+      <pwcdataset url="https://paperswithcode.com/dataset/mimic-iii">MIMIC-III</pwcdataset>
+      <pwcdataset url="https://paperswithcode.com/dataset/pubmed">Pubmed</pwcdataset>
     </paper>
     <paper id="16">
       <title>Knowledge Grounded Conversational Symptom Detection with Graph Memory Networks</title>
diff --git a/data/xml/2020.eamt.xml b/data/xml/2020.eamt.xml
@@ -509,7 +509,6 @@
       <abstract>With official status in both Ireland and the EU, there is a need for high-quality English-Irish (EN-GA) machine translation (MT) systems which are suitable for use in a professional translation environment. While we have seen recent research on improving both statistical MT and neural MT for the EN-GA pair, the results of such systems have always been reported using automatic evaluation metrics. This paper provides the first human evaluation study of EN-GA MT using professional translators and in-domain (public administration) data for a more accurate depiction of the translation quality available via MT.</abstract>
       <url hash="d59b6dd8">2020.eamt-1.46</url>
       <bibkey>dowling-etal-2020-human</bibkey>
-      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="47">
       <title>Machine Translation Quality: A comparative evaluation of <fixed-case>SMT</fixed-case>, <fixed-case>NMT</fixed-case> and tailored-<fixed-case>NMT</fixed-case> outputs</title>
diff --git a/data/xml/2020.emnlp.xml b/data/xml/2020.emnlp.xml
@@ -4217,6 +4217,7 @@
       <doi>10.18653/v1/2020.emnlp-main.281</doi>
       <video href="https://slideslive.com/38938810"/>
       <bibkey>he-etal-2020-amalgamating</bibkey>
+      <pwccode url="https://github.com/siat-nlp/TTOS" additional="false">siat-nlp/TTOS</pwccode>
     </paper>
     <paper id="282">
       <title>Task-oriented Domain-specific Meta-Embedding for Text Classification</title>
@@ -6550,7 +6551,7 @@
       <doi>10.18653/v1/2020.emnlp-main.438</doi>
       <video href="https://slideslive.com/38938985"/>
       <bibkey>hardalov-etal-2020-exams</bibkey>
-      <pwccode url="https://github.com/mhardalov/exams-qa" additional="false">mhardalov/exams-qa</pwccode>
+      <pwccode url="https://github.com/mhardalov/exams-qa" additional="true">mhardalov/exams-qa</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/exams">EXAMS</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/arc">ARC</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/race">RACE</pwcdataset>
@@ -7181,7 +7182,6 @@
       <video href="https://slideslive.com/38939183"/>
       <bibkey>el-kishky-etal-2020-ccaligned</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ccaligned">CCAligned</pwcdataset>
-      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/wikimatrix">WikiMatrix</pwcdataset>
     </paper>
     <paper id="481">
@@ -7473,7 +7473,7 @@
       <doi>10.18653/v1/2020.emnlp-main.498</doi>
       <video href="https://slideslive.com/38938695"/>
       <bibkey>garg-ramakrishnan-2020-bae</bibkey>
-      <pwccode url="https://github.com/QData/TextAttack/blob/master/textattack/attack_recipes/bae_garg_2019.py" additional="true">QData/TextAttack</pwccode>
+      <pwccode url="https://github.com/QData/TextAttack" additional="true">QData/TextAttack</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mpqa-opinion-corpus">MPQA Opinion Corpus</pwcdataset>
     </paper>
     <paper id="499">
@@ -9276,7 +9276,6 @@
       <pwcdataset url="https://paperswithcode.com/dataset/esxnli">esXNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/mlqa">MLQA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/multinli">MultiNLI</pwcdataset>
-      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/squad">SQuAD</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xnli">XNLI</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/xquad">XQuAD</pwcdataset>
diff --git a/data/xml/2020.findings.xml b/data/xml/2020.findings.xml
@@ -2036,6 +2036,7 @@
       <bibkey>feng-etal-2020-codebert</bibkey>
       <pwccode url="https://github.com/microsoft/CodeBERT" additional="true">microsoft/CodeBERT</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/codesearchnet">CodeSearchNet</pwcdataset>
+      <pwcdataset url="https://paperswithcode.com/dataset/manytypes4typescript">ManyTypes4TypeScript</pwcdataset>
     </paper>
     <paper id="140">
       <title><fixed-case>S</fixed-case>tyle<fixed-case>DGPT</fixed-case>: Stylized Response Generation with Pre-trained Language Models</title>
diff --git a/data/xml/2020.ngt.xml b/data/xml/2020.ngt.xml
@@ -276,7 +276,6 @@
       <doi>10.18653/v1/2020.ngt-1.20</doi>
       <video href="http://slideslive.com/38929834"/>
       <bibkey>nagoudi-etal-2020-growing</bibkey>
-      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="21">
       <title>Generating Diverse Translations via Weighted Fine-tuning and Hypotheses Filtering for the <fixed-case>D</fixed-case>uolingo <fixed-case>STAPLE</fixed-case> Task</title>
diff --git a/data/xml/2020.tlt.xml b/data/xml/2020.tlt.xml
@@ -167,6 +167,7 @@
       <doi>10.18653/v1/2020.tlt-1.13</doi>
       <video href="https://uni-duesseldorf.sciebo.de/s/wg97bnr5QS7B7CP"/>
       <bibkey>kleiweg-van-noord-2020-alpinograph</bibkey>
+      <pwccode url="https://github.com/rug-compling/alpinograph" additional="false">rug-compling/alpinograph</pwccode>
     </paper>
     <paper id="14">
       <title>Implementing an End-to-End Treebank-Informed Pipeline for <fixed-case>B</fixed-case>ulgarian</title>
diff --git a/data/xml/2020.wmt.xml b/data/xml/2020.wmt.xml
@@ -248,6 +248,7 @@
       <url hash="f3fb101e">2020.wmt-1.15</url>
       <video href="https://slideslive.com/38939633"/>
       <bibkey>krislauks-pinnis-2020-tilde</bibkey>
+      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="16">
       <title><fixed-case>S</fixed-case>amsung <fixed-case>R</fixed-case>&amp;<fixed-case>D</fixed-case> Institute <fixed-case>P</fixed-case>oland submission to <fixed-case>WMT</fixed-case>20 News Translation Task</title>
@@ -1076,6 +1077,7 @@
       <video href="https://slideslive.com/38939678"/>
       <bibkey>koehn-etal-2020-findings</bibkey>
       <pwcdataset url="https://paperswithcode.com/dataset/ccaligned">CCAligned</pwcdataset>
+      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="79">
       <title>Findings of the <fixed-case>WMT</fixed-case> 2020 Shared Task on Quality Estimation</title>
@@ -1196,7 +1198,6 @@
       <url hash="1120ae61">2020.wmt-1.87</url>
       <video href="https://slideslive.com/38939591"/>
       <bibkey>corral-saralegi-2020-elhuyar</bibkey>
-      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="88">
       <title><fixed-case>Y</fixed-case>ereva<fixed-case>NN</fixed-case>’s Systems for <fixed-case>WMT</fixed-case>20 Biomedical Translation Task: The Effect of Fixing Misaligned Sentence Pairs</title>
diff --git a/data/xml/2021.acl.xml b/data/xml/2021.acl.xml
@@ -3424,6 +3424,7 @@
       <bibkey>mao-etal-2021-lightweight</bibkey>
       <pwccode url="https://github.com/Mao-KU/lightweight-crosslingual-sent2vec" additional="false">Mao-KU/lightweight-crosslingual-sent2vec</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mldoc">MLDoc</pwcdataset>
+      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="227">
       <title><fixed-case>ERNIE</fixed-case>-<fixed-case>D</fixed-case>oc: A Retrospective Long-Document Modeling Transformer</title>
@@ -5466,7 +5467,6 @@
       <bibkey>liu-etal-2021-element</bibkey>
       <revision id="1" href="2021.acl-long.361v1" hash="502ea135"/>
       <revision id="2" href="2021.acl-long.361v2" hash="0b78b856" date="2021-08-17">Typo fixes in Abstract, clearer content in Related Work and Conclusion</revision>
-      <pwccode url="https://github.com/Lfc1993/EI_ORE" additional="false">Lfc1993/EI_ORE</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/t-rex">T-REx</pwcdataset>
     </paper>
     <paper id="362">
diff --git a/data/xml/2021.dialdoc.xml b/data/xml/2021.dialdoc.xml
@@ -61,6 +61,7 @@
       <url hash="6fbc33af">2021.dialdoc-1.3</url>
       <doi>10.18653/v1/2021.dialdoc-1.3</doi>
       <bibkey>wang-etal-2021-template</bibkey>
+      <pwccode url="https://github.com/wdimmy/THPN" additional="false">wdimmy/THPN</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/dialogue-state-tracking-challenge">Dialogue State Tracking Challenge</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/babi-1">bAbI</pwcdataset>
     </paper>
diff --git a/data/xml/2021.emnlp.xml b/data/xml/2021.emnlp.xml
@@ -1444,7 +1444,6 @@
       <doi>10.18653/v1/2021.emnlp-main.99</doi>
       <pwccode url="https://github.com/machelreid/afromt" additional="false">machelreid/afromt</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/opensubtitles">OpenSubtitles</pwcdataset>
-      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="100">
       <title>Evaluating the Evaluation Metrics for Style Transfer: A Case Study in Multilingual Formality Transfer</title>
@@ -3899,6 +3898,7 @@
       <url hash="c461d256">2021.emnlp-main.268</url>
       <bibkey>vu-etal-2021-generalised</bibkey>
       <doi>10.18653/v1/2021.emnlp-main.268</doi>
+      <pwccode url="https://github.com/trangvu/guda" additional="false">trangvu/guda</pwccode>
     </paper>
     <paper id="269">
       <title><fixed-case>STANKER</fixed-case>: Stacking Network based on Level-grained Attention-masked <fixed-case>BERT</fixed-case> for Rumor Detection on Social Media</title>
@@ -5469,7 +5469,7 @@
       <url hash="5af48518">2021.emnlp-main.373</url>
       <bibkey>kobayashi-etal-2021-incorporating</bibkey>
       <doi>10.18653/v1/2021.emnlp-main.373</doi>
-      <pwccode url="https://github.com/gorokoba560/norm-analysis-of-transformer" additional="false">gorokoba560/norm-analysis-of-transformer</pwccode>
+      <pwccode url="https://github.com/gorokoba560/norm-analysis-of-transformer" additional="true">gorokoba560/norm-analysis-of-transformer</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/conll-2003">CoNLL-2003</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/sst">SST</pwcdataset>
     </paper>
@@ -9822,6 +9822,7 @@
       <url hash="9f3a7029">2021.emnlp-main.674</url>
       <bibkey>berard-etal-2021-efficient</bibkey>
       <doi>10.18653/v1/2021.emnlp-main.674</doi>
+      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="675">
       <title>Role of <fixed-case>L</fixed-case>anguage <fixed-case>R</fixed-case>elatedness in <fixed-case>M</fixed-case>ultilingual <fixed-case>F</fixed-case>ine-tuning of <fixed-case>L</fixed-case>anguage <fixed-case>M</fixed-case>odels: <fixed-case>A</fixed-case> <fixed-case>C</fixed-case>ase <fixed-case>S</fixed-case>tudy in <fixed-case>I</fixed-case>ndo-<fixed-case>A</fixed-case>ryan <fixed-case>L</fixed-case>anguages</title>
@@ -10375,6 +10376,7 @@
       <url hash="0a04bf62">2021.emnlp-main.710</url>
       <bibkey>hardalov-etal-2021-cross</bibkey>
       <doi>10.18653/v1/2021.emnlp-main.710</doi>
+      <pwccode url="https://github.com/checkstep/mole-stance" additional="false">checkstep/mole-stance</pwccode>
     </paper>
     <paper id="711">
       <title>Text <fixed-case>A</fixed-case>uto<fixed-case>A</fixed-case>ugment: Learning Compositional Augmentation Policy for Text Classification</title>
diff --git a/data/xml/2021.eval4nlp.xml b/data/xml/2021.eval4nlp.xml
@@ -211,6 +211,7 @@
       <url hash="d3526cbe">2021.eval4nlp-1.16</url>
       <bibkey>leiter-2021-reference</bibkey>
       <doi>10.18653/v1/2021.eval4nlp-1.16</doi>
+      <pwccode url="https://github.com/gringham/wordandsentscoresfromtokenmatching" additional="false">gringham/wordandsentscoresfromtokenmatching</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/mlqe-pe">MLQE-PE</pwcdataset>
     </paper>
     <paper id="17">
diff --git a/data/xml/2021.findings.xml b/data/xml/2021.findings.xml
@@ -10018,6 +10018,7 @@
       <bibkey>ahia-etal-2021-low-resource</bibkey>
       <doi>10.18653/v1/2021.findings-emnlp.282</doi>
       <pwccode url="https://github.com/orevaahia/mc4lrnmt" additional="false">orevaahia/mc4lrnmt</pwccode>
+      <pwcdataset url="https://paperswithcode.com/dataset/paracrawl">ParaCrawl</pwcdataset>
     </paper>
     <paper id="283">
       <title>Transformer over Pre-trained Transformer for Neural Text Segmentation with Enhanced Topic Coherence</title>
diff --git a/data/xml/2021.insights.xml b/data/xml/2021.insights.xml
@@ -165,6 +165,7 @@
       <url hash="93b16635">2021.insights-1.12</url>
       <bibkey>bogoychev-chen-2021-highs</bibkey>
       <doi>10.18653/v1/2021.insights-1.12</doi>
+      <pwccode url="https://github.com/marian-nmt/marian" additional="false">marian-nmt/marian</pwccode>
     </paper>
     <paper id="13">
       <title>Backtranslation in Neural Morphological Inflection</title>
diff --git a/data/xml/2021.paclic.xml b/data/xml/2021.paclic.xml
@@ -539,6 +539,7 @@
       <pages>546–554</pages>
       <url hash="ecba50bd">2021.paclic-1.58</url>
       <bibkey>tran-etal-2021-vivqa-vietnamese</bibkey>
+      <pwccode url="https://github.com/khanhtran0412/vivqa" additional="false">khanhtran0412/vivqa</pwccode>
       <pwcdataset url="https://paperswithcode.com/dataset/coco">COCO</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/coco-qa">COCO-QA</pwcdataset>
       <pwcdataset url="https://paperswithcode.com/dataset/fm-iqa">FM-IQA</pwcdataset>
diff --git a/data/xml/2021.wmt.xml b/data/xml/2021.wmt.xml
diff --git a/data/xml/N18.xml b/data/xml/N18.xml
diff --git a/data/xml/N19.xml b/data/xml/N19.xml
diff --git a/data/xml/W19.xml b/data/xml/W19.xml