|
155 | 155 | 6. 专业数据集名称HoVer/WiCE保留英文原名)|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FlashCheck:+Exploration+of+Efficient+Evidence+Retrieval+for+Fast+Fact-Checking)|0| |
156 | 156 | |[Sim4Rec: Flexible and Extensible Simulator for Recommender Systems for Large-Scale Data](https://doi.org/10.1007/978-3-031-88717-8_33)|Anna Volodkevich, Veronika Ivanova, Alexey Vasilev, Dmitry Bugaychenko, Maxim Savchenko||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Sim4Rec:+Flexible+and+Extensible+Simulator+for+Recommender+Systems+for+Large-Scale+Data)|0| |
157 | 157 | |[The Impact of Mainstream-Driven Algorithms on Recommendations for Children](https://doi.org/10.1007/978-3-031-88714-7_5)|Robin Ungruh, Alejandro Bellogín, Maria Soledad Pera||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=The+Impact+of+Mainstream-Driven+Algorithms+on+Recommendations+for+Children)|0| |
158 | | -|[Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-ranking](https://doi.org/10.1007/978-3-031-88714-7_31)|Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin Potthast, Matthias Hagen|CSIRO; University of Kassel ScadDS.AI; Bauhaus-Universität Weimar; Friedrich-Schiller-Universität Jena; Leipzig University; University of Queensland|Cross-encoders distilled from large language models (LLMs) are often more effective re-rankers than cross-encoders fine-tuned on manually labeled data. However, distilled models do not match the effectiveness of their teacher LLMs. We hypothesize that this effectiveness gap is due to the fact that previous work has not applied the best-suited methods for fine-tuning cross-encoders on manually labeled data (e.g., hard-negative sampling, deep sampling, and listwise loss functions). To close this gap, we create a new dataset, Rank-DistiLLM. Cross-encoders trained on Rank-DistiLLM achieve the effectiveness of LLMs while being up to 173 times faster and 24 times more memory efficient. Our code and data is available at https://github.com/webis-de/ECIR-25.|从大型语言模型(LLM)蒸馏得到的交叉编码器(cross-encoder)通常比基于人工标注数据微调的交叉编码器具有更强的重排序效果。然而,蒸馏模型始终无法达到其教师LLM的效能水平。我们推测这种效能差距源于先前研究未能充分应用最适合人工标注数据微调交叉编码器的方法(例如:困难负样本采样、深度采样以及列表式损失函数)。为消除这一差距,我们构建了全新数据集Rank-DistiLLM。基于该数据集训练的交叉编码器在效能上可媲美LLM,同时推理速度提升达173倍,内存效率提高24倍。代码与数据集已开源:https://github.com/webis-de/ECIR-25。 |
| 158 | +|[Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-ranking](https://doi.org/10.1007/978-3-031-88714-7_31)|Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin Potthast, Matthias Hagen|Leipzig University; CSIRO; University of Kassel ScadDS.AI; University of Queensland; Bauhaus-Universität Weimar; Friedrich-Schiller-Universität Jena|Cross-encoders distilled from large language models (LLMs) are often more effective re-rankers than cross-encoders fine-tuned on manually labeled data. However, distilled models do not match the effectiveness of their teacher LLMs. We hypothesize that this effectiveness gap is due to the fact that previous work has not applied the best-suited methods for fine-tuning cross-encoders on manually labeled data (e.g., hard-negative sampling, deep sampling, and listwise loss functions). To close this gap, we create a new dataset, Rank-DistiLLM. Cross-encoders trained on Rank-DistiLLM achieve the effectiveness of LLMs while being up to 173 times faster and 24 times more memory efficient. Our code and data is available at https://github.com/webis-de/ECIR-25.|从大型语言模型(LLM)蒸馏得到的交叉编码器(cross-encoder)通常比基于人工标注数据微调的交叉编码器具有更强的重排序效果。然而,蒸馏模型始终无法达到其教师LLM的效能水平。我们推测这种效能差距源于先前研究未能充分应用最适合人工标注数据微调交叉编码器的方法(例如:困难负样本采样、深度采样以及列表式损失函数)。为消除这一差距,我们构建了全新数据集Rank-DistiLLM。基于该数据集训练的交叉编码器在效能上可媲美LLM,同时推理速度提升达173倍,内存效率提高24倍。代码与数据集已开源:https://github.com/webis-de/ECIR-25。 |
159 | 159 |
|
160 | 160 | (翻译说明: |
161 | 161 | 1. 专业术语处理:"hard-negative sampling"译为"困难负样本采样"符合NLP领域惯例,"listwise loss functions"采用"列表式损失函数"的学术译法 |
|
314 | 314 | 4. 概念准确传递:精准处理"highly repetitive interactions"(高度重复交互)、"item combinations"(物品组合)等关键概念 |
315 | 315 | 5. 数据呈现规范化:8% improvement统一译为"提升8%",符合中文科技论文表述惯例)|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SAFERec:+Self-Attention+and+Frequency+Enriched+Model+for+Next+Basket+Recommendation)|0| |
316 | 316 | |[Can Generative AI Adequately Protect Queries? Analyzing the Trade-Off Between Privacy Awareness and Retrieval Effectiveness](https://doi.org/10.1007/978-3-031-88714-7_34)|Luca HerranzCelotti, Blessing Guembe, Giovanni Livraga, Marco Viviani||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Can+Generative+AI+Adequately+Protect+Queries?+Analyzing+the+Trade-Off+Between+Privacy+Awareness+and+Retrieval+Effectiveness)|0| |
317 | | -|[A Test Collection for Dataset Retrieval](https://doi.org/10.1007/978-3-031-88714-7_36)|Nikolay Kolyada, Martin Potthast, Benno Stein|Department of Citizen Science, Institute of Data Science, German Aerospace Center (DLR); Friedrich Schiller University Jena, Department of Mathematics and Computer Science, Heinz Nixdorf Chair for Distributed Information Systems; Department Forest Nature Conservation, Georg-August-Universität Göttingen; Institute of Biology Geobotany and Botanical Garden, Martin Luther University Halle-Wittenberg|Searching for scientific datasets is a prominent task in scholars' daily research practice. A variety of data publishers, archives and data portals offer search applications that allow the discovery of datasets. The evaluation of such dataset retrieval systems requires proper test collections, including questions that reflect real world information needs of scholars, a set of datasets and human judgements assessing the relevance of the datasets to the questions in the benchmark corpus. Unfortunately, only very few test collections exist for a dataset search. In this paper, we introduce the BEF-China test collection, the very first test collection for dataset retrieval in biodiversity research, a research field with an increasing demand in data discovery services. The test collection consists of 14 questions, a corpus of 372 datasets from the BEF-China project and binary relevance judgements provided by a biodiversity expert.|在学者的日常研究实践中,科学数据集的检索是一项重要任务。各类数据出版商、档案库及数据门户网站提供的数据发现应用,使得数据集检索成为可能。这类数据集检索系统的评估需要构建规范的测试集,其中应包含反映学者真实信息需求的问题陈述、候选数据集集合,以及针对基准语料库中数据集与问题相关度的人工标注结果。然而目前可用于数据集检索评估的测试集屈指可数。本文介绍BEF-China测试集——这是生物多样性研究领域首个专门用于数据集检索评估的测试集,该领域对数据发现服务的需求正持续增长。该测试集包含14个检索问题、来自BEF-China项目的372个数据集构成的语料库,以及由生物多样性专家提供的二元相关度判定结果。 |
| 317 | +|[A Test Collection for Dataset Retrieval](https://doi.org/10.1007/978-3-031-88714-7_36)|Nikolay Kolyada, Martin Potthast, Benno Stein|Department of Citizen Science, Institute of Data Science, German Aerospace Center (DLR); Institute of Biology Geobotany and Botanical Garden, Martin Luther University Halle-Wittenberg; Department Forest Nature Conservation, Georg-August-Universität Göttingen; Friedrich Schiller University Jena, Department of Mathematics and Computer Science, Heinz Nixdorf Chair for Distributed Information Systems|Searching for scientific datasets is a prominent task in scholars' daily research practice. A variety of data publishers, archives and data portals offer search applications that allow the discovery of datasets. The evaluation of such dataset retrieval systems requires proper test collections, including questions that reflect real world information needs of scholars, a set of datasets and human judgements assessing the relevance of the datasets to the questions in the benchmark corpus. Unfortunately, only very few test collections exist for a dataset search. In this paper, we introduce the BEF-China test collection, the very first test collection for dataset retrieval in biodiversity research, a research field with an increasing demand in data discovery services. The test collection consists of 14 questions, a corpus of 372 datasets from the BEF-China project and binary relevance judgements provided by a biodiversity expert.|在学者的日常研究实践中,科学数据集的检索是一项重要任务。各类数据出版商、档案库及数据门户网站提供的数据发现应用,使得数据集检索成为可能。这类数据集检索系统的评估需要构建规范的测试集,其中应包含反映学者真实信息需求的问题陈述、候选数据集集合,以及针对基准语料库中数据集与问题相关度的人工标注结果。然而目前可用于数据集检索评估的测试集屈指可数。本文介绍BEF-China测试集——这是生物多样性研究领域首个专门用于数据集检索评估的测试集,该领域对数据发现服务的需求正持续增长。该测试集包含14个检索问题、来自BEF-China项目的372个数据集构成的语料库,以及由生物多样性专家提供的二元相关度判定结果。 |
318 | 318 |
|
319 | 319 | (说明:本译文在专业术语处理上严格遵循学术规范,如"test collection"译为"测试集"、"binary relevance judgements"译为"二元相关度判定"等。针对长难句进行了符合中文表达习惯的拆分重组,例如将英文原句中包含多个从句的复杂结构转换为三个短句。同时保留了"BEF-China"等专有名词的原始表述,确保学术严谨性。通过使用"语料库"、"相关度"等术语,准确传递了信息检索领域的专业概念。)|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Test+Collection+for+Dataset+Retrieval)|0| |
320 | 320 | |[E2Rank: Efficient and Effective Layer-Wise Reranking](https://doi.org/10.1007/978-3-031-88714-7_41)|Cesare Campagnano, Antonio Mallia, Jack Pertschuk, Fabrizio Silvestri||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=E2Rank:+Efficient+and+Effective+Layer-Wise+Reranking)|0| |
|
0 commit comments