chore: update confs

actions-user · actions-user · commit 4029eb7485ad · 2025-09-26T10:19:40.000Z
diff --git a/arxiv.json b/arxiv.json
@@ -55788,5 +55788,40 @@
         "pub_date": "2025-09-24",
         "summary": "Managing scan protocols in Computed Tomography (CT), which includes adjusting acquisition parameters or configuring reconstructions, as well as selecting postprocessing tools in a patient-specific manner, is time-consuming and requires clinical as well as technical expertise. At the same time, we observe an increasing shortage of skilled workforce in radiology. To address this issue, a Large Language Model (LLM)-based agent framework is proposed to assist with the interpretation and execution of protocol configuration requests given in natural language or a structured, device-independent format, aiming to improve the workflow efficiency and reduce technologists' workload. The agent combines in-context-learning, instruction-following, and structured toolcalling abilities to identify relevant protocol elements and apply accurate modifications. In a systematic evaluation, experimental results indicate that the agent can effectively retrieve protocol components, generate device compatible protocol definition files, and faithfully implement user requests. Despite demonstrating feasibility in principle, the approach faces limitations regarding syntactic and semantic validity due to lack of a unified device API, and challenges with ambiguous or complex requests. In summary, the findings show a clear path towards LLM-based agents for supporting scan protocol management in CT imaging.",
         "translated": "在计算机断层扫描（CT）中，扫描协议管理（包括调整采集参数或配置重建方案）以及根据患者具体情况选择后处理工具，不仅耗时且需要临床与技术双重专业知识。与此同时，我们观察到放射学领域熟练劳动力日益短缺。为解决这一问题，本研究提出基于大语言模型（LLM）的智能体框架，通过解析自然语言或结构化、设备无关的格式给出的协议配置请求，旨在提升工作流效率并减轻技师工作负担。该智能体融合情境学习、指令跟随与结构化工具调用能力，可精准识别相关协议要素并执行修改操作。系统性评估实验表明，该智能体能够有效检索协议组件、生成设备兼容的协议定义文件，并准确实现用户需求。尽管该方法在原理上验证了可行性，但由于缺乏统一的设备API接口，其在语法和语义有效性方面仍存在局限，同时面临模糊或复杂请求的处理挑战。总体而言，本研究为基于LLM的智能体支持CT扫描协议管理指明了清晰的发展路径。"
+    },
+    {
+        "title": "Interactive Recommendation Agent with Active User Commands",
+        "url": "http://arxiv.org/abs/2509.21317v1",
+        "pub_date": "2025-09-25",
+        "summary": "Traditional recommender systems rely on passive feedback mechanisms that limit users to simple choices such as like and dislike. However, these coarse-grained signals fail to capture users' nuanced behavior motivations and intentions. In turn, current systems cannot also distinguish which specific item attributes drive user satisfaction or dissatisfaction, resulting in inaccurate preference modeling. These fundamental limitations create a persistent gap between user intentions and system interpretations, ultimately undermining user satisfaction and harming system effectiveness.   To address these limitations, we introduce the Interactive Recommendation Feed (IRF), a pioneering paradigm that enables natural language commands within mainstream recommendation feeds. Unlike traditional systems that confine users to passive implicit behavioral influence, IRF empowers active explicit control over recommendation policies through real-time linguistic commands. To support this paradigm, we develop RecBot, a dual-agent architecture where a Parser Agent transforms linguistic expressions into structured preferences and a Planner Agent dynamically orchestrates adaptive tool chains for on-the-fly policy adjustment. To enable practical deployment, we employ simulation-augmented knowledge distillation to achieve efficient performance while maintaining strong reasoning capabilities. Through extensive offline and long-term online experiments, RecBot shows significant improvements in both user satisfaction and business outcomes.",
+        "translated": "传统推荐系统依赖被动反馈机制，仅允许用户进行\"喜欢/不喜欢\"等简单选择。然而这类粗粒度信号无法捕捉用户细粒度的行为动机与意图，导致现有系统难以区分驱动用户满意度的具体物品属性，进而造成偏好建模失准。这些根本性局限在用户意图与系统解读之间形成持续性鸿沟，最终既损害用户体验又影响系统效能。为突破这些局限，我们推出交互式推荐信息流——一种可在主流推荐流中实现自然语言指令的新型范式。与传统系统将用户局限于被动隐式行为影响不同，IRF通过实时语言指令赋予用户对推荐策略的主动显式控制权。为支撑该范式，我们开发了RecBot双智能体架构：解析智能体将语言表达转化为结构化偏好，规划智能体则动态编排自适应工具链以实现实时策略调整。为实现实际部署，我们采用仿真增强的知识蒸馏技术，在保持强大推理能力的同时实现高效性能。经大规模离线实验与长期在线测试，RecBot在用户满意度和商业指标上均展现出显著提升。"
+    },
+    {
+        "title": "Query-Centric Graph Retrieval Augmented Generation",
+        "url": "http://arxiv.org/abs/2509.21237v1",
+        "pub_date": "2025-09-25",
+        "summary": "Graph-based retrieval-augmented generation (RAG) enriches large language models (LLMs) with external knowledge for long-context understanding and multi-hop reasoning, but existing methods face a granularity dilemma: fine-grained entity-level graphs incur high token costs and lose context, while coarse document-level graphs fail to capture nuanced relations. We introduce QCG-RAG, a query-centric graph RAG framework that enables query-granular indexing and multi-hop chunk retrieval. Our query-centric approach leverages Doc2Query and Doc2Query{-}{-} to construct query-centric graphs with controllable granularity, improving graph quality and interpretability. A tailored multi-hop retrieval mechanism then selects relevant chunks via the generated queries. Experiments on LiHuaWorld and MultiHop-RAG show that QCG-RAG consistently outperforms prior chunk-based and graph-based RAG methods in question answering accuracy, establishing a new paradigm for multi-hop reasoning.",
+        "translated": "基于图结构的检索增强生成（RAG）通过外部知识库增强大语言模型的长文本理解与多跳推理能力，但现有方法面临粒度困境：细粒度实体级图会带来高昂的标记成本且丢失上下文，而粗粒度文档级图难以捕捉细微关系。我们提出QCG-RAG——一种以查询为中心的图结构RAG框架，实现查询粒度的索引构建与多跳文本块检索。该框架利用Doc2Query和Doc2Query{--}技术构建粒度可控的查询中心图，提升图质量与可解释性，进而通过定制化的多跳检索机制基于生成查询筛选相关文本块。在LiHuaWorld和MultiHop-RAG数据集上的实验表明，QCG-RAG在问答准确率上持续优于现有基于文本块和图结构的RAG方法，为多跳推理建立了新范式。"
+    },
+    {
+        "title": "SGMem: Sentence Graph Memory for Long-Term Conversational Agents",
+        "url": "http://arxiv.org/abs/2509.21212v1",
+        "pub_date": "2025-09-25",
+        "summary": "Long-term conversational agents require effective memory management to handle dialogue histories that exceed the context window of large language models (LLMs). Existing methods based on fact extraction or summarization reduce redundancy but struggle to organize and retrieve relevant information across different granularities of dialogue and generated memory. We introduce SGMem (Sentence Graph Memory), which represents dialogue as sentence-level graphs within chunked units, capturing associations across turn-, round-, and session-level contexts. By combining retrieved raw dialogue with generated memory such as summaries, facts and insights, SGMem supplies LLMs with coherent and relevant context for response generation. Experiments on LongMemEval and LoCoMo show that SGMem consistently improves accuracy and outperforms strong baselines in long-term conversational question answering.",
+        "translated": "长期对话智能体需要有效的记忆管理机制，以处理超出大语言模型（LLM）上下文窗口的对话历史。现有基于事实提取或摘要生成的方法虽能减少冗余信息，但难以跨对话与生成记忆的不同粒度有效组织和检索相关信息。我们提出SGMem（语句图记忆）方法，通过在分块单元内构建语句级图结构来表征对话，捕捉轮次、回合及会话级上下文之间的关联。通过将检索到的原始对话与摘要、事实、洞见等生成记忆相结合，SGMem能为大语言模型提供连贯且相关的上下文以生成响应。在LongMemEval和LoCoMo数据集上的实验表明，SGMem能持续提升长期对话问答的准确率，并显著优于现有强基线模型。"
+    },
+    {
+        "title": "Adoption, usability and perceived clinical value of a UK AI clinical\n  reference platform (iatroX): a mixed-methods formative evaluation of\n  real-world usage and a 1,223-respondent user survey",
+        "url": "http://arxiv.org/abs/2509.21188v1",
+        "pub_date": "2025-09-25",
+        "summary": "Clinicians face growing information overload from biomedical literature and guidelines, hindering evidence-based care. Retrieval-augmented generation (RAG) with large language models may provide fast, provenance-linked answers, but requires real-world evaluation. We describe iatroX, a UK-centred RAG-based clinical reference platform, and report early adoption, usability, and perceived clinical value from a formative implementation evaluation. Methods comprised a retrospective analysis of usage across web, iOS, and Android over 16 weeks (8 April-31 July 2025) and an in-product intercept survey. Usage metrics were drawn from web and app analytics with bot filtering. A client-side script randomized single-item prompts to approx. 10% of web sessions from a predefined battery assessing usefulness, reliability, and adoption intent. Proportions were summarized with Wilson 95% confidence intervals; free-text comments underwent thematic content analysis. iatroX reached 19,269 unique web users, 202,660 engagement events, and approx. 40,000 clinical queries. Mobile uptake included 1,960 iOS downloads and Android growth (peak &gt;750 daily active users). The survey yielded 1,223 item-level responses: perceived usefulness 86.2% (95% CI 74.8-93.9%; 50/58); would use again 93.3% (95% CI 68.1-99.8%; 14/15); recommend to a colleague 88.4% (95% CI 75.1-95.9%; 38/43); perceived accuracy 75.0% (95% CI 58.8-87.3%; 30/40); reliability 79.4% (95% CI 62.1-91.3%; 27/34). Themes highlighted speed, guideline-linked answers, and UK specificity. Early real-world use suggests iatroX can mitigate information overload and support timely answers for UK clinicians. Limitations include small per-item samples and early-adopter bias; future work will include accuracy audits and prospective studies on workflow and care quality.",
+        "translated": "临床医生正面临生物医学文献与指南带来的日益严重的信息过载问题，这阻碍了循证医疗的实施。基于大语言模型的检索增强生成技术虽能提供快速且可溯源的答案，但需经过真实场景验证。本文介绍以英国为中心的临床参考平台iatroX，并通过形成性实施评估报告其早期应用情况、可用性及临床价值感知。研究方法包括对16周内网页端、iOS和安卓端使用情况的回顾性分析及产品内拦截调查。使用指标来自经过机器人过滤的网页和应用程序分析数据。通过客户端脚本从预设问题库中随机向约10%的网页会话用户推送单项提问，评估有用性、可靠性和使用意愿。采用威尔逊95%置信区间汇总比例数据，对自由文本评论进行主题内容分析。iatroX覆盖19,269名独立网页用户，产生202,660次交互事件，处理约40,000次临床查询。移动端iOS下载量达1,960次，安卓端日活跃用户峰值超750人。调查获得1,223项有效回复：感知有用性86.2%（95%CI 74.8-93.9%；50/58）；再次使用意愿93.3%（95%CI 68.1-99.8%；14/15）；向同事推荐意愿88.4%（95%CI 75.1-95.9%；38/43）；感知准确性75.0%（95%CI 58.8-87.3%；30/40）；可靠性79.4%（95%CI 62.1-91.3%；27/34）。主题分析突出速度优势、指南关联答案及英国本土化特性。早期真实世界应用表明iatroX可缓解英国临床医生的信息过载问题并提供及时解答。局限性包括单项样本量较小和早期使用者偏差，未来工作将包括准确性审计及对工作流程与护理质量的前瞻性研究。"
+    },
+    {
+        "title": "IntSR: An Integrated Generative Framework for Search and Recommendation",
+        "url": "http://arxiv.org/abs/2509.21179v1",
+        "pub_date": "2025-09-25",
+        "summary": "Generative recommendation has emerged as a promising paradigm, demonstrating remarkable results in both academic benchmarks and industrial applications. However, existing systems predominantly focus on unifying retrieval and ranking while neglecting the integration of search and recommendation (S&amp;R) tasks. What makes search and recommendation different is how queries are formed: search uses explicit user requests, while recommendation relies on implicit user interests. As for retrieval versus ranking, the distinction comes down to whether the queries are the target items themselves. Recognizing the query as central element, we propose IntSR, an integrated generative framework for S&amp;R. IntSR integrates these disparate tasks using distinct query modalities. It also addresses the increased computational complexity associated with integrated S&amp;R behaviors and the erroneous pattern learning introduced by a dynamically changing corpus. IntSR has been successfully deployed across various scenarios in Amap, leading to substantial improvements in digital asset's GMV(+3.02%), POI recommendation's CTR(+2.76%), and travel mode suggestion's ACC(+5.13%).",
+        "translated": "生成式推荐作为一种新兴范式，在学术基准测试和工业应用中都展现出显著成果。然而现有系统主要聚焦于统一检索与排序阶段，却忽视了搜索与推荐任务的整合。搜索与推荐的核心差异在于查询生成方式：搜索依赖用户显式请求，而推荐基于隐式用户兴趣。至于检索与排序的区别，则在于查询对象是否为目标条目本身。基于查询的核心地位，我们提出IntSR——一个面向搜索推荐任务的统一生成式框架。该框架通过不同的查询模态整合这些异构任务，同时解决了集成搜索推荐行为带来的计算复杂度提升问题，以及动态变化语料库引发的错误模式学习现象。IntSR已成功落地高德地图多个场景，推动数字资产GMV提升3.02%、POI推荐点击率增长2.76%、出行方式建议准确率提高5.13%。"
     }
 ]