Skip to content
Change the repository type filter

All

    Repositories list

    • Reasoning-Embedding

      Public
      The official repository of the paper "Do Reasoning Models Enhance Embedding Models?"
      Python
      21300Updated Feb 20, 2026Feb 20, 2026
    • DARK

      Public
      Code for DARK: Unifying Deductive and Abductive Reasoning in Knowledge Graphs with Masked Diffusion Model
      Python
      0300Updated Feb 11, 2026Feb 11, 2026
    • CtrlHGen

      Public
      Python
      0210Updated Feb 11, 2026Feb 11, 2026
    • NGDBench

      Public
      Python
      0100Updated Feb 9, 2026Feb 9, 2026
    • AtlasKV

      Public
      [ICLR'26] AtlasKV: A scalable, effective, and general way to augment LLMs with billion-scale knowledge graphs using very little GPU memory cost.
      Python
      31410Updated Jan 27, 2026Jan 27, 2026
    • NAACL

      Public
      The official codebase for our paper "NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems"
      Python
      12410Updated Jan 21, 2026Jan 21, 2026
    • RelationalIntentionGraph

      Public
      Python
      0100Updated Jan 18, 2026Jan 18, 2026
    • This repository contains the implementation of AutoSchemaKG, a novel framework for automatic knowledge graph construction that combines schema generation via co…
      Python
      8969860Updated Jan 14, 2026Jan 14, 2026
    • NewtonBench

      Public
      NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents
      Python
      2013710Updated Dec 15, 2025Dec 15, 2025
    • MarConf

      Public
      [ACL 2025] Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models' Uncertainty?.
      Python
      1801Updated Nov 25, 2025Nov 25, 2025
    • Python
      32600Updated Nov 17, 2025Nov 17, 2025
    • privacy

      Public
      HTML
      0200Updated Nov 17, 2025Nov 17, 2025
    • CritiCal

      Public
      Code for CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?
      Python
      0510Updated Nov 15, 2025Nov 15, 2025
    • MARS

      Public
      Code and dataset for the paper: MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset (https://arxiv.o…
      Python
      0600Updated Nov 10, 2025Nov 10, 2025
    • [EMNLP2025] From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery
      3629800Updated Nov 5, 2025Nov 5, 2025
    • [ACL 2024] Implementation for Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis Generation
      Python
      11500Updated Oct 9, 2025Oct 9, 2025
    • [EMNLP 2025 Wordplay] LLM-Hanabi Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game
      Python
      0200Updated Oct 4, 2025Oct 4, 2025
    • Official Repository for MASLegalBench.
      Python
      0000Updated Sep 30, 2025Sep 30, 2025
    • MCIP

      Public
      Python
      21210Updated Sep 29, 2025Sep 29, 2025
    • Python
      0000Updated Sep 20, 2025Sep 20, 2025
    • Official Repository for Context Reasoner.
      Python
      0900Updated Sep 1, 2025Sep 1, 2025
    • MarPT

      Public
      Code for Prospect Theory Fails for LLMs: Instability of Decision-Making under Epistemic Uncertainty
      Python
      0210Updated Aug 11, 2025Aug 11, 2025
    • CEQA

      Public
      Official Implementation of paper: Complex Query Answering on Eventuality Knowledge Graph with Implicit Logical Constraints
      Python
      11120Updated Jul 15, 2025Jul 15, 2025
    • FedNGDB

      Public
      Python
      1000Updated Jul 6, 2025Jul 6, 2025
    • TEGA

      Public
      [ACL 2025] Enhancing Transformers for Generalizable First-Order Logical Entailment
      Python
      0200Updated May 29, 2025May 29, 2025
    • ConKE

      Public
      0110Updated May 28, 2025May 28, 2025
    • Python
      0200Updated May 28, 2025May 28, 2025
    • Source code and data for paper "Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations".
      Python
      0100Updated May 27, 2025May 27, 2025
    • [ACL 2025] KnowShiftQA: How Robust are RAG Systems when Textbook Knowledge Shifts in K-12 Education?
      Jupyter Notebook
      0100Updated May 25, 2025May 25, 2025
    • Python
      0000Updated May 25, 2025May 25, 2025