Skip to content
@ictnlp

ICTNLP

Natural Language Processing Group, Institute of Computing Technology, Chinese Academy of Sciences

Pinned Loading

  1. LLaMA-Omni LLaMA-Omni Public

    LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

    Python 3k 197

  2. StreamSpeech StreamSpeech Public

    StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

    Python 1.1k 84

  3. BayLing BayLing Public

    “百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT 90%的性能。BayLing is an English/Chinese LLM equipped with advanced language alignment, showing superior capability in English/Ch…

    Python 317 19

  4. LLaVA-Mini LLaVA-Mini Public

    LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

    Python 509 24

  5. Auto-RAG Auto-RAG Public

    This is the official repository for Auto-RAG.

    Python 214 20

  6. FlexRAG FlexRAG Public

    FlexRAG: A RAG Framework for Information Retrieval and Generation.

    Python 194 19

Repositories

Showing 10 of 82 repositories
  • Auto-RAG Public

    This is the official repository for Auto-RAG.

    ictnlp/Auto-RAG’s past year of commit activity
    Python 214 Apache-2.0 20 4 0 Updated Jul 15, 2025
  • StreamUni Public
    ictnlp/StreamUni’s past year of commit activity
    Python 6 1 0 0 Updated Jul 14, 2025
  • LLaVA-Mini Public

    LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

    ictnlp/LLaVA-Mini’s past year of commit activity
    Python 509 Apache-2.0 24 25 0 Updated Jun 29, 2025
  • StreamSpeech Public

    StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

    ictnlp/StreamSpeech’s past year of commit activity
    Python 1,117 MIT 84 13 1 Updated Jun 29, 2025
  • Stream-Omni Public

    Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across various modality combinations.

    ictnlp/Stream-Omni’s past year of commit activity
    Python 323 GPL-3.0 30 3 0 Updated Jun 17, 2025
  • FlexRAG Public

    FlexRAG: A RAG Framework for Information Retrieval and Generation.

    ictnlp/FlexRAG’s past year of commit activity
    Python 194 MIT 19 3 1 Updated Jun 17, 2025
  • SLED-TTS Public

    Streamable Text-to-Speech model using a language modeling approach, without vector quantization

    ictnlp/SLED-TTS’s past year of commit activity
    Python 93 5 4 0 Updated May 20, 2025
  • MonoAttn-Transducer Public

    Code for ICML25 Paper "Overcoming Non-monotonicity in Transducer-based Streaming Generation"

    ictnlp/MonoAttn-Transducer’s past year of commit activity
    Python 11 2 0 0 Updated May 19, 2025
  • LLaMA-Omni2 Public
    ictnlp/LLaMA-Omni2’s past year of commit activity
    Python 203 22 7 1 Updated May 18, 2025
  • LLaMA-Omni Public

    LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

    ictnlp/LLaMA-Omni’s past year of commit activity
    Python 2,957 Apache-2.0 197 48 1 Updated May 19, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.