background of LLMLingua

LLMLingua

Effectively Deliver Information to LLMs via Prompt Compression

Paper: https://arxiv.org/abs/2403.12968 (opens in new tab)

Project Page: https://llmlingua.com/llmlingua2.html (opens in new tab)

Demo: https://huggingface.co/spaces/microsoft/llmlingua-2 (opens in new tab)

Why LLMLingua-2?

LLMLingua-2 Onepage

Challenges Encountered in Information Entropy Based Methods​:

  • 🤔  Perplexity or information entropy may be suboptimal for prompt trimming: Not aligned with the prompt compression objective.
  • 🤖 How can we identify or build a suitable dataset to align the SLM towards effective prompt compression?
  • ➡️  Importance of tokens is context-dependent. Causal LMs only leverage unidirectional context, which may fail to capture all essential information within the context.
  • 🔄 How can we design a compression algorithm that effectively leverage the full bidirectional context?

Why Data Distillation?

Shortcomings of Existing Text Compression Datasets ​:

  • 😢  Most text compression datasets are abstractive, which leads to slow autoregressive process and may produce hallucinated content.
  • 🤷‍♂️  Extractive compression datasets such as SentComp (opens in new tab) and DebateSum (opens in new tab) are mainly created for the summarization task and often lack detailed information. In the case of prompt compression, we should retain essential information as much as possible.

BibTeX

@inproceedings{pan-etal-2024-llmlingua,
    title = "{LLML}ingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression",
    author = "Zhuoshi Pan and Qianhui Wu and Huiqiang Jiang and Menglin Xia and Xufang Luo and Jue Zhang and Qingwei Lin and Victor Ruhle and Yuqing Yang and Chin-Yew Lin and H. Vicky Zhao and Lili Qiu and Dongmei Zhang",
    editor = "Ku, Lun-Wei  and
      Martins, Andre  and
      Srikumar, Vivek",
    booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand and virtual meeting",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-acl.57",
    pages = "963--981",
}