NoLiMa Reveals LLM Performance Drops Beyond 1K Contexts

Recent advancements in Large Language Models (LLMs) have enabled them to process expansive context windows, ranging from 128K to 1M tokens. However, a study titled “NoLiMa: Long-Context Evaluation Beyond Literal Matching” reveals that LLM performance significantly declines when handling contexts exceeding 1,000 tokens. The NoLiMa is a benchmark developed to assess LLMs’ ability to retrieve […]