News & features
In the news | The Sequence
Making Small Models Great Achieve GPT-o1 Levels in Math Reasoning with Microsoft rStar-Math
rStar-Math is a novel approach that significantly boosts the mathematical reasoning capabilities of small language models (SLMs). This innovative system enables SLMs to achieve performance levels comparable to, and even exceeding, OpenAI’s o1, despite a significantly smaller model size. This…
In the news | Venture Beat
Microsoft’s Differential Transformer cancels attention noise in LLMs
Improving LLMs’ ability to retrieve in-prompt information can impact important applications such as retrieval-augmented generation (RAG) and in-context learning (ICL). Microsoft Research and Tsinghua University researchers have introduced Differential Transformer (Diff Transformer), a new LLM architecture that amplifies attention to relevant context while…