Downloads
Phi-1.5
December 2023
The language model phi-1.5 is a Transformer with 1.3 billion parameters. It was trained using the same data sources as phi-1, augmented with a new data source that consists of various NLP synthetic texts. When assessed against benchmarks testing common…
KITAB Dataset
February 2024
🕮 KITAB is a challenging dataset and a dynamic data collection approach for testing abilities of Large Language Models (LLMs) in answering information retrieval queries with constraint filters. A filtering query with constraints can be of the form “List all books…
FLAML: A Fast Library for AutoML and Tuning
December 2020
FLAML is a Python library designed to automatically produce accurate machine learning models with low computational cost. It frees users from selecting learners and hyperparameters for each learner. FLAML is powered by a new, cost-effective hyperparameter optimization and learner selection…
Phi-3
April 2024
The Phi-3-Mini-128K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model belongs to…
VISOR
September 2023
Benchmarking Spatial Relationships in Text-to-Image Generation—Spatial understanding is a fundamental aspect of computer vision and integral for human-level reasoning about images, making it an important component for grounded language understanding. While recent large-scale text-to-image synthesis (T2I) models have shown unprecedented…
ToxiGen
May 2022
Toxic language detection systems often falsely flag text that contains minority group mentions as toxic, as those groups are often the targets of online hate. Such over-reliance on spurious correlations also causes systems to struggle with detecting implicitly toxic language.…