LLaVA: Large Language and Vision Assistant: 视频

Building Next-Gen Multimodal Foundation Models for General-Purpose Assistants

LLaVA is an open-source project, collaborating with research community to advance the state-of-the-art in AI. LLaVA represents the first end-to-end trained large multimodal model (LMM) that achieves impressive chat capabilities mimicking spirits of the multimodal GPT-4. The LLaVA family continues growing to support more modalities, capabilities, applications and beyond.

视频

Peter Lee standing posing for the camera

23:42

Research Forum Keynote: Research in the Era of AI

2024年1月30日

Speakers : Peter Lee

所属单位 : Microsoft Research and Incubations