{"id":1005408,"date":"2024-02-13T12:00:00","date_gmt":"2024-02-13T20:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=1005408"},"modified":"2024-04-02T14:41:02","modified_gmt":"2024-04-02T21:41:02","slug":"graphrag-unlocking-llm-discovery-on-narrative-private-data","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/graphrag-unlocking-llm-discovery-on-narrative-private-data\/","title":{"rendered":"GraphRAG: Unlocking LLM discovery on narrative private data"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1.png\" alt=\"Project Ire - GraphRag background: Blue-green gradient\" class=\"wp-image-1005555\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1.png 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-1280x720.png 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<p><em><strong>Editor\u2019s note, Apr. 2, 2024 \u2013<\/strong> Figure 1 was updated to clarify the origin of each source.<\/em><\/p>\n\n\n\n<p>Perhaps the greatest challenge \u2013 and opportunity \u2013 of LLMs is extending their powerful capabilities to solve problems beyond the data on which they have been trained, and to achieve comparable results with data the LLM has never seen.&nbsp;This opens new possibilities in data investigation, such as identifying themes and semantic concepts with context and grounding on datasets.&nbsp;In this post, we introduce GraphRAG, created by Microsoft Research, as a significant advance in enhancing the capability of LLMs.<\/p>\n\n\n\n<div class=\"annotations \" data-bi-aN=\"margin-callout\">\n\t<ul class=\"annotations__list card depth-16 bg-body p-4 annotations__list--right\">\n\t\t<li class=\"annotations__list-item\">\n\t\t\t\t\t\t<span class=\"annotations__type d-block text-uppercase font-weight-semibold text-neutral-300 small\">Publication<\/span>\n\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/can-generalist-foundation-models-outcompete-special-purpose-tuning-case-study-in-medicine\/\" target=\"_self\" class=\"annotations__link font-weight-semibold text-decoration-none\" data-bi-type=\"annotated-link\" aria-label=\"Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine\" data-bi-aN=\"margin-callout\" data-bi-cN=\"Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine\">\n\t\t\t\tCan Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine&nbsp;<span class=\"glyph-append glyph-append-chevron-right glyph-append-xsmall\"><\/span>\n\t\t\t<\/a>\n\t\t\t\t\t<\/li>\n\t<\/ul>\n<\/div>\n\n\n\n<p>Retrieval-Augmented Generation (RAG) is a technique to search for information based on a user query and provide the results as reference for an AI answer to be generated. This technique is an important part of most LLM-based tools and the majority of RAG approaches use vector similarity as the search technique. GraphRAG uses LLM-generated knowledge graphs to provide substantial improvements in question-and-answer performance when conducting document analysis of complex information.&nbsp;This builds upon our recent <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/can-generalist-foundation-models-outcompete-special-purpose-tuning-case-study-in-medicine\/\" target=\"_blank\" rel=\"noreferrer noopener\">research<\/a>, which points to the power of prompt augmentation when performing discovery on <em>private datasets<\/em>. Here, we define <em>private dataset <\/em>as data that the LLM is not trained on and has never seen before, such as an enterprise\u2019s proprietary research, business documents, or communications.&nbsp;<em>Baseline RAG<\/em><a href=\"#baseline-RAG\">[1]<\/a> was created to help solve this problem, but we observe situations where baseline RAG performs very poorly. For example:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Baseline RAG struggles to connect the dots.&nbsp;This happens when answering a question requires traversing disparate pieces of information through their shared attributes in order to provide new synthesized insights.<\/li>\n\n\n\n<li>Baseline RAG performs poorly when being asked to holistically understand summarized semantic concepts over large data collections or even singular large documents.<\/li>\n<\/ul>\n\n\n\n<p>To address this, the tech community is working to develop methods that extend and enhance RAG (e.g., <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.llamaindex.ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">LlamaIndex<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>).&nbsp;Microsoft Research\u2019s new approach, GraphRAG, uses the LLM to create a knowledge graph based on the private dataset.&nbsp;This graph is then used alongside graph machine learning to perform prompt augmentation at query time.&nbsp;GraphRAG shows substantial improvement in answering the two classes of questions described above, demonstrating intelligence or mastery that outperforms other approaches previously applied to private datasets.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"applying-rag-to-private-datasets\">Applying RAG to private datasets<\/h2>\n\n\n\n<p>To demonstrate the effectiveness of GraphRAG, let\u2019s start with an investigation using the Violent Incident Information from News Articles (VIINA) <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/zhukovyuri\/VIINA\" target=\"_blank\" rel=\"noreferrer noopener\">dataset<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.&nbsp;This dataset was chosen due to its complexity and the presence of differing opinions and partial information. It is a messy real-world test case that was recent enough not to be included in the LLM base model\u2019s training.&nbsp;<\/p>\n\n\n\n<p>For this research, we use thousands of news articles from both Russian and Ukrainian news sources for the month of June 2023, translated into English, to create a private dataset on which we will perform our LLM-based retrieval.&nbsp;The dataset is far too large to fit into an LLM context window, thus demanding a RAG approach.<\/p>\n\n\n\n<p>We start with an exploratory query, which we pose to both a baseline RAG system and to our new approach, GraphRAG:<\/p>\n\n\n\n<p>Query: <em>\u201cWhat is Novorossiya?\u201d<\/em><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Baseline RAG<\/strong><\/td><td><strong>GraphRAG<\/strong><\/td><\/tr><tr><td>The term &#8220;Novorossiya&#8221; is not mentioned in the provided context. However, to provide some information, Novorossiya, or New Russia, is a historical term referring to a region in the south of Ukraine that was conquered by the Russian Empire in the 18th century. In recent years, the term has been revived by Russian-backed separatists in Eastern Ukraine to refer to the territories they claim, but it is not officially recognized as a separate entity by international bodies.<\/td><td>Novorossiya, also known as New Russia, is a term with historical and contemporary significance. Historically, it referred to a region in the southern part of the Russian Empire that is now part of Ukraine. In the contemporary context, it has been used by pro-Russian separatists to refer to certain areas of Ukraine [Data: Entities (6494, 912, 6493)].&nbsp;<br><br>The term Novorossiya has been used in the context of a political movement in Ukraine, particularly by Russian nationalists. This movement has been associated with efforts to destabilize Ukraine and create a so-called &#8220;Novorossiya&#8221; [Data: Entities (912); Relationships (15211, 15206)]. The term has also been used to denote a separatist state in Ukraine, known as the Federal Republic of Novorossiya (FRN) [Data: Entities (7690)].&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>In these results, we can see both systems perform well \u2013 highlighting a class of query on which baseline RAG performs well. Let\u2019s try a query that requires connecting the dots:<\/p>\n\n\n\n<p>Query: <em>\u201cWhat has Novorossiya done?\u201d<\/em><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Baseline RAG<\/strong><\/td><td><strong>GraphRAG<\/strong><\/td><\/tr><tr><td>The text does not provide specific information on what Novorossiya has done.<\/td><td>Novorossiya, a political movement in Ukraine, has been involved in a series of destructive activities, particularly targeting various entities in Ukraine [Entities (6494, 912)]. The movement has been linked to plans to destroy properties of several Ukrainian entities, including Rosen, the Odessa Canning Factory, the Odessa Regional Radio Television Transmission Center, and the National Television Company of Ukraine [Relationships (15207, 15208, 15209, 15210)]. In addition to these, Novorossiya has also been associated with plans to destroy properties of Roshen, the Odessa Conservatory Factory, the Odesa Regional Radio Television Transmission Center, and the National Telecommunications Company of Ukraine [Relationships (8329, 8330, 8331, 8332)]. Furthermore, Novorossiya has been implicated in plans to blow up ATMs of PrivatBank, the largest state bank in Ukraine [Relationships (8335, 8335)].&nbsp;<br>&nbsp;<br>The Office of the General Prosecutor in Ukraine has reported on the creation of Novorossiya, indicating the government&#8217;s awareness and potential concern over the activities of this movement [Relationships (8264, 8264)]. The destructive activities of Novorossiya have had significant impacts on the targeted entities and have likely contributed to the ongoing tensions and conflicts in the region.&nbsp;<br>In summary, Novorossiya has been involved in a series of destructive activities targeting various entities in Ukraine. These activities have had significant impacts on the targeted entities and have likely contributed to the ongoing tensions and conflicts in the region. The Ukrainian government, through the Office of the General Prosecutor, has acknowledged the existence and activities of Novorossiya, indicating a level of concern over the movement&#8217;s actions.&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Baseline RAG fails to answer this question.&nbsp;Looking at the source documents inserted into the context window (Figure 1), none of the text segments discuss Novorossiya, resulting in this failure.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"799\" height=\"564\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/Figure-1-Updated-032924.png\" alt=\"A table entitled \u201cRelevant chunks of source documents\u201d with 10 rows of text segments and the news source that they were pulled from in the VIINA dataset. Each text segment mentions a news event happening in Ukraine and Russia.  None include the term \u2018Novorossiya\u2019. \" class=\"wp-image-1020468\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/Figure-1-Updated-032924.png 799w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/Figure-1-Updated-032924-300x212.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/Figure-1-Updated-032924-768x542.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/04\/Figure-1-Updated-032924-240x169.png 240w\" sizes=\"auto, (max-width: 799px) 100vw, 799px\" \/><figcaption class=\"wp-element-caption\">Figure 1: Baseline RAG retrieved context <a href=\"#_ftn2\">[2]<\/a><\/figcaption><\/figure>\n\n\n\n<p>In comparison, the GraphRAG approach discovered an entity in the query, Novorossiya.&nbsp;This allows the LLM to ground itself in the graph and results in a superior answer that contains provenance through links to the original supporting text.&nbsp;For example, Figure 2 below shows the exact content the LLM used for the LLM-generated statement, \u201cNovorossiya has been implicated in plans to blow up ATMs.\u201d We see the snippet from the raw source documents (after English translation) that the LLM used to support the assertion that a specific bank was a target for Novorossiya via the relationship that exists between the two entities in the graph.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"414\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag_Figure2.png\" alt=\"Figure 2: GraphRAG Provenance An image of the GraphRAG system displaying a table of the VIINA source text used to ground the connection between Novorossiya and PrivatBank. The table has three columns for source, date, and text. There is a single row of content shown. The row shows the source is from \u2018interfaxua\u2019, the date of publication is June 8, 2023, and the text box contains a paragraph taken from the source document. In summary, the text describes the creation of Novorossiya with intent to commit acts of terrorism targeting PrivatBank, the Regional Radio and Television Broadcasting Center, and other targets. It describes recruitment of residents of Odessa. Highlighted in the text box are two separate strings of text. The first is the word \u2018Novorossiya\u2019 and the second is the text \u2018criminal blew up buildings of military commissariats, ATMs\u2019.\" class=\"wp-image-1005435\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag_Figure2.png 600w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag_Figure2-300x207.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag_Figure2-240x166.png 240w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><figcaption class=\"wp-element-caption\">Figure 2: GraphRAG provenance <\/figcaption><\/figure>\n\n\n\n<p>By using the LLM-generated knowledge graph, GraphRAG vastly improves the \u201cretrieval\u201d portion of RAG, populating the context window with higher relevance content, resulting in better answers and capturing evidence provenance.&nbsp;<\/p>\n\n\n\n<p>Being able to trust and verify LLM-generated results is always important.&nbsp;We care that the results are factually correct, coherent, and accurately represent content found in the source material.&nbsp;GraphRAG provides the provenance, or source grounding information, as it generates each response. It demonstrates that an answer is grounded in the dataset.&nbsp;Having the cited source for each assertion readily available also enables a human user to quickly and accurately audit the LLM\u2019s output directly against the original source material.&nbsp;<\/p>\n\n\n\n<p>However, this isn\u2019t all that\u2019s possible using GraphRAG.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"whole-dataset-reasoning\">Whole dataset reasoning&nbsp;<\/h2>\n\n\n\n<p>Baseline RAG struggles with queries that require aggregation of information across the dataset to compose an answer. Queries such as \u201cWhat are the top 5 themes in the data?\u201d perform terribly because baseline RAG relies on a vector search of semantically similar text content within the dataset. There is nothing in the query to direct it to the correct information.&nbsp;<\/p>\n\n\n\n<p>However, with GraphRAG we <em>can<\/em> answer such questions, because the structure of the LLM-generated knowledge graph tells us about the structure (and thus themes) of the dataset as a whole.&nbsp;This allows the private dataset to be organized into meaningful semantic clusters that are pre-summarized.&nbsp;The LLM uses these clusters to summarize these themes when responding to a user query.&nbsp;<\/p>\n\n\n\n<p>We illustrate whole-dataset reasoning abilities by posing the following question to the two systems:&nbsp;<\/p>\n\n\n\n<p>Query: &#8220;<em>What are the top 5 themes in the data?<\/em>&#8220;<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Baseline RAG<\/strong><\/td><td><strong>GraphRAG<\/strong><\/td><\/tr><tr><td>1. Improving the quality of life in cities and urban development.&nbsp;<br>2. The meeting between Vladimir Putin and the President of the United Arab Emirates, Mohammed bin Zayed Al Nahyan.&nbsp;<br>3. The National Rating of the Investment Climate in the Regions of Russia.&nbsp;<br>4. The state of the Russian economy and the growth of domestic brands.&nbsp;<br>5. The discussion on the meaning of life and the role of science in understanding it.&nbsp;<\/td><td>The data presents a wide range of themes, but the top five most prevalent themes can be identified as follows:&nbsp;<br><br>1. <strong>Conflict and Military Activity<\/strong>: A significant portion of the data revolves around the ongoing conflict in Ukraine, with numerous reports detailing military activities, occupations, and attacks by various entities. This theme includes the actions of the Ukrainian military, Russian forces, and various other military units and entities [Data: Reports (513, 241, 151, 493, 343, +more)].&nbsp;<br><br>2. <strong>Political and Governmental Entities<\/strong>: Many reports focus on political figures and governmental entities, highlighting their roles, relationships, and actions in the context of the current situation. This includes entities like the Kremlin, the President of Ukraine, the US State Department, and various regional administrations [Data: Reports (98, 493, 265, 129, 196, +more)].&nbsp;<br><br>3. <strong>Infrastructure and Environmental Concerns<\/strong>: Several reports highlight potential environmental impacts and infrastructure threats. These include potential environmental damage due to industrial activities, flooding crises, and threats to infrastructure due to military conflicts [Data: Reports (308, 154, 201, 348, 216, +more)].&nbsp;<br><br>4. <strong>Community Analysis and Threat Assessment<\/strong>: Several reports provide detailed analyses of specific communities, often centered around a particular location or entity. These analyses typically discuss the key entities within the community, their relationships, and any associated threats. Many reports also include assessments of potential threats associated with the entities or situations they discuss [Data: Reports (323, 596, 518, 583, 179, +more)].&nbsp;<br><br>5. <strong>Health, Humanitarian Concerns, and Emergency Situations<\/strong>: Some reports focus on health and humanitarian issues, such as the Shereshevsky-Turner Syndrome community and the potential risk of intestinal diseases in the Odessa region. The data also covers emergency situations, including natural disasters, accidents, and attacks [Data: Reports (204, 149, 372, 154, 532, +more)].&nbsp;<br>These themes provide a comprehensive overview of the current situation, highlighting the complexity and interconnectedness of the issues at hand.&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Looking at the results from baseline RAG, we see that none of the listed themes has much to do with the war between the two countries. As anticipated, the vector search retrieved irrelevant text, which was inserted into the LLM\u2019s context window.&nbsp;Results that were included were likely keying on the word \u201ctheme,\u201d resulting in a less than useful assessment of what is going on in the dataset.&nbsp;<\/p>\n\n\n\n<p>Observing the results from GraphRAG, we can clearly see that the results are far more aligned with what is going on in the dataset as a whole.&nbsp;The answer provides the five main themes as well as supporting details that are observed in the dataset. The referenced reports are pre-generated by the LLM for each semantic cluster in GraphRAG and, in turn, provide provenance back to original source material.<\/p>\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1085514\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">Spotlight: blog post<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/graphrag-auto-tuning-provides-rapid-adaptation-to-new-domains\/\" aria-label=\"GraphRAG auto-tuning provides rapid adaptation to new domains\" data-bi-cN=\"GraphRAG auto-tuning provides rapid adaptation to new domains\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/09\/GraphRag-3-BlogHeroFeature-1400x788-1.png\" alt=\"GraphRAG image on blue to green gradient\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">GraphRAG auto-tuning provides rapid adaptation to new domains<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p class=\"large\">GraphRAG uses LLM-generated knowledge graphs to substantially improve complex Q&A over retrieval-augmented generation (RAG). Discover automatic tuning of GraphRAG for new datasets, making it more accurate and relevant.<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/graphrag-auto-tuning-provides-rapid-adaptation-to-new-domains\/\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" aria-label=\"Read more\" data-bi-cN=\"GraphRAG auto-tuning provides rapid adaptation to new domains\" target=\"_blank\">\n\t\t\t\t\t\t\tRead more\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n<h2 class=\"wp-block-heading\" id=\"creating-llm-generated-knowledge-graphs\">Creating LLM-generated knowledge graphs<\/h2>\n\n\n\n<p>We note the basic flow that underpins GraphRAG, which builds upon our prior <a href=\"https:\/\/www.microsoft.com\/en-us\/worklab\/patterns-hidden-inside-the-org-chart\" target=\"_blank\" rel=\"noreferrer noopener\">research<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/graspologic\" target=\"_blank\" rel=\"noreferrer noopener\">repositories<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> using graph machine learning:&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The LLM processes the entire private dataset, creating references to all entities and relationships within the source data, which are then used to create an LLM-generated knowledge graph.&nbsp;<\/li>\n\n\n\n<li>This graph is then used to create a bottom-up clustering that organizes the data hierarchically into semantic clusters (indicated by using color in Figure 3 below).&nbsp;This partitioning allows for pre-summarization of semantic concepts and themes, which aids in holistic understanding of the dataset.&nbsp;<\/li>\n\n\n\n<li>At query time, both of these structures are used to provide materials for the LLM context window when answering a question.&nbsp;<\/li>\n<\/ul>\n\n\n\n<p>An example visualization of the graph is shown in Figure 3. Each circle is an entity (e.g., a person, place, or organization), with the entity size representing the number of relationships that entity has, and the color representing groupings of similar entities. The color partitioning is a bottom-up clustering method built on top of the graph structure, which enables us to answer questions at varying levels of abstraction.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"611\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-Figure3.jpg\" alt=\"Figure 3: LLM-generated knowledge graph built from a private dataset using GPT-4 Turbo. A knowledge graph visualization represented by a collection in 3D space projected onto a 2D image of circles of varying sizes and colors. The circles are grouped together in space by color, and within each color area the larger circles are surrounded by many smaller circles. Each circle represents an entity within the knowledge graph.\" class=\"wp-image-1005441\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-Figure3.jpg 600w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-Figure3-295x300.jpg 295w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-Figure3-177x180.jpg 177w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><figcaption class=\"wp-element-caption\">Figure 3: LLM-generated knowledge graph built from a private dataset using GPT-4 Turbo.<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"result-metrics\">Result metrics<\/h2>\n\n\n\n<p>The illustrative examples above are representative of GraphRAG\u2019s consistent improvement across multiple datasets in different subject domains. We assess this improvement by performing an evaluation using an LLM grader to determine a pairwise winner between GraphRAG and baseline RAG. We use a set of qualitative metrics, including comprehensiveness (completeness within the framing of the implied context of the question), human enfranchisement (provision of supporting source material or other contextual information), and diversity (provision of differing viewpoints or angles on the question posed). Initial results show that GraphRAG <em>consistently outperforms <\/em>baseline RAG on these metrics.\u202f<\/p>\n\n\n\n<p>In addition to relative comparisons, we also use <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/pdf\/2303.08896.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">SelfCheckGPT<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> to perform an absolute measurement of faithfulness to help ensure factual, coherent results grounded in the source material. Results show that GraphRAG achieves a similar level of faithfulness to baseline RAG. We are currently developing an evaluation framework to measure performance on the class of problems above.&nbsp;This will include more robust mechanisms for generating question-answer test sets as well as additional metrics, such as accuracy and context relevance.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"next-steps\">Next steps<\/h2>\n\n\n\n<p id=\"baseline-RAG\">By combining LLM-generated knowledge graphs and graph machine learning, GraphRAG enables us to answer important classes of questions that we cannot attempt with baseline RAG alone.&nbsp;We have seen promising results after applying this technology to a variety of scenarios, including social media, news articles, workplace productivity, and chemistry.&nbsp;Looking forward, we plan to work closely with customers on a variety of new domains as we continue to apply this technology while working on metrics and robust evaluation. We look forward to sharing more as our research continues.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p><a id=\"baseline-RAG\" href=\"#baseline-RAG\">[1]<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> As <em>baseline RAG<\/em> in this comparison we use <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/python.langchain.com\/docs\/use_cases\/question_answering\/\" target=\"_blank\" rel=\"noreferrer noopener\">LangChain&#8217;s Q&A<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, a well-known representative example of this class of RAG tools in widespread use today.<\/p>\n\n\n\n<p><a id=\"_ftn2\" href=\"#_ftnref2\">[2]<\/a> This dataset contains sensitive topics. The dataset was chosen solely to showcase tools for data analysis that surface all relevant information including its origin. The tools, grounded by that dataset information, enable a human user to more rapidly reach informed conclusions within the context of opposing viewpoints from both Ukrainian (unian) and Russian (ria) news articles sourced in their native languages. The tools highlight the source of each statement, which can be used to identify where the information is originating.<\/p>\n\n\n\n<div style=\"height:30px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Editor\u2019s note, Apr. 2, 2024 \u2013 Figure 1 was updated to clarify the origin of each source. Perhaps the greatest challenge \u2013 and opportunity \u2013 of LLMs is extending their powerful capabilities to solve problems beyond the data on which they have been trained, and to achieve comparable results with data the LLM has never [&hellip;]<\/p>\n","protected":false},"author":42735,"featured_media":1005555,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":"[]"},"categories":[1],"tags":[],"research-area":[13556],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243984],"msr-impact-theme":[264846],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1005408","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-post-option-blog-homepage-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":["Computing foundations"],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[1027041],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Jonathan Larson","user_id":32385,"display_name":"Jonathan Larson","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jolarso\/\" aria-label=\"Visit the profile page for Jonathan Larson\">Jonathan Larson<\/a>","is_active":false,"last_first":"Larson, Jonathan","people_section":0,"alias":"jolarso"},{"type":"user_nicename","value":"Steven Truitt","user_id":43143,"display_name":"Steven Truitt","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/steventruitt\/\" aria-label=\"Visit the profile page for Steven Truitt\">Steven Truitt<\/a>","is_active":false,"last_first":"Truitt, Steven","people_section":0,"alias":"steventruitt"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-960x540.png\" class=\"img-object-cover\" alt=\"Project Ire - GraphRag background: Blue-green gradient\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1-1280x720.png 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/02\/GraphRag-BlogHeroFeature-1400x788-1.png 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jolarso\/\" title=\"Go to researcher profile for Jonathan Larson\" aria-label=\"Go to researcher profile for Jonathan Larson\" data-bi-type=\"byline author\" data-bi-cN=\"Jonathan Larson\">Jonathan Larson<\/a> and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/steventruitt\/\" title=\"Go to researcher profile for Steven Truitt\" aria-label=\"Go to researcher profile for Steven Truitt\" data-bi-type=\"byline author\" data-bi-cN=\"Steven Truitt\">Steven Truitt<\/a>","formattedDate":"February 13, 2024","formattedExcerpt":"Editor\u2019s note, Apr. 2, 2024 \u2013 Figure 1 was updated to clarify the origin of each source. Perhaps the greatest challenge \u2013 and opportunity \u2013 of LLMs is extending their powerful capabilities to solve problems beyond the data on which they have been trained, and&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1005408","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/42735"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1005408"}],"version-history":[{"count":47,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1005408\/revisions"}],"predecessor-version":[{"id":1111764,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1005408\/revisions\/1111764"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1005555"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1005408"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1005408"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1005408"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1005408"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1005408"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1005408"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1005408"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1005408"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1005408"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1005408"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1005408"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}