Welcome to GraphRAG using langchain

Transform your documents into searchable knowledge graphs

Overview

This library is an implementation of concepts from the paper:

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

Below excerpts are taken from the companion website of the paper: https://microsoft.github.io/graphrag/

GraphRAG is a structured, hierarchical approach to Retrieval Augmented Generation (RAG), as opposed to naive semantic-search approaches using plain text snippets. The GraphRAG process involves extracting a knowledge graph out of raw text, building a community hierarchy, generating summaries for these communities, and then leveraging these structures when performing RAG-based tasks.

There are two main phases in the GraphRAG process:

Indexing

Slice up an input corpus into a series of TextUnits, which act as analyzable units for the rest of the process, and provide fine-grained references in our outputs.
Extract all entities, relationships, and key claims from the TextUnits using an LLM.
Perform a hierarchical clustering of the graph using the Leiden technique.
Generate summaries of each community and its constituents from the bottom-up. This aids in holistic understanding of the dataset.

Query

At query time, these structures are used to provide materials for the LLM context window when answering a question. The primary query modes are:

Global Search for reasoning about holistic questions about the corpus by leveraging the community summaries.
Local Search for reasoning about specific entities by fanning-out to their neighbors and associated concepts.

Differences from the official implementation

There is an official implementation of the paper available at https://github.com/microsoft/graphrag

The main differeneces are:

Usage of langchain as the foundation
Support for LLMs and Embedding models other than the ones provided by Azure OpenAI
Focus on modularity, readability, and extensibility
Does not assume any workflow engine and leave it to the application

Installation

pip install langchain-graphrag

Documentation

1. Architecture Overview

Understand how GraphRAG works and when to use Local vs Global search

2. Indexing Pipeline

How to build knowledge graphs from your documents with technical implementation details

3. Query System

Local Search vs Global Search with practical examples

4. Data Flow & Examples

Real data transformations through each pipeline step with actual JSON examples

5. Advanced Examples

Jupyter notebooks for component-level customization and development