LlamaIndex のバックアップ(No.14) - .NET 開発基盤部会 Wiki

[ トップ ] [ 新規 | 一覧 | 単語検索 | 最終更新 | ヘルプ ]

「.NET 開発基盤部会 Wiki」は、「Open棟梁Project」,「OSSコンソーシアム .NET開発基盤部会」によって運営されています。

戻る
- OpenAI
- OSSのLLM
- LLMのPE
- LLMのRAG
- LangChain
- LlamaIndex
- AutoGen

目次 †

概要 †

ステージ †

Loading †

テキストデータを読み込む

Indexing †

テキストデータからインデックスを作成する。

Storing †

テキストデータとインデックスを永続化する。

Querying †

インデックスを使用してテキストデータを検索する。

Evaluation †

検索のリクエストレスポンスを客観的に評価。

機能 †

データの取得 †

生のテキストデータだけでなく、

ファイル (PDF、ePub、Word、PowerPoint?、Audioなど) や
Webサービス (Notion、Slack、Wikipediaなど) を

データソースとして利用できる。

インデックスの作成 †

テキストデータをチャンクに分割し、クエリからチャンクを検索するインデックスを作成する。
さまざまなインデックス化の方法がある（キーワード、ベクトル、グラフ）。

データのストア †

Vector Store、Document Store、Index Storeなどのストアにデータを保存。

Vector Store
Document Store
Index Store

データの検索 †

インデックスを使用してデータを検索

キーワード検索：キーワードを使用し、文書ベクトルを検索し結果を得る。
ベクトル検索：クエリもベクトルに変換し、文書ベクトルと近似最近傍探索（ANN）
グラフ検索：全文検索後、ノードとエッジから関連文書を検索し結果を得る。

プロバイダ †

1st Party †

各ステージを処理する基本的なライブラリ

3rd Party †

各ステージを処理するライブラリ

データ取得：LlamaHub?に様々なデータコネクタが提供されている。

ベクトル化、ストア
- NoSQLデータベース：MongoDBやElasticsearchなどのNoSQLデータベースを使用してデータを保存および検索できる。
- クラウドストレージ: AWS S3やCloudflare R2などのクラウドストレージサービスを利用してデータを保存できる。
- Vectorストア: DeepLake?やFAISSなどを使用して、効率的なベクトル化、ベクトル検索を実現する。

詳細 †

斯々然々で公式を読む事をオススメする。

主要機能 †

Loading †

Reader
- SimpleDirectoryReader?と言う汎用的なライブラリを利用できる他、
- Readerを使用する代わりに、ドキュメントを直接使用することもできる。
- また、数百のデータコネクタをLlamaHub?レジストリをダウンロードして使用できる。
- LlamaCloud?のコネクタは、LlamaIndex純正IaaSストレージということだろう。
- ストレージによっては、インデックス化処理がオフロードされているものもあり、その場合、Indexingのプロセスは不要になる。

node_parser
- API的には、Indexingと同じタイミングで実行されるが、
- 概念的には、Loading、Readerの後に実行されるもの。
- SplitterでChunkに分割する（APIはNodeを返す）。
- Splitterのインスタンスがnode_parserらしい。
- node_parserの単独実行も可能で、show_progressと言ったオプションもある。
- パイプライン（IngestionPipeline?）に組み込んで、複雑なパースを実装することもできる。
- IngestionPipeline?()には、Splitter、Extractor、Embeddingなどを指定できる模様。

Indexing †

参考：https://docs.llamaindex.ai/en/stable/module_guides/indexing/index_guide/

基本的には、キーワード、ベクトル、グラフなどの検索を使用する。

キーワード：SummaryIndex?、KeywordTableIndex?、SQLStructStoreIndex?

ベクトル：VectorStoreIndex?

グラフ：Graph RAG

KnowledgeGraphIndex?：RDF、トリプレット
PropertyGraphIndex?：プロパティグラフ
TreeIndex?：タキソノミー？

Storing †

Document Store、Vector Store、Index Storeに、Storage Contextを設定する。

Document Store：既出の、Loadingの所で、Document Storeから読み出している。

Indexingで、Vector Store と Index Storeに書き出し（永続化し）ている。

通常、DBにストア機能とサーチ機能が実装されているので、
Vector Store、Index Storeには同じDBのStorage Contextを設定する。

区分	インデックス	特性	適合する NoSQL
キーワード	SummaryIndex?、KeywordTableIndex?、SQLIndex	文書型データ、メタデータ管理	MongoDB, Elasticsearch, DynamoDB, Cassandra, Firebase Firestore
ベクトル	VectorStoreIndex?	ベクトルデータ管理	ANN Pinecone, Weaviate, Milvus, Qdrant, Redis
グラフ	KnowledgeGraphIndex?、PropertyGraphIndex?、TreeIndex?	グラフデータ（ノードとエッジ）管理	Neo4j, ArangoDB, Amazon Neptune, TigerGraph?, JanusGraph?

Querying †

Evaluation †

その多機能 †

エージェントの構築 †

ワークフローの構築 †

構造化データ抽出 †

トレースとデバッグ †

参考 †

RAGフレームワーク LlamaIndex の概要を整理してみる
https://zenn.dev/nomhiro/articles/llama-index-abstract

LlamaIndexを使ってローカル環境でRAGを実行する方法 - 電通総研テックブログ
https://tech.dentsusoken.com/entry/2024/01/22/LlamaIndex%E3%82%92%E4%BD%BF%E3%81%A3%E3%81%A6%E3%83%AD%E3%83%BC%E3%82%AB%E3%83%AB%E7%92%B0%E5%A2%83%E3%81%A7RAG%E3%82%92%E5%AE%9F%E8%A1%8C%E3%81%99%E3%82%8B%E6%96%B9%E6%B3%95

公式 †

LlamaIndex - LlamaIndex
https://docs.llamaindex.ai/en/stable/

Home †

https://docs.llamaindex.ai/en/stable/

High-Level Concepts
Installation and Setup
How to read these docs
Starter Examples
- Starter Tutorial (OpenAI) - LlamaIndex
  https://docs.llamaindex.ai/en/stable/getting_started/starter_example/
- Starter Tutorial (Local Models) - LlamaIndex
  https://docs.llamaindex.ai/en/stable/getting_started/starter_example_local/
Discover LlamaIndex Video Series
Frequently Asked Questions (FAQ)
Starter Tools

Learn †

https://docs.llamaindex.ai/en/stable/understanding/

Using LLMs
https://docs.llamaindex.ai/en/stable/understanding/rag/

Building a RAG pipeline

Loading & Ingestion
- Loading Data (Ingestion)
  https://docs.llamaindex.ai/en/stable/understanding/loading/loading/
- LlamaHub?
  https://docs.llamaindex.ai/en/stable/understanding/loading/llamahub/
- Loading from LlamaCloud?
  https://docs.llamaindex.ai/en/stable/understanding/loading/llamacloud/

Indexing & Embedding
https://docs.llamaindex.ai/en/stable/understanding/indexing/indexing/
Storing
https://docs.llamaindex.ai/en/stable/understanding/storing/storing/
Querying
https://docs.llamaindex.ai/en/stable/understanding/querying/querying/

Building an agent
Building Workflows
Structured Data Extraction
Tracing and Debugging
Evaluating
Putting it all Together

Use Cases †

https://docs.llamaindex.ai/en/stable/use_cases/

Prompting
Question-Answering (RAG)
Chatbots
Structured Data Extraction

Examples †

https://docs.llamaindex.ai/en/stable/examples/

Agents
Chat Engines
Cookbooks
Customization

Component Guides †

https://docs.llamaindex.ai/en/stable/module_guides/

Models
Prompts
Loading
Indexing

非公式 †

LlamaIndex クイックスタートガイド｜npaka †

LlamaIndexのXについてやってみた(v0.10 対応)｜Aya* †

① Data Loding
https://note.com/rhe/n/ndf6d42efe273
②Indexing
https://note.com/rhe/n/n27b4ad617226
③StoringStoring?
https://note.com/rhe/n/n852087f2d905
④Querying
https://note.com/rhe/n/n665a9a24ae17

LlamaIndexを完全に理解するチュートリアル †

その１：処理の概念や流れを理解する基礎編（v0.6.8対応）
https://dev.classmethod.jp/articles/llamaindex-tutorial-001-overview/
その１：処理の概念や流れを理解する基礎編（v0.7.9対応）
https://dev.classmethod.jp/articles/llamaindex-tutorial-001-overview-v0-7-9/
その２：テキスト分割のカスタマイズ
https://dev.classmethod.jp/articles/llamaindex-tutorial-002-text-splitter/
その３：CallbackManager?で内部動作の把握やデバッグを可能にする
https://dev.classmethod.jp/articles/llamaindex-tutorial-003-callback-manager/
その４：ListIndex?で埋め込みベクトルを使用する方法
https://dev.classmethod.jp/articles/llamaindex-tutorial-004-listindex-use-embedding-vector/
その５：TreeIndex?を使ってその動作を確認してみる
https://dev.classmethod.jp/articles/llamaindex-tutorial-005-treeindex/

その他 †

Indexing †

LlamaIndexモジュールガイドを試してみる: Indexing
https://zenn.dev/kun432/scraps/189e19674125ce

Graph †

LlamaIndexモジュールガイドを試してみる: Indexing > Knowledge Graph Index
https://zenn.dev/kun432/scraps/189e19674125ce#comment-4265fb2ee007b4
LlamaIndexのProperty Graph Indexを試す
https://zenn.dev/kun432/scraps/b2a363020a79e2
Local-LLM+Knowledge Graph+RAG, RAG series 2/n｜Pes Cafe
https://note.com/pescafe/n/n1759b5f4d41b