← All docs
📦

Python SDK

pip install ragfly. Fastest path from Python: client.ask("…").

RAGfly Python SDK

The Python SDK is the fastest way to connect a Python agent or script to RAGfly. It wraps the REST API and SSE stream protocol into three simple methods.


Installation

pip install ragfly

Requires Python 3.10+. Single dependency: httpx.


Quick start

from ragfly import RAGfly

client = RAGfly(api_key="slm_live_...")

# End-to-end RAG: retrieves documents and generates a response
resp = client.ask("What are the Q1 sales figures?")
print(resp.answer)

# Token-by-token streaming (same as OpenAI)
for chunk in client.ask("Summarize the active contracts", stream=True):
    print(chunk.delta, end="", flush=True)

# Semantic retrieval only, without going through the LLM
results = client.search("maintenance contracts", limit=5)
for doc in results.documents:
    print(doc.nombre, f"rrf_score={doc.rrf_score:.3f}")
    for chunk in doc.chunks[:2]:
        print(f"  \"{chunk.texto[:100]}…\"")

Method reference

client.ask(question, *, stream=False, conversation_id=None)

Natural language question over the corpus. Internally: creates a temporary conversation → sends the message → consumes the SSE stream → returns the response.

Parameter Type Description
question str The natural language question
stream bool True → returns Iterator[AskChunk]; False (default) → AskResponse
conversation_id int | None Reuse an existing conversation to maintain history

Without streaming:

resp = client.ask("What does the Acme contract say?")
print(resp.answer)          # str — full response
print(resp.conversation_id) # int — id of the created conversation

With streaming:

for chunk in client.ask("What does the Acme contract say?", stream=True):
    print(chunk.delta, end="")  # str — text fragment

Maintain history in a conversation:

resp1 = client.ask("Who signed the contract?")
resp2 = client.ask("And when does it expire?", conversation_id=resp1.conversation_id)

client.search(query, *, limit=10, min_similitud=0.0, codigo_entidad=None, id_espacio=None)

Hybrid semantic search (vector + lexical + Cohere rerank) without LLM generation. Returns the most relevant chunks from the corpus with their scores.

Parameter Type Description
query str Search text
limit int Maximum documents to return (default 10)
min_similitud float Minimum similarity threshold 0–1 (default 0.0)
codigo_entidad str | None Filter by specific entity
id_espacio int | None Search only within a Workspace
results = client.search("maintenance contracts", limit=5)

print(f"{results.total_documentos} documents, {results.total_chunks} chunks")
print(f"Time: {results.duracion_ms:.0f}ms")

for doc in results.documents:
    print(f"· {doc.nombre} (rrf={doc.rrf_score:.3f})")
    for chunk in doc.chunks:
        print(f"  similitud={chunk.similitud:.3f}: {chunk.texto[:80]}…")

client.list_documents(*, page=1, page_size=20, estado=None)

Paginated list of the active group's corpus.

page = client.list_documents(page=1, page_size=50)
# → dict with keys: items, total, pagina, limite

Data models

Class Key fields
AskResponse answer: str, conversation_id: int, message_id: int | None
AskChunk delta: str
SearchResult query, total_documentos, total_chunks, duracion_ms, documents: list[Document]
Document codigo, nombre, resumen, url, rrf_score, similitud_max, chunks: list[Chunk]
Chunk texto, similitud, score_rerank, pagina

Authentication

The SDK accepts API Keys (format slm_live_...) generated from app.ragfly.ai/api-keys or via the POST /auth/api-key endpoint.

import os
client = RAGfly(api_key=os.environ["RAGFLY_API_KEY"])

The Key inherits the group, entity, and role of the user who issued it. See INTEGRATION.md § Credentials for role and PROFILE details.


Context manager

with RAGfly(api_key="slm_live_...") as client:
    resp = client.ask("How many active contracts are there?")
    print(resp.answer)
# → client.close() is called automatically