Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

AI Router

Route your LLM calls through the AI Router with a single base URL change. Zero vendor lock-in: always run on the best model at the lowest cost for your use case.

Observability

Attach the native Orq callback handler to your LangGraph to capture traces for every LLM call, graph node, tool use, and retrieval.

AI Router

Overview

LangChain is a framework for building LLM-powered applications through composable chains, agents, and integrations with external data sources. By connecting LangChain to Orq.ai’s AI Router, you access 300+ models through a single base URL change.

Key Benefits

Orq.ai’s AI Router enhances your LangChain applications with:

Complete Observability

Track every chain step, tool use, and LLM call with detailed traces

Built-in Reliability

Automatic fallbacks, retries, and load balancing for production resilience

Cost Optimization

Real-time cost tracking and spend management across all your AI operations

Multi-Provider Access

Access 300+ LLMs and 20+ providers through a single, unified integration

Prerequisites

Before integrating LangChain with Orq.ai, ensure you have:

An Orq.ai account and API Key
Python 3.8 or higher

To set up your API key, see API keys & Endpoints.

Installation

pip install langchain langchain-openai

Configuration

Configure LangChain to use Orq.ai’s AI Router via ChatOpenAI with a custom base_url:

Python

from langchain_openai import ChatOpenAI
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

base_url: https://api.orq.ai/v3/router

Basic Example

Python

from langchain_openai import ChatOpenAI
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

result = llm.invoke("Explain quantum computing in simple terms.")
print(result.content)

Chains

Build composable chains using LangChain’s pipe operator:

Python

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}"),
])

chain = prompt | llm
result = chain.invoke({"input": "Tell me a joke about programming."})
print(result.content)

Streaming

Python

from langchain_openai import ChatOpenAI
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

for chunk in llm.stream("Write a short poem about the ocean."):
    print(chunk.content, end="", flush=True)
print()

Model Selection

With Orq.ai, you can use any supported model from 20+ providers:

Python

from langchain_openai import ChatOpenAI
import os

# Use Claude
claude = ChatOpenAI(
    model="anthropic/claude-sonnet-4-5",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

# Use Gemini
gemini = ChatOpenAI(
    model="gemini-2.5-flash",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

# Use Groq
groq = ChatOpenAI(
    model="groq/llama-3.3-70b-versatile",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

Observability

orq_ai_sdk.langchain provides a global setup() function that automatically instruments all LangChain and LangGraph components. Call it once at the top of your application and every LLM call, graph node, tool execution, and retrieval is traced automatically, no callback wiring needed.

Zero configuration

One setup() call and tracing is live, no callbacks, no OpenTelemetry exporters, no extra wiring.

Full graph visibility

Traces preserve the parent-child structure of your LangGraph so you see exactly which node triggered each LLM call or tool use.

Token usage and costs

Input and output token counts are captured on every LLM call and synced to Orq.ai for cost tracking.

Retrieval tracking

Retrieval events include the query and all returned documents, making RAG pipelines fully inspectable.

Installation

pip install orq-ai-sdk langchain-core langchain-openai langgraph

orq-ai-sdk is the Orq.ai Python SDK. @orq-ai/node is the Orq.ai Node.js SDK.

Environment Variables

export ORQ_API_KEY="your-orq-api-key"
export OPENAI_API_KEY="your-openai-api-key" # required because the examples call OpenAI models directly

Or set them in code:

import os
os.environ["ORQ_API_KEY"] = "your-orq-api-key"
os.environ["OPENAI_API_KEY"] = "your-openai-api-key"

Basic Example

Call setup() at the top of your entry point, before invoking any graphs or chains. It globally instruments LangChain so that all subsequent executions are traced automatically.

from orq_ai_sdk.langchain import setup

setup()

from typing import Annotated
from typing_extensions import TypedDict
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages

class State(TypedDict):
    messages: Annotated[list, add_messages]

graph_builder = StateGraph(State)
llm = ChatOpenAI(model="gpt-4o", temperature=0.2)

def chatbot(state: State):
    return {"messages": [llm.invoke(state["messages"])]}

graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)

graph = graph_builder.compile()

result = graph.invoke({"messages": [{"role": "user", "content": "Hello!"}]})
print(result["messages"][-1].content)

Async Example

from orq_ai_sdk.langchain import setup

setup()

import asyncio
from typing import Annotated
from typing_extensions import TypedDict
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages

class State(TypedDict):
    messages: Annotated[list, add_messages]

graph_builder = StateGraph(State)
llm = ChatOpenAI(model="gpt-4o", temperature=0.2)

def chatbot(state: State):
    return {"messages": [llm.invoke(state["messages"])]}

graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)

graph = graph_builder.compile()

async def main():
    result = await graph.ainvoke({"messages": [{"role": "user", "content": "Hello!"}]})
    print(result["messages"][-1].content)

asyncio.run(main())

Viewing Traces

Traces appear in the Orq.ai Studio under the Traces tab. Each run is captured as a tree reflecting your graph structure: top-level chain spans for each node, with LLM calls, tool executions, and retrievals nested underneath.

What Gets Traced

Event	Details captured
Graph nodes (chains)	Node name, inputs, outputs, duration
LLM calls	Messages, model, token usage, finish reason
Tool executions	Tool name, input, output, duration
Retrievals	Query, returned documents
Agent actions	Action taken, finish output

Evaluations & Experiments

Once your agents are running, use Evaluatorq to score outputs across a dataset and Experiments to compare configurations side-by-side.

Run Evaluations with Evaluatorq

Run parallel evaluations across your agents and compare results.

Run Experiments via the API

Compare agent configurations and view results in the AI Studio.

Documentation Index

AI Router

Observability

​AI Router

​Overview

​Key Benefits

Complete Observability

Built-in Reliability

Cost Optimization

Multi-Provider Access

​Prerequisites

​Installation

​Configuration

​Basic Example

​Chains

​Streaming

​Model Selection

​Observability

Zero configuration

Full graph visibility

Token usage and costs

Retrieval tracking

​Installation

​Environment Variables

​Basic Example

​Async Example

​Viewing Traces

​What Gets Traced

​Evaluations & Experiments

Run Evaluations with Evaluatorq

Run Experiments via the API

AI Router

Overview

Key Benefits

Prerequisites

Installation

Configuration

Basic Example

Chains

Streaming

Model Selection

Observability

Installation

Environment Variables

Basic Example

Async Example

Viewing Traces

What Gets Traced

Evaluations & Experiments