Setup7 min read

Tracking LangChain LLM Costs with CostLynx

Add a CostLynxCallbackHandler to any LangChain LLM to automatically track token usage after every response — no changes to your chain logic.

Install

pip

pip install costlynx langchain langchain-openai

CostLynxCallbackHandler

Implement BaseCallbackHandler and override on_llm_end. LangChain calls it after every LLM response with the token usage in llm_output.

Callback handler

from typing import Any, Optional
from uuid import UUID
from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.outputs import LLMResult
from costlynx import CostLynx

class CostLynxCallbackHandler(BaseCallbackHandler):
    def __init__(self, clx: CostLynx, feature: Optional[str] = None) -> None:
        super().__init__()
        self._clx = clx
        self._feature = feature

    def on_llm_end(
        self,
        response: LLMResult,
        *,
        run_id: UUID,
        **kwargs: Any,
    ) -> None:
        try:
            llm_output = response.llm_output or {}
            usage = llm_output.get("token_usage") or {}
            model_name = llm_output.get("model_name", "gpt-4o")
            input_tokens = usage.get("prompt_tokens", 0)
            output_tokens = usage.get("completion_tokens", 0)
            if input_tokens or output_tokens:
                self._clx.track(
                    provider="openai",
                    model=model_name,
                    input_tokens=input_tokens,
                    output_tokens=output_tokens,
                    feature=self._feature,
                    request_id=str(run_id),
                )
        except Exception:
            pass  # never break LangChain execution

Attach to any LangChain LLM

Usage

import os
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from costlynx import CostLynx

clx = CostLynx(
    ingestion_key=os.environ["COSTLYNX_INGESTION_KEY"],
    default_project="langchain-app",
    default_environment="prod",
)

llm = ChatOpenAI(
    model="gpt-4o-mini",
    callbacks=[CostLynxCallbackHandler(clx, feature="qa-chain")],
)

response = llm.invoke([HumanMessage(content="What is LLM cost optimisation?")])
print(response.content)

Tip

Pass a different feature= per chain to get per-chain cost breakdowns in the CostLynx dashboard.

LCEL / chain composition

The callback is inherited by any chain that uses the LLM. In LCEL you can pass callbacks at the LLM constructor level (as above) or at invoke time:

LCEL invoke-time callback

from langchain_core.runnables import RunnableConfig

handler = CostLynxCallbackHandler(clx, feature="summariser")
chain = prompt | llm | output_parser

result = chain.invoke(
    {"text": "Summarise this..."},
    config=RunnableConfig(callbacks=[handler]),
)

Failure behaviour

The handler wraps all tracking in a try/except — it never raises or breaks your chain.
Network errors and API failures are silently swallowed in production.
Enable debug=True on the CostLynx client to print errors to stderr during development.

← Back to all guides