Batch API Calls for Speed — Price a Whole Option Chain in One Request

The problem: latency, not compute

Each QuantOracle calculation is fast — a Black-Scholes price computes server-side in roughly 15 ms. But a calculation isn't the slow part. The slow part is the round trip: DNS, TLS, the request crossing the internet to the API, and the response crossing back. That's typically 200–400 ms per call, and it dwarfs the ~15 ms of actual math.

So when you loop over an option chain calling /v1/options/price 20 times, you pay that round-trip tax 20 times in series. The compute is ~300 ms total; the waiting is everything else.

for K in strikes:          # 20 strikes
    price(S, K, T, r, sigma)  # 1 HTTP round-trip each
# 20 × ~360 ms = ~7.2 seconds, mostly spent waiting on the network

The fix: one request, up to 100 computations

/v1/batch takes a list of sub-requests, runs them all server-side, and returns every result in one response. You pay the network round-trip once instead of N times. The body is a list of { endpoint, params } objects — and the endpoints can be mixed: a few option prices, a Kelly sizing, a VaR, all in the same batch.

POST https://api.quantoracle.dev/v1/batch
Content-Type: application/json

{
  "requests": [
    { "endpoint": "options/price", "params": { "S": 100, "K": 95,  "T": 0.25, "r": 0.05, "sigma": 0.2, "type": "call" } },
    { "endpoint": "options/price", "params": { "S": 100, "K": 100, "T": 0.25, "r": 0.05, "sigma": 0.2, "type": "call" } },
    { "endpoint": "options/price", "params": { "S": 100, "K": 105, "T": 0.25, "r": 0.05, "sigma": 0.2, "type": "call" } },
    { "endpoint": "risk/kelly",    "params": { "win_prob": 0.55, "win_loss_ratio": 1.8 } }
  ]
}

The response preserves order and reports per-item status, so a single bad input doesn't sink the whole batch:

{
  "batch_size": 4,
  "total_price_usdc": 0.025,
  "ms": 64.1,
  "results": [
    { "endpoint": "options/price", "status": 200, "data": { "price": 6.58, "greeks": { ... } } },
    { "endpoint": "options/price", "status": 200, "data": { "price": 4.61, "greeks": { ... } } },
    { "endpoint": "options/price", "status": 200, "data": { "price": 3.07, "greeks": { ... } } },
    { "endpoint": "risk/kelly",    "status": 200, "data": { "kelly_fraction": 0.30, ... } }
  ]
}

Each result carries its own status. A sub-request that fails validation comes back as { "status": 422, "data": { "error": ... } } in its slot — the other 99 results are unaffected. Match results to requests by index; the order out is the order in.

The benchmark (a real run)

Twenty Black-Scholes call prices across a strike ladder ($80–$175), measured two ways from the same machine against the live API:

N requests           : 20
Sequential wall-clock:  7,182 ms      (one HTTP round-trip per call)
Batch wall-clock     :  1,426 ms      (one round-trip total)
Speedup              :  5.0×
Batch server compute :    320 ms      (server-reported "ms" — ~16 ms/calc)
Sequential cost      :  free          (20 calculator calls, under the 1,000/day free tier)
Batch cost           :  0.1 USDC      (paid endpoint — 20 × $0.005, settled via x402)

The shape of those numbers is the whole point. The batch did the same 20 calculations in ~320 ms of actual compute; the remaining ~1.1 s of its wall-clock is the single network round-trip. The sequential version spent ~6.9 s of its 7.2 s waiting on the network — paying that round-trip 20 times. Collapse 20 round-trips into 1 and the latency tax collapses with it.

The bigger the batch, the bigger the win: at the 100-request maximum, you replace 100 round-trips with one. The speedup scales with how network-bound your loop was.

When batching is the right tool

Option chains & vol surfaces. Price every strike and expiry in one shot instead of nesting two loops over the network. Feeds straight into the Black-Scholes calculator math at scale.
Parameter sweeps. Sweeping volatility from 10% to 80% to plot how an option's Greeks move? That's 15–70 independent calls — a perfect batch.
Multi-asset risk. Compute VaR, Sharpe, and drawdown across a book of 30 positions in one request rather than 90 separate ones.
Backfills & reports. Any time you're generating a table where each row is an independent calculation, batch the rows.

When is batching not the right tool? When calls are dependent — when call 2's inputs come from call 1's output. Batches run as a set of independent computations; there's no data flow between sub-requests. For dependent, multi-step reasoning (audit risk → then recommend a hedge based on the result), you want a chained agent loop instead — see Chaining x402 paid tool calls.

Pricing: the batch endpoint is paid — you pay for speed

Here is the tradeoff to understand: the batch endpoint is paid, and it sits outside the free tier. Individual calculator calls are free for the first 1,000 per IP per day — but bundling them into a batch is a paid convenience. Each IP gets one free trial batch; after that, every batch costs the sum of its sub-request prices (20 Black-Scholes calls × $0.005 = $0.10), settled via x402 micropayments in USDC on Base or Solana.

So you are not getting the speedup for free — you are paying for it. If those 20 calls would have fit inside your daily free quota, the sequential version costs nothing and the batch costs $0.10: that $0.10 buys the 5× latency win and one round-trip instead of twenty. Above the free tier, where the individual calls cost $0.005 each anyway, the batch is the same price and simply far faster. Batching never gives a volume discount — it trades dollars for latency. Full breakdown on the pricing page.

The code, end to end

A runnable Python example — build the batch, send it, read the results back in order:

import json, urllib.request

API = "https://api.quantoracle.dev"

def call(path, body):
    req = urllib.request.Request(
        API + path,
        data=json.dumps(body).encode(),
        headers={"Content-Type": "application/json"},
    )
    with urllib.request.urlopen(req, timeout=30) as r:
        return json.loads(r.read())

# Price a 20-strike call chain in ONE request
strikes = range(80, 180, 5)
batch = {
    "requests": [
        {
            "endpoint": "options/price",
            "params": {
                "S": 100, "K": k, "T": 0.25,
                "r": 0.05, "sigma": 0.2, "type": "call",
            },
        }
        for k in strikes
    ]
}

out = call("/v1/batch", batch)
print(f"{out['batch_size']} prices in {out['ms']} ms, {out['total_price_usdc']} USDC")

for req, res in zip(batch["requests"], out["results"]):
    if res["status"] == 200:
        print(f"  K={req['params']['K']:>3}  ->  {res['data']['price']:.4f}")
    else:
        print(f"  K={req['params']['K']:>3}  ->  ERROR {res['status']}")

The TypeScript shape is identical — one fetch with a JSON body:

const res = await fetch("https://api.quantoracle.dev/v1/batch", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    requests: strikes.map((K) => ({
      endpoint: "options/price",
      params: { S: 100, K, T: 0.25, r: 0.05, sigma: 0.2, type: "call" },
    })),
  }),
});
const { results, ms, total_price_usdc } = await res.json();

Giving an agent one efficient tool instead of a loop

If you expose quant tools to an LLM agent, batching matters even more. An agent asked to "price this whole chain" will, by default, emit one tool call per strike — 20 sequential round-trips, 20 reasoning steps, 20 chances to drift. Expose a single batch_price tool backed by /v1/batch and the agent makes one tool call that returns the entire chain. Fewer turns, lower latency, and the agent reasons over a complete table instead of dribbling in one row at a time.

// A single agent tool that prices an arbitrary set of strikes in one call.
const batchPriceTool = {
  name: "batch_price_options",
  description:
    "Price many options at once. Pass an array of {K, T, sigma, type}; " +
    "returns every price + Greeks in a single response. Use this instead of " +
    "calling the single-option pricer in a loop.",
  schema: z.object({
    S: z.number(),
    r: z.number(),
    legs: z.array(z.object({
      K: z.number(), T: z.number(), sigma: z.number(),
      type: z.enum(["call", "put"]),
    })),
  }),
  func: async ({ S, r, legs }) => {
    const res = await fetch("https://api.quantoracle.dev/v1/batch", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({
        requests: legs.map((l) => ({
          endpoint: "options/price",
          params: { S, r, ...l },
        })),
      }),
    });
    return (await res.json()).results;
  },
};

This is the same lesson as the chained-tool-call tutorial from the other direction: chain when calls are dependent; batch when they're independent. Most "do this calculation across a list" tasks are the latter.

Gotchas worth knowing

Max 100 sub-requests per batch. Need more? Chunk into batches of 100 and send the chunks concurrently — now you're parallelizing a handful of round trips instead of thousands.
Endpoint paths are relative, no leading /v1/. Use "options/price", not "/v1/options/price" (a leading/trailing slash is tolerated, but keep it clean). An unknown endpoint rejects the whole batch with a 400.
Per-item failures are isolated. A bad params object comes back as a non-200 status in its slot; the rest still compute. Always check each result's status before reading data.
Order is preserved. results[i] corresponds to requests[i]. Zip them by index.
Batch is paid, outside the free tier. Unlike individual calculator calls (free up to 1,000/IP/day), a batch costs the sum of its sub-request prices via x402 — after a single free trial batch per IP. A 50-call batch of $0.005 calculators costs $0.25.

The bottom line

If your code or your agent calls a quant endpoint in a loop, you're almost certainly paying network latency you don't need to. Collapse the loop into one /v1/batch request and the round-trip tax goes from N× to 1×. In the run above that was a 5× speedup on 20 calls for $0.10 via the paid batch endpoint — and the win grows with the batch size. You pay for the speed, but when you are network-bound, a fraction of the wait is usually worth a fraction of a cent.

Chaining x402 paid tool calls — the opposite pattern: when calls are dependent, chain instead of batch
Add 73 quant tools to your agent with MCP — expose the batch endpoint (and all 73) as agent tools in one config line
Black-Scholes calculator — the per-option math a chain batch runs at scale
API docs — the full endpoint catalog you can mix inside a single batch

Batch API calls for speed: price a whole option chain in one request