The problem: latency, not compute
Each QuantOracle calculation is fast — a Black-Scholes price computes server-side in roughly 15 ms. But a calculation isn't the slow part. The slow part is the round trip: DNS, TLS, the request crossing the internet to the API, and the response crossing back. That's typically 200–400 ms per call, and it dwarfs the ~15 ms of actual math.
So when you loop over an option chain calling /v1/options/price 20 times, you pay that round-trip tax 20 times in series. The compute is ~300 ms total; the waiting is everything else.
for K in strikes: # 20 strikes
price(S, K, T, r, sigma) # 1 HTTP round-trip each
# 20 × ~360 ms = ~7.2 seconds, mostly spent waiting on the networkThe fix: one request, up to 100 computations
/v1/batch takes a list of sub-requests, runs them all server-side, and returns every result in one response. You pay the network round-trip once instead of N times. The body is a list of { endpoint, params } objects — and the endpoints can be mixed: a few option prices, a Kelly sizing, a VaR, all in the same batch.
POST https://api.quantoracle.dev/v1/batch
Content-Type: application/json
{
"requests": [
{ "endpoint": "options/price", "params": { "S": 100, "K": 95, "T": 0.25, "r": 0.05, "sigma": 0.2, "type": "call" } },
{ "endpoint": "options/price", "params": { "S": 100, "K": 100, "T": 0.25, "r": 0.05, "sigma": 0.2, "type": "call" } },
{ "endpoint": "options/price", "params": { "S": 100, "K": 105, "T": 0.25, "r": 0.05, "sigma": 0.2, "type": "call" } },
{ "endpoint": "risk/kelly", "params": { "win_prob": 0.55, "win_loss_ratio": 1.8 } }
]
}The response preserves order and reports per-item status, so a single bad input doesn't sink the whole batch:
{
"batch_size": 4,
"total_price_usdc": 0.025,
"ms": 64.1,
"results": [
{ "endpoint": "options/price", "status": 200, "data": { "price": 6.58, "greeks": { ... } } },
{ "endpoint": "options/price", "status": 200, "data": { "price": 4.61, "greeks": { ... } } },
{ "endpoint": "options/price", "status": 200, "data": { "price": 3.07, "greeks": { ... } } },
{ "endpoint": "risk/kelly", "status": 200, "data": { "kelly_fraction": 0.30, ... } }
]
}Each result carries its own status. A sub-request that fails validation comes back as { "status": 422, "data": { "error": ... } } in its slot — the other 99 results are unaffected. Match results to requests by index; the order out is the order in.
The benchmark (a real run)
Twenty Black-Scholes call prices across a strike ladder ($80–$175), measured two ways from the same machine against the live API:
N requests : 20
Sequential wall-clock: 7,182 ms (one HTTP round-trip per call)
Batch wall-clock : 1,426 ms (one round-trip total)
Speedup : 5.0×
Batch server compute : 320 ms (server-reported "ms" — ~16 ms/calc)
Price (both ways) : 0.1 USDC (20 × $0.005 — batching is not cheaper, just faster)The shape of those numbers is the whole point. The batch did the same 20 calculations in ~320 ms of actual compute; the remaining ~1.1 s of its wall-clock is the single network round-trip. The sequential version spent ~6.9 s of its 7.2 s waiting on the network — paying that round-trip 20 times. Collapse 20 round-trips into 1 and the latency tax collapses with it.
The bigger the batch, the bigger the win: at the 100-request maximum, you replace 100 round-trips with one. The speedup scales with how network-bound your loop was.
When batching is the right tool
- Option chains & vol surfaces. Price every strike and expiry in one shot instead of nesting two loops over the network. Feeds straight into the Black-Scholes calculator math at scale.
- Parameter sweeps. Sweeping volatility from 10% to 80% to plot how an option's Greeks move? That's 15–70 independent calls — a perfect batch.
- Multi-asset risk. Compute VaR, Sharpe, and drawdown across a book of 30 positions in one request rather than 90 separate ones.
- Backfills & reports. Any time you're generating a table where each row is an independent calculation, batch the rows.
When is batching not the right tool? When calls are dependent — when call 2's inputs come from call 1's output. Batches run as a set of independent computations; there's no data flow between sub-requests. For dependent, multi-step reasoning (audit risk → then recommend a hedge based on the result), you want a chained agent loop instead — see Chaining x402 paid tool calls.
Pricing: batching is about speed, not discount
A batch costs the sum of its sub-request prices — no more, no less. Twenty Black-Scholes calls at $0.005 each cost $0.10 whether you fire them sequentially or in one batch. Batching buys you latency, not a volume discount.
And most of the time it costs nothing: the free tier covers 1,000 calls per IP per day with no API key, and every one of the 63 calculator endpoints is free within that quota. A 20-call batch counts as 20 calls against your daily free quota — well within reach for development and most production loads. You only pay (via x402 USDC micropayments on Base or Solana) once you exceed the free tier or call the paid composite endpoints. Full breakdown on the pricing page.
The code, end to end
A runnable Python example — build the batch, send it, read the results back in order:
import json, urllib.request
API = "https://api.quantoracle.dev"
def call(path, body):
req = urllib.request.Request(
API + path,
data=json.dumps(body).encode(),
headers={"Content-Type": "application/json"},
)
with urllib.request.urlopen(req, timeout=30) as r:
return json.loads(r.read())
# Price a 20-strike call chain in ONE request
strikes = range(80, 180, 5)
batch = {
"requests": [
{
"endpoint": "options/price",
"params": {
"S": 100, "K": k, "T": 0.25,
"r": 0.05, "sigma": 0.2, "type": "call",
},
}
for k in strikes
]
}
out = call("/v1/batch", batch)
print(f"{out['batch_size']} prices in {out['ms']} ms, {out['total_price_usdc']} USDC")
for req, res in zip(batch["requests"], out["results"]):
if res["status"] == 200:
print(f" K={req['params']['K']:>3} -> {res['data']['price']:.4f}")
else:
print(f" K={req['params']['K']:>3} -> ERROR {res['status']}")The TypeScript shape is identical — one fetch with a JSON body:
const res = await fetch("https://api.quantoracle.dev/v1/batch", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
requests: strikes.map((K) => ({
endpoint: "options/price",
params: { S: 100, K, T: 0.25, r: 0.05, sigma: 0.2, type: "call" },
})),
}),
});
const { results, ms, total_price_usdc } = await res.json();Giving an agent one efficient tool instead of a loop
If you expose quant tools to an LLM agent, batching matters even more. An agent asked to "price this whole chain" will, by default, emit one tool call per strike — 20 sequential round-trips, 20 reasoning steps, 20 chances to drift. Expose a single batch_price tool backed by /v1/batch and the agent makes one tool call that returns the entire chain. Fewer turns, lower latency, and the agent reasons over a complete table instead of dribbling in one row at a time.
// A single agent tool that prices an arbitrary set of strikes in one call.
const batchPriceTool = {
name: "batch_price_options",
description:
"Price many options at once. Pass an array of {K, T, sigma, type}; " +
"returns every price + Greeks in a single response. Use this instead of " +
"calling the single-option pricer in a loop.",
schema: z.object({
S: z.number(),
r: z.number(),
legs: z.array(z.object({
K: z.number(), T: z.number(), sigma: z.number(),
type: z.enum(["call", "put"]),
})),
}),
func: async ({ S, r, legs }) => {
const res = await fetch("https://api.quantoracle.dev/v1/batch", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
requests: legs.map((l) => ({
endpoint: "options/price",
params: { S, r, ...l },
})),
}),
});
return (await res.json()).results;
},
};This is the same lesson as the chained-tool-call tutorial from the other direction: chain when calls are dependent; batch when they're independent. Most "do this calculation across a list" tasks are the latter.
Gotchas worth knowing
- Max 100 sub-requests per batch. Need more? Chunk into batches of 100 and send the chunks concurrently — now you're parallelizing a handful of round trips instead of thousands.
- Endpoint paths are relative, no leading
/v1/. Use"options/price", not"/v1/options/price"(a leading/trailing slash is tolerated, but keep it clean). An unknown endpoint rejects the whole batch with a 400. - Per-item failures are isolated. A bad
paramsobject comes back as a non-200statusin its slot; the rest still compute. Always check each result'sstatusbefore readingdata. - Order is preserved.
results[i]corresponds torequests[i]. Zip them by index. - It's still N calls for billing/quota. A 50-call batch consumes 50 of your 1,000 daily free calls and, past the free tier, costs the sum of the 50 prices.
The bottom line
If your code or your agent calls a quant endpoint in a loop, you're almost certainly paying network latency you don't need to. Collapse the loop into one /v1/batch request and the round-trip tax goes from N× to 1×. In the run above that was a 5× speedup on 20 calls, for the same 0.1 USDC — and the win grows with the batch size. Same math, same price, a fraction of the wait.
Related
- Chaining x402 paid tool calls — the opposite pattern: when calls are dependent, chain instead of batch
- Add 73 quant tools to your agent with MCP — expose the batch endpoint (and all 73) as agent tools in one config line
- Black-Scholes calculator — the per-option math a chain batch runs at scale
- API docs — the full endpoint catalog you can mix inside a single batch