What each metric actually measures
All three answer the same question — "how much return did this strategy generate per unit of risk?" — but they disagree about what counts as risk. That disagreement is the entire reason multiple metrics exist.
Sharpe: volatility is risk
Sharpe = (mean return − risk-free rate) / standard deviation of returns
The Sharpe ratio (Sharpe, 1966) treats any deviation from the mean as risk — including upside surprises. A strategy that returned +30% one month and +5% the next is penalized for the same total "volatility" as one that returned +5% and -20%. That matches the idea of risk used in mean-variance portfolio theory and CAPM, but it doesn't match how investors actually feel risk.
Sortino: only downside is risk
Sortino = (mean return − target) / downside deviation (returns below target only)
The Sortino ratio (Sortino & Price, 1994) fixes the symmetry problem. It only counts returns that fall below a target threshold (usually zero or the risk-free rate) toward the denominator. Upside volatility is no longer penalized. This matches the intuition that big winning months are not risk — they're the point.
Calmar: peak-to-trough loss is risk
Calmar = annualized return / |maximum drawdown|
The Calmar ratio (Young, 1991) goes further: it doesn't care about the shape of the return distribution at all. It cares about one number — the worst peak-to-trough loss ever observed. This matches the practical question allocators ask: "What is the worst experience this strategy has put investors through?"
A concrete example: three strategies, three winners
Imagine three hypothetical strategies, each running for 5 years with the same average annualized return of 12%:
| Strategy | Profile | Vol | Max DD | Sharpe | Sortino | Calmar |
|---|---|---|---|---|---|---|
| A. Carry | Smooth → one blowup | 8% | 35% | 1.50 | 2.10 | 0.34 |
| B. Long-only equity | Normal-ish | 15% | 25% | 0.80 | 1.10 | 0.48 |
| C. Trend-following | Big wins, small losses | 18% | 12% | 0.67 | 1.45 | 1.00 |
By Sharpe, Strategy A (carry) looks best at 1.50. By Sortino, A is still best at 2.10 — but B and C close the gap. By Calmar, the ranking inverts: A is worst (0.34) and C is best (1.00).
Same returns, same period, three different recommended strategies. The metric you pick determines the answer. That's why allocators report all three.
When each one lies
Sharpe lies on non-normal returns
Sharpe assumes returns are roughly normal. They're not, for most real strategies. Three pathological cases:
- Negative skew (option selling, carry trades, short-volatility): the strategy has many small wins and rare large losses. Until a tail event hits, Sharpe looks great. Then it doesn't.
- Fat tails (high kurtosis): the strategy occasionally has moves much larger than the normal distribution would predict. Sharpe under-estimates the risk because standard deviation under-weights tail observations.
- Short sample sizes: with fewer than ~30 monthly observations, the sample Sharpe has a wide confidence interval. A 6-month strategy with Sharpe 3.0 could really be anywhere from -1 to +5. The probabilistic Sharpe ratio corrects for this explicitly.
Sortino lies by always looking better than Sharpe
Sortino mechanically produces a higher number than Sharpe (for the same strategy) because the denominator is smaller. The ratio of Sortino to Sharpe is typically 1.3-1.7x. That's not a feature — it just means you can't directly compare a Sortino from one fund to a Sharpe from another. Apples-to-apples comparison requires using the same metric.
Sortino also still uses a denominator computed from past observations. A strategy that had no big down months in-sample will have an artificially small downside deviation and therefore a sky-high Sortino. Recent crisis-alpha strategies (long-vol, tail hedges) can look terrible by Sortino during calm regimes for exactly this reason — they bleed steadily with no large up months either.
Calmar is dominated by one data point
Calmar's denominator is the single worst drawdown. That means it can change overnight if a new worst-ever drawdown happens. A strategy with a 5-year track record and Calmar 1.5 can drop to Calmar 0.6 after one bad quarter. It also means a strategy that happens not to have experienced a drawdown yet (because it's only 18 months old, or because the regime has been favorable) will show an inflated Calmar.
Calmar also doesn't care how frequently drawdowns happen. A strategy with one big drawdown and three years of recovery scores the same as a strategy that experiences the same drawdown depth annually. Some practitioners use the average drawdown (Pain Index) alongside Calmar to fix this.
What good values look like
Rough ranges for what's acceptable in different contexts. Annualized, after fees, computed on at least 3 years of monthly returns:
| Strategy class | Good Sharpe | Good Sortino | Good Calmar |
|---|---|---|---|
| Long-only equity | 0.5 – 0.8 | 0.7 – 1.2 | 0.3 – 0.6 |
| 60/40 balanced | 0.6 – 1.0 | 0.9 – 1.4 | 0.5 – 0.9 |
| Hedge fund (typical) | 0.8 – 1.5 | 1.2 – 2.0 | 0.7 – 1.5 |
| Quant CTA / trend | 0.6 – 1.2 | 1.0 – 2.0 | 0.5 – 1.5 |
| Market-neutral | 1.0 – 2.0 | 1.5 – 2.5 | 1.0 – 3.0 |
| Crypto strategy | 0.5 – 1.5 | 0.8 – 1.8 | 0.3 – 1.0 |
Sharpe above 3.0 on a multi-year sample is rare and usually indicates a data error, look-ahead bias, or unaccounted-for transaction costs. Worth scrutinizing before allocating to.
The decision rule (use this, save the rest for context)
- Always compute Sharpe. It's the industry default and the only metric an LP will compare to other managers without conversion. Use the Sharpe ratio calculator (with its 95% confidence interval) — and run the probabilistic Sharpe ratio to check whether the number is statistically meaningful.
- If the strategy has skewed returns, also compute Sortino. Skew < -0.5 or > +0.5 is enough to make Sharpe misleading. The PSR output gives you skewness as a free side-effect; check it.
- Always compute Calmar before allocating capital. The drawdown calculator gives you max drawdown, which is the Calmar denominator. If the strategy hasn't had a drawdown yet (under-2-year track records), assume Calmar will be lower than the backtest suggests.
- Report all three in your tearsheet. Putting just one number in front of an allocator who knows finance is a red flag. Putting all three with one-sentence interpretations is what a competent shop does.
Related calculators
- Sharpe Ratio Calculator — Sharpe with a 95% confidence interval (most calculators omit the CI; it matters a lot for short samples)
- Probabilistic Sharpe Ratio Calculator — Lopez de Prado 2012 PSR adjusting for sample size, skew, and kurtosis
- Drawdown Calculator — max drawdown (the Calmar denominator), average drawdown, recovery time
- Value at Risk Calculator — parametric and historical VaR + CVaR for the same return series
- Monte Carlo Simulation Calculator — forward-projects Calmar by simulating thousands of return paths
References
- Sharpe, W. F. (1966). "Mutual fund performance." Journal of Business 39, 119-138.
- Sortino, F. & Price, L. (1994). "Performance measurement in a downside risk framework." Journal of Investing 3(3), 59-64.
- Young, T. (1991). "Calmar ratio: A smoother tool." Futures Magazine.
- Bailey, D. H. & Lopez de Prado, M. (2012). "The Sharpe ratio efficient frontier." Journal of Risk 15(1).
- Magdon-Ismail, M. & Atiya, A. (2004). "Maximum drawdown." Risk Magazine, October.