RR-AgentInsights › Factor Mining for China A-Shares: Seven Factor Types & Anti-Overfitting Discipline

Factor Mining for China A-Shares: Seven Factor Types & Anti-Overfitting Discipline

Factor mining is where quant research starts — but distinguishing a real factor from a backtest-pretty one requires strict discipline. This post outlines the factor taxonomy for China A-shares and the validation pipeline we use (methodology only; we do not publish factor names, formulas, or parameters).

The seven factor categories

A-share quant factors fall into seven broad families, capturing different micro- and fundamental signals:

Type Signal source
Momentum Persistence of price trends
Reversal Short-term mean reversion
Volatility / Volume Volatility regime + volume structure
Microstructure Order book, tick-level, large-order classification
Money-flow Main capital, Northbound flow, Dragon List
Fundamentals F10, financial statements, valuation, dividends
ML-synthesized Non-linear composition of multiple primitive factors

Every type depends on a clean, consistent data substrate — which is exactly what ReachRich provides: unified data contract, adjusted-price continuity, multi-source cross-validation.

Why "backtest-pretty" is not enough

The single biggest factor-mining trap is overfitting: test enough variants on the same history, and you will always find one with a great-looking Sharpe — but it is fitting noise, not signal. To separate real alpha from noise, every candidate factor must pass three gates:

  1. Out-of-Sample (OOS) — re-validate on data segments the mining process never saw.
  2. DSR (Deflated Sharpe Ratio) multiple-testing correction — explicitly adjust for how many variants were tried. See our deep-dive on CPCV + DSR backtesting.
  3. Transaction-cost gate — strategy must still net positive after impact cost, fees, and slippage.

We publish methodology, not factor IP

Serious quant teams disclose validation methodology and out-of-sample performance, not factor names/formulas/parameters/model weights. What is reproducible is the discipline, not the alpha itself.