Skip to content

Format Comparison: NL vs JSON vs DCP

Core Finding

DCP positional arrays are as readable as JSON and NL for LLMs. When a model fails a task, it fails across all formats equally — format is not the bottleneck, model capability is.

This means DCP's token savings come at no accuracy cost.

Evidence

NL vs DCP accuracy testing (4 models × 2 tasks × 3 runs):

ModelTestNLDCP
phi3:minihighest_score0/33/3
phi3:minifilter_by_value0/30/3
gemma2:2bhighest_score0/33/3
gemma2:2bfilter_by_value3/33/3
llama3.2:1bhighest_score3/33/3
llama3.2:1bfilter_by_value3/30/3

DCP outperforms NL in 2 cases, ties in 3, loses in 1. No consistent disadvantage.

Shadow level testing confirmed the same pattern — L0 (DCP with field names only) matched or beat L4 (NL key-value) for phi3 and llama. See Lightweight LLM & Density for full data.

Implications

  • DCP is safe to deploy — no accuracy penalty vs JSON or NL
  • Token savings are free — same comprehension, fewer tokens
  • Model capability is the bottleneck, not data format