Skip to content

Lightweight LLM Compatibility & Density

DCP is viable as a system→AI data format even for sub-4B models. Read comprehension works. Generation does not. No single density level works for all models — adaptive selection is justified.

Test Environment

  • Models: phi3:mini (3.8B), gemma2:2b, qwen2.5:1.5b, llama3.2:1b, qwen2.5:0.5b
  • Environment: ollama 0.18.2, localhost, temperature=0, 3 runs per test
  • Date: 2026-03-25/26

DCP Read Comprehension

Modelbasic_field_lookupfield_by_positioncount_and_filter
phi3:mini (3.8B)3/33/30/3
gemma2:2b0/30/33/3
qwen2.5:1.5b0/33/33/3
llama3.2:1b0/33/33/3
qwen2.5:0.5b3/33/33/3

All models can read DCP data. Failure patterns are task-type-specific, not DCP-specific.

DCP Generation

Modelvalid_jsoncorrect_orderhas_all_fields
phi3:mini3/30/30/3
gemma2:2b3/30/33/3
qwen2.5:1.5b3/30/30/3
llama3.2:1b0/30/30/3
qwen2.5:0.5b3/30/30/3

correct_order = 0/3 across all models. LLMs produce valid JSON but cannot maintain positional field ordering. System-side encoder is essential for system → AI. For AI → system, the Output Controller places LLM key-value output into schema-ordered positions.

NL vs DCP Accuracy

ModelTestNLDCPDCP ≥ NL
phi3:minihighest_score0/33/3
phi3:minifilter_by_value0/30/3=
gemma2:2bhighest_score0/33/3
gemma2:2bfilter_by_value3/33/3=
llama3.2:1bhighest_score3/33/3=
llama3.2:1bfilter_by_value3/30/3
qwen2.5:0.5bhighest_score0/30/3=
qwen2.5:0.5bfilter_by_value0/30/3=

DCP never consistently loses to NL. In 2 cases DCP outperforms NL — structured data is easier to extract from structured format.

Schema Density

How much schema information should accompany data? Three density levels tested:

LevelDescriptionExample
AbbreviatedSchema ID only$S:knowledge:v1
ExpandedID + field hints with types$S:knowledge:v1 [action(enum) domain detail confidence:0-1]
FullComplete JSON schema definition{"id":"knowledge:v1","fields":[...],"types":{...}}

Density Results (5 models)

ModelAbbreviatedExpandedFull
phi3:mini (3.8B)0/33/33/3
gemma2:2b0/30/33/3
qwen2.5:1.5b0/33/33/3
llama3.2:1b3/30/33/3
qwen2.5:0.5b3/30/30/3

No single density works for all models. Notation retest (JSON array vs custom syntax) produced identical results — notation is not the variable, model capability is.

Shadow Level Comprehension

The density results led to a further question: what if we strip all protocol information and show only field names? Tested L0 (fields only), L2 (full $S protocol), L4 (NL key-value):

ModelL0 (fields only)L2 (full $S)L4 (NL)
phi3:mini9/96/96/9
gemma2:2b3/96/93/9
llama3.2:1b6/93/96/9

L0 (fields only) is optimal for most lightweight models. Protocol information ($S, schema ID, field count) adds no value at ≤4B and actively hurts comprehension.

Conclusions

  1. DCP works for consumption at ≤3.8B. Token savings come at no accuracy cost.

  2. DCP generation is impossible at this size. System-side encoder is essential.

  3. Density adaptation is justified. No single level works for all models. The agent-profile system is the correct design.

  4. L0 (fields only) is the right default for lightweight agents. Strip protocol overhead, keep field names.

  5. phi3:mini (~3.8B) is the practical floor for DCP consumption.

  6. 7B+ testing needed for practical thresholds. Deferred pending hardware.