diff --git a/docs/reference/prompt-caching.md b/docs/reference/prompt-caching.md
index 8233fd9de3a..67561e4a21b 100644
--- a/docs/reference/prompt-caching.md
+++ b/docs/reference/prompt-caching.md
@@ -9,6 +9,10 @@ read_when:
 
 # Prompt caching
 
+Prompt caching means the model provider can reuse unchanged prompt prefixes (usually system/developer instructions and other stable context) across turns instead of re-processing them every time. The first matching request writes cache tokens (`cacheWrite`), and later matching requests can read them back (`cacheRead`).
+
+Why this matters: lower token cost, faster responses, and more predictable performance for long-running sessions. Without caching, repeated prompts pay the full prompt cost on every turn even when most input did not change.
+
 This page covers all cache-related knobs that affect prompt reuse and token cost.
 
 For Anthropic pricing details, see: