In-Memory Semantic Cache
The in-memory cache backend provides fast, local caching for development environments and single-instance deployments. It stores semantic embeddings and cached responses directly in memory for maximum performance.
Overview​
The in-memory cache is ideal for:
- Development and testing environments
 - Single-instance deployments
 - Quick prototyping and experimentation
 - Low-latency requirements where external dependencies should be minimized
 
Architecture​
Configuration​
Basic Configuration​
# config/config.yaml
semantic_cache:
  enabled: true
  backend_type: "memory"
  similarity_threshold: 0.8       # Global default threshold
  max_entries: 1000
  ttl_seconds: 3600
  eviction_policy: "fifo"