Caching
Cache LLM responses with deterministic keys and optional semantic matching.
Caching avoids redundant provider calls for identical or similar requests. Nexus generates deterministic cache keys from the request content and supports semantic matching for near-duplicate detection.
Cache Interface
type Cache interface {
Get(ctx context.Context, key string) (*provider.CompletionResponse, error)
Set(ctx context.Context, key string, resp *provider.CompletionResponse, ttl time.Duration) error
Delete(ctx context.Context, key string) error
Clear(ctx context.Context) error
}Enabling Caching
import "github.com/xraph/nexus/cache/stores"
gw := nexus.New(
nexus.WithCache(stores.NewMemory(1000)), // LRU, 1000 entries
)Redis Cache
gw := nexus.New(
nexus.WithCache(stores.NewRedis(redisClient)),
)Cache Keys
Keys are generated from a SHA-256 hash of the request model, messages, temperature, and other parameters. This ensures identical requests always hit the same cache entry.
Semantic Matching
For near-duplicate detection, provide a SemanticMatcher:
type SemanticMatcher interface {
Match(ctx context.Context, key string, threshold float64) (string, bool, error)
}When enabled, cache lookups first try an exact key match, then fall back to semantic similarity.