Caching

Caching avoids redundant provider calls for identical or similar requests. Nexus generates deterministic cache keys from the request content and supports semantic matching for near-duplicate detection.

Cache Interface

type Cache interface {
    Get(ctx context.Context, key string) (*provider.CompletionResponse, error)
    Set(ctx context.Context, key string, resp *provider.CompletionResponse, ttl time.Duration) error
    Delete(ctx context.Context, key string) error
    Clear(ctx context.Context) error
}

Enabling Caching

import "github.com/xraph/nexus/cache/stores"

gw := nexus.New(
    nexus.WithCache(stores.NewMemory(1000)), // LRU, 1000 entries
)

Redis Cache

gw := nexus.New(
    nexus.WithCache(stores.NewRedis(redisClient)),
)

Cache Keys

Keys are generated from a SHA-256 hash of the request model, messages, temperature, and other parameters. This ensures identical requests always hit the same cache entry.

Semantic Matching

For near-duplicate detection, provide a SemanticMatcher:

type SemanticMatcher interface {
    Match(ctx context.Context, key string, threshold float64) (string, bool, error)
}

When enabled, cache lookups first try an exact key match, then fall back to semantic similarity.