Redis Integration

Overview

The Tekimax SDK includes an optional Redis adapter that helps you reduce costs, prevent rate limit errors, enforce token budgets, and manage conversation sessions — all without adding Redis as a core dependency.

Works with any Redis-compatible client: ioredis, @upstash/redis, node-redis, etc.

Installation

The SDK defines a minimal RedisClient interface — bring your own Redis client:

Code

npm install ioredis # or @upstash/redis, redis, etc.

Code

import Redis from 'ioredis'
import {
  ResponseCache,
  RateLimiter,
  TokenBudget,
  SessionStore,
} from 'tekimax-ts'

const redis = new Redis(process.env.REDIS_URL)

Response Caching

Cache identical chat() calls to avoid paying for repeated API requests. The cache uses a SHA-256 hash of the model name + messages as the key.

Cost savings: 50–80% on repeated queries.

Code

import { Tekimax, OpenAIProvider, ResponseCache } from 'tekimax-ts'
import Redis from 'ioredis'

const redis = new Redis(process.env.REDIS_URL)
const cache = new ResponseCache(redis, { ttl: 3600 })

// Initialize
const client = new Tekimax({ 
  provider: new OpenAIProvider({ apiKey: process.env.OPENAI_API_KEY })
})

// Check cache first
const cached = await cache.get('gpt-4o', messages)
if (cached) {
  console.log('Cache hit — $0 cost')
  return cached
}

// Miss → call provider
const result = await client.text.chat.completions.create({
  model: 'gpt-4o',
  messages
})

// Store for next time
await cache.set('gpt-4o', messages, result)

Custom Cache Keys

Code

const cache = new ResponseCache(redis, {
  ttl: 7200,
  prefix: 'myapp:ai:',
  keyFn: (model, messages) => {
    // Cache by last user message only
    const last = messages.filter(m => m.role === 'user').pop()
    return `${model}:${last?.content?.slice(0, 100)}`
  },
})

Rate Limiting

Track per-provider request counts with sliding windows. Check limits before making API calls to prevent 429 errors and automatic provider bans.

Code

const limiter = new RateLimiter(redis, {
  maxRequests: 60,      // 60 requests
  windowSeconds: 60,    // per minute
})

// Check before calling
const { allowed, remaining } = await limiter.check('openai')
if (!allowed) {
  console.log('Rate limit hit — waiting or switching providers')
  // Use FallbackProvider to switch to another provider
}

// Record the request
const status = await limiter.record('openai')
console.log(`Remaining: ${status.remaining}/${60}`)

Per-Model Rate Limits

Code

// Track limits per model, not just per provider
const modelLimiter = new RateLimiter(redis, {
  maxRequests: 10,
  windowSeconds: 60,
  prefix: 'tekimax:model-limit:',
})

await modelLimiter.check('gpt-4o')    // separate limit
await modelLimiter.check('gpt-4o-mini') // separate limit

Token Budget Tracking

Track daily or monthly token usage per API key. Prevent surprise bills by enforcing hard limits that block requests when budgets are exceeded.

Code

const budget = new TokenBudget(redis, {
  maxTokens: 100_000,         // 100K tokens
  periodSeconds: 86400,       // per day (24h)
})

// Check budget before call
const { allowed, used, remaining } = await budget.check('openai-prod')
if (!allowed) {
  throw new Error(`Daily token budget exhausted (${used.toLocaleString()} used)`)
}

// After a successful call, record token usage
const result = await client.text.chat.completions.create({
  model: 'gpt-4o',
  messages
})
await budget.record('openai-prod', result.usage?.totalTokens || 0)

Multi-Tier Budgets

Code

// Per-user daily limit
const userBudget = new TokenBudget(redis, {
  maxTokens: 10_000,
  periodSeconds: 86400,
  prefix: 'tekimax:user-budget:',
})

// Organization monthly limit
const orgBudget = new TokenBudget(redis, {
  maxTokens: 1_000_000,
  periodSeconds: 2_592_000,  // 30 days
  prefix: 'tekimax:org-budget:',
})

// Check both before making a call
const user = await userBudget.check(`user:${userId}`)
const org  = await orgBudget.check(`org:${orgId}`)

if (!user.allowed) throw new Error('Personal daily limit reached')
if (!org.allowed)  throw new Error('Organization monthly limit reached')

Session Storage

Lightweight conversation state for serverless and edge deployments. Store and retrieve message history with automatic TTL expiry — faster than a full database for ephemeral state.

Code

const sessions = new SessionStore(redis, { ttl: 1800 }) // 30 min

// Save conversation after each turn
await sessions.save('user-123', {
  messages: [
    { role: 'user', content: 'Hello' },
    { role: 'assistant', content: 'Hi! How can I help?' },
  ],
  metadata: {
    model: 'gpt-4o',
    startedAt: Date.now(),
    provider: 'openai',
  },
})

// Restore on next request (e.g., in a serverless function)
const session = await sessions.load('user-123')
if (session) {
  // Continue the conversation
  session.messages.push({ role: 'user', content: 'Tell me a joke' })
  const response = await client.text.chat.completions.create({
    model: 'gpt-4o',
    messages: session.messages
  })
  session.messages.push(response.message)
  await sessions.save('user-123', session)
}

// Extend session TTL without modifying data
await sessions.touch('user-123')

// End conversation
await sessions.destroy('user-123')

Combining All Utilities

Here's a complete example that uses all four Redis utilities together:

Code

import Redis from 'ioredis'
import {
  Tekimax,
  OpenAIProvider,
  ResponseCache,
  RateLimiter,
  TokenBudget,
  SessionStore,
} from 'tekimax-ts'

const redis = new Redis(process.env.REDIS_URL)

const client = new Tekimax({ 
  provider: new OpenAIProvider({ apiKey: process.env.OPENAI_API_KEY }) 
})

const cache   = new ResponseCache(redis, { ttl: 3600 })
const limiter = new RateLimiter(redis, { maxRequests: 60, windowSeconds: 60 })
const budget  = new TokenBudget(redis, { maxTokens: 100_000, periodSeconds: 86400 })
const sessions = new SessionStore(redis, { ttl: 1800 })

async function chat(userId: string, message: string) {
  // 1. Load session
  const session = await sessions.load(userId) ?? { messages: [] }
  session.messages.push({ role: 'user', content: message })

  // 2. Check cache
  const cached = await cache.get('gpt-4o', session.messages)
  if (cached) return cached

  // 3. Check rate limit
  const { allowed: rateOk } = await limiter.check('openai')
  if (!rateOk) throw new Error('Rate limit — try again shortly')

  // 4. Check token budget
  const { allowed: budgetOk } = await budget.check(userId)
  if (!budgetOk) throw new Error('Daily token budget exceeded')

  // 5. Make the call
  const result = await client.text.chat.completions.create({
    model: 'gpt-4o',
    messages: session.messages
  })
  
  await limiter.record('openai')
  await budget.record(userId, result.usage?.totalTokens ?? 0)

  // 6. Cache + save session
  await cache.set('gpt-4o', session.messages, result)
  session.messages.push(result.message)
  await sessions.save(userId, session)

  return result
}

Overview

Installation

Response Caching

Custom Cache Keys

Rate Limiting

Per-Model Rate Limits

Token Budget Tracking

Multi-Tier Budgets

Session Storage

Combining All Utilities

On this page