Skip to main content

Strategies

  • Vector similarity (semantic)
  • Keyword/BM25
  • Hybrid (best of both)
import { MossClient } from '@inferedge/moss'
const client = new MossClient(process.env.MOSS_PROJECT_ID!, process.env.MOSS_PROJECT_KEY!)
await client.loadIndex('my-index')
const byVector = await client.query('my-index', 'getting started latency', 5)

// Metadata filtering (mirrors moss-samples)
const filtered = await client.query('my-index', 'refund policy', 5, {
  filters: { category: 'faq', lang: 'en' }
})

Hybrid weighting (alpha)

  • alpha = 1.0: pure semantic (embeddings)
  • alpha = 0.0: pure keyword
  • Between 0 and 1 blends the two (default is semantic-heavy)
// Blend semantic and keyword scores (60/40)
const hybrid = await client.query('my-index', 'return policy', 3, { alpha: 0.6 })

// Pure keyword
const keywordOnly = await client.query('my-index', 'return policy', 3, { alpha: 0.0 })

Hybrid weighting (alpha)

  • alpha = 1.0: pure semantic (embeddings)
  • alpha = 0.0: pure keyword
  • Between 0 and 1 blends the two (default is semantic-heavy, e.g., ~0.8)
// Blend semantic and keyword scores (60/40)
const hybrid = await client.query('my-index', 'return policy', 3, { alpha: 0.6 })

// Pure keyword
const keywordOnly = await client.query('my-index', 'return policy', 3, { alpha: 0.0 })

// Pure semantic
const semanticOnly = await client.query('my-index', 'return policy', 3, { alpha: 1.0 })

Reranking

Apply a reranker to reorder top-k for precision.

Tuning

  • Adjust k and score thresholds
  • Use metadata filters
  • Group queries by intent (e.g., returns, billing, onboarding) and tune per index
  • Choose model per index: moss-minilm (fast) or moss-mediumlm (more accurate)