Retrieval

Strategies

Vector similarity (semantic)
Keyword/BM25
Hybrid (best of both)

import { MossClient } from '@inferedge/moss'
const client = new MossClient(process.env.MOSS_PROJECT_ID!, process.env.MOSS_PROJECT_KEY!)
await client.loadIndex('my-index')
const byVector = await client.query('my-index', 'getting started latency', 5)

// Metadata filtering (mirrors moss-samples)
const filtered = await client.query('my-index', 'refund policy', 5, {
  filters: { category: 'faq', lang: 'en' }
})

Hybrid weighting (`alpha`)

alpha = 1.0: pure semantic (embeddings)
alpha = 0.0: pure keyword
Between 0 and 1 blends the two (default is semantic-heavy)

// Blend semantic and keyword scores (60/40)
const hybrid = await client.query('my-index', 'return policy', 3, { alpha: 0.6 })

// Pure keyword
const keywordOnly = await client.query('my-index', 'return policy', 3, { alpha: 0.0 })

Hybrid weighting (`alpha`)

alpha = 1.0: pure semantic (embeddings)
alpha = 0.0: pure keyword
Between 0 and 1 blends the two (default is semantic-heavy, e.g., ~0.8)

// Blend semantic and keyword scores (60/40)
const hybrid = await client.query('my-index', 'return policy', 3, { alpha: 0.6 })

// Pure keyword
const keywordOnly = await client.query('my-index', 'return policy', 3, { alpha: 0.0 })

// Pure semantic
const semanticOnly = await client.query('my-index', 'return policy', 3, { alpha: 1.0 })

Reranking

Apply a reranker to reorder top-k for precision.

Tuning

Adjust k and score thresholds
Use metadata filters
Group queries by intent (e.g., returns, billing, onboarding) and tune per index
Choose model per index: moss-minilm (fast) or moss-mediumlm (more accurate)

Community

Getting Started

Use Cases

How it works

Strategies

Hybrid weighting (`alpha`)

Hybrid weighting (`alpha`)

Reranking

Tuning

Community

Getting Started

Use Cases

How it works

​Strategies

​Hybrid weighting (alpha)

​Hybrid weighting (alpha)

​Reranking

​Tuning

Strategies

Hybrid weighting (`alpha`)

Hybrid weighting (`alpha`)

Reranking

Tuning