Zero-hop latency
Retrieval from device memory in under 10 ms. No round-trip to a server. Answers feel instant.
Sub-10 ms retrieval for conversational AI, voice assistants, and multimodal agents. On-device, in-browser, or in the cloud.
Moss is a high-performance runtime for real-time semantic search. It delivers sub-10 ms retrieval, instant index updates, and zero infrastructure overhead. It runs wherever your intelligence lives — in-browser, on-device, or in the cloud — so search feels native and effortless.
TIP
Head to Moss Portal to set up projects and start building with sub-10ms search.
npm install @inferedge/mosspip install inferedge-mossimport { MossClient } from '@inferedge/moss'
const client = new MossClient(process.env.PROJECT_ID!, process.env.PROJECT_KEY!)
await client.createIndex('docs', [{ id: '1', text: 'Vector search in production' }])
await client.loadIndex('docs')
const results = await client.query('docs', 'production search tips')from inferedge_moss import MossClient, DocumentInfo
client = MossClient("$PROJECT_ID", "$PROJECT_KEY")
await client.create_index(
"docs",
[DocumentInfo(id="1", text="Vector search in production")],
)
await client.load_index("docs")
results = await client.query("docs", "production search tips")| Task | Link |
|---|---|
| Project setup and credentials | Getting Started |
| JavaScript usage and API docs | JavaScript SDK |
| Python usage and API docs | Python SDK |
For support, commercial licensing, or partnership inquiries: contact@moss.dev