Overview

Add a private, context-aware AI chatbot to any website in one line — and keep your visitors’ conversations on their own device.

InferKit is a drop-in JavaScript SDK by SynaptiCortex. Paste one <script> tag, and your page gets a chat assistant that answers questions grounded in that page’s content. On capable devices it runs the model entirely in the browser (WebGPU) — so the conversation never leaves the user’s machine.


Why InferKit

🔒 Privacy-firstIn local mode, inference runs in-browser via WebLLM. No prompts, no page content, and no conversations are sent to any server. A genuine differentiator for healthcare, finance, legal, and EU/GDPR-sensitive sites.
Drop-inOne script tag + an API key. No backend to build, no model to host, no infra to run.
💸 Cost-efficientWhen the user’s GPU does the work, we bear no inference cost — and neither do you. Remote inference is a fallback, not the default.
🎯 Grounded answersA deterministic grounding gate refuses off-page questions before calling the model, so “what’s the capital of France?” doesn’t hijack your support bot.
🔌 Multi-providerWhen remote inference is used, route to OpenAI, Groq, Anthropic, or Google Gemini — or bring your own key (BYOK).

How it works

  1. Drop in the SDK. The script mounts a chat widget and reads the current page.
  2. Pick the mode automatically. Capable browser → local (in-browser WebLLM). Otherwise → remote fallback through the InferKit API.
  3. Answer in context. The page’s content is the knowledge base; the grounding gate keeps answers on-topic.
Visitor ──▶ InferKit widget ──┬─▶ Local model (WebGPU, in-browser)   ← default, private
                              └─▶ InferKit API ──▶ LLM provider       ← fallback / paid tiers

The platform API is the control plane — keys, tiers, usage, billing, bot protection — not a data pipe. In local mode it’s contacted exactly once, at startup, to validate the key.


Who it’s for

  • Documentation & knowledge sites — instant Q&A grounded in your docs.
  • SaaS products — in-app help that understands the current screen.
  • Privacy-sensitive sites — healthcare, finance, legal: answers without data egress.
  • Agencies — one integration, deployed across many client sites.

At a glance

  • SDK: synapticortex-chat on npm + jsDelivr/unpkg CDN
  • Models (local): Qwen, Llama 3.2, Phi-3.5, SmolLM2 (4-bit, WebGPU)
  • Providers (remote): OpenAI · Groq · Anthropic · Google Gemini · BYOK
  • Tiers: Free (local-only) · Starter · Pro · Enterprise — see pricing
  • Get started: Quickstart · Security: Security & Privacy