Zero Data Egress
MyKB runs entirely within your VPC or on your local machine. Keep proprietary code and docs off third‑party clouds.
MyKB turns your private docs and code into a secure, auditable, self-healing intelligence engine. No data egress. No vendor lock-in. Your knowledge, on your metal.
Initial OSS release coming September 2025.
MyKB runs entirely within your VPC or on your local machine. Keep proprietary code and docs off third‑party clouds.
Commit‑pinned citations and lineage give repeatable answers with proof — built for compliance and critical workflows.
Deploy on fixed‑cost infrastructure you already own. Avoid per‑token surprise billing and lock in predictable spend.
Combines semantic understanding (FastEmbed/ONNX) with keyword precision (BM25) for superior relevance, fused with Reciprocal Rank Fusion (RRF).
Content hashes detect chunk-level changes, ensuring only deltas are re-indexed. This makes ingestion efficient and idempotent.
A Markdown AST parser creates stable, meaningful chunks that respect headings, code blocks, and tables for better context retrieval.
Production-grade identity using EdDSA (Ed25519) JWTs, revocable refresh tokens, device fingerprinting, and per-route rate limiting.
OSS users can add synonyms/boosts via a `patches.json` file. Enterprise adds a UI-driven feedback loop for continuous, automated improvement.
Built with Python, FastAPI, and Qdrant. Runs efficiently on CPU with optional GPU acceleration. Deploys anywhere with Docker.
MyKB's core engine powers specialized Co-Pilots—thin, secure agents that automate complex workflows.
Automates market research by ingesting external data and comparing it against internal roadmaps, securely.
kb.seed.preview, kb.ingest, kb.search
Accelerates development by providing instant, cited answers from the entire codebase and internal documentation.
kb.search_code, kb.get, kb.sources
Enforces governance by checking documents and code against policies, using curated filters and self-healing patches.
kb.search, kb.admin.patch, kb.explain_policy
| Feature | OSS (Solo & Team) | Enterprise |
|---|---|---|
| Core Engine | ||
| Full RAG Pipeline | ✓ | ✓ |
| Hybrid Search & Reranking | ✓ | ✓ |
| Incremental Ingestion Ledger | ✓ | ✓ |
| Security & IAM | ||
| JWT Auth (RBAC) | Coming Soon | ✓ |
| Zero-Trust Gateway | ✗ | ✓ |
| SSO/SAML/OIDC Integration | ✗ | ✓ |
| Automation & Ops | ||
| Manual Healing (patches.json) | ✗ | ✓ |
| Automated Self-Healing Loop | ✗ | ✓ |
| SLAs & Dedicated Support | ✗ | ✓ |
Never. MyKB is architected to be on-premise first. The entire RAG pipeline—from ingestion to retrieval—runs on your hardware. You can optionally connect to an external LLM API for answer synthesis, but the retrieved context is the only data sent, and you can also use fully local LLMs for a 100% air-gapped deployment.
Our incremental ledger uses content hashing (e.g., git commit hashes or file checksums) to track changes at the document and chunk level. When you re-run the ingestion process, only new or modified chunks are processed and embedded, making updates extremely fast and efficient.
We use Qdrant as our vector store due to its performance, on-disk storage capabilities, and support for named vectors. For embeddings, we default to high-performance, small-footprint models via FastEmbed (ONNX runtime), which are ideal for on-prem CPU deployments. However, the architecture is modular, allowing you to plug in other models.
Self-healing refers to our system for improving search relevance without retraining models. When a search fails or returns a poor result, an admin can create a "patch." In the OSS version, this is a rule in a JSON file (e.g., mapping a synonym like "k8s" to "kubernetes" or boosting a specific document). The Enterprise version provides a UI for this, and uses user feedback to automatically suggest and apply these patches.
Join our community of privacy-first teams building the future of on-premise AI.