Taris: The AI Assistant That Keeps Data at the Client's Side

2026-06-26 · Sintaris · taris, rag, on-prem, ollama, vendor-neutral, smb, ai-platform

Taris: The AI Assistant That Keeps Data at the Client's Side

TL;DR. Taris is an AI assistant where client data never leaves to the vendor. The baseline principle: the model is a plugin behind a stable interface, not the centre of the architecture. Inside: vendor-neutral LLM dispatcher, hybrid RAG (BM25 + dense + RRF + cross-encoder rerank), multi-tenant Postgres with pgvector, optionally local models via Ollama. This article covers how Taris is built, why it is built that way, and where that delivers value for SMBs in the EU and CIS.

1. The Conflict: "Let's Get GPT-4 and Forget About It"

When a small business owner asks "which AI assistant should we install?", they usually get one of two extremes:

Taris is the third path: a productised base (model dispatcher, hybrid RAG, multi-tenant Postgres, channel adapters) that we deploy for the client and leave with the client. Not SaaS. Not "build from scratch." A half-product that is straightforward to adapt.

2. Who This Concerns

3. The Common Wrong Approach

What we see in 70% of "pilots" started before us:

4. The Engineering Approach: What's Inside Taris

Architecture — four independent layers:

flowchart LR
  subgraph Channels
    TG[Telegram Bot]
    WEB[Web UI / PWA]
    VOICE[Voice]
    API[REST API]
  end
  subgraph Core
    GW[FastAPI Gateway]
    ORCH[Agent Orchestrator]
    DISP[LLM Dispatcher]
    KB[KB Service]
    AUTH[Auth + RBAC]
  end
  subgraph Storage
    PG[(Postgres + pgvector)]
    OBJ[(MinIO / S3)]
    LOG[(Audit log)]
  end
  subgraph Models
    LOCAL[Ollama / llama.cpp]
    CLOUD[OpenAI / Anthropic / Gemini / YandexGPT]
  end
  TG --> GW
  WEB --> GW
  VOICE --> GW
  API --> GW
  GW --> AUTH --> ORCH
  ORCH --> KB --> PG
  ORCH --> DISP
  DISP --> LOCAL
  DISP --> CLOUD
  ORCH --> LOG

Each layer is replaceable — that's the key point. Channels are adapters. The model is a plugin. Storage is a backend. The orchestrator is the only place where business logic lives. If OpenAI triples its prices tomorrow, a Taris installation switches with a single config file.

4.1. LLM Dispatcher

class LLMProvider(Protocol):
    async def complete(
        self,
        messages: list[ChatMessage],
        *,
        max_tokens: int,
        temperature: float,
        tools: list[Tool] | None = None,
    ) -> ChatCompletion: ...

Seven concrete providers: OpenAI, Anthropic, Gemini, YandexGPT, OpenRouter, Ollama, llama.cpp. Routing via YAML:

default: openrouter:openai/gpt-4o-mini
routes:
  - match: { task: rerank }
    use:   ollama:bge-reranker-base
  - match: { task: summary, locale: ru }
    use:   yandexgpt:latest
  - match: { sensitive: true }
    use:   ollama:llama3.1:8b
fallback:
  - openrouter:anthropic/claude-3-5-sonnet
  - ollama:llama3.1:8b

4.2. Hybrid RAG with RRF

Retrieval — three-pass stages:

  1. Lexical (BM25) — Postgres FTS with language-aware analyser for RU/EN/DE/SL.
  2. Dense — pgvector cosine, default text-embedding-3-small, for on-prem — bge-m3.
  3. Metadata boost — exact match on tags (product, section, last_updated).

Fusion — Reciprocal Rank Fusion:

$$ \text{score}(d) = \sum_{i \in \text{retrievers}} \frac{1}{k + \text{rank}_i(d)}, \quad k = 60 $$

Then cross-encoder rerank (bge-reranker-base) down to top-5. Empirical gain on our internal occupational safety golden set: recall@5 0.71 → 0.88 (RRF vs pure dense), grounding-rate +0.07 after rerank. This is not "slightly better" — it is the difference between "usable" and "give the client their money back."

4.3. Multi-Tenant Postgres with RLS

CREATE POLICY tenant_isolation ON chunks
  USING (tenant_id = current_setting('app.tenant_id')::int);

Every connection sets SET app.tenant_id = $1 before querying. It is impossible to accidentally read another client's data: the database itself enforces it.

5. Table: Which Components Are Replaceable

Layer Default Alternative Switching cost
Embedding text-embedding-3-small bge-m3 config + re-index
Reranker bge-reranker-base mxbai-rerank config
Vector store pgvector Qdrant docker-compose + migration
LLM gpt-4o-mini claude-3-5-sonnet, llama3.1:8b config
Channel Telegram Web / VK / Slack / WhatsApp adapter ~200 lines
File storage MinIO S3 / Nextcloud config
Deployment Docker Compose Kubernetes / Nomad manifests

6. Sintaris Mini-Case

The Worksafety Superassistant product is an example of Taris in a real deployment. Task:

Technical implementation:

Metrics after 90 days:

Details: Worksafety § 6 RAG pipeline and OpenClaw § 8 AI dispatch.

7. Checklist (15 Points) When Choosing an AI Assistant for SMB

  1. Vendor lock-in verified: can you switch the LLM provider in a week?
  2. Data — where are client documents physically stored?
  3. Embeddings — where are they stored? (often forgotten: they are also PII-derived)
  4. DPA signed with every LLM provider you use.
  5. Eval set — do you have one, and how many questions are in it?
  6. Citation — does the system generate source references?
  7. Grounding rate — is it measured? (if not — nobody knows whether the model is lying)
  8. Retrieval regression tested after every prompt change?
  9. Multi-tenant security — RLS at DB level, not "agreed in code"?
  10. Local models available — is there a Plan B if the cloud is down?
  11. Cost per token — monitored in real time?
  12. DSAR + erasure — implemented as code, not a manual procedure?
  13. Audit log — present, immutable, with the required retention period?
  14. Channels — is adding a new channel < 500 lines or a core rewrite?
  15. Documentation — in what language, for whom, how often updated?

8. Risks

9. What to Do Next

If you already have an AI assistant and it's time to replace it — we run an AI Audit for €900–4500. If you want to try Taris — there is an AI Pilot over 4–8 weeks for €3000–12000 with a fixed scope. −25% for Slovenian companies from 1 to 30 June 2026 — see packages.

If you'd rather read first — see the KB chapters Taris (full description) and OpenClaw (on-prem topology).

10. References


Sintaris runs AI process audits, AI pilots and Taris deployments for SMBs in the EU and CIS. Discovery call — free, 30 minutes.