Skip to content

Technology & models

Confident, specific, no hand-waving.

This page is for the technical evaluator. Value — custody and quality — always leads the plumbing. Here's the plumbing.

Three layers, plain language

Microfiber is one platform with three layers — we surface only the one a given buyer needs. Lead with the memory: it's the product; the utility is the foundation under it.

Live

Private inference

The utility

A compliant, stateless model endpoint. Send context, get intelligence back. The foundation everything else sits on.

Roadmap

Your document corpus

The memory

Retrieval-grounded intelligence that knows your files and reasons across years of them. This is the product — the beachhead for accounting firms.

Roadmap

Your structured data

The query layer

Ask questions of data you already hold (transactions, records) — the model calls your own query function through the gateway and reasons over the exact results, never storing anything.

How Microfiber is built

A private endpoint, not a public chatbot.

Your firm integrates against api.microfiber.[tld] with your own key. Your application never talks to a public model provider — it talks to Microfiber.

Control plane vs. data plane.

Account, billing, and portal (no client data) are separated by design from the data plane — your documents, the vector store, retrieval, and prompt assembly — which runs inside a boundary we control. Inference runs inside that boundary on the owned-hardware tier; on the managed tier it runs at our contracted provider (see Security). This separation is a security control, not just an architecture choice.

Per-tenant isolation.

Every firm gets its own isolated environment. No shared memory, no commingled data, ever.

Stateless compute.

The model performs a computation and stores nothing. Your corpus and the intelligence about it live on our controlled side; raw horsepower is a utility we direct, never a place your data rests. On the managed tier, the excerpts needed to answer a query transit to that compute under a no-retention contract and are discarded after the response; on the owned-hardware tier the compute runs inside our boundary, so nothing transits at all.

Encryption in transit and at rest. Metadata-only logging.

We log that a request happened, never its contents.

Retrieval-grounded answers — the moat

Microfiber reasons over your documents, and the retrieval is built for the way financial and legal files are actually structured:

Structure-aware chunking

We split along sections, tables, and form fields — never blind fixed-size cuts that sever a number from its label.

Tables made reasoning-ready

Financial tables become narratives the model can reason over (e.g. “Asset X: 2024 depreciation $Y, method MACRS, accumulated $Z”) — the make-or-break capability for accounting work.

Hybrid retrieval

Semantic search and exact keyword/number matching and metadata filters (by client, document type, tax year), so “find the basis on this asset” lands on the right passage.

Re-ranking

Candidate passages are ordered by true relevance before the model ever sees them.

Citations, always

Every answer points back to the source document, page, and section.

Honest by design

When the answer isn't in your files, it says “I don't see that in your documents” instead of inventing a figure. For a firm where a wrong number is a real problem, that honesty is the product.

The separation, drawn03

The control plane and the data plane, separated by design.

Your documents, the vector store, retrieval, and prompt assembly live inside a boundary we control; what reaches the model at query time depends on your tier.

Your firm

app + users · api.microfiber.[tld] with your key

Microfiber boundary · data plane per-tenant isolation
Gateway

auth · routing · metadata-only logging

Retrieval (RAG)

your documents · cited answers

Your document corpus — stays inside a boundary we control.

stateless compute calls only

prompt + retrieved chunks → ← answer

Inference backend

stateless utility · stores nothing · swappable

control plane — separated by designportal · billing · login · holds no client data
A customer's application calls Microfiber's private endpoint with their own key. Inside an Microfiber-controlled boundary — the data plane — a gateway handles auth, routing, and metadata-only logging, and a retrieval (RAG) layer assembles prompts grounded in the firm's own documents and returns cited answers. Per-tenant isolation keeps every firm separate. The firm's document corpus stays inside this boundary we control. On the managed tier, only the stateless excerpts needed to answer — a prompt plus retrieved chunks — transit to a contracted inference provider under a no-retention agreement and are discarded after the response; on the owned-hardware tier, inference runs inside the boundary, so nothing transits at all. The control plane (portal, billing, login) is separated by design and holds no client data.
Per-tenant isolation

The control-plane / data-plane split — the trust story, made architectural.

Built to drop into your stack

Live today
client = Anthropic(
  base_url="https://api.microfiber.[tld]",   # 1. point at Microfiber
  api_key="sk-ac-...",                     # 2. your Microfiber key
)
client.messages.create(model="ac-qwen3-32b", ...)   # 3. an Microfiber model

An OpenAI-shape endpoint is under evaluation; today the API speaks the Anthropic Messages format.

  • A drop-in swap, not a rebuild.

    If your tools already use the Anthropic SDK, switching to Microfiber is three lines — point the base URL at our endpoint, use your Microfiber key, name an Microfiber model. Your code doesn't change.

  • Models by alias, swapped server-side.

    You call a stable Microfiber alias (e.g. ac-qwen3-32b); we map it to the right hosted model behind the scenes, so you're never re-integrating when the model layer advances.

  • Streaming responses.

    Token-by-token streaming (server-sent events) is live and behaves exactly like the SDK's native streaming — no special handling.

  • Extended reasoning on demand.

    Flip on “thinking” per request for harder problems on models that support it; off by default for speed.

The models

Microfiber runs leading open-weight models in a private environment — the key being that an open model, grounded on your own documents with strong retrieval, is what delivers accurate, cited answers on your work. We are not dependent on any single public AI vendor, and the model layer is swappable as the field advances.

We pick the right model for your work and host it privately — and because the model layer is swappable, you're never locked to one vendor's roadmap or pricing.

Representative models available (menu evolves as the ecosystem does):

  • QwenLivestrong reasoning class — Qwen3-32B is serving in our environment today
  • Llama (Meta)general-purpose workhorse class
  • DeepSeekstrong reasoning / cost-efficient class
  • Mistral / Mixtralefficient open models
  • GPT-OSS-class open modelsas available

Why an open model wins on your documents.

Aren't the frontier models (ChatGPT, Claude) simply better? For general world knowledge, yes — they've read a huge slice of the public internet. But your client's tax return isn't on the internet. No model — frontier or not — was trained on your data; every model starts from zero on your files. So the question that actually matters isn't “which model knows the most about the world,” it's “which system reasons most accurately over your documents” — and there, research on financial documents is clear: retrieval and chunking quality drive answer accuracy more than the size of the model writing the answer. We give a strong open model your documents, precisely retrieved and cited; a frontier model handed a clumsy paste of those same files actually does worse.

Frontier models are smarter about the world; we're smarter about your data.

Enterprise / strictest tier

For firms with the highest custody requirements (e.g. healthcare PHI, or a strict closed-loop mandate), Microfiber offers a dedicated, owned-hardware deployment — full physical custody of both the data layer and the inference, with a certificate of destruction on offboarding. This is the only tier on which we can say your data never leaves the controlled environment: retrieval and inference both happen inside it, so nothing transits at query time.

For the evaluator

Take it apart on a call.

We're happy to go deep — architecture, retrieval, isolation, the model menu — and map every custody claim to your information security plan.