Caching

openFHIR Enterprise caches OPT templates, FHIR Connect contexts, tenant records, and compiled mapper indexes in memory to avoid repeated database round-trips on every mapping request. All caches use time-based expiry (Caffeine) and can be individually tuned or disabled via environment variables.

Cache overview

There are three independently configurable cache groups:

Cache group

Key

What is stored

OPT cache

tenant:templateId

Parsed OPT entities loaded from the database

FHIR Connect cache

tenant:templateId / tenant

FHIR Connect context entities (per-template and full tenant list), compiled mapper indexes (CompletableFuture-per-key, thundering-herd safe), and pre-built helper trees (deep-cloned on each use to isolate request state). All four internal caches share a single TTL setting.

Tenant cache

tenantId

Tenant configuration records

TTL configuration

Each cache group has an independent TTL controlled by an environment variable. The default is -1 (infinite — entries never expire and are only evicted explicitly).

Special values:

  • 0 — cache disabled; every request goes directly to the database with no in-memory layer.

  • -1 — infinite TTL; entries never expire and are only removed by explicit eviction (e.g. after an upsert) or process restart.

Environment variable

Default

Description

OPENFHIR_CACHE_OPT_TTL

-1

TTL in seconds for the OPT cache. 0 = disabled, -1 = infinite.

OPENFHIR_CACHE_FHIR_CONNECT_TTL

-1

TTL in seconds for all FHIR Connect caches (context, all-contexts, mapper, and helpers). 0 = disabled, -1 = infinite.

OPENFHIR_CACHE_TENANT_TTL

-1

TTL in seconds for the tenant cache. 0 = disabled, -1 = infinite.

Entries are expired on a write basis: the TTL clock starts when the entry is first populated and resets on each cache miss that triggers a reload.

Explicit eviction

TTL expiry is a safety net, not the primary eviction mechanism. When a mapping, OPT, or tenant record is updated through the API, openFHIR immediately invalidates the relevant cache entry so that the next request reflects the change without waiting for the TTL to expire.

In a multi-node deployment, the node that processes the write broadcasts an invalidation message to all peers over the cluster channel. Peer nodes invalidate the same entry locally upon receipt. See High Availability for cluster configuration details.

Note

Cache invalidation is best-effort. A peer that is temporarily unreachable when a write occurs will not receive the invalidation event and may serve stale data until the TTL expires or the node is restarted. The database remains the authoritative source of truth at all times.

Disabling and pinning caches

Setting a TTL to 0 is useful in two scenarios:

  • Debugging — forces every request to hit the database, making it straightforward to verify that a mapping change is applied immediately without having to wait for expiry or trigger manual eviction.

  • Low-memory deployments — removes the in-memory overhead entirely. Expect a latency increase proportional to the database round-trip cost; see Performance for measured DB overhead by backend and resource configuration.

Setting a TTL to -1 is useful when entries should survive indefinitely and be evicted only on explicit invalidation (e.g. after an upsert or cluster broadcast):

  • Stable datasets — mappings and templates that change rarely benefit from never expiring silently; explicit eviction on upsert keeps the cache coherent without the overhead of periodic reloading.

Example (Docker Compose):

environment:
  OPENFHIR_CACHE_FHIR_CONNECT_TTL: 0    # disable context, mapper, and helpers caches entirely
  OPENFHIR_CACHE_OPT_TTL: 300           # override OPT cache to expire after 5 minutes

All other caches remain active with their default TTLs when only some variables are overridden.

Relationship to performance

The FHIR Connect cache group (mapper and helpers caches in particular) has the largest impact on latency. A cold start (no cached mappers) requires loading and compiling mapper indexes from the database on the first request for each tenant/template combination. Subsequent requests are served entirely from memory. See Performance for measured latency figures by hardware configuration.