High Availability
openFHIR Enterprise caches mappings, templates, and related configuration in memory to reduce database load and improve response times. In a single-node deployment this is transparent. In a multi-node (HA) deployment, a change made through one node — such as uploading a new mapping or OPT — must be reflected on all peers immediately. Without coordination, peer nodes continue serving stale cached data until they are restarted.
To address this, openFHIR Enterprise propagates cache invalidation events across all running instances automatically. No external message broker or additional infrastructure is required.
Peer discovery
Nodes need to find each other at startup. Three discovery modes are available, selected via the CLUSTER_DISCOVERY environment variable:
Mode |
When to use |
Notes |
|---|---|---|
|
Local development, Docker Compose, VMs on the same network segment |
Works out of the box on most networks. May be blocked in some cloud or restricted environments. |
|
Kubernetes |
Requires a headless Service so that the DNS name resolves to individual pod IPs. |
|
VM / bare-metal, or any environment where multicast is unavailable |
Requires a static list of all peer addresses provided upfront. |
Configuration reference
Environment variable |
Default |
Description |
|---|---|---|
|
|
Peer discovery mode: |
|
(empty) |
Headless DNS name to resolve when using |
|
(empty) |
Comma-separated list of peer addresses when using |
|
|
Port used for cluster communication. |
Per-environment setup
Single node
No configuration is required. The node forms a cluster of one and cache invalidation operates locally as usual.
Docker Compose
Multicast works on Docker’s default bridge network, so no additional configuration is needed when running multiple replicas in Compose.
services:
openfhir-node-1:
image: openfhir-enterprise:latest
# no cluster configuration needed — MULTICAST is the default
openfhir-node-2:
image: openfhir-enterprise:latest
If multicast is disabled on the network, switch to TCP mode and set CLUSTER_INITIAL_HOSTS to the list of all peer service names and their cluster port.
services:
openfhir-node-1:
image: openfhir-enterprise:latest
environment:
CLUSTER_DISCOVERY: TCP
CLUSTER_INITIAL_HOSTS: openfhir-node-2[7800]
openfhir-node-2:
image: openfhir-enterprise:latest
environment:
CLUSTER_DISCOVERY: TCP
CLUSTER_INITIAL_HOSTS: openfhir-node-1[7800]
Kubernetes
Use DNS mode with a headless Service. Set CLUSTER_DNS_QUERY to the fully qualified DNS name of the headless Service (e.g. openfhir-cluster.default.svc.cluster.local). The DNS name must resolve to the IPs of all running pods.
The pod’s service account requires read access to pods and endpoints in the deployment namespace so that peer IPs can be resolved at startup.
env:
- name: CLUSTER_DISCOVERY
value: DNS
- name: CLUSTER_DNS_QUERY
value: openfhir-cluster.default.svc.cluster.local
VM / bare-metal
Use TCP mode and set CLUSTER_INITIAL_HOSTS to the addresses and cluster ports of all nodes. Every node must include the full list of peers, including itself.
CLUSTER_DISCOVERY=TCP
CLUSTER_INITIAL_HOSTS=10.0.0.1[7800],10.0.0.2[7800],10.0.0.3[7800]
Behaviour and guarantees
Cache invalidation is best-effort. If a peer is temporarily unreachable when a change is made, it will not receive the invalidation event for that change. It will serve stale data until the next request for the same resource triggers a reload from the database, or until the node is restarted. This is an acceptable trade-off given that the database remains the authoritative source of truth at all times.
All nodes in the cluster must run the same version of openFHIR Enterprise.