Pryv.io infrastructure procurement

This document is for system administrators provisioning virtual machines and other web resources to run a Pryv.io platform. It guides you through deciding which topology you need, which virtual machines to procure, firewalling, OS compatibility and related operational concerns.

Since Pryv.io v2 (2026) the platform runs as a single binary (bin/master.js) packaged as a single Docker image (pryvio/open-pryv.io). There is no longer a separate register, core, hfs, preview, static-web or dns service to procure — one machine runs everything. Scaling out is done by adding more instances of the same binary and joining them through an embedded rqlite cluster.

Table of contents

  1. Topology
    1. Single-core (most deployments)
    2. Multi-core for load
    3. Multi-core for geographical compliance
  2. Business requirements
    1. Granularity
    2. Data production
    3. Data consumption
  3. Sizing a core
  4. System requirements
    1. Operating systems
    2. Docker
    3. Per-core machine
    4. Database host (optional — external PostgreSQL / MongoDB)
  5. Network and firewall
  6. Operational concerns
    1. System hardening
    2. Backups
    3. Node monitoring
  7. Previous versions of this document

Topology

A Pryv.io v2 deployment is a set of cores. Every core is the same binary — there are no role-specific machines. Cores coordinate through an embedded rqlite cluster that holds the platform DB (user→core mapping, registration tokens, invitations, active-core list).

Single-core (most deployments)

single-node

One VM runs bin/master.js, which in turn runs:

This mode uses dnsLess.isActive: true — the platform is reached at a single publicUrl. No wildcard DNS or embedded DNS server is needed.

See INSTALL.

Multi-core for load

cluster-load

Multiple cores share the same domain (mc.example.com). Each core hosts a subset of users and advertises its identity through rqlite. New registrations are assigned to the core with the fewest users; client SDKs discover the user’s home core via /reg/cores?username={user} and then talk to that core directly.

DNS is either served by each core’s embedded DNS (wildcard *.mc.example.com) or by an external DNS provider (DNSless multi-core). Rqlite peers discover each other through an lsc.{domain} DNS A record.

See single-node to multi-core upgrade and the upstream SINGLE-TO-MULTIPLE.md.

Multi-core for geographical compliance

cluster-compliance-zones

Cores can be placed in different jurisdictions to keep user data where local law requires it. Users registered in a given zone stay on the cores of that zone. The granularity of distribution is always one user account — a compliance zone can contain as few as one user.

If Pryv.io coexists with other server components (e.g. SMTP), apply the same partitioning logic to those components too.

Business requirements

The size of a deployment is driven by business requirements. The tables below list the factors that matter for Pryv.io.

Granularity

Pryv.io’s fundamental entity is the user; data is kept vertically and not spread out. Requirements below are therefore specified per user.

Data production

Metric Your values here
Expected write requests per second (max rqps)
Attachment writes (max MB/s)
Volume (data points per day)
Volume (MB per day)
Retention of data (years)

The first two metrics influence the number of users that can be co-hosted on a single core; the last two give you an estimation of disk space consumed per day per user.

Data consumption

Metric Your values here
Expected read requests per second (max rqps)
Number of points retrieved per request (scalar)
Attachment reads (max rqps)
Volume (data points per day)
Volume (MB per day)

This table quantifies the load generated by reading data back per user.

Sizing a core

Use the key metrics from the previous section to decide how many cores you need. Inside each compliance zone (or for the whole platform if there’s only one), derive the number of cores from the following maximum values for a single core:

Metric Max performance of a single core
Write requests per second 2000 rqps
Attachment writes Depends heavily on network path — roughly speed of underlying storage / 2
Data points per day Sustained write increases total data points per user, which uses more disk space.
Volume (MB per day) See above.
Expected read requests per second 2000 rqps — latency has a long-tail distribution depending on your query.
Number of points retrieved per request Big (> 10 000 points) result sets should use paging.
Attachment reads 600 rqps

Consider load distribution across your user base. For a heterogeneous user base, add safety margins to the above numbers.

New users are assigned to the core with the fewest users in the same compliance zone — this produces round-robin behaviour for a stable set of cores. User deletions or newly added cores skew the distribution toward the less-loaded cores until balance is restored.

System requirements

Operating systems

Linux — any distribution supported by your chosen container runtime or Node.js 22. Tested on:

Docker

If running from the pryvio/open-pryv.io image:

Native (non-Docker) installs need Node.js 22.x.

Per-core machine

Aspect Minimal requirement
RAM 4 GB (8 GB recommended; add ~200 MB per extra API/HFS worker)
CPU cores 2 (4+ under load or with image previews)
Pryv.io binary + image 2 GB (Docker image unpacked)
Data size Depending on storage needs (see Sizing a core)
Service ports See Network and firewall

Load sensitivity:

Load situation Resource needs
Large data per user Data disk space — increase per data-usage predictions
High requests per second CPU cores — increase to 4+; raise cluster.apiWorkers
High-frequency series Raise cluster.hfsWorkers; ensure port 4000 reachable from your proxy
Image uploads / previews CPU + RAM — GraphicsMagick + sharp are CPU-bound; enable previewsWorker

Database host (optional — external PostgreSQL / MongoDB)

When running the base storage engine on a separate machine:

Aspect Minimal requirement
RAM 4 GB
CPU cores 2
Data size Scales with users × retention — plan from the Data production table
Service port tcp/5432 (PostgreSQL) or tcp/27017 (MongoDB) — reachable from the core

If using embedded MongoDB or PostgreSQL on the same VM as the core, add the database’s resource needs to the core requirements above.

Network and firewall

Inbound — from clients:

Port Protocol When
443 tcp HTTPS (built-in SSL or behind your reverse proxy)
53 udp Only in multi-core deployments using the embedded DNS server

Inter-core (multi-core only):

Port Protocol Purpose
4002 tcp rqlite Raft consensus — must be reachable between all cores. Mutually-authenticated TLS by default when cores are added via the bootstrap CLI; a VPN between cores is no longer required as a baseline.
4001 tcp rqlite HTTP (usually only bound to localhost)

Cores added via bin/bootstrap.js new-core ship with storages.engines.rqlite.tls.{caFile, certFile, keyFile, verifyClient: true} enabled — both ends of every Raft connection verify the peer’s cert against the cluster CA. Plain TCP attempts on port 4002 are rejected. If you opt out of mTLS (set tls: null, the default for fresh installs that have never run the bootstrap CLI), opening port 4002 still requires a private network or VPN between cores.

Outbound — from the core:

Operational concerns

System hardening

Follow a system-hardening guide for your chosen OS: firewall defaults, no password SSH, automatic security updates, non-root service user, etc. Administrators of a regulated system must themselves conform to the applicable regulations and have received adequate training.

Backups

See the backup guide. Making a copy of private user data is regulated by law — make sure you understand the implications before rolling out backups.

Node monitoring

Monitor key performance metrics on every core and keep historical data for incident analysis. At minimum:

Application-level: the core exposes standard Node.js process metrics via its logs, and each of its HTTP ports (3000, 4000) responds to basic liveness checks — see the healthchecks guide.

v1 procedure (legacy)

Pryv.io v1 used a three-role topology — a web machine (NGINX), a two-node registry (config-leader + config-follower) and one or more core machines, partitioned by username and optionally spread across privacy zones. High availability relied on Pacemaker/Heartbeat. None of those roles exist in v2: the core binary handles registration in-process, and the operator’s reverse proxy replaces the web role.

The per-core sizing orders of magnitude from the v1 design guide are still useful as v2 baselines, since the inner workers (api / hfs / previews) remain the bottleneck:

Resource Limit per core
Read/write requests ~2000 rqps each
Image-preview requests ~100 rqps
Attachment reads ~600 rqps
Attachment writes ≈ storage throughput / 2
Users per core < 10000

Minimum starting point per core: 4 GB RAM, 2 CPU, 15 GB data disk (raise data disk in line with retained events + attachments).