The smallest Ocient system
Ocient is built for the very large end of analytics — the Ocient Hyperscale Data Warehouse™, designed to "store and query petabytes of data," run "queries up to 100x faster" than conventional cloud warehouses, and "transform and load data at terabits per second." Its architecture targets record sets numbering "in the trillions, quadrillions, and beyond."
ociforge is the diametric opposite. It is, deliberately, about the smallest Ocient system you can build — and the fact that it works at all, with the exact same software that runs those petabyte clusters, is the point of this article.
Three layers, each scales sideways
An Ocient system is built from three node roles, and you grow throughput by adding nodes to whichever layer is the bottleneck:
- Loader nodes ingest and transform incoming data. Need faster loading? Add loaders.
- SQL nodes parse, plan, and coordinate queries and hold client connections. Need more concurrent users? Add SQL nodes.
- Foundation (storage) nodes hold the data on NVMe drives and do the bulk of the scanning. Need more capacity or scan throughput? Add foundation nodes — and add NVMe drives within each one.
Query execution fans out across these nodes as a query tree: a probe phase fixes which nodes provide which data and which peers co-execute, "without any global coordination or synchronization," and the tree "scales up by increasing the number of clusters." That is what horizontal scalability means here — there's no central bottleneck to widen.
Compute-adjacent storage, and why drives are the unit of parallelism
Ocient's core design is Compute-Adjacent Storage Architecture (CASA): it "brings storage adjacent to compute" by collocating NVMe drives with the cores that process them. It goes further than most databases at the drive itself — each NVMe SSD is detached from the normal Linux block driver and driven directly over the PCIe/NVMe protocol from user memory, "eliminating all system calls and memory copies required to do I/O." A modern NVMe SSD can serve on the order of a million 4 KiB random reads per second when kept busy with many in-flight requests, and Ocient is written to always have "a replacement I/O operation teed up" so the drive never stalls. Cores, local RAM, and local drives are grouped into NUMA-aware silos that process their own data without cross-socket hops.
The practical upshot: the drive is the unit of parallelism. A large deployment might run, say, 20 foundation nodes with 16 NVMe drives each — 320 compute-adjacent I/O pipelines reading and processing data at once. Double the drives and you roughly double the scan throughput.
ociforge: one of everything
Now look at ociforge:
| Layer | Big Ocient system | ociforge |
|---|---|---|
| SQL nodes | many | 1 |
| Loader nodes | many | 1 |
| Foundation nodes | tens | 1 |
| NVMe data drives | hundreds | 1 |
| Parallel I/O pipelines | hundreds | 1 |
| Data scale | petabytes / trillions of rows | ~550M rows |
It is one SQL node, one loader, one foundation node, with a single data drive
each — the minimum that still exercises every layer. Where a production cluster
has 320 storage pipelines, ociforge has exactly one. When you run a query here,
the completed-queries record shows it executing on
just two nodes (storage0 and sql0), with the loader sitting idle until the
next load.
Why that's the impressive part
A SELECT count(*) over the 119-million-row taxi table returns 119,136,044
on ociforge — exactly what it would return on a 320-pipeline cluster. The same
SQL, the same optimizer, the same execution engine, the same correctness — the
big system just spreads the work over hundreds of drives instead of one, and
finishes proportionally faster.
That is the real testament to the architecture: it is functionally correct and identical across the entire scale range, from a single NVMe drive to hundreds, from half a billion rows to quadrillions. You can learn Ocient, develop against it, and validate queries on a three-node sandbox like this one, then run the same workload unchanged on a hyperscale cluster — the only thing you change is how many loaders, SQL nodes, and foundation drives you point at it.
So ociforge is small on purpose. It's a working proof that Ocient's petabyte-scale design degrades gracefully all the way down to the smallest possible footprint — and scales right back up by adding hardware, not by rewriting anything.
See Ocient Architecture and the Exabyte Scalability Design docs for the full design, and the datasets and example queries to put this little system to work.