Overview
Architecture
GreptimeDB
consists of the following key components:
Frontend
that exposes read and write service in various protocols, forwards requests toDatanode
.Datanode
is responsible for storing data to persistent storage such as local disk or object storage in the cloud such as AWS S3, Azure Blob Storage etc.Metasrv
server that coordinates the operations between theFrontend
andDatanode
.
Concepts
To better understand GreptimeDB
, a few concepts need to be introduced:
- A
table
is where user data is stored inGreptimeDB
. Atable
has a schema and a totally ordered primary key. Atable
is split into segments calledregion
by its partition key. - A
region
is a contiguous segment of a table, and also could be regarded as a partition in some relational databases. Aregion
could be replicated on multipledatanode
and only one of these replicas is writable and can serve write requests, while any replica can serve read requests. - A
datanode
stores and servesregion
tofrontends
. Onedatanode
can serve multipleregions
and oneregion
can be served by multipledatanodes
. - The
metasrv
stores the metadata of the cluster, such as tables,datanodes
,regions
of each table, etc. It also coordinatesfrontends
anddatanodes
. - The
frontend
has a catalog implementation, which fetches the metadata frommetasrv
, tells whichregion
of atable
is served by whichdatanode
. - A
frontend
is a stateless service that serves requests from client. It acts as a proxy to forward read and write requests to correspondingdatanode
, according to the mapping from catalog. - A timeseries of a
table
is identified by its primary key. Eachtable
must have a timestamp column, asGreptimeDB
is a timeseries database. Data intable
will be sorted by its primary key and timestamp, but the actual order is implementation specific and may change in the future.
How it works
Before diving into each component, let's take a high level view of how the database works.
- Users can interact with the database via various protocols, such as ingesting data using
InfluxDB line protocol
, then exploring the data using SQL or PromQL. Thefrontend
is the component users or clients connect to and operate, thus hidedatanode
andmetasrv
behind it. - Assumes a user uses the HTTP API to insert data into the database, by sending a HTTP request to a
frontend
instance. When thefrontend
receives the request, it then parses the request body using corresponding protocol parser, and finds the table to write to from a catalog manager based onmetasrv
. - The
frontend
relies on a push-pull strategy to cache metadata frommetasrv
, thus it knows whichdatanode
, or more precisely, theregion
a request should be sent to. A request may be split and sent to multipleregion
s, if its contents need to be stored in differentregion
s. - When
datanode
receives the request, it writes the data to theregion
, and then sends response back to thefrontend
. Writing to theregion
will then write to the underlying storage engine, which will eventually put the data to persistent device. - Once
frontend
has received all responses from the targetdatanode
s, it then sends the result back to the user.
For more details on each component, see the following guides: