Celerity Datastore

celerity/datastore

Spec Version: v2026-02-27-draft
Ecosystem Compatibility: v0 (Current) / v1 (Preview)

blueprint transform: celerity-2026-02-27-draft

Ecosystem Compatibility

v0 (Current): AWS only, supporting Amazon DynamoDB.
v1 (Preview): Multi-cloud, supporting Amazon DynamoDB, Google Cloud Firestore and Azure Cosmos DB.

Specification

The specification is the structure of the resource definition that comes under the spec field of the resource in a blueprint. The rest of this section lists fields that are available to configure the celerity/datastore resource followed by examples of different configurations for the resource and how the data store behaves in target environments along with additional documentation.

Feature Availability

✅ Available in v0 - Features currently supported
🔄 Planned for v0 - Features coming in future v0 evolution
🚀 Planned for v1 - Features coming in v1 release

name

A unique name to use for the data store. If a name is not provided, a unique name will be generated for based on the blueprint that the data store is defined in. This will map to a table, collection or namespace in the target environment.

type

string

keys (required)

A definition of the primary and sort keys for the data store. For target environments that do not support composite keys, sort keys will be combined with the primary key to create additional composite indexes.

✅ Available in v0

type

dataStoreKeys

schema

An inline schema to document the structure of items in the data store. This is optional, however, it should be considered essential for providing a single source of truth for the structure of the data that can be used by consumers of the data such as data teams that build data pipelines and internal systems (in your company) that make use of the data.

When schemaPath is provided, the external schema file takes precedence and this inline schema is ignored.

✅ Available in v0

type

dataStoreSchema

schemaPath

The path to an external YAML schema definition file, relative to the blueprint file. When provided, Celerity manages the datastore schema through its built-in schema management system. The external schema file supports rich metadata (owner, tags, classification, default) and integrates with schema contracts, type generation and export systems.

When schemaPath is provided, any inline schema on the same resource is ignored — the external file takes precedence.

✅ Available in v0

type

string

examples

./schemas/user-store.yaml

scriptsPath

The path to a directory containing escape hatch data scripts, relative to the blueprint file. These are versioned scripts for data operations that cannot be expressed in the schema YAML (backfilling new fields, migrating data formats, cleaning up deprecated fields).

Scripts follow the naming convention V<number>__<description>.<ext> and can be written in any language. They are tracked in schema state and surfaced in celerity schema diff, but are NOT automatically executed during deployment. See Escape Hatch Data Scripts for details.

✅ Available in v0

type

string

examples

./scripts/user-store

indexes

A list of indexes to apply to the data store.

✅ Available in v0

type

array[dataStoreIndex]

timeToLive

Time to live configuration for items in the data store.

Warning

TTL functionality is only supported for some target environments.

✅ Available in v0

type

dataStoreTimeToLive

Annotations

There are no annotations required for linking other resources to a celerity/datastore resource or modifying the behaviour of a data store resource.

linkSelector.byLabel can be used to target data stores from other resource types.

Outputs

Outputs are computed values that are accessible via the {resourceName}.spec.* field accessor in a blueprint substitution. For example, if the resource name is myDatastore, the output would be accessible via ${myDatastore.spec.id}.

id

The ID of the created data store in the target environment.

type

string

examples

arn:aws:dynamodb:us-east-1:123456789012:table/users (AWS)

projects/my-project/databases/(default)/documents/users (Google Cloud)

my-account/my-database/users (Azure)

Data Types

dataStoreKeys

A definition of the keys that make up the primary and sort keys for the data store. Sort keys are only supported for target environments that support composite keys such as Amazon DynamoDB.

FIELDS

partitionKey (required)

The partition key of the data store, this is the field that will be the unique identifier for each item in the data store.

field type

string

sortKey

The sort key of the data store, this is the field that will be used to sort the items in the data store. This is only supported for target environments that support composite keys such as Amazon DynamoDB.

field type

string

dataStoreSchema

A schema to document the structure of the data in the data store.

FIELDS

description

A human-readable description of the data store schema. Used for documentation and schema exports.

field type

string

owner

The team or individual that owns the data store schema. Used for documentation, data governance and schema contracts.

field type

string

dataStoreFieldSchema

Schema definition for a field in an item in the data store or in a nested object.

FIELDS

type

The type of the field.

field type

string

allowed values

string | number | boolean | object | array

description

A description of the field.

field type

string

nullable

Whether the field can be null. Different from "not required" — a nullable field may exist on an item with a null value, while a non-required field may not exist at all.

field type

boolean

default

A default value for the field. This is used by type generation and SDK validation but is not enforced by the database. When a new item is written without this field, the application should use this default.

field type

string

classification

A data classification label for the field (e.g. pii, sensitive, public, internal-id). Used for data governance, documentation and schema exports.

field type

string

dataStoreIndex

An index to apply to the data store to allow for efficient querying of the data store based on different combinations of fields.

FIELDS

name

The name of the index.

field type

string

fields

The fields to include in the index.

The order of the fields can be important depending on the target envrionment. For example, when deploying to AWS (Amazon DynamoDB), the first field in the list will be used as the partition key and the second field will be used as the sort key.

The number of fields in the index is limited based on the target environment, see the Target Environments section for more information.

field type

array[string]

dataStoreTimeToLive

Time to live configuration for items in the data store.

FIELDS

fieldName

The field name to use for the time to live. Depending on the target environment, this will either be expected to be set as a unix timestamp or a number of seconds from the current time for when the item should expire.

field type

string

enabled

Whether the time to live is enabled for the data store.

field type

boolean

Linked From

`celerity/handler`

A handler can link to a data store at runtime, linking a handler to a data store resource will automatically configure the handler to be able to use the data store with appropriate permissions, environment variables/secrets and configuration.

✅ Available in v0

Links To

`celerity/consumer`

When a data store links out to a consumer, either a stream or pub/sub configuration will be created to allow the consumer to receive events from the data store.

✅ Available in v0

Examples

Data Store with Schema and Indexes

version: 2025-11-02
transform: celerity-2026-02-27-draft
resources:
    userStore:
        type: "celerity/datastore"
        metadata:
            displayName: "User Store"
        spec:
            name: "users"
            keys:
                partitionKey: "id"
            schema:
                required: ["id", "name", "email"]
                fields:
                    id:
                        type: "string"
                        description: "The ID of the user"
                    name:
                        type: "string"
                        description: "The name of the user"
                    email:
                        type: "string"
                        description: "The email of the user"
                    lastLogin:
                        type: "number"
                        description: "The unix timestamp of the last login"
                    isActive:
                        type: "boolean"
                        description: "Whether the user is active"
                    roles:
                        type: "array"
                        description: "The roles of the user"
                        items:
                            type: "string"
                            description: "The role of the user"
            indexes:
                - name: "emailNameIndex"
                  fields: ["email", "name"]

Data Store with External Schema and Indexes

This example uses schemaPath and scriptsPath to manage the data store schema through Celerity's built-in schema management system. The external schema file supports rich metadata, type generation, schema contracts and exports.

version: 2025-11-02
transform: celerity-2026-02-27-draft
resources:
    userStore:
        type: "celerity/datastore"
        metadata:
            displayName: "User Store"
        spec:
            name: "users"
            keys:
                partitionKey: "id"
            schemaPath: "./schemas/user-store.yaml"
            scriptsPath: "./scripts/user-store"
            indexes:
                - name: "emailIndex"
                  fields: ["email"]
                - name: "teamRoleIndex"
                  fields: ["teamId", "role"]
            timeToLive:
                fieldName: "ttl"
                enabled: true

The corresponding external schema file (schemas/user-store.yaml) would define the field-level schema with rich metadata:

description: "User accounts and profile data"
owner: "platform-team"
tags: ["pii", "core-entity"]

required: ["id", "email", "name", "status"]

fields:
  id:
    type: string
    description: "Unique user identifier (ULID)"
  email:
    type: string
    description: "Primary email address"
    classification: pii
  name:
    type: string
    description: "Display name"
    classification: pii
  status:
    type: string
    description: "Account status: active | suspended | deleted"
    default: "active"
  teamId:
    type: string
    description: "Team the user belongs to"
    nullable: true
  role:
    type: string
    description: "User role within their team"
    default: "member"
  preferences:
    type: object
    description: "User preferences"
    fields:
      theme:
        type: string
        default: "light"
      notifications:
        type: boolean
        default: true
  ttl:
    type: number
    description: "TTL for soft-deleted accounts (unix timestamp)"
    nullable: true

See the NoSQL Datastore Schema Management guide for full details on the schema format, type generation, contracts and more.

Target Environments

Local Development

✅ Available in v0

In the local development environment, data stores are backed by a cloud-specific emulator that matches the deploy target configured in app.deploy.jsonc. This ensures that cloud-specific query patterns and APIs work correctly during local development.

Deploy Target	Local Emulator	Notes
`aws` / `aws-serverless`	Amazon DynamoDB Local	Full DynamoDB API compatibility including queries, GSIs, and streams
`gcloud` / `gcloud-serverless`	Firebase Emulator	Planned for v1
`azure` / `azure-serverless`	Planned for v1

A single emulator instance is shared across all data stores in the local development environment. Key schema (partition keys, sort keys, and indexes) defined in the blueprint is used to create the local tables automatically.

AWS

✅ Available in v0

In the AWS environment, data stores are backed by an Amazon DynamoDB table.

There is a one-to-one mapping between the partition and sort keys defined for the data store (table) and indexes and the primary and sort keys in DynamoDB. For indexes, the first field in the fields list will be used as the partition key and the second field (if present) will be used as the sort key.

Without any configuration, the underlying table will be created with the PAY_PER_REQUEST billing mode.

The app deploy configuration can be used to configure the data store with Amazon DynamoDB specific settings such as the read and write capacity units and types of indexes to create.

Google Cloud

🚀 Planned for v1 - This target environment will be available in a future v0 evolution.

Azure

🚀 Planned for v1 - This target environment will be available in a future v0 evolution.

App Deploy Configuration

Configuration specific to a target environment can be defined for celerity/datastore resources in the app deploy configuration file.

This section lists the configuration options that can be set in the deployTarget.config object in the app deploy configuration file.

AWS Configuration Options

✅ Available in v0

aws.dynamodb.<datastoreName>.billingMode

The billing mode to use for the DynamoDB table backing a specific data store. This can be set to PAY_PER_REQUEST or PROVISIONED and defaults to PAY_PER_REQUEST. datastoreName is the name (key) of the data store resource in the blueprint.

Deploy Targets

aws, aws-serverless

type

string

allowed values

PAY_PER_REQUEST | PROVISIONED

default value

PAY_PER_REQUEST

examples

{
    "deployTarget": {
        "name": "aws",
        "appEnv": "production",
        "config": {
            "aws.dynamodb.userStore.billingMode": "PROVISIONED"
        }
    }
}

aws.dynamodb.<datastoreName>.readCapacityUnits

The read capacity units to use for the DynamoDB table backing a specific data store. This is only available when the billing mode is set to PROVISIONED. datastoreName is the name (key) of the data store resource in the blueprint.

Deploy Targets

aws, aws-serverless

type

number

examples

{
    "deployTarget": {
        "name": "aws",
        "appEnv": "production",
        "config": {
            "aws.dynamodb.userStore.readCapacityUnits": 10
        }
    }
}

aws.dynamodb.<datastoreName>.writeCapacityUnits

The write capacity units to use for the DynamoDB table backing a specific data store. This is only available when the billing mode is set to PROVISIONED. datastoreName is the name (key) of the data store resource in the blueprint.

Deploy Targets

aws, aws-serverless

type

number

examples

{
    "deployTarget": {
        "name": "aws",
        "appEnv": "production",
        "config": {
            "aws.dynamodb.userStore.writeCapacityUnits": 10
        }
    }
}

aws.dynamodb.<datastoreName>.maxReadRequestUnits

The maximum number of read request units to use for the DynamoDB table backing a specific data store. This is only available when the billing mode is set to PAY_PER_REQUEST. datastoreName is the name (key) of the data store resource in the blueprint.

Deploy Targets

aws, aws-serverless

type

number

examples

{
    "deployTarget": {
        "name": "aws",
        "appEnv": "production",
        "config": {
            "aws.dynamodb.userStore.maxReadRequestUnits": 1000
        }
    }
}

aws.dynamodb.<datastoreName>.maxWriteRequestUnits

The maximum number of write request units to use for the DynamoDB table backing a specific data store. This is only available when the billing mode is set to PAY_PER_REQUEST. datastoreName is the name (key) of the data store resource in the blueprint.

Deploy Targets

aws, aws-serverless

type

number

examples

{
    "deployTarget": {
        "name": "aws",
        "appEnv": "production",
        "config": {
            "aws.dynamodb.userStore.maxWriteRequestUnits": 1000
        }
    }
}

aws.dynamodb.<datastoreName>.replicaRegions

The regions to replicate the table to, making the data store a DynamoDB global table. This is expected to be a comma separated list of region names.

Deploy Targets

aws, aws-serverless

type

string

examples

{
    "deployTarget": {
        "name": "aws",
        "appEnv": "production",
        "config": {
            "aws.dynamodb.userStore.replicaRegions": "us-east-1,us-east-2"
        }
    }
}

Google Cloud Configuration Options

🚀 Planned for v1 - The Google Cloud deployment targets are planned for v1, it may become available in a future v0 evolution.

Azure Configuration Options

🚀 Planned for v1 - The Azure deployment targets are planned for v1, it may become available in a future v0 evolution.

SDK Operations

The Celerity SDKs provide a cloud-agnostic interface for interacting with data stores. The following operations are supported across all target providers (Amazon DynamoDB, Google Cloud Firestore and Azure Cosmos DB).

For language-specific documentation, see Node.js SDK - Datastore and Python SDK - Datastore.

Core Operations

getItem(key) — Get a single item by primary key
putItem(item) — Put or upsert a full item
deleteItem(key) — Delete an item by primary key
query(options) — Query by partition key with optional sort key and filter conditions
scan(options) — Full table scan with optional filter conditions
batchGetItems(keys) — Get multiple items in a single call
batchWriteItems(ops) — Batch put and delete operations

Condition Expressions

Queries and scans support portable condition expressions with the following operators:

eq | ne | lt | le | gt | ge | between | startsWith | contains | exists

Note

The not_exists operator is only available through provider-specific SDK classes as Google Cloud Firestore does not support querying for the absence of a field.

Multiple conditions can be combined using AND or OR logic:

Array [condA, condB] — implicit AND (all must match)
Explicit AND { and: [...] } — equivalent to array
OR { or: [...] } — at least one must match

Groups can be nested recursively for compound logic:

// (status = "active" OR status = "pending") AND age > 18
filter: { and: [
  { or: [
    { name: "status", operator: "eq", value: "active" },
    { name: "status", operator: "eq", value: "pending" },
  ]},
  { name: "age", operator: "gt", value: 18 },
]}

Range Conditions

When querying with a composite key, the following range conditions are supported for the sort key:

eq | lt | le | gt | ge | between | startsWith

Cursor Pagination

Query and scan results are returned as async iterables with cursor-based pagination, allowing you to efficiently iterate through large result sets and resume from where you left off.

Schema Management

✅ Available in v0

Celerity provides built-in schema management for NoSQL data stores that makes schema a first-class concern — bridging the gap between development and data teams. Unlike SQL databases where you can introspect information_schema, NoSQL databases have no database-level schema to query. The "schema" lives only in application code, leaving data teams that build pipelines on the data flying blind.

Celerity solves this by treating the schema YAML as the single source of truth:

Schema-as-code: Declarative YAML definitions with rich metadata (description, owner, tags, classification) that are version-controlled alongside your application
Type generation: Generate TypeScript interfaces and Python Pydantic models from the schema for type-safe development (celerity schema codegen)
Schema validation: Layered validation — runtime (SDK validates on writes), build-time (generated types) and CI (celerity schema validate)
Schema contracts: Data teams declare which data stores they depend on and get CI-level protection when schemas change
Schema exports: Export as markdown, JSON Schema or Avro for pipeline tool integration (celerity schema export)
Schema diff: See what changed between the current and desired schema state (celerity schema diff)

To use schema management, set the schemaPath field on the data store resource to point to an external YAML schema file. See the Data Store with External Schema and Indexes example above.

For the full guide on NoSQL schema management, see NoSQL Datastore Schema Management.

Structure and Data Pipelines

Celerity provides a way to define schemas for NoSQL data stores to define and track the structure of the data in the data store. This is especially useful to provide a source of truth for data teams that need to know the structure of the data to feed into pipelines.

NoSQL databases are often preferred for their flexibility and ability to handle unstructured data, however, when you are building systems for a company, there will come a point where you will want to make sense of the data to help your company make decisions based on how your customers are using your product. There are plenty of reasons other than flexibility of data to opt for NoSQL databases over SQL databases such as scalability, performance and cost.

The key benefits of defining a schema for a NoSQL data store is to be able to provide a single source of truth for the structure of the data in the data store. Tools in the Celerity ecosystem that have a tight integration with Celerity applications will be able to use the schema to understand the intended structure of the data in the data store, identify drift where source code or the actual data in the data store does not match the schema, and provide a way to keep all parties that need to make use of the data in sync.

Defining schemas for NoSQL data stores is not required. When experimenting, you may not yet know the shape of the data you will be storing, in which case you may not want to define a schema until further down the line.

For a comprehensive guide on managing NoSQL data store schemas — including type generation, schema contracts for data teams, CI validation and export formats — see NoSQL Datastore Schema Management.

Celerity Datastore

On this page