> For the complete documentation index, see [llms.txt](https://bucketdb.sullux.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://bucketdb.sullux.com/usage/schema.md).

# Schema Definition

In BucketDB, schemas serve a very specific, limited purpose: **to instruct the storage engine on how to efficiently serialize your JavaScript objects into packed binary rows.**

BucketDB adheres to the philosophy that data validation (e.g., ensuring a string matches an email regex, or an integer falls within a specific range) belongs in your application's business logic layer. Thus, BucketDB schemas do *not* reject writes based on constraint validation.

## Defining a Schema

You must register a schema before you can write to or query a table. A schema definition consists of the table name, a version number, an object of fields, a primary key designation, and optionally an array of indexes.

### Basic Example

```javascript
db.registerSchema({
  name: 'users',
  version: 1,  // Increment to perform migrations
  fields: {
    id: { typeId: 10, maxLength: 36 }, // varchar
    name: { typeId: 10, maxLength: 50 }, // varchar
    active: { typeId: 6 }, // boolean
    age: { typeId: 1 } // uint8
  },
  primaryKey: 'id',
  indexes: ['name']
});
```

### Supported Data Types

When a row is written, its fields are serialized into a fixed-width binary block according to these types. BucketDB uses integer type IDs to compress schemas.

| Type ID | Type        | Description                     | Max Value / Size                                 |
| ------- | ----------- | ------------------------------- | ------------------------------------------------ |
| `1`     | `uint8`     | Unsigned 8-bit integer          | 0 to 255                                         |
| `2`     | `uint32`    | Unsigned 32-bit integer         | 0 to 4,294,967,295                               |
| `3`     | `uint64`    | Unsigned 64-bit integer         | 0 to 18,446,744,073,709,551,615                  |
| `4`     | `int32`     | Signed 32-bit integer           | -2,147,483,648 to 2,147,483,647                  |
| `5`     | `float64`   | Double precision float          | Standard JS Number                               |
| `6`     | `boolean`   | Boolean                         | `true` or `false`                                |
| `7`     | `timestamp` | UTC Unix timestamp              | Milliseconds                                     |
| `10`    | `varchar`   | Variable-length UTF-8 text      | Stored in the Heap. Requires `maxLength` config. |
| `11`    | `blob`      | Managed Unstructured Blob       | Stored in the Heap as a 36-char UUID pointer     |
| `12`    | `json`      | Arbitrary nested objects/arrays | Stored in the Heap                               |

### The Heap (Variable Length Data)

To ensure the primary data block remains fixed-width (allowing instantaneous seeking by multiplying row index by byte width), types like `varchar` and `json` are not stored directly in the row. Instead, the row stores a pointer (offset and length) to a secondary section of the block called the **Heap**.

## Primary Keys

You must explicitly designate a `primaryKey` field.

* It must be unique per table.
* Most commonly, this is a `varchar` (e.g., a UUID or KSUID).
* You must always provide this field when calling `batch.insert()`, `batch.update()`, or `batch.delete()`.

## Index Design

Queries in BucketDB without an index require a full table scan, meaning the system must download every single data block for that table and scan it in memory.

By defining `indexes`, you instruct BucketDB to maintain Copy-on-Write (CoW) B+Tree files natively in the storage driver.

### Defining Indexes

Indexes are defined as an array of field names. B+Tree indexes natively support exact equality matches (`=`) as well as range queries (`>`, `>=`, `<`, `<=`). See [Querying](/usage/querying.md) for performance implications.

## Schema Versioning

Because blocks are immutable, schemas in BucketDB are permanently tied to the blocks they create. You cannot modify a registered schema version. Instead, you create a new version (e.g., `version: 2`) and initiate a zero-downtime background migration via the Storage-Level Migration Daemon.

See [Schema Migrations](/usage/migrations.md) for the complete guide on modifying schemas safely.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://bucketdb.sullux.com/usage/schema.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
