> For the complete documentation index, see [llms.txt](https://bucketdb.sullux.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://bucketdb.sullux.com/usage/querying.md).

# Querying & Indexes

BucketDB provides a fluent `QueryBuilder` API for reading data. It allows filtering, applying custom JavaScript functions, limiting results, and leveraging native Copy-on-Write (CoW) B+Tree indexes.

## Basic Execution

Queries must always specify the `tableName`.

```javascript
const results = await db.query('users')
  .where('role', '=', 'admin')
  .limit(50)
  .execute();
```

## Supported Operators

BucketDB supports the following exact operators:

* `where('field', '=', 'value')` or `.eq()`
* `where('field', '!=', 'value')` or `.ne()`
* `where('field', '>', 'value')` or `.gt()`
* `where('field', '>=', 'value')` or `.gte()`
* `where('field', '<', 'value')` or `.lt()`
* `where('field', '<=', 'value')` or `.lte()`
* `where('field', 'IN', ['value1', 'value2'])` or `.in()`

## Performance and the Query Optimizer

### The Mock Query Executor

As of v0.2.0, BucketDB features a highly dynamic Mock Query Executor. The background Write-Forward Service maintains a strictly bounded `_sample` system table (e.g., 100 rows per schema version).

When you execute a query, the **Query Optimizer** does not rely on stale statistics. Instead, it runs the actual JavaScript predicates against this in-memory sample. It calculates a projected physical block read count and assigns a cost score to each potential execution plan. The plan with the lowest cost wins. This ensures optimal index selection with negligible overhead.

### Indexes vs. Full Table Scans

#### 1. B+Tree Indexes

If you define `indexes: ['role']` in your [Schema Definition](/usage/schema.md), the optimizer will detect exact-match equality conditions (`.where('role', '=', 'admin')`) as well as range conditions (`>`, `>=`, `<`, `<=`, `in`).

* Instead of downloading all data blocks, it traverses the B+Tree blocks under the `indexes/` prefix.
* It returns precise pointers to specific rows in specific blocks, downloading only what is necessary.
* **Complexity: O(log N)**

#### 2. Full Table Scans

If you query a field that is unindexed, BucketDB cannot use the B+Tree.

* It must fetch the `Block Index` from `Block 0`.
* It must download *every single block* associated with the table into the local cache.
* It scans the blocks linearly in memory.
* **Complexity: O(N)**

*Warning: Full table scans are exceptionally slow on multi-gigabyte datasets and can spike your S3 `GET` request costs. Always define indexes for fields you query frequently.*

## Batch Overlay

BucketDB is built around eventual consistency and Write-Forward logs. You can pass an active batch to a query to overlay the unflushed data on top of the S3 state.

```javascript
const b = db.batch();
b.insert('users', { id: 'u1', email: 'test@example.com', role: 'guest' });

// This query returns the newly inserted row, even though it hasn't 
// been flushed to the S3 Write-Forward log yet!
const localGuests = await db.query('users')
  .overlay(b)
  .where('role', '=', 'guest')
  .execute();
```

## Custom Functions

For complex logic that cannot be expressed via simple operators (e.g., regex checks, complex mathematical formulas, or deeply nested JSON iteration), you can execute registered JavaScript functions natively during a query.

See [Custom Functions](/usage/functions.md) for a complete guide.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://bucketdb.sullux.com/usage/querying.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
