Skip to main content
The LanceDB resource mounts a LanceDB table as a filesystem: group-by columns become folders, rows become files, and semantic search is the search command. See LanceDB Resource for the full layout and command list.

Dependencies

uv add lancedb
lancedb ships the embedded engine and the async client Mirage uses. It pulls in pyarrow; no separate server is required. Semantic search needs an embedding function inside the table. For real multimodal (CLIP) embeddings, add the model deps to your builder environment only (Mirage core never imports them):
uv add open-clip-torch torch

Where the data lives

A LanceDB database is a directory of Lance files. The uri decides where it is stored, and the same LanceDBConfig works for every tier.

LanceDB OSS (local disk)

from mirage import MountMode, Workspace
from mirage.resource.lancedb import LanceDBConfig, LanceDBResource

config = LanceDBConfig(
    uri="/data/fashion.lancedb",
    table="fashion",
    group_by=["gender", "articleType", "baseColour"],
    id_column="id",
    title_column="productDisplayName",
    blob_column="image_bytes",
    blob_ext="jpg",
    vector_column="vector",
)
ws = Workspace({"/fashion/": LanceDBResource(config)}, mode=MountMode.READ)

Object storage (S3 / GCS / Azure)

Point uri at a bucket. Credentials come from the environment by default, or pass them through storage_options.
config = LanceDBConfig(
    uri="s3://my-bucket/fashion.lancedb",
    table="fashion",
    group_by=["gender", "articleType", "baseColour"],
    id_column="id",
    vector_column="vector",
    storage_options={"region": "us-east-1"},
)

LanceDB Cloud

Use a db:// URI plus an API key and region. The API key can also come from the LANCEDB_API_KEY environment variable.
import os

config = LanceDBConfig(
    uri="db://my-database",
    api_key=os.environ["LANCEDB_API_KEY"],
    region="us-east-1",
    table="fashion",
    group_by=["gender", "articleType", "baseColour"],
    id_column="id",
    vector_column="vector",
)
ws = Workspace({"/fashion/": LanceDBResource(config)}, mode=MountMode.READ)

LanceDB Enterprise

Enterprise is the same as Cloud plus a custom endpoint via host_override.
config = LanceDBConfig(
    uri="db://my-database",
    api_key=os.environ["LANCEDB_API_KEY"],
    host_override="https://my-database.us-east-1.api.lancedb.com",
    region="us-east-1",
    table="fashion",
    group_by=["gender", "articleType", "baseColour"],
    id_column="id",
    vector_column="vector",
)
region and host_override are only applied for db:// URIs; they are ignored for local and object-storage mounts.

Search setup

Search is powered by the table’s own embedding function, not by Mirage. The search command is available when vector_column is set; the table must have been created with an embedding function registered on a source field. A minimal CLIP-backed table (run once in your builder environment):
import lancedb
from lancedb.embeddings import get_registry
from lancedb.pydantic import LanceModel, Vector

func = get_registry().get("open-clip").create()

class Product(LanceModel):
    id: int
    gender: str
    articleType: str
    baseColour: str
    productDisplayName: str = func.SourceField()  # text the model embeds
    image_bytes: bytes
    vector: Vector(func.ndims()) = func.VectorField()

db = lancedb.connect("/data/fashion.lancedb")
table = db.create_table("fashion", schema=Product)
table.add(rows)  # rows include image_bytes
Once mounted, querying is the search command. LanceDB embeds the query text with the same model and runs vector search, returning ranked rows as canonical file paths with a score, then their cards:
search "red running shoes" /fashion          # ranked <path>.md:score + card body
cat /fashion/Men/Shoes/White/3.md             # follow a result to the real file
A runnable, dependency-free version (a lightweight keyword embedding instead of CLIP) lives in examples/python/lancedb/.

Config reference

FieldRequiredDefaultDescription
uriYesLocal path, s3:///gs:///az:///hf://, or db:// (Cloud)
api_keyNoLanceDB Cloud/Enterprise API key (or LANCEDB_API_KEY)
regionNous-east-1Cloud region (db:// only)
host_overrideNoEnterprise endpoint URL (db:// only)
storage_optionsNoObject-storage options/credentials
tableNoPin one table; the mount root becomes that table
group_byNo[]Columns that become nested folder levels
id_columnNoidColumn used to name row files
title_columnNoColumn used as the card heading
blob_columnNoColumn served as the raw blob/image file
blob_extNobinExtension for the blob file (jpg, png, …)
vector_columnNoVector column; presence enables the search command
search_limitNo10Default top-k returned by search
max_rowsNo1000Cap on rows scanned per folder listing
The mount is read-only. See LanceDB Resource for the filesystem layout and supported commands.