Skip to main content
The LanceDB resource exposes a LanceDB table as a virtual filesystem mounted at some prefix such as /fashion/. Group-by columns become nested folders, each row becomes a card plus an optional blob file, and semantic search is the search command, which returns ranked rows as canonical file paths. For connection setup (LanceDB OSS, object storage, Cloud, Enterprise), see LanceDB Setup.

Config

from mirage import MountMode, Workspace
from mirage.resource.lancedb import LanceDBConfig, LanceDBResource

config = LanceDBConfig(
    uri="/data/fashion.lancedb",
    table="fashion",
    group_by=["gender", "articleType", "baseColour"],
    id_column="id",
    title_column="productDisplayName",
    blob_column="image_bytes",
    blob_ext="jpg",
    vector_column="vector",
    search_limit=5,
)
resource = LanceDBResource(config)
ws = Workspace({"/fashion/": resource}, mode=MountMode.READ)
The mapping is config-driven; nothing about the dataset is hardcoded. Point group_by at different columns and the folder tree changes. See the full config reference.

Filesystem layout

Every path is translated into a LanceDB query. Descending a folder adds one WHERE clause; the leaf level lists rows.
/                                  # list tables (omitted when `table` is pinned)
/<table>/                          # distinct group_by[0] values
  <v1>/                            # distinct group_by[1] WHERE group_by[0]=v1
    .../<vN>/                      # all group-by columns bound -> row files
      <id>.md                      # rendered card (text)
      <id>.<ext>                   # raw blob / image bytes
When table is set the table level is elided, so the mount root is that table:
/fashion/
  Men/
    Shoes/
      White/
        3.md
        3.jpg

Row cards

A <id>.md card renders the row’s columns as readable text and points at its blob. The vector and blob columns are omitted from the card body.
# Nike Men White Running Sneakers

id: 3
gender: Men
articleType: Shoes
baseColour: White
productDisplayName: Nike Men White Running Sneakers
blob: 3.jpg
Search is a command, not a path. It returns each ranked row as its canonical file path (the same <id>.md you would cat while browsing) annotated with the vector distance, followed by the card body. Results point back at the real files, so search composes with cat, pipes, and wc:
$ search "white running sneakers" /fashion
/fashion/Men/Shoes/White/3.md:0.2679
# Nike Men White Running Sneakers

id: 3
gender: Men
articleType: Shoes
baseColour: White
productDisplayName: Nike Men White Running Sneakers
blob: 3.jpg
...
Flags: --top-k <n> (default search_limit), --threshold <max-distance>, --method semantic (the only supported method; grep/rg stay lexical).

Supported commands

All commands delegate to Mirage’s shared implementations.
CommandBehaviour on a LanceDB mount
lslist tables, label folders, or row files
cdnavigate (each level narrows the filter)
treerender the label hierarchy
catprint a row card, or dump raw blob/image bytes
statdirectory vs file, blob size, image mime type
findwalk the tree (e.g. find /fashion -name '*.md')
grep / rglexical search over the rendered cards
searchsemantic (vector) search -> ranked canonical paths + score
head / tailfirst/last lines of a card
wccount lines/bytes of a card
grep/rg stay lexical (literal/regex). search is the semantic path: it auto-embeds the query via the table’s embedding function and returns ranked rows as canonical file paths, which compose with cat, wc, and pipes.

Access pattern

The mount is read-only (MountMode.READ); writes are not supported. The two read modes are:
  • Browse by label folders: pure metadata WHERE filters, no embedding.
  • Search by meaning: search "<query>" <path> runs vector search using the table’s embedding function and returns canonical row paths.
Folder listings scan one column with SELECT DISTINCT and are capped by max_rows, so very large tables should keep group_by to low-cardinality columns.