Skip to main content
DatabricksVolumeResource exposes files from a Unity Catalog volume through Mirage’s standard filesystem interface. Agents can list, stat, read, stream, glob, and write files under the configured volume root. For auth and environment setup, see Databricks Volume Setup.

Config

from mirage import MountMode, Workspace
from mirage.resource.databricks_volume import (
    DatabricksVolumeConfig,
    DatabricksVolumeResource,
)

resource = DatabricksVolumeResource(DatabricksVolumeConfig(
    catalog="main",
    schema="default",
    volume="agent_files",
    root_path="/reports",
))
ws = Workspace({"/dbx": resource}, mode=MountMode.READ)
FieldDefaultNotes
catalogrequiredUnity Catalog catalog name.
schemarequiredUnity Catalog schema name.
volumerequiredUnity Catalog volume name.
root_path/Subdirectory inside the volume to expose.
hostNoneOptional workspace host override.
tokenNoneOptional PAT override. Redacted in snapshots.
profileNoneOptional Databricks SDK profile name.
timeout30Request timeout in seconds.

Mount mode

read or write.

Filesystem layout

Mirage maps the configured mount prefix onto the configured volume subtree. Given:
DatabricksVolumeConfig(
    catalog="main",
    schema="default",
    volume="agent_files",
    root_path="/reports/2026",
)
and mount prefix /dbx/, the volume path:
/Volumes/main/default/agent_files/reports/2026/q1/summary.md
appears in Mirage as:
/dbx/q1/summary.md
root_path is normalized before use, and Mirage rejects any virtual path that would escape above that configured subtree.

Supported operations

Reads: readdir, stat, exists, read_bytes, read_stream, range_read, and glob resolution. Writes: write, create, mkdir, rmdir, unlink, recursive rm, cp, and mv. mv/cp are non-atomic download + upload — the Files API has no server-side rename.

Shell Commands

The Databricks Volume resource supports shell commands that operate on real file content. Reads use the Files API with range requests, so commands like head -c BYTES avoid downloading the whole object. The supported set is scoped to commands that work over the volume API (no compression, encoding, or local-only utilities).

Read Commands

CommandNotes
catRead file content
head / tailFirst/last N lines
grep / rgPattern search (file or directory level)
jqQuery JSON fields
wcLine/word/byte counts
statFile metadata (name, size, type, modified)
findRecursive search with -name, -maxdepth
treeDirectory tree view
nlNumber lines

Text Processing

CommandNotes
awkPattern scanning and processing
sedStream editor
trTranslate or delete characters
sortSort lines
uniqRemove duplicate lines
cutExtract fields/columns
diffCompare files line by line

File Operations

CommandNotes
cpCopy files (non-atomic download + upload)
mvMove/rename files (non-atomic download + upload)
rmRemove files (recursive for directories)
mkdirCreate directories
touchCreate empty file or update timestamp

Path Utilities

CommandNotes
lsList directory contents

Snapshot behavior

token is redacted in resource state. Loading a snapshot back requires an override config that provides fresh credentials if the runtime auth chain does not already supply them.

Databricks Apps

For Databricks Apps, prefer SDK-default auth and keep Mirage in-process:
config = DatabricksVolumeConfig(
    catalog="main",
    schema="default",
    volume="agent_files",
)
resource = DatabricksVolumeResource(config)
ws = Workspace({"/dbx/": resource}, mode=MountMode.READ)
This does not require FUSE. The agent can access the mounted workspace through Mirage’s backend adapters or Workspace.execute(...).

Example

import asyncio

from mirage import MountMode, Workspace
from mirage.resource.databricks_volume import (
    DatabricksVolumeConfig,
    DatabricksVolumeResource,
)

resource = DatabricksVolumeResource(DatabricksVolumeConfig(
    catalog="main",
    schema="default",
    volume="agent_files",
    root_path="/reports",
))


async def main() -> None:
    ws = Workspace({"/dbx/": resource}, mode=MountMode.READ)

    r = await ws.execute("ls /dbx/")
    print(await r.stdout_str())

    r = await ws.execute("find /dbx/ -name '*.md'")
    print(await r.stdout_str())

    r = await ws.execute('head -n 20 "/dbx/q1/summary.md"')
    print(await r.stdout_str())

    r = await ws.execute('stat "/dbx/q1/summary.md"')
    print(await r.stdout_str())


if __name__ == "__main__":
    asyncio.run(main())
See:
  • examples/python/databricks_volume/databricks_volume.py
  • examples/python/agents/langchain/databricks_volume_deepagent.py

Use Cases

  • Agents in Databricks Apps: mount a Unity Catalog volume in-process so an agent can read and write governed files without FUSE or hardcoded credentials.
  • Reading governed datasets: expose a reports or dataset subtree through root_path and query it with shell commands.
  • Sandboxed volume access: scope an agent to a single volume (and optional root_path) so it cannot read or write outside that subtree.
  • Writing agent outputs back: persist generated files to the volume with write, cp, and mv.