DatabricksVolumeResource exposes files from a Unity Catalog volume through
Mirage’s standard filesystem interface. Agents can list, stat, read, stream,
glob, and write files under the configured volume root.
For auth and environment setup, see Databricks Volume Setup.
Config
| Field | Default | Notes |
|---|---|---|
catalog | required | Unity Catalog catalog name. |
schema | required | Unity Catalog schema name. |
volume | required | Unity Catalog volume name. |
root_path | / | Subdirectory inside the volume to expose. |
host | None | Optional workspace host override. |
token | None | Optional PAT override. Redacted in snapshots. |
profile | None | Optional Databricks SDK profile name. |
timeout | 30 | Request timeout in seconds. |
Mount mode
read or write.
Filesystem layout
Mirage maps the configured mount prefix onto the configured volume subtree. Given:/dbx/, the volume path:
root_path is normalized before use, and Mirage rejects any virtual path that
would escape above that configured subtree.
Supported operations
Reads:readdir, stat, exists, read_bytes, read_stream,
range_read, and glob resolution. Writes: write, create, mkdir,
rmdir, unlink, recursive rm, cp, and mv.
mv/cp are non-atomic download + upload — the Files API has no server-side
rename.
Shell Commands
The Databricks Volume resource supports shell commands that operate on real file content. Reads use the Files API with range requests, so commands likehead -c BYTES avoid downloading the whole object. The supported set is
scoped to commands that work over the volume API (no compression, encoding,
or local-only utilities).
Read Commands
| Command | Notes |
|---|---|
cat | Read file content |
head / tail | First/last N lines |
grep / rg | Pattern search (file or directory level) |
jq | Query JSON fields |
wc | Line/word/byte counts |
stat | File metadata (name, size, type, modified) |
find | Recursive search with -name, -maxdepth |
tree | Directory tree view |
nl | Number lines |
Text Processing
| Command | Notes |
|---|---|
awk | Pattern scanning and processing |
sed | Stream editor |
tr | Translate or delete characters |
sort | Sort lines |
uniq | Remove duplicate lines |
cut | Extract fields/columns |
diff | Compare files line by line |
File Operations
| Command | Notes |
|---|---|
cp | Copy files (non-atomic download + upload) |
mv | Move/rename files (non-atomic download + upload) |
rm | Remove files (recursive for directories) |
mkdir | Create directories |
touch | Create empty file or update timestamp |
Path Utilities
| Command | Notes |
|---|---|
ls | List directory contents |
Snapshot behavior
token is redacted in resource state. Loading a snapshot back requires an
override config that provides fresh credentials if the runtime auth chain does
not already supply them.
Databricks Apps
For Databricks Apps, prefer SDK-default auth and keep Mirage in-process:Workspace.execute(...).
Example
examples/python/databricks_volume/databricks_volume.pyexamples/python/agents/langchain/databricks_volume_deepagent.py
Use Cases
- Agents in Databricks Apps: mount a Unity Catalog volume in-process so an agent can read and write governed files without FUSE or hardcoded credentials.
- Reading governed datasets: expose a reports or dataset subtree through
root_pathand query it with shell commands. - Sandboxed volume access: scope an agent to a single volume (and optional
root_path) so it cannot read or write outside that subtree. - Writing agent outputs back: persist generated files to the volume with
write,cp, andmv.