Documentation Index
Fetch the complete documentation index at: https://docs.mirage.strukto.ai/llms.txt
Use this file to discover all available pages before exploring further.
The Disk resource mounts a local directory at some prefix such as /data/.
All operations are backed by real files on disk. Path resolution validates
against the root boundary to prevent directory traversal escapes.
Config
from mirage import MountMode, Workspace
from mirage.resource.disk import DiskResource
resource = DiskResource(root="/path/to/dir")
ws = Workspace({"/data": resource}, mode=MountMode.READ)
DiskResource(root=...) takes a single root path argument pointing to the
directory to mount. Both READ and WRITE modes are supported.
Filesystem Layout
The Disk resource mirrors the structure of the root directory. For example,
if root="/srv/files" contains:
/srv/files/
notes.txt
config.json
reports/
q1.csv
q2.csv
Then mounting at /data/ exposes:
/data/
notes.txt
config.json
reports/
q1.csv
q2.csv
Paths like ../../etc/passwd are rejected - resolution is always confined
to the root boundary.
Cache
The Disk resource uses IndexCacheStore with _index_ttl = 60 (1 minute).
Directory listings are cached for up to 60 seconds before being refreshed
from disk.
Example
import asyncio
import shutil
import tempfile
from pathlib import Path
from mirage import MountMode, Workspace
from mirage.resource.disk import DiskResource
DATA_DIR = Path("/path/to/files")
tmp = tempfile.mkdtemp()
shutil.copytree(DATA_DIR, Path(tmp) / "files", dirs_exist_ok=True)
resource = DiskResource(root=tmp + "/files")
async def main() -> None:
ws = Workspace({"/data/": resource}, mode=MountMode.READ)
r = await ws.execute("ls /data/")
print(await r.stdout_str())
r = await ws.execute("cat /data/example.json")
print(await r.stdout_str())
r = await ws.execute("tree /data/")
print(await r.stdout_str())
r = await ws.execute("find /data/ -name '*.json'")
print(await r.stdout_str())
r = await ws.execute("grep example /data/example.json")
print(await r.stdout_str())
r = await ws.execute("stat /data/example.json")
print(await r.stdout_str())
if __name__ == "__main__":
asyncio.run(main())
Shell Commands
The Disk resource supports the full set of shell commands since it operates
on real file content (text, binary, JSON, CSV, etc.):
Read Commands
| Command | Notes |
|---|
cat | Read file content |
head / tail | First/last N lines |
grep / rg | Pattern search (file or directory level) |
jq | Query JSON fields |
wc | Line/word/byte counts |
stat | File metadata (name, size, type, modified) |
find | Recursive search with -name, -maxdepth |
tree | Directory tree view |
nl | Number lines |
du | Disk usage summary |
file | Detect file type |
strings | Extract printable strings from binary |
xxd | Hex dump |
md5 | MD5 checksum |
sha256sum | SHA-256 checksum |
Text Processing
| Command | Notes |
|---|
awk | Pattern scanning and processing |
sed | Stream editor |
tr | Translate or delete characters |
sort | Sort lines |
uniq | Remove duplicate lines |
cut | Extract fields/columns |
join | Join lines on a common field |
paste | Merge lines side by side |
column | Columnate output |
fold | Wrap lines to a specified width |
expand | Convert tabs to spaces |
unexpand | Convert spaces to tabs |
fmt | Simple text formatter |
rev | Reverse lines |
tac | Concatenate and print in reverse |
look | Display lines beginning with a given string |
shuf | Shuffle lines |
tsort | Topological sort |
comm | Compare two sorted files |
cmp | Compare two files byte by byte |
diff | Compare files line by line |
patch | Apply a diff patch |
iconv | Character encoding conversion |
File Operations
| Command | Notes |
|---|
cp | Copy files |
mv | Move/rename files |
rm | Remove files |
mkdir | Create directories |
touch | Create empty file or update timestamp |
ln | Create symbolic links |
tee | Write stdin to file and stdout |
mktemp | Create temporary file |
split | Split file into pieces |
csplit | Split file by context |
Path Utilities
| Command | Notes |
|---|
basename | Strip directory from path |
dirname | Strip filename from path |
realpath | Resolve path |
readlink | Print symbolic link target |
ls | List directory contents |
Compression
| Command | Notes |
|---|
gzip | Compress files |
gunzip | Decompress gzip files |
zip | Create zip archives |
unzip | Extract zip archives |
tar | Archive files |
zcat | Cat compressed files |
zgrep | Grep compressed files |
Encoding
| Command | Notes |
|---|
base64 | Base64 encode/decode |
Commands with format-specific variants for structured data files:
| Format | Extension | Variants |
|---|
| Parquet | .parquet | cat, head, tail, wc, stat, cut, grep, ls, file |
| Feather | .feather | cat, head, tail, wc, stat, cut, grep, ls, file |
| ORC | .orc | cat, head, tail, wc, stat, cut, grep, ls, file |
| HDF5 | .hdf5 | cat, head, tail, wc, stat, cut, grep, ls, file |
These variants auto-detect the format by extension and convert to
tabular text (CSV) for processing.
Audio Support (Optional)
Audio commands are opt-in and require sherpa-onnx with a Whisper model.
They transcribe audio to text, enabling cat, head, tail, grep,
and stat on audio files.
| Format | Extension | Commands |
|---|
| WAV | .wav | cat, head, tail, grep, stat |
| MP3 | .mp3 | cat, head, tail, grep, stat |
| OGG | .ogg | cat, head, tail, grep, stat |
To enable, register audio commands manually:
from mirage.commands.audio import AUDIO_COMMANDS
from mirage.commands.audio.utils import configure
configure(model_dir="path/to/sherpa-onnx-whisper-base")
for cmd in AUDIO_COMMANDS:
ws.register(cmd)
Use Cases
- Local directory access: Mount local directories for AI agents to read and process
- Sandboxed file access: Restrict agent file operations to a specific directory tree
- FUSE mounting: Expose disk files through a virtual FUSE mount for external tools
- Data pipelines: Process local datasets with shell-like commands
- Development: Test file operations against real data before deploying to cloud resources