Documentation Index
Fetch the complete documentation index at: https://docs.mirage.strukto.ai/llms.txt
Use this file to discover all available pages before exploring further.
The GitHub resource mounts a GitHub repository as a read-only virtual
filesystem.
For token setup, see GitHub Setup.
Config
import os
from mirage import MountMode, Workspace
from mirage.resource.github import GitHubConfig, GitHubResource
config = GitHubConfig(token=os.environ["GITHUB_TOKEN"])
resource = GitHubResource(
config=config, owner="my-org", repo="my-repo", ref="main")
ws = Workspace({"/github": resource}, mode=MountMode.READ)
Filesystem Layout
/github/
README.md
pyproject.toml
src/
__init__.py
main.py
utils.py
models/
user.py
item.py
tests/
test_main.py
The filesystem mirrors the repository tree. No owner/repo/branch in
the path - those are specified at mount time.
Tree Fetching
The resource fetches the full recursive tree at init. For repos with
100K entries, it falls back to per-directory fetching.
Cache
The GitHub resource uses IndexCacheStore with SHA-based fingerprinting. Content is content-addressed - if the SHA matches, the content is identical.
Example
import asyncio
import os
from dotenv import load_dotenv
from mirage import MountMode, Workspace
from mirage.resource.github import GitHubConfig, GitHubResource
load_dotenv(".env.development")
async def main():
config = GitHubConfig(token=os.environ["GITHUB_TOKEN"])
resource = GitHubResource(
config=config, owner="my-org", repo="my-repo", ref="main")
ws = Workspace({"/github": resource}, mode=MountMode.READ)
# List repository root
r = await ws.execute("ls /github/")
print(await r.stdout_str())
# Read a file
r = await ws.execute("cat /github/README.md")
print(await r.stdout_str())
# Search for a pattern
r = await ws.execute('rg "def main" /github/')
print(await r.stdout_str())
# Tree view
r = await ws.execute("tree -L 2 /github/")
print(await r.stdout_str())
# File metadata with SHA
r = await ws.execute("stat /github/README.md")
print(await r.stdout_str())
if __name__ == "__main__":
asyncio.run(main())
See examples/code/github.py for the full working example.
Finding SHAs
Git blob SHAs are available via the stat command:
stat /github/README.md
# -> extra={"sha": "a1b2c3d4e5f6..."}
stat /github/src/main.py
# -> extra={"sha": "f6e5d4c3b2a1..."}
Working with Large Repos
Tips for efficient access on large repositories:
# Find files by name
find /github/ -name "*.py"
# Search with rg (uses GitHub code search API when applicable)
rg "TODO" /github/
# Read only the first lines of a file
head -n 20 /github/src/main.py
# Check file sizes
du /github/src/
# List deeply nested directories
tree -L 3 /github/src/
Shell Commands
Standard commands available on the mounted GitHub tree:
| Command | Notes |
|---|
ls | List files and directories |
cat | Read file contents |
head / tail | First/last N lines |
grep / rg | Pattern search; rg uses code search API |
jq | Query JSON files |
wc | Line/word/byte counts |
stat | File metadata including git SHA |
find | Recursive search with -name, -maxdepth |
tree | Directory tree view |
diff | Compare files |
du | Disk usage / file sizes |
awk | Text processing |
sed | Stream editing |
sort | Sort lines |
uniq | Deduplicate lines |
cut | Extract columns |
tr | Translate characters |
nl | Number lines |
md5 | MD5 checksum |
sha256sum | SHA-256 checksum |
file | Detect file type |
basename | Strip directory from path |
dirname | Strip filename from path |
realpath | Resolve path |
Search Optimization
rg uses the GitHub code search API when the search scope exceeds
100 files and the mounted ref is the repository’s default branch.
This avoids downloading file contents and returns results significantly
faster for large repositories.