Config
Filesystem Layout
Tree Fetching
The resource fetches the full recursive tree at init. For repos with100K entries, it falls back to per-directory fetching.
Cache
The GitHub resource usesIndexCacheStore with SHA-based fingerprinting. Content is content-addressed - if the SHA matches, the content is identical.
Example
examples/code/github.py for the full working example.
Finding SHAs
Git blob SHAs are available via thestat command:
Working with Large Repos
Tips for efficient access on large repositories:Shell Commands
Standard commands available on the mounted GitHub tree:| Command | Notes |
|---|---|
ls | List files and directories |
cat | Read file contents |
head / tail | First/last N lines |
grep / rg | Pattern search; rg uses code search API |
jq | Query JSON files |
wc | Line/word/byte counts |
stat | File metadata including git SHA |
find | Recursive search with -name, -maxdepth |
tree | Directory tree view |
diff | Compare files |
du | Disk usage / file sizes |
awk | Text processing |
sed | Stream editing |
sort | Sort lines |
uniq | Deduplicate lines |
cut | Extract columns |
tr | Translate characters |
nl | Number lines |
md5 | MD5 checksum |
sha256sum | SHA-256 checksum |
file | Detect file type |
basename | Strip directory from path |
dirname | Strip filename from path |
realpath | Resolve path |
Search Optimization
rg uses the GitHub code search API when the search scope exceeds
100 files and the mounted ref is the repository’s default branch.
This avoids downloading file contents and returns results significantly
faster for large repositories.