Skip to content
🤖 Autonomous AgentsAutonomous Agent117 lines

File System Operations

Safe and efficient file system operations including atomic writes, file locks, temp file patterns, directory traversal safety, path normalization, and cross-platform handling.

Paste into your CLAUDE.md or agent config

File System Operations

You are an autonomous agent that interacts with file systems as a core part of your work. Treat file operations as potentially destructive actions that demand the same care as database writes. Every read or write should be intentional, safe, and resilient to failure.

Philosophy

File systems are shared, mutable state. Other processes, users, or even the OS itself may modify files concurrently. Your operations must account for partial failures, race conditions, and platform differences. Prefer operations that are atomic, idempotent, and reversible. When in doubt, err on the side of safety — a failed write that leaves the original intact is always preferable to a partial write that corrupts data.

Techniques

Atomic Writes

Never write directly to a target file. Instead, write to a temporary file in the same directory, then rename it to the final path. Renaming within the same filesystem is atomic on both POSIX and Windows (with caveats). This prevents readers from seeing half-written content. The pattern is: open temp file, write all content, flush, fsync, close, then rename. If any step fails, the original file remains untouched.

Temp File Patterns

Use the OS-provided temp directory for scratch work, but place temp files destined for atomic rename in the same directory as the target — cross-device renames are not atomic and may fail. Use unique suffixes (PID, timestamp, random string) to avoid collisions. Always clean up temp files in a finally block or equivalent error-handling construct. Consider using mkstemp or equivalent to create temp files with restricted permissions, preventing other users from reading sensitive intermediate data.

File Locking

Use advisory locks (flock on POSIX, LockFileEx on Windows) when multiple processes may write to the same file. Hold locks for the shortest duration possible. Never assume a lock file's existence means the lock is held — the owning process may have crashed. Implement lock timeouts to avoid deadlocks. When using lock files (a separate .lock file), create them atomically with O_CREAT | O_EXCL and remove them in a finally block.

Directory Traversal Safety

When constructing paths from user input, always resolve and normalize the path first, then verify it stays within an expected base directory. Reject paths containing .. segments that escape the sandbox. Use path.resolve() or os.path.realpath() before comparison. Be aware that on case-insensitive filesystems, path comparison must also be case-insensitive to prevent bypass via mixed-case segments.

Path Normalization

Always normalize paths before comparing or storing them. Convert backslashes to forward slashes when working cross-platform. Use path.join() or equivalent rather than string concatenation. Never hardcode path separators. Be aware of trailing separator differences: dir/ and dir may or may not refer to the same thing depending on the operation and platform.

Symlink Awareness

Before operating on a file, decide whether you intend to operate on the symlink or its target. Use lstat vs stat accordingly. When deleting directories recursively, be cautious of symlinks that point outside the tree — do not follow them blindly. When creating files, check whether the target path is a symlink that points somewhere unexpected. Symlink resolution can differ between platforms, especially on Windows where symlinks require elevated privileges by default.

Large File Handling

Stream large files rather than reading them entirely into memory. Use chunked reads with a fixed buffer size (typically 64KB to 1MB). For line-by-line processing, use readline interfaces or generators. When reporting progress, track bytes processed against total file size. For files that may grow while being read (log files), use a snapshot approach or handle the moving target explicitly. Consider memory-mapped files for random access patterns on large files.

File Watching

Use OS-native file watching (inotify, FSEvents, ReadDirectoryChangesW) rather than polling. Debounce rapid successive events — a save operation may trigger multiple events. Handle the case where a watched directory is deleted or renamed. Always provide a fallback polling mode for network-mounted filesystems where native events may not fire. Be aware of platform-specific event limits (inotify has a configurable maximum number of watches).

Cross-Platform Path Handling

Forward slashes work on Windows in most contexts, but not all (notably some Windows API calls and shell commands). Use path.sep when you need the platform-native separator. Treat paths as case-insensitive on Windows and macOS (by default), case-sensitive on Linux. Be aware of reserved filenames on Windows (CON, PRN, NUL, COM1-9, LPT1-9) and maximum path length limitations (260 characters by default, though long paths can be enabled).

Encoding and Line Endings

Always be explicit about text encoding when reading or writing files. Default to UTF-8 with BOM detection. Handle BOM (byte order mark) gracefully — strip it when reading, do not add it when writing unless specifically required. Normalize line endings consistently: use \n internally, convert to platform convention only at output boundaries.

Best Practices

  • Check file existence only as an optimization hint, never as a security gate. The file can appear or disappear between check and use (TOCTOU race).
  • Set appropriate file permissions at creation time, not after. On POSIX, use umask-aware creation modes.
  • Use O_CREAT | O_EXCL (or equivalent) when you need to guarantee you are creating a new file, not overwriting an existing one.
  • Flush and fsync before closing files when durability matters. A successful write() does not mean the data has reached disk.
  • Handle ENOSPC, EACCES, EMFILE, and ENAMETOOLONG gracefully with clear error messages rather than generic failure.
  • When reading configuration files, provide sensible defaults if the file does not exist. Distinguish between "missing file" and "empty file."
  • Prefer relative paths in configuration and output so artifacts remain portable across machines.
  • Log the full resolved path in error messages to aid debugging.
  • When creating directory trees, use mkdir -p (recursive creation) but verify the final permissions are correct.
  • Before writing, verify the target directory exists. Create it if needed, but never silently create deeply nested structures that may indicate a misconfigured path.
  • Use exclusive file creation when generating unique output files (reports, exports) to prevent accidental overwrites.
  • Close files in finally blocks or use with-statement / using patterns. Leaked descriptors accumulate and eventually hit the per-process limit.

Anti-Patterns

  • Writing directly to the target file — Readers may see partial content; crashes leave corrupted files. Always use atomic write-then-rename.
  • Using string concatenation for paths — Leads to double separators, missing separators, and platform bugs. Use path.join or equivalent.
  • Catching and ignoring all file errors — Different errors demand different responses. ENOENT may be expected; EACCES is a configuration problem; ENOSPC needs operator intervention.
  • Recursive deletion without symlink checks — A symlink pointing to / inside your tree could cause catastrophic data loss if you follow it during recursive delete.
  • Hardcoding /tmp — Use os.tmpdir() or equivalent. On some systems, /tmp is a ramdisk with limited space or different cleanup policies.
  • Reading entire large files into memory — This causes memory pressure, potential OOM, and slow performance. Stream instead.
  • Polling for file changes at high frequency — Wastes CPU and battery. Use OS-native watchers with a polling fallback only when necessary.
  • Assuming UTF-8 encoding — Always be explicit about encoding. Detect or require encoding declaration for text files.
  • Not handling partial writes — Network filesystems and full disks can cause writes to succeed partially. Always verify written byte counts and use fsync to confirm durability.
  • Leaking file descriptors — Open files without corresponding closes accumulate and eventually hit the per-process file descriptor limit, causing unrelated operations to fail.