File System Operations
Safe and efficient file system operations including atomic writes, file locks, temp file patterns, directory traversal safety, path normalization, and cross-platform handling.
File System Operations
You are an autonomous agent that interacts with file systems as a core part of your work. Treat file operations as potentially destructive actions that demand the same care as database writes. Every read or write should be intentional, safe, and resilient to failure.
Philosophy
File systems are shared, mutable state. Other processes, users, or even the OS itself may modify files concurrently. Your operations must account for partial failures, race conditions, and platform differences. Prefer operations that are atomic, idempotent, and reversible. When in doubt, err on the side of safety — a failed write that leaves the original intact is always preferable to a partial write that corrupts data.
Techniques
Atomic Writes
Never write directly to a target file. Instead, write to a temporary file in the same directory, then rename it to the final path. Renaming within the same filesystem is atomic on both POSIX and Windows (with caveats). This prevents readers from seeing half-written content. The pattern is: open temp file, write all content, flush, fsync, close, then rename. If any step fails, the original file remains untouched.
Temp File Patterns
Use the OS-provided temp directory for scratch work, but place temp files destined for atomic rename in the same directory as the target — cross-device renames are not atomic and may fail.
Use unique suffixes (PID, timestamp, random string) to avoid collisions.
Always clean up temp files in a finally block or equivalent error-handling construct.
Consider using mkstemp or equivalent to create temp files with restricted permissions, preventing other users from reading sensitive intermediate data.
File Locking
Use advisory locks (flock on POSIX, LockFileEx on Windows) when multiple processes may write to the same file.
Hold locks for the shortest duration possible.
Never assume a lock file's existence means the lock is held — the owning process may have crashed.
Implement lock timeouts to avoid deadlocks.
When using lock files (a separate .lock file), create them atomically with O_CREAT | O_EXCL and remove them in a finally block.
Directory Traversal Safety
When constructing paths from user input, always resolve and normalize the path first, then verify it stays within an expected base directory.
Reject paths containing .. segments that escape the sandbox.
Use path.resolve() or os.path.realpath() before comparison.
Be aware that on case-insensitive filesystems, path comparison must also be case-insensitive to prevent bypass via mixed-case segments.
Path Normalization
Always normalize paths before comparing or storing them.
Convert backslashes to forward slashes when working cross-platform.
Use path.join() or equivalent rather than string concatenation.
Never hardcode path separators.
Be aware of trailing separator differences: dir/ and dir may or may not refer to the same thing depending on the operation and platform.
Symlink Awareness
Before operating on a file, decide whether you intend to operate on the symlink or its target.
Use lstat vs stat accordingly.
When deleting directories recursively, be cautious of symlinks that point outside the tree — do not follow them blindly.
When creating files, check whether the target path is a symlink that points somewhere unexpected.
Symlink resolution can differ between platforms, especially on Windows where symlinks require elevated privileges by default.
Large File Handling
Stream large files rather than reading them entirely into memory. Use chunked reads with a fixed buffer size (typically 64KB to 1MB). For line-by-line processing, use readline interfaces or generators. When reporting progress, track bytes processed against total file size. For files that may grow while being read (log files), use a snapshot approach or handle the moving target explicitly. Consider memory-mapped files for random access patterns on large files.
File Watching
Use OS-native file watching (inotify, FSEvents, ReadDirectoryChangesW) rather than polling. Debounce rapid successive events — a save operation may trigger multiple events. Handle the case where a watched directory is deleted or renamed. Always provide a fallback polling mode for network-mounted filesystems where native events may not fire. Be aware of platform-specific event limits (inotify has a configurable maximum number of watches).
Cross-Platform Path Handling
Forward slashes work on Windows in most contexts, but not all (notably some Windows API calls and shell commands).
Use path.sep when you need the platform-native separator.
Treat paths as case-insensitive on Windows and macOS (by default), case-sensitive on Linux.
Be aware of reserved filenames on Windows (CON, PRN, NUL, COM1-9, LPT1-9) and maximum path length limitations (260 characters by default, though long paths can be enabled).
Encoding and Line Endings
Always be explicit about text encoding when reading or writing files.
Default to UTF-8 with BOM detection.
Handle BOM (byte order mark) gracefully — strip it when reading, do not add it when writing unless specifically required.
Normalize line endings consistently: use \n internally, convert to platform convention only at output boundaries.
Best Practices
- Check file existence only as an optimization hint, never as a security gate. The file can appear or disappear between check and use (TOCTOU race).
- Set appropriate file permissions at creation time, not after. On POSIX, use umask-aware creation modes.
- Use
O_CREAT | O_EXCL(or equivalent) when you need to guarantee you are creating a new file, not overwriting an existing one. - Flush and fsync before closing files when durability matters. A successful
write()does not mean the data has reached disk. - Handle
ENOSPC,EACCES,EMFILE, andENAMETOOLONGgracefully with clear error messages rather than generic failure. - When reading configuration files, provide sensible defaults if the file does not exist. Distinguish between "missing file" and "empty file."
- Prefer relative paths in configuration and output so artifacts remain portable across machines.
- Log the full resolved path in error messages to aid debugging.
- When creating directory trees, use
mkdir -p(recursive creation) but verify the final permissions are correct. - Before writing, verify the target directory exists. Create it if needed, but never silently create deeply nested structures that may indicate a misconfigured path.
- Use exclusive file creation when generating unique output files (reports, exports) to prevent accidental overwrites.
- Close files in finally blocks or use with-statement / using patterns. Leaked descriptors accumulate and eventually hit the per-process limit.
Anti-Patterns
- Writing directly to the target file — Readers may see partial content; crashes leave corrupted files. Always use atomic write-then-rename.
- Using string concatenation for paths — Leads to double separators, missing separators, and platform bugs. Use path.join or equivalent.
- Catching and ignoring all file errors — Different errors demand different responses. ENOENT may be expected; EACCES is a configuration problem; ENOSPC needs operator intervention.
- Recursive deletion without symlink checks — A symlink pointing to
/inside your tree could cause catastrophic data loss if you follow it during recursive delete. - Hardcoding
/tmp— Useos.tmpdir()or equivalent. On some systems,/tmpis a ramdisk with limited space or different cleanup policies. - Reading entire large files into memory — This causes memory pressure, potential OOM, and slow performance. Stream instead.
- Polling for file changes at high frequency — Wastes CPU and battery. Use OS-native watchers with a polling fallback only when necessary.
- Assuming UTF-8 encoding — Always be explicit about encoding. Detect or require encoding declaration for text files.
- Not handling partial writes — Network filesystems and full disks can cause writes to succeed partially. Always verify written byte counts and use fsync to confirm durability.
- Leaking file descriptors — Open files without corresponding closes accumulate and eventually hit the per-process file descriptor limit, causing unrelated operations to fail.
Related Skills
Abstraction Control
Avoiding over-abstraction and unnecessary complexity by choosing the simplest solution that solves the actual problem
Accessibility Implementation
Making web content accessible through ARIA attributes, semantic HTML, keyboard navigation, screen reader support, color contrast, focus management, and WCAG compliance.
API Design Patterns
Designing and implementing clean APIs with proper REST conventions, pagination, versioning, authentication, and backward compatibility.
API Integration
Integrating with external APIs effectively — reading API docs, authentication patterns, error handling, rate limiting, retry with backoff, response validation, SDK vs raw HTTP decisions, and API versioning.
Assumption Validation
Detecting and validating assumptions before acting on them to prevent cascading errors from wrong guesses
Authentication Implementation
Implementing authentication flows correctly including OAuth 2.0/OIDC, JWT handling, session management, password hashing, MFA, token refresh, and CSRF protection.