4.1 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
This is a single-file Python script (imapdown.py) that downloads all emails from an IMAP server into individual EML files, preserving the folder hierarchy. It uses only Python's standard library and has no external dependencies.
Development Environment
- Python 3.6+ required
- Virtual environment is set up in
.venv- activate it before running:source .venv/bin/activate
Running the Script
Basic usage (incremental mode - only downloads new emails):
./imapdown.py --server imap.example.com --email user@example.com --user user@example.com --password "password" --ssl
Full download (ignores previous state, requires empty target directory):
./imapdown.py --server imap.example.com --email user@example.com --user user@example.com --password "password" --ssl --full
Testing/debugging with limited emails:
./imapdown.py --server imap.example.com --email user@example.com --user user@example.com --password "password" --ssl --limit 10
Custom storage directory:
./imapdown.py --server imap.example.com --email user@example.com --user user@example.com --password "password" --ssl --output /path/to/backup
Architecture
Single-File Design
The entire application is contained in imapdown.py (13KB). This is intentional - no modules or packages.
State Tracking
- The script maintains a
.imapdown_state.jsonfile in each email account's download folder - Tracks the highest UID (unique identifier) downloaded per IMAP folder
- Format:
{"INBOX": 19334, "INBOX.Archive": 1770, "Sent": 892} - Enables efficient incremental downloads (default mode)
Download Flow
- Parse arguments
- Connect to IMAP server (SSL, STARTTLS, or plain)
- List all folders and decode modified UTF-7 folder names
- For each folder:
- Load last downloaded UID from state file (if incremental mode)
- Search for new messages (UID > last_uid)
- Download each message as RFC822
- Save as
.emlfile with naming:{UID}_{date}_{subject}.eml - Extract attachments into
.zipfile (same base name) - Update state with highest UID
- Save state file
Key Implementation Details
Modified UTF-7 Decoding: IMAP folder names use modified UTF-7 encoding (see decode_modified_utf7() at line 39). This is not standard base64 - it uses , instead of / and has special & handling.
Filename Sanitization: Two-stage process:
sanitize_filename(): Removes invalid filesystem characters, max 50 chars for subjectssanitize_folder_path(): Converts IMAP folder separators (.or/) to OS path separators
UID-Based Incremental Updates: Uses IMAP UIDs (not sequence numbers) because UIDs are persistent. The search UID {last_uid + 1}:* fetches only new messages. Some servers return the highest UID even when searching for higher UIDs, so there's additional filtering at line 251.
Full Mode Safety: --full mode checks if the download folder already contains .eml files and refuses to run (line 325). This prevents accidental duplicates. Users must delete the folder first.
Attachment Handling:
- Walks message parts looking for
Content-Disposition: attachmentorinline - Handles duplicate attachment filenames by appending
_{counter} - All attachments for one email go into a single
.zipfile
Output Structure
{output_dir}/ # default: ./download
└── {email_address}/ # sanitized email address
├── .imapdown_state.json
├── INBOX/
│ ├── 123_20240115_Meeting_notes.eml
│ └── 124_20240116_Report.zip
└── Sent/
└── 456_20240114_RE_Question.eml
Testing
No formal test suite exists. Manual testing approach:
- Use
--limit 10to download a small batch for verification - Test SSL vs STARTTLS connections
- Test incremental mode by running twice
- Verify
.emlfiles open correctly in email clients - Check that folders with special characters (non-ASCII) are handled correctly