a0e7a494e4
Single-file Python script to download emails from IMAP servers: - Downloads emails as .eml files preserving folder structure - Extracts attachments to zip files - Supports SSL and STARTTLS connections - Incremental updates using UID tracking (default behavior) - Multi-account support with separate folders per email - Safety checks to prevent duplicate downloads Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
98 lines
2.7 KiB
Markdown
98 lines
2.7 KiB
Markdown
# Implementation Plan: IMAP Downloader
|
|
|
|
## Overview
|
|
|
|
Create a single-file Python script (`imapdown.py`) that downloads all emails from an IMAP server and saves them as individual EML files in a local folder structure mirroring the IMAP mailbox hierarchy.
|
|
|
|
## Implementation Steps
|
|
|
|
### 1. Argument Parsing
|
|
|
|
Use `argparse` to handle command line arguments:
|
|
|
|
**Mandatory arguments:**
|
|
- `--server` - IMAP server hostname
|
|
- `--email` - Email address
|
|
- `--user` - Username for authentication
|
|
- `--password` - Password for authentication
|
|
|
|
**Optional arguments:**
|
|
- `--ssl` - Use implicit SSL/TLS (typically port 993)
|
|
- `--starttls` - Use STARTTLS upgrade (typically port 143)
|
|
- `--port` - Custom port (defaults: 993 for SSL, 143 for STARTTLS/plain)
|
|
|
|
Add mutual exclusion for `--ssl` and `--starttls`.
|
|
|
|
### 2. IMAP Connection
|
|
|
|
- Use Python's built-in `imaplib` module
|
|
- Connection logic:
|
|
- If `--ssl`: Use `IMAP4_SSL` (default port 993)
|
|
- If `--starttls`: Use `IMAP4`, then call `starttls()` (default port 143)
|
|
- If neither: Use plain `IMAP4` (default port 143)
|
|
- Authenticate with provided credentials
|
|
|
|
### 3. Folder Discovery
|
|
|
|
- Use `list()` method to get all mailbox folders
|
|
- Parse folder names and hierarchy delimiter
|
|
- Handle folder name encoding (IMAP uses modified UTF-7)
|
|
|
|
### 4. Email Download
|
|
|
|
For each folder:
|
|
1. Create corresponding local directory structure
|
|
2. Select the folder with `select()`
|
|
3. Search for all messages with `search(None, 'ALL')`
|
|
4. For each message:
|
|
- Fetch the complete RFC822 message
|
|
- Generate a unique filename (using UID or message ID + date)
|
|
- Save as `.eml` file
|
|
|
|
### 5. File Naming Strategy
|
|
|
|
Use a naming scheme that ensures uniqueness and provides useful info:
|
|
- Format: `{UID}_{date}_{subject_snippet}.eml`
|
|
- Sanitize subject for filesystem safety
|
|
- Handle duplicates by appending counter if needed
|
|
|
|
### 6. Error Handling
|
|
|
|
- Connection failures
|
|
- Authentication errors
|
|
- Folder access issues
|
|
- Invalid/corrupt messages
|
|
- Filesystem errors (permissions, disk space)
|
|
|
|
## Dependencies
|
|
|
|
Only Python standard library:
|
|
- `imaplib` - IMAP protocol
|
|
- `argparse` - Command line parsing
|
|
- `email` - Email message parsing
|
|
- `os` / `pathlib` - Filesystem operations
|
|
- `re` - Regex for sanitization
|
|
- `datetime` - Date handling
|
|
|
|
## Output Structure
|
|
|
|
```
|
|
./download/
|
|
├── INBOX/
|
|
│ ├── 1_20240115_Meeting_notes.eml
|
|
│ └── 2_20240116_Project_update.eml
|
|
├── Sent/
|
|
│ └── 1_20240114_RE_Question.eml
|
|
└── Archive/
|
|
└── 2023/
|
|
└── 1_20230501_Old_email.eml
|
|
```
|
|
|
|
## Testing Approach
|
|
|
|
1. Test argument parsing with various combinations
|
|
2. Test connection with SSL, STARTTLS, and plain
|
|
3. Test with folders containing special characters
|
|
4. Test with empty folders
|
|
5. Verify EML files are valid and openable
|