Initial commit: IMAP email downloader
Single-file Python script to download emails from IMAP servers: - Downloads emails as .eml files preserving folder structure - Extracts attachments to zip files - Supports SSL and STARTTLS connections - Incremental updates using UID tracking (default behavior) - Multi-account support with separate folders per email - Safety checks to prevent duplicate downloads Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,97 @@
|
||||
# Implementation Plan: IMAP Downloader
|
||||
|
||||
## Overview
|
||||
|
||||
Create a single-file Python script (`imapdown.py`) that downloads all emails from an IMAP server and saves them as individual EML files in a local folder structure mirroring the IMAP mailbox hierarchy.
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### 1. Argument Parsing
|
||||
|
||||
Use `argparse` to handle command line arguments:
|
||||
|
||||
**Mandatory arguments:**
|
||||
- `--server` - IMAP server hostname
|
||||
- `--email` - Email address
|
||||
- `--user` - Username for authentication
|
||||
- `--password` - Password for authentication
|
||||
|
||||
**Optional arguments:**
|
||||
- `--ssl` - Use implicit SSL/TLS (typically port 993)
|
||||
- `--starttls` - Use STARTTLS upgrade (typically port 143)
|
||||
- `--port` - Custom port (defaults: 993 for SSL, 143 for STARTTLS/plain)
|
||||
|
||||
Add mutual exclusion for `--ssl` and `--starttls`.
|
||||
|
||||
### 2. IMAP Connection
|
||||
|
||||
- Use Python's built-in `imaplib` module
|
||||
- Connection logic:
|
||||
- If `--ssl`: Use `IMAP4_SSL` (default port 993)
|
||||
- If `--starttls`: Use `IMAP4`, then call `starttls()` (default port 143)
|
||||
- If neither: Use plain `IMAP4` (default port 143)
|
||||
- Authenticate with provided credentials
|
||||
|
||||
### 3. Folder Discovery
|
||||
|
||||
- Use `list()` method to get all mailbox folders
|
||||
- Parse folder names and hierarchy delimiter
|
||||
- Handle folder name encoding (IMAP uses modified UTF-7)
|
||||
|
||||
### 4. Email Download
|
||||
|
||||
For each folder:
|
||||
1. Create corresponding local directory structure
|
||||
2. Select the folder with `select()`
|
||||
3. Search for all messages with `search(None, 'ALL')`
|
||||
4. For each message:
|
||||
- Fetch the complete RFC822 message
|
||||
- Generate a unique filename (using UID or message ID + date)
|
||||
- Save as `.eml` file
|
||||
|
||||
### 5. File Naming Strategy
|
||||
|
||||
Use a naming scheme that ensures uniqueness and provides useful info:
|
||||
- Format: `{UID}_{date}_{subject_snippet}.eml`
|
||||
- Sanitize subject for filesystem safety
|
||||
- Handle duplicates by appending counter if needed
|
||||
|
||||
### 6. Error Handling
|
||||
|
||||
- Connection failures
|
||||
- Authentication errors
|
||||
- Folder access issues
|
||||
- Invalid/corrupt messages
|
||||
- Filesystem errors (permissions, disk space)
|
||||
|
||||
## Dependencies
|
||||
|
||||
Only Python standard library:
|
||||
- `imaplib` - IMAP protocol
|
||||
- `argparse` - Command line parsing
|
||||
- `email` - Email message parsing
|
||||
- `os` / `pathlib` - Filesystem operations
|
||||
- `re` - Regex for sanitization
|
||||
- `datetime` - Date handling
|
||||
|
||||
## Output Structure
|
||||
|
||||
```
|
||||
./download/
|
||||
├── INBOX/
|
||||
│ ├── 1_20240115_Meeting_notes.eml
|
||||
│ └── 2_20240116_Project_update.eml
|
||||
├── Sent/
|
||||
│ └── 1_20240114_RE_Question.eml
|
||||
└── Archive/
|
||||
└── 2023/
|
||||
└── 1_20230501_Old_email.eml
|
||||
```
|
||||
|
||||
## Testing Approach
|
||||
|
||||
1. Test argument parsing with various combinations
|
||||
2. Test connection with SSL, STARTTLS, and plain
|
||||
3. Test with folders containing special characters
|
||||
4. Test with empty folders
|
||||
5. Verify EML files are valid and openable
|
||||
Reference in New Issue
Block a user