Files
steve a0e7a494e4 Initial commit: IMAP email downloader
Single-file Python script to download emails from IMAP servers:
- Downloads emails as .eml files preserving folder structure
- Extracts attachments to zip files
- Supports SSL and STARTTLS connections
- Incremental updates using UID tracking (default behavior)
- Multi-account support with separate folders per email
- Safety checks to prevent duplicate downloads

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 17:37:11 +00:00

2.7 KiB

Implementation Plan: IMAP Downloader

Overview

Create a single-file Python script (imapdown.py) that downloads all emails from an IMAP server and saves them as individual EML files in a local folder structure mirroring the IMAP mailbox hierarchy.

Implementation Steps

1. Argument Parsing

Use argparse to handle command line arguments:

Mandatory arguments:

  • --server - IMAP server hostname
  • --email - Email address
  • --user - Username for authentication
  • --password - Password for authentication

Optional arguments:

  • --ssl - Use implicit SSL/TLS (typically port 993)
  • --starttls - Use STARTTLS upgrade (typically port 143)
  • --port - Custom port (defaults: 993 for SSL, 143 for STARTTLS/plain)

Add mutual exclusion for --ssl and --starttls.

2. IMAP Connection

  • Use Python's built-in imaplib module
  • Connection logic:
    • If --ssl: Use IMAP4_SSL (default port 993)
    • If --starttls: Use IMAP4, then call starttls() (default port 143)
    • If neither: Use plain IMAP4 (default port 143)
  • Authenticate with provided credentials

3. Folder Discovery

  • Use list() method to get all mailbox folders
  • Parse folder names and hierarchy delimiter
  • Handle folder name encoding (IMAP uses modified UTF-7)

4. Email Download

For each folder:

  1. Create corresponding local directory structure
  2. Select the folder with select()
  3. Search for all messages with search(None, 'ALL')
  4. For each message:
    • Fetch the complete RFC822 message
    • Generate a unique filename (using UID or message ID + date)
    • Save as .eml file

5. File Naming Strategy

Use a naming scheme that ensures uniqueness and provides useful info:

  • Format: {UID}_{date}_{subject_snippet}.eml
  • Sanitize subject for filesystem safety
  • Handle duplicates by appending counter if needed

6. Error Handling

  • Connection failures
  • Authentication errors
  • Folder access issues
  • Invalid/corrupt messages
  • Filesystem errors (permissions, disk space)

Dependencies

Only Python standard library:

  • imaplib - IMAP protocol
  • argparse - Command line parsing
  • email - Email message parsing
  • os / pathlib - Filesystem operations
  • re - Regex for sanitization
  • datetime - Date handling

Output Structure

./download/
├── INBOX/
│   ├── 1_20240115_Meeting_notes.eml
│   └── 2_20240116_Project_update.eml
├── Sent/
│   └── 1_20240114_RE_Question.eml
└── Archive/
    └── 2023/
        └── 1_20230501_Old_email.eml

Testing Approach

  1. Test argument parsing with various combinations
  2. Test connection with SSL, STARTTLS, and plain
  3. Test with folders containing special characters
  4. Test with empty folders
  5. Verify EML files are valid and openable