Taking your contacts with you when you switch jobs, without stealing from your former employer

By | November 23, 2022

I’ve worked almost entirely for startups for the past 30+ years. It’s what I love, but switching jobs a lot comes with the territory. One of the problems I face with each switch is how to take with me outside contacts I’ve made, without compromising my soon-to-be-ex-employer’s intellectual property. Sure, I’ve added some of them to my personal contacts already, but what about the ones I haven’t but might want to contact later?

This is moot if you get laid off and immediately locked out of your company accounts; your access is gone, and anything you might have wanted to take with you is no longer accessible.

This is moot in the other direction if you’re switching jobs because your company has run out of money and is going out of business. they’re not going to particularly care what you take with you, so it’s probably just fine to download a copy of your entire email archive and stash it somewhere for future reference.

The problem area is in the middle, when you know you’re going to be leaving a company which will persist after you’re gone, so it would be wrong to just keep your email archive, but you want to keep contact info you might need later from those emails.

Here’s my solution:

  • I download my email archive in mbox format (I use Thunderbird as my email client, so this is easy to do with the ImportExportTools NG Thunderbird add-on).
  • I run the archive file through dos2unix because who needs all those carriage returns ugh.
  • I run the archive through a script (this is one of the few tasks for which I still prefer Perl over Python) which does the following:
    • Discard nearly all the headers in each message, leaving only a few crucial ones, e.g., From, Reply-To, To, CC, Date, Message-ID, Subject.
    • Delete the message bodies! The goal here is to keep contact info, not the contents of correspondence I engaged in on behalf of the company. Yes, this means I’ll lose the contact info in email signatures, e.g., phone numbers etc., and yes, I could probably cobble something together to save signatures while throwing away the rest of the body, but honestly, it’s not worth the effort. If I want to contact someone later, their email address will probably be good enough.
    • Parse the sender and recipient headers and discard all messages which have only internal company addresses on them. There’s no justification for retaining any trace of internal company correspondence.
    • Discard abuse complaints: emails from me with “abuse” in one of the recipient addresses.
    • Discard auto-generated and bulk emails, detected via the Auto-Submitted and Precedence headers and “no-reply” or “noreply” From addresses. There’s not enough valuable in these to save them.
    • Discard emails with any correspondent addresses for the company’s customers or sales partners. While these might be useful to me, they constitute company IP and I’m not entitled to take them with me. I determine the list of email domains to exclude by reviewing all of the domains in all of the correspondents in the archive.

When all this is done, what I’m left with is an mbox file that contains the headers of emails I’ve exchanged with vendors and other external contacts during my time at the company. I save this file for future reference and feed it into my email full-text indexer (I use mairix) for easy searching.

Here’s my most recent iteration of solving this problem. Share and enjoy!

#!/usr/bin/env perl

use strict;
use warnings;

use Email::Address;
use English;
use File::Basename;
use Getopt::Long;

my $my_domain = "[company email domain]";
# Remember to put a backslash before the @
my $my_email = "[my company email address]";
my(@blacklist_domains) = qw([email domains to be excluded from filtered
                            archive]);

my $whoami = basename $0;
my $usage = "Usage: $whoami [-h|--help] [--headers|--domains] [input-file]
    Specify --headers to output the header fields found in all messages instead
    of the filtered messages. Useful for identifying headers you should modify
    the script to strip out. This list will only include headers the script
    isn't already configured to strip out.

    Specify --domains to output the email domains of correspondents found in
    all messages instead of the filtered messages. Useful for identifying which
    domains you should include in \@blacklist_domains. This list will only
    include domains the script isn't already configured to strip out.

    If you don't specify an input file then input is read from stdin.

    Output goes to stdout.\n";

my($do_headers, $do_domains);
die $usage if (! GetOptions("h|help" => sub { print $usage; exit; },
                            "headers" => \$do_headers,
                            "domains" => \$do_domains));
die "$whoami: Don't specify both --headers and --domains\n$usage"
    if ($do_headers and $do_domains);
                            
map(s/(\W)/\\$1/g, @blacklist_domains);
my $blacklist_domain_re = join('|', @blacklist_domains);
$my_domain =~ s/(\W)/\\$1/g;
my(%items);

&read_file;

if ($do_domains or $do_headers) {
    my(@keys) = sort { $items{$b} <=> $items{$a} } keys %items;
    map(print($items{$_}, "\t", $_, "\n"), @keys);
}

sub read_file {
    my $msg = "";
    my $skipping = 0;
    
    while (<>) {
        if (/^From .* 20\d\d$/) {
            &process_message($msg) if ($msg);
            $msg = $_;
            $skipping = 0;
            next;
        }
        next if ($skipping);
        $msg .= $_;
        if (/^$/) {
            &process_message($msg);
            $msg = "";
            $skipping = 1;
        }
    }
}

sub process_message {
    local($_) = @_;
    my($header, $body, %msg_items);
    if (/\n\n/) {
        $header = $PREMATCH;
        $body = $POSTMATCH;
    }
    else {
        $header = $_;
        $body = "";
    }
    return if ($header =~
               /^(Auto-Submitted: auto-|Precedence: (Bulk|Junk|List))/mi);
    my(@headers) = split(/\n\b/, $header);
    $header = "";
    my $external = 0;
    my $abuse = 0;
    my $from;
    for (@headers) {
        next if (/^(?:Delivered-To|Received|X-.*|ARC-Seal|ARC-Message-Signature|
        ARC-Authentication-Results|Return-Path|Received-SPF|
        Authentication-Results|DKIM-Signature|MIME-Version|Content-Type|
        Content-Transfer-Encoding|AdditionalInfo|References|User-Agent|
        In-Reply-To|List-.*|Thread-.*|Accept-Language|Content-Language|BCC|
        Sender|Feedback-ID|Mailing-List|Errors-To|disclaimersource|
        message-id-hash|archived-at|atlassian.*|Importance|msip_labels|
        dkim-filter|ironport-.*|domainkey-signature|dmarc-filter|
        authentication-results-original|signature|disclaimer|campaign|
        content-disposition|suggested_attachment_session_id|content-class|
        content-length):/ix);
        if (/^(from|to|cc|reply-to):/i) {
            my(@addresses) = Email::Address->parse($POSTMATCH);
            return if grep($_->host =~ /^$blacklist_domain_re$/oi, @addresses);
            $external += grep($_->host !~ /^$my_domain$/oi, @addresses);
            if (/^from:/i) {
                return if ($addresses[0]->user =~ /\bno-?reply\b/i);
                $from = $addresses[0]->address;
            }
            else {
                $abuse += grep($_->user =~ /\babuse\b/, @addresses);
            }
            if ($do_domains) {
                map($msg_items{$_->host}++, @addresses);
            }
        }
        if ($do_headers and /^(\S+):/) {
            my $header = $1;
            $header =~ tr/A-Z/a-z/;
            $msg_items{$header}++;
        }
        $header .= $_ . "\n";
    }
    return if (! $external);
    return if ($from eq $my_email and $abuse);
    if ($do_domains or $do_headers) {
        map($items{$_} += $msg_items{$_}, keys %msg_items);
    }
    else {
        print($header, "\n", $body);
    }
}
Share

Leave a Reply

Your email address will not be published. Required fields are marked *