This project started when a large customer asked InBoxer whether it would be possible to use our language technology to scan outbound mail for potential problems. At the time, we did not know if it was possible. We had success with the InBoxer Anti-Spam Filter, but that was for one type of inbound mail.
Could InBoxer be turned around for outbound mail? We needed to test. But to test, we needed a large collection of email messages with all of a company's secrets and dirty laundry. That was not going to be easy. Could you imagine going up to a client or customer and saying, "Give me at least one half million of your most confidential and incriminating email messages, please." That wasn't going to happen and we were stuck. That is, until FERC.
The Federal Energy Regulatory Commission did InBoxer a great favor. It released a database of hundreds of thousands of Enron emails to the public. And, while they did it to allow the world to see the culture that allowed for what the Department of Justice called "a criminal conspiracy to commit one of the largest corporate frauds in American history,” it really helped us.
As a way of returning the favor, and frankly as a way to get potential customers to see the benefits of the InBoxer Anti-Risk Appliance, we have posted all of the available Enron email messages. You can search them for free.
To make searching easier, many problems with the data were fixed. For example, many system messages and spam emails were removed. InBoxer, Inc. owes thanks to the CALO Project (A Cognitive Assistant that Learns and Organizes), Leslie Kaelbling at MIT, and a number of folks at SRI, notably Melinda Gervasio, for their work to fix the dataset problems. Some messages have been deleted, such as spam messages and generic system messages (such as "the system will be unavailable for routine maintenance starting at 2AM"). All really personal information, such as salaries and social security numbers, were scrubbed from the messages. So, don't try to steal any identities here. Also, attachments were removed from the database before we got it. That means that, unfortunately, no attachments are posted here.
Of course, we need to have at least one shameless plug. Whether moments of poor judgment are accidental or intentional, employers are responsible for messages sent on their systems. If, by chance, you might be worried about what people in their company might be sending in email, please let us know. Thanks.