Here’s a good one from the department of unintended consequences.
In 2003, the Federal Energy Regulatory Commission released a million and a half Enron email messages to the public. FERC was very rightly taken to task by many for releasing the entire set of emails, because a large number were personal emails of innocent employees, some of which contained information like social security numbers, and only a small fraction of which contained anything incriminating. FERC did backtrack a bit and took down the messages containing social security numbers and employee performance evaluations. But the rest remain publicly accessible.
Putting the ethics of how the messages were obtained and released aside, the entry of this corpus into the public record has really been a boon to various branches of computer science, particularly information retrieval and knowledge discovery. There has been lots of interesting academic work with the Enron corpus; everything from spam filters to social netork analysis to information retrieval to linguistic analysis. I’ve also seen demos of various commercial enterprise search products using it, and have read about some other commercial products which have used it in R&D. Really, researchers have never had anything like this, a full corporate email database captured in the wild.
And it should also be a cautionary tale to not put anything in your work email that you wouldn’t want anyone in the world to read. (My previous barometer for to put in work email was to ask myself if I would mind one of the IT guys reading it).