使用C#从Outlook邮箱中提取附件






4.27/5 (5投票s)
如何使用C#从Outlook邮箱中提取附件
引言
我的个人邮箱,追溯到 90 年代末的电子邮件,充满了旧的附件,这些附件使 PST 文件膨胀,但实际上并不需要。 包含附件的 PST 文件现在大约为 40Gb。
我决定编写一个简单的 C# 控制台应用程序来提取它们,以减小 PST 文件的大小。
应用程序本身将执行一些简单的任务
- 在 Outlook 数据存储中查找根文件夹
- 递归遍历文件夹结构
- 遍历每个文件夹中的每封电子邮件,查找附件
- 找到后,将每个附件保存在硬盘上的文件夹结构中,该结构代表 Outlook 文件夹结构
必备组件
首先,在 Visual Studio 中创建一个 C# 控制台应用程序,目标是 .NET 4.5 或更高版本框架。
该应用程序使用 Microsoft.Office.Interop.Outlook 程序集,因此您需要将其作为引用添加到您的项目中。
Outlook 主互操作程序集 (PIA) 参考提供了开发适用于 Outlook 2013 和 2016 的托管应用程序的帮助。 它将 Outlook 2013 和 2016 开发人员参考从 COM 环境扩展到托管环境,允许您从 .NET 应用程序与 Outlook 交互。
您还需要在您的 PC 上安装 Microsoft Outlook - 否则 Interop 程序集将无所适从。
在 MSDN 上了解更多信息。
遍历 Outlook 帐户
在我们浏览 Outlook 中的每个文件夹和电子邮件之前,我们需要找到一个实际的帐户,并从此构建根文件夹。
根文件夹的格式为\\文件夹名称\,收件箱位于其下一级,即\\文件夹名称\收件箱\。
为此,我们只需遍历 Outlook.Application.Session.Accounts
集合。
Outlook.Application Application = new Outlook.Application();
Outlook.Accounts accounts = Application.Session.Accounts;
foreach (Outlook.Account account in accounts)
{
Console.WriteLine(account.DisplayName);
}
从这些帐户中,我们可以导出根文件夹名称。
递归遍历文件夹
使用下面的函数,我们最初将根文件夹传递给它。 然后它会查找任何子(子)文件夹,并将它们递归地传递给自己,沿着文件夹结构直到到达末尾。
static void EnumerateFolders(Outlook.Folder folder)
{
Outlook.Folders childFolders = folder.Folders;
if (childFolders.Count > 0)
{
foreach (Outlook.Folder childFolder in childFolders)
{
// We only want Inbox folders - ignore Contacts and others
if (childFolder.FolderPath.Contains("Inbox"))
{
Console.WriteLine(childFolder.FolderPath);
// Call EnumerateFolders using childFolder,
// to see if there are any sub-folders within this one
EnumerateFolders(childFolder);
}
}
}
}
遍历文件夹中的电子邮件并列出其附件
使用下面的函数,我们最初将当前文件夹传递给它。 然后它将遍历 folder.Items
对象,该对象实际上包含 Outlook 文件夹中实际电子邮件消息的集合。
每封电子邮件都作为项目返回,包含属性 .Attachments.Count
,该属性指示电子邮件消息有多少个附件。
当此值不为零 (!= 0
) 时,我们只需列出电子邮件中的每个附件。 从这里,您可以保存附件、删除附件或以您希望的任何方式处理它。
static void IterateMessages(Outlook.Folder folder)
{
var fi = folder.Items;
if (fi != null)
{
foreach (Object item in fi)
{
Outlook.MailItem mi = (Outlook.MailItem)item;
var attachments = mi.Attachments;
if (attachments.Count != 0)
{
for (int i = 1; i <= mi.Attachments.Count; i++)
{
Console.WriteLine("Attachment: " + mi.Attachments[i].FileName);
}
}
}
}
}
查找特定类型的附件
Outlook 经常存储嵌入的图像(例如电子邮件中的徽标)和其他您通常不需要作为附件的文件,因此我创建了一个我想要提取的扩展名类型数组,忽略那些对我没有用的扩展名类型。
通过将附件文件名与扩展名数组进行比较,我可以确定要保留哪些文件。
由于这只是执行基本的 string
比较,因此任何包含数组中 string
之一的文件都将被识别。 例如,hellowworld.doc (Office) 和 hellowworld.docx (Outlook 2007 及更高版本中的 Office Open XML 格式) 都包含 .doc,因此都会被识别。
// attachment extensions to save
string[] extensionsArray = { ".pdf", ".doc",
".xls", ".ppt", ".vsd", ".zip",
".rar", ".txt", ".csv", ".proj" };
if (extensionsArray.Any(mi.Attachments[i].FileName.Contains)) {
// the filename contains one of the extensions
}
保存和删除附件
保存每个附件非常容易,并且该程序集提供了一个函数来执行保存到本地磁盘的操作。 在下面的示例中,pathToSaveFile
是一个本地磁盘路径,例如 c:\temp\。
mi.Attachments[i].SaveAsFile(pathToSaveFile);
同样,删除附件就像调用 .Delete
函数一样简单。
mi.Attachments[i].Delete();
在下面的示例代码中,我们将每个附件保存到基于以下结构的文件夹中
(basepath)(accountname)(folderstructure)(sender)
下载
您可以从 GitHub 下载此项目的代码,或者查看下面的代码。
完整代码
/// /// Outlook Attachment Extractor /// Version 0.1 /// Build 2015-Oct-18 /// Written by Matthew Proctor /// www.matthewproctor.com /// using System; using System.Linq; using System.IO; using Outlook = Microsoft.Office.Interop.Outlook; namespace OutlookAttachmentExtractor { class Program { // Path where attachments will be saved static string basePath = @"c:\temp\emails\"; // Integer to store to the total size of all the files - displayed // after processing to indicate possible PST file size saving static int totalfilesize = 0; static void Main(string[] args) { EnumerateAccounts(); } // Uses recursion to enumerate Outlook subfolders. static void EnumerateFolders(Outlook.Folder folder) { Outlook.Folders childFolders = folder.Folders; if (childFolders.Count > 0) { // loop through each childFolder (aka sub-folder) in current folder foreach (Outlook.Folder childFolder in childFolders) { // We only want Inbox folders - ignore Contacts and others if (childFolder.FolderPath.Contains("Inbox")) { // Write the folder path. Console.WriteLine(childFolder.FolderPath); // Call EnumerateFolders using childFolder, // to see if there are any sub-folders within this one EnumerateFolders(childFolder); } } } // pass folder to IterateMessages which processes individual email messages Console.WriteLine("Looking for items in " + folder.FolderPath); IterateMessages(folder); } // Loops through each item (aka email) in a folder static void IterateMessages(Outlook.Folder folder) { // attachment extensions to save string[] extensionsArray = { ".pdf", ".doc", ".xls", ".ppt", ".vsd", ".zip", ".rar", ".txt", ".csv", ".proj" }; // Iterate through all items ("messages") in a folder var fi = folder.Items; if (fi != null) { try { foreach (Object item in fi) { Outlook.MailItem mi = (Outlook.MailItem)item; var attachments = mi.Attachments; // Only process item if it has one or more attachments if (attachments.Count != 0) { // Create a directory to store the attachment if (!Directory.Exists(basePath + folder.FolderPath)) { Directory.CreateDirectory(basePath + folder.FolderPath); } // Loop through each attachment for (int i = 1; i <= mi.Attachments.Count; i++) { // Check wither any of the strings in the // extensionsArray are contained within the filename var fn = mi.Attachments[i].FileName.ToLower(); if (extensionsArray.Any(fn.Contains)) { // Create a further sub-folder for the sender if (!Directory.Exists(basePath + folder.FolderPath + @"\" + mi.Sender.Address)) { Directory.CreateDirectory(basePath + folder.FolderPath + @"\" + mi.Sender.Address); } totalfilesize = totalfilesize + mi.Attachments[i].Size; if (!File.Exists(basePath + folder.FolderPath + @"\" + mi.Sender.Address + @"\" + mi.Attachments[i].FileName)) { Console.WriteLine("Saving " + mi.Attachments[i].FileName); mi.Attachments[i].SaveAsFile(basePath + folder.FolderPath + @"\" + mi.Sender.Address + @"\" + mi.Attachments[i].FileName); // Uncomment next line to delete attachment after saving it // mi.Attachments[i].Delete(); } else { Console.WriteLine("Already saved " + mi.Attachments[i].FileName); } } } } } } catch (Exception e) { // Console.WriteLine("An error occurred: '{0}'", e); } } } // Retrieves the email address for a given account object static string EnumerateAccountEmailAddress(Outlook.Account account) { try { if (string.IsNullOrEmpty(account.SmtpAddress) || string.IsNullOrEmpty(account.UserName)) { Outlook.AddressEntry oAE = account.CurrentUser.AddressEntry as Outlook.AddressEntry; if (oAE.Type == "EX") { Outlook.ExchangeUser oEU = oAE.GetExchangeUser() as Outlook.ExchangeUser; return oEU.PrimarySmtpAddress; } else { return oAE.Address; } } else { return account.SmtpAddress; } } catch (Exception ex) { Console.WriteLine(ex.Message); return ""; } } // Displays introduction text, lists each Account, and prompts user to select one for processing. static void EnumerateAccounts() { Console.Clear(); Console.WriteLine("Outlook Attachment Extractor v0.1"); Console.WriteLine("---------------------------------"); int id; Outlook.Application Application = new Outlook.Application(); Outlook.Accounts accounts = Application.Session.Accounts; string response = ""; while (true == true) { id = 1; foreach (Outlook.Account account in accounts) { Console.WriteLine(id + ":" + EnumerateAccountEmailAddress(account)); id++; } Console.WriteLine("Q: Quit Application"); response = Console.ReadLine().ToUpper(); if (response == "Q") { Console.WriteLine("Quitting"); return; } if (response != "") { if (Int32.Parse(response.Trim()) >= 1 && Int32.Parse(response.Trim()) < id) { Console.WriteLine("Processing: " + accounts[Int32.Parse(response.Trim())].DisplayName); Console.WriteLine("Processing: " + EnumerateAccountEmailAddress(accounts[Int32.Parse(response.Trim())])); Outlook.Folder selectedFolder = Application.Session.DefaultStore.GetRootFolder() as Outlook.Folder; selectedFolder = GetFolder(@"\\" + accounts[Int32.Parse(response.Trim())].DisplayName); EnumerateFolders(selectedFolder); Console.WriteLine("Finished Processing " + accounts[Int32.Parse(response.Trim())].DisplayName); Console.WriteLine(""); } else { Console.WriteLine("Invalid Account Selected"); } } } } // Returns Folder object based on folder path static Outlook.Folder GetFolder(string folderPath) { Console.WriteLine("Looking for: " + folderPath); Outlook.Folder folder; string backslash = @"\"; try { if (folderPath.StartsWith(@"\\")) { folderPath = folderPath.Remove(0, 2); } String[] folders = folderPath.Split(backslash.ToCharArray()); Outlook.Application Application = new Outlook.Application(); folder = Application.Session.Folders[folders[0]] as Outlook.Folder; if (folder != null) { for (int i = 1; i <= folders.GetUpperBound(0); i++) { Outlook.Folders subFolders = folder.Folders; folder = subFolders[folders[i]] as Outlook.Folder; if (folder == null) { return null; } } } return folder; } catch (Exception ex) { Console.WriteLine(ex.Message); return null; } } } }
测试
我已经在托管在本地 Exchange 2013 环境、Office 365 和 POP3/IMAP 邮箱中的邮箱上测试了此代码 - 所有功能都完全相同。
延伸阅读
以下链接提供了有关如何使用 Outlook Interop 服务的更多信息。