电子邮件搜索引擎





5.00/5 (11投票s)
基于 MySQL 存储引擎和 API 进行复杂分析,使用 SQL 查询语言:第 5 部分。
引言
MySQL 支持许多存储引擎,它们负责存储和检索数据。MyISAM 和 InnoDB 是常见的。在创建表时可以指定引擎。基本的存储引擎允许只读表。大多数引擎从文件中读取数据并提供给 MYSQL。另一方面,我们的想法是从 API 中检索数据并将其提供给 MYSQL。这将使我们能够使用 SQL 对从 API 检索到的数据进行复杂分析。想法是为每组相关的 API 编写一个引擎。在本系列文章的第 4 部分中,我介绍了 MySQL 存储引擎的概念,该引擎在 MySQL 控制台上公开 NOSQL 数据,供您进行复杂分析。这允许对来自Parse.com的数据进行复杂分析。本系列文章继续进行,这次介绍了 MySQL 存储引擎的概念,该引擎在 MySQL 控制台上公开您的电子邮件,供您进行复杂分析。具体来说,这允许对您 Gmail 收件箱中的电子邮件进行复杂分析。
背景
一种常规方法是将电子邮件从 Gmail 导出并插入到 MySQL 中。
- 这种方法的一个缺点是,每次邮件到达或被删除时,您都必须向表中插入和删除行。
- 另一个缺点是,在查询表之前,您需要触发该日志应用程序来插入和删除行。
- 另一个缺点是,即使您只需要检索几行,所有电子邮件都保留在表中。
- 另一个缺点是,在插入或删除期间,表可能会被锁定,这会给检索实体带来性能瓶颈。
- 另一个缺点是数据重复
- 如果您的日志应用程序(负责插入和删除电子邮件)停运了一周,那么您将不得不等待一段时间才能检索到今天的电子邮件。
相反,提议的方法通过以下方式避免了这些问题:
- 无需向表中插入和删除行。
- 在查询表之前,无需触发日志应用程序。
- 不需要在只需要少量电子邮件时获取所有电子邮件。
- 没有瓶颈,因为没有插入和删除。
- 无需等待日志应用程序在一周内插入电子邮件。
此外,提议的方法优于 Gmail 的默认搜索功能,因为它利用了 MySQL 解析器的强大功能来分析您的电子邮件,这是以前从未做过的。
电子邮件搜索引擎
这个 MySQL 存储引擎与第 4 部分中介绍的存储引擎不同,因为它从 Gmail 检索数据并将其提供给 MySQL。创建了一个名为 Email 的表,其引擎为 EmailSearchEngine。
CREATE TABLE `email` (
`UID` int(10) unsigned DEFAULT NULL,
`from` varchar(1024) DEFAULT NULL,
`to` varchar(1024) DEFAULT NULL,
`cc` varchar(1024) DEFAULT NULL,
`subject` varchar(1024) DEFAULT NULL,
`body` LONGTEXT,
`messagedate` datetime DEFAULT NULL
) ENGINE=EmailSearchEngine;
当用户在 Email 表上输入 select 查询时,控制权会传递到 rnd_init 方法中的 EmailSearchEngine。此方法检查表名是否确实是“Email”。然后,它检查 where 子句解析树是否存在。如果存在,则遍历 where 子句解析树以转换为 IMAP 搜索命令。否则,收件箱中的所有电子邮件都将提供给 MySQL。为了检索电子邮件,我们使用 vmime.NET [1],这是一个面向 C++ 和 .NET Windows 程序员的库,允许通过 SMTP、POP3 和 IMAP(支持 TLS 和 SSL)发送和下载电子邮件。
在继续 rnd_init 之前,需要解释以下数据结构。
vector<Email> m_EmailVector;
//(belongs to ha_example class and maintains all emails for the query)
int m_EmailVectorIndex;
//(belongs to ha_example class and maintains the index to m_EmailVector)
struct Email
{
int uid;
string from;
string to;
string cc;
string subject;
string body;
vmime::datetime receiveTime;
};
int ha_example::rnd_init(bool scan)
{
DBUG_ENTER("ha_example::rnd_init");
if(_stricmp(table_share->table_name.str,"Email")==0)
{
THD *thd=this->ha_thd();
SELECT_LEX *select_lex=&thd->lex->select_lex;
m_EmailVectorIndex=0;
m_EmailVector.clear();
if(select_lex->where==0)
{
stack<ParseStackElement> parseStack;
ParseStackElement topElem;//=parseStack.top();
GetAllEmails(m_EmailVector);
}
else
{
stack<ParseStackElement> parseStack;
select_lex->where->traverse_cond(My_Cond_traverser,(void*)&parseStack,Item::traverse_order::POSTFIX);
if(!parseStack.empty()&&parseStack.size()==1)
{
ParseStackElement topElem=parseStack.top();
if(topElem.emailWhereClauseUnitVector.size()==1)
{
if(topElem.emailWhereClauseUnitVector[0].uid!=-1&&topElem.emailWhereClauseUnitVector[0].from==""&&topElem.emailWhereClauseUnitVector[0].to==""&&topElem.emailWhereClauseUnitVector[0].cc==""&&topElem.emailWhereClauseUnitVector[0].subject==""&&topElem.emailWhereClauseUnitVector[0].body=="")
{
GetSingleEmail(topElem.emailWhereClauseUnitVector[0].uid,m_EmailVector);
}
else
{
GetEmails(topElem.emailWhereClauseUnitVector,m_EmailVector);
}
}
else
{
GetEmails(topElem.emailWhereClauseUnitVector,m_EmailVector);
}
}
else
{
GetAllEmails(m_EmailVector);
}
}
}
DBUG_RETURN(0);
}
在 where 子句解析树的后缀遍历期间,创建并维护 ParseStackElement 结构体堆栈。
struct ParseStackElement
{
vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
Item *item;
};
ParseStackElement 包含 EmailWhereClauseUnit 类型的向量。
class EmailWhereClauseUnit
{
public:
int uid;
string from;
string to;
string cc;
string subject;
string body;
time_t emailTimeStampStart;
time_t emailTimeStampEnd;
EmailWhereClauseUnit()
{
uid=-1;
}
};
EmailWhereClauseUnit 是 where 子句的基本单元,包含 uid、from、to、cc、subject、body 和 messagedate。例如,考虑以下 where 子句:
Select * from email where `from`="example@yahoo.com" and messagedate >="2014-12-06 00:00:00" and messagedate<="2014-12-06 23:59:59";
等效的 EmailWhereClauseUnit 是:
- from=example@yahoo.com
- emailTimeStampStart=2014-12-06 00:00:00
- emailTimeStampEnd=2014-12-06 23:59:59
rnd_init 调用 traverse_cond 在 where 子句解析树的根节点上,以便以后缀顺序遍历它。My_Cond_traverser 函数被指定在遍历期间调用。将解析堆栈传递给此函数。此函数检查以下内容:
- In 操作符:在这种情况下,调用 Item_func_in_case 来获取“in 子句”中指定的“from”、“to”、“cc”、“subject”、“body”或“message date”的集合,并将其推入堆栈。
- Equality 操作符:在这种情况下,调用 Item_func_eq_case 来获取指定的“from”、“to”、“cc”、“subject”、“body”或“message date”,并将其推入堆栈。
- Less 或 Less and equal 操作符:在这种情况下,调用 Item_func_less_case 来获取指定的“message date”范围,该范围以零开头并推入堆栈。
- Greater 或 greater and equal 操作符:在这种情况下,调用 Item_func_greater_case 来获取指定的“message date”范围,该范围以 MYSQL_TIMESTAMP_MAX_VALUE 结尾并推入堆栈。
- Between 操作符:在这种情况下,调用 Item_func_between_case 来获取指定的“message date”范围并推入堆栈。
- And/Or 操作符:在这种情况下,弹出堆栈直到出现项目(My_Cond_traverser 的第一个参数)的子项。在弹出过程中,将 EmailWhereClauseUnit 类型的向量进行 AND 和 OR 操作,以获得结果集。
void My_Cond_traverser (const Item *item, void *arg)
{
stack<ParseStackElement> *parseStack=(stack<ParseStackElement> *)arg;
vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
if(dynamic_cast<const Item_func_in*>(item))
{
emailWhereClauseUnitVector=Item_func_in_case(item);
ParseStackElement elem;
elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
elem.item=(Item *)item;
parseStack->push(elem);
}
else if(dynamic_cast<const Item_func_eq*>(item))
{
emailWhereClauseUnitVector=Item_func_eq_case(item);
ParseStackElement elem;
elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
elem.item=(Item *)item;
parseStack->push(elem);
}
else if(dynamic_cast<const Item_func_ge*>(item)||dynamic_cast<const Item_func_gt*>(item))
{
emailWhereClauseUnitVector=Item_func_greater_case(item);
ParseStackElement elem;
elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
elem.item=(Item *)item;
parseStack->push(elem);
}
else if(dynamic_cast<const Item_func_le*>(item)||dynamic_cast<const Item_func_lt*>(item))
{
emailWhereClauseUnitVector=Item_func_less_case(item);
ParseStackElement elem;
elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
elem.item=(Item *)item;
parseStack->push(elem);
}
else if(dynamic_cast<const Item_func_between*>(item))
{
emailWhereClauseUnitVector=Item_func_between_case(item);
ParseStackElement elem;
elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
elem.item=(Item *)item;
parseStack->push(elem);
}
else if(dynamic_cast<const Item_cond*>(item))
{
const Item_cond *itemCondC=dynamic_cast<const Item_cond*>(item);
Item_cond *itemCond=(Item_cond *)itemCondC;
vector<EmailWhereClauseUnit> result;
bool isAnd=false;
if(dynamic_cast<const Item_cond_and*>(item))
{
isAnd=true;
}
if (!parseStack->empty()&&isChildOf((Item_cond*)item,parseStack->top().item))
{
result=parseStack->top().emailWhereClauseUnitVector;
parseStack->pop();
}
while (!parseStack->empty()&&isChildOf((Item_cond*)item,parseStack->top().item))
{
if(result.empty()&&!parseStack->top().emailWhereClauseUnitVector.empty())
{
result=parseStack->top().emailWhereClauseUnitVector;
}
else if(!result.empty()&&parseStack->top().emailWhereClauseUnitVector.empty())
{
result=result;
}
else
{
if(isAnd)
{
result=And(result,parseStack->top().emailWhereClauseUnitVector);
}
else
{
result=Or(result,parseStack->top().emailWhereClauseUnitVector);
}
}
parseStack->pop();
}
ParseStackElement elem;
elem.emailWhereClauseUnitVector=result;
elem.item=(Item *)item;
parseStack->push(elem);
}
}
以下函数提取用户在等号子句中输入的“from”、“to”、“cc”、“subject”、“body”或“message date”。代码检查该项是否为 Item_func 类型。然后它检查第一个参数是字段,第二个不是字段。然后它检查字段是“from”、“to”、“cc”、“subject”、“body”或“message date”。
vector<EmailWhereClauseUnit> Item_func_eq_case(const Item *item)
{
vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
Item **arguments=0;
if(itemFunction)
{
arguments=itemFunction->arguments();
}
else
{
return emailWhereClauseUnitVector;
}
Item *field=0;
Item *value=0;
if(dynamic_cast <Item_field*>(arguments[0]))
{
field = arguments[0];
}
else
{
return emailWhereClauseUnitVector;
}
if(dynamic_cast <Item_field*>(arguments[1]))
{
return emailWhereClauseUnitVector;
}
else
{
value = arguments[1];
}
Item_field *f;
if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"from")==0)
{
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from=value->item_name.ptr();
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"to")==0)
{
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to=value->item_name.ptr();
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"cc")==0)
{
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc=value->item_name.ptr();
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"subject")==0)
{
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject=value->item_name.ptr();
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"body")==0)
{
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body=value->item_name.ptr();
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
{
MYSQL_TIME emailDateTime={0};
value->get_date(&emailDateTime,0);
tm tempDateTime={0};
tempDateTime.tm_year=emailDateTime.year-1900;
tempDateTime.tm_mon=emailDateTime.month-1;
tempDateTime.tm_mday=emailDateTime.day;
tempDateTime.tm_hour=emailDateTime.hour;
tempDateTime.tm_min=emailDateTime.minute;
tempDateTime.tm_sec=emailDateTime.second;
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"UID")==0)
{
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.uid=atoi(value->item_name.ptr());
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
return emailWhereClauseUnitVector;
}
以下函数提取“in 子句”中指定的“from”、“to”、“cc”、“subject”、“body”或“message date”的集合。首先检查该项是否为 Item_func 类型。然后检查第一个参数是字段,并且后续参数不是字段类型。
vector<EmailWhereClauseUnit> Item_func_in_case(const Item *item)
{
vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
Item **arguments=0;
int inArgcount=0;
if(itemFunction)
{
arguments=itemFunction->arguments();
inArgcount=itemFunction->arg_count;
}
else
{
return emailWhereClauseUnitVector;
}
Item *field=0;
Item *value=0;
if(dynamic_cast <Item_field*>(arguments[0]))
{
field = arguments[0];
}
else
{
return emailWhereClauseUnitVector;
}
if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"from")==0)
{
LONG index;
for (index = 1; index < inArgcount; index++)
{
if(dynamic_cast <Item_field*>(arguments[index]))
{
continue;
}
else
{
value = arguments[index];
}
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from=value->item_name.ptr();
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"to")==0)
{
LONG index;
for (index = 1; index < inArgcount; index++)
{
if(dynamic_cast <Item_field*>(arguments[index]))
{
continue;
}
else
{
value = arguments[index];
}
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to=value->item_name.ptr();
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"cc")==0)
{
LONG index;
for (index = 1; index < inArgcount; index++)
{
if(dynamic_cast <Item_field*>(arguments[index]))
{
continue;
}
else
{
value = arguments[index];
}
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc=value->item_name.ptr();
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"subject")==0)
{
LONG index;
for (index = 1; index < inArgcount; index++)
{
if(dynamic_cast <Item_field*>(arguments[index]))
{
continue;
}
else
{
value = arguments[index];
}
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject=value->item_name.ptr();
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"body")==0)
{
LONG index;
for (index = 1; index < inArgcount; index++)
{
if(dynamic_cast <Item_field*>(arguments[index]))
{
continue;
}
else
{
value = arguments[index];
}
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body=value->item_name.ptr();
myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
}
else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
{
LONG index;
for (index = 1; index < inArgcount; index++)
{
if(dynamic_cast <Item_field*>(arguments[index]))
{
continue;
}
else
{
value = arguments[index];
}
MYSQL_TIME emailDateTime={0};
value->get_date(&emailDateTime,0);
tm tempDateTime={0};
tempDateTime.tm_year=emailDateTime.year-1900;
tempDateTime.tm_mon=emailDateTime.month-1;
tempDateTime.tm_mday=emailDateTime.day;
tempDateTime.tm_hour=emailDateTime.hour;
tempDateTime.tm_min=emailDateTime.minute;
tempDateTime.tm_sec=emailDateTime.second;
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
}
return emailWhereClauseUnitVector;
}
以下函数提取“less 或 less than equal 子句”中指定的“message date”。首先检查该项是否为 Item_func 类型。然后检查第一个参数是字段,并且后续参数不是字段类型。Less than 子句被转换为 where 子句单元,其中 emailTimeStampStart 为零,emailTimeStampEnd 等于 less 子句中指定的值。
vector<EmailWhereClauseUnit> Item_func_less_case(const Item *item)
{
vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
Item **arguments=0;
if(itemFunction)
{
arguments=itemFunction->arguments();
}
else
{
return emailWhereClauseUnitVector;
}
Item *field=0;
Item *value=0;
if(dynamic_cast <Item_field*>(arguments[0]))
{
field = arguments[0];
}
else
{
return emailWhereClauseUnitVector;
}
if(dynamic_cast <Item_field*>(arguments[1]))
{
return emailWhereClauseUnitVector;
}
else
{
value = arguments[1];
}
Item_field *f;
if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
{
MYSQL_TIME emailDateTime={0};
value->get_date(&emailDateTime,0);
tm tempDateTime={0};
tempDateTime.tm_year=emailDateTime.year-1900;
tempDateTime.tm_mon=emailDateTime.month-1;
tempDateTime.tm_mday=emailDateTime.day;
tempDateTime.tm_hour=emailDateTime.hour;
tempDateTime.tm_min=emailDateTime.minute;
tempDateTime.tm_sec=emailDateTime.second;
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=0;
myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
return emailWhereClauseUnitVector;
}
以下函数提取“greater 或 greater than equal 子句”中指定的 message date。首先检查该项是否为 Item_func 类型。然后检查第一个参数是字段,并且后续参数不是字段类型。Greater than 子句被转换为 where 子句单元,其中 emailTimeStampStart 为 greater 子句中指定的值,emailTimeStampEnd 等于 MYSQL_TIMESTAMP_MAX_VALUE 值。
vector<EmailWhereClauseUnit> Item_func_greater_case(const Item *item)
{
vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
Item **arguments=0;
if(itemFunction)
{
arguments=itemFunction->arguments();
}
else
{
return emailWhereClauseUnitVector;
}
Item *field=0;
Item *value=0;
if(dynamic_cast <Item_field*>(arguments[0]))
{
field = arguments[0];
}
else
{
return emailWhereClauseUnitVector;
}
if(dynamic_cast <Item_field*>(arguments[1]))
{
return emailWhereClauseUnitVector;
}
else
{
value = arguments[1];
}
Item_field *f;
if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
{
MYSQL_TIME emailDateTime={0};
value->get_date(&emailDateTime,0);
tm tempDateTime={0};
tempDateTime.tm_year=emailDateTime.year-1900;
tempDateTime.tm_mon=emailDateTime.month-1;
tempDateTime.tm_mday=emailDateTime.day;
tempDateTime.tm_hour=emailDateTime.hour;
tempDateTime.tm_min=emailDateTime.minute;
tempDateTime.tm_sec=emailDateTime.second;
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
return emailWhereClauseUnitVector;
}
以下函数提取“between 子句”中指定的“message date”。首先检查该项是否为 Item_func 类型。然后检查第一个参数是字段,并且后续参数不是字段类型。“between”子句被转换为 where 子句单元,其中 emailTimeStampStart 和 emailTimeStampEnd 是“between”子句中指定的值。
vector<EmailWhereClauseUnit> Item_func_between_case(const Item *item)
{
vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
Item **arguments=0;
if(itemFunction)
{
arguments=itemFunction->arguments();
}
else
{
return emailWhereClauseUnitVector;
}
Item *field=0;
Item *value1=0;
Item *value2=0;
if(dynamic_cast <Item_field*>(arguments[0]))
{
field = arguments[0];
}
else
{
return emailWhereClauseUnitVector;
}
if(dynamic_cast <Item_field*>(arguments[1]))
{
return emailWhereClauseUnitVector;
}
else
{
value1 = arguments[1];
}
if(dynamic_cast <Item_field*>(arguments[2]))
{
return emailWhereClauseUnitVector;
}
else
{
value2 = arguments[2];
}
Item_field *f;
if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
{
MYSQL_TIME emailDateTime={0};
value1->get_date(&emailDateTime,0);
tm tempDateTime={0};
tempDateTime.tm_year=emailDateTime.year-1900;
tempDateTime.tm_mon=emailDateTime.month-1;
tempDateTime.tm_mday=emailDateTime.day;
tempDateTime.tm_hour=emailDateTime.hour;
tempDateTime.tm_min=emailDateTime.minute;
tempDateTime.tm_sec=emailDateTime.second;
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from="";
myEmailWhereClauseUnit.to="";
myEmailWhereClauseUnit.cc="";
myEmailWhereClauseUnit.subject="";
myEmailWhereClauseUnit.body="";
myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
value2->get_date(&emailDateTime,0);
tempDateTime.tm_year=emailDateTime.year-1900;
tempDateTime.tm_mon=emailDateTime.month-1;
tempDateTime.tm_mday=emailDateTime.day;
tempDateTime.tm_hour=emailDateTime.hour;
tempDateTime.tm_min=emailDateTime.minute;
tempDateTime.tm_sec=emailDateTime.second;
myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
}
return emailWhereClauseUnitVector;
}
以下是 My_Cond_traverser 用来检查父子关系的一个辅助函数。
bool isChildOf(Item_cond *parent,Item *child)
{
List_iterator<Item> li(*(parent->argument_list()));
Item *it= NULL;
while ((it= li++))
{
if(child==it)
return true;
}
return false;
}
以下函数将 EmailWhereClauseUnit 的向量 A 和向量 B 进行 AND 操作。在此函数中,两个向量都通过嵌套循环进行遍历。如果 A 中类型为 EmailWhereClauseUnit 的一个结构体与 B 中类型相同的另一个结构体匹配,则将此 EmailWhereClauseUnit 添加到结果向量中。通过检查 A 和 B 中属于 EmailWhereClauseUnit 类型的结构体的“from”、“to”、“cc”、“subject”、“body”和“message date”来找到匹配。如果两个“from”都不为空,则它们必须匹配,否则跳过它们。类似地,如果两个“to”都不为空,则它们必须匹配,否则跳过它们,依此类推。如果其中一个“from”为空而另一个不为空,则公共“from”等于非空的一个。类似地,如果其中一个“to”为空而另一个不为空,则公共“to”等于非空的一个。类似地,如果 A 和 B 中的 EmailWhereClauseUnit 的“message date”范围有重叠,则将公共值插入到结果向量中。
vector<EmailWhereClauseUnit> And(vector<EmailWhereClauseUnit> A,vector<EmailWhereClauseUnit> B)
{
vector<EmailWhereClauseUnit> result;
for(vector<EmailWhereClauseUnit>::iterator iter1=A.begin();iter1!=A.end();iter1++)
{
for(vector<EmailWhereClauseUnit>::iterator iter2=B.begin();iter2!=B.end();iter2++)
{
string from="";
if(iter1->from!=""&&iter2->from!="")
{
if(iter1->from==iter2->from)
{
from=iter1->from;
}
else
{
continue;
}
}
else if(iter1->from!=""&&iter2->from=="")
{
from=iter1->from;
}
else if(iter1->from==""&&iter2->from!="")
{
from=iter2->from;
}
else if(iter1->from==""&&iter2->from=="")
{
}
string to="";
if(iter1->to!=""&&iter2->to!="")
{
if(iter1->to==iter2->to)
{
to=iter1->to;
}
else
{
continue;
}
}
else if(iter1->to!=""&&iter2->to=="")
{
to=iter1->to;
}
else if(iter1->to==""&&iter2->to!="")
{
to=iter2->to;
}
else if(iter1->to==""&&iter2->to=="")
{
}
string cc="";
if(iter1->cc!=""&&iter2->cc!="")
{
if(iter1->cc==iter2->cc)
{
cc=iter1->cc;
}
else
{
continue;
}
}
else if(iter1->cc!=""&&iter2->cc=="")
{
cc=iter1->cc;
}
else if(iter1->cc==""&&iter2->cc!="")
{
cc=iter2->cc;
}
else if(iter1->cc==""&&iter2->cc=="")
{
}
string subject="";
if(iter1->subject!=""&&iter2->subject!="")
{
if(iter1->subject==iter2->subject)
{
subject=iter1->subject;
}
else
{
continue;
}
}
else if(iter1->subject!=""&&iter2->subject=="")
{
subject=iter1->subject;
}
else if(iter1->subject==""&&iter2->subject!="")
{
subject =iter2->subject;
}
else if(iter1->subject==""&&iter2->subject=="")
{
}
string body="";
if(iter1->body!=""&&iter2->body!="")
{
if(iter1->body==iter2->body)
{
body=iter1->body;
}
else
{
continue;
}
}
else if(iter1->body!=""&&iter2->body=="")
{
body=iter1->body;
}
else if(iter1->body==""&&iter2->body!="")
{
body=iter2->body;
}
else if(iter1->body==""&&iter2->body=="")
{
}
unsigned int emailTimeStampStart;
unsigned int emailTimeStampEnd;
bool common=FindAndBetweenTwoRanges(iter1->emailTimeStampStart,iter1->emailTimeStampEnd,iter2->emailTimeStampStart,iter2->emailTimeStampEnd,emailTimeStampStart,emailTimeStampEnd);
if(common==false)
{
continue;
}
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from=from;
myEmailWhereClauseUnit.to=to;
myEmailWhereClauseUnit.cc=cc;
myEmailWhereClauseUnit.subject=subject;
myEmailWhereClauseUnit.body=body;
myEmailWhereClauseUnit.emailTimeStampStart=emailTimeStampStart;
myEmailWhereClauseUnit.emailTimeStampEnd=emailTimeStampEnd;
result.push_back(myEmailWhereClauseUnit);
}
}
return result;
}
以下函数查找两个消息日期范围之间的重叠。如果范围 1 的开始小于范围 2 的开始,并且范围 2 的开始小于范围 1 的结束,则存在重叠。结果重叠是从范围 2 的开始到范围 1 的结束或范围 2 的结束(取决于哪个较小,反之亦然)。
bool FindAndBetweenTwoRanges(unsigned int range1Start,unsigned int range1End,unsigned int range2Start,unsigned int range2End,unsigned int &resRangeStart,unsigned int &resRangeEnd)
{
bool success=false;
if(range1Start<range2Start)
{
if(range2Start<=range1End)
{
resRangeStart=range2Start;
if(range1End<range2End)
{
resRangeEnd=range1End;
}
else
{
resRangeEnd=range2End;
}
success=true;
}
}
else
{
if(range1Start<=range2End)
{
resRangeStart=range1Start;
if(range2End<range1End)
{
resRangeEnd=range2End;
}
else
{
resRangeEnd=range1End;
}
success=true;
}
}
return success;
}
以下函数将 EmailWhereClauseUnit 的向量 A 和向量 B 进行 OR 操作。在此函数中,两个向量都通过嵌套循环进行遍历。如果 A 中类型为 EmailWhereClauseUnit 的一个结构体与 B 中类型相同的另一个结构体有共同点,则将公共值添加到结果向量中。公共点通过匹配 A 和 B 中属于 EmailWhereClauseUnit 类型的结构体的“from”、“to”、“cc”、“subject”、“body”和“message date”来找到。如果两个“from”都不为空,则它们必须匹配,否则跳过它们。类似地,如果两个“to”都不为空,则它们必须匹配,否则跳过它们。如果其中一个“from”为空而另一个不为空,则公共“from”为空。类似地,如果其中一个“to”为空而另一个不为空,则公共“to”为空。类似地,如果“message date”范围之间存在重叠,则将公共值插入到结果向量中,并从向量 A 和 B 中删除。嵌套循环完成后,遍历向量 A 并将所有元素添加到结果向量中。类似地,遍历向量 B 并将所有元素添加到结果向量中。
vector<EmailWhereClauseUnit> Or(vector<EmailWhereClauseUnit> A,vector<EmailWhereClauseUnit> B)
{
vector<EmailWhereClauseUnit> result;
for(vector<EmailWhereClauseUnit>::iterator iter1=A.begin();A.size()>0&&iter1!=A.end();iter1++)
{
for(vector<EmailWhereClauseUnit>::iterator iter2=B.begin();B.size()>0&&iter2!=B.end();iter2++)
{
bool commonFound=false;
string from="";
if(iter1->from!=""&&iter2->from!="")
{
if(iter1->from==iter2->from)
{
from=iter1->from;
commonFound=true;
}
else
{
continue;
}
}
string to="";
if(iter1->to!=""&&iter2->to!="")
{
if(iter1->to==iter2->to)
{
to=iter1->to;
commonFound=true;
}
else
{
continue;
}
}
string cc="";
if(iter1->cc!=""&&iter2->cc!="")
{
if(iter1->cc==iter2->cc)
{
cc=iter1->cc;
commonFound=true;
}
else
{
continue;
}
}
string subject="";
if(iter1->subject!=""&&iter2->subject!="")
{
if(iter1->subject==iter2->subject)
{
subject=iter1->subject;
commonFound=true;
}
else
{
continue;
}
}
string body="";
if(iter1->body!=""&&iter2->body!="")
{
if(iter1->body==iter2->body)
{
body=iter1->body;
commonFound=true;
}
else
{
continue;
}
}
unsigned int emailTimeStampStart;
unsigned int emailTimeStampEnd;
if(FindOrBetweenTwoRanges(iter1->emailTimeStampStart,iter1->emailTimeStampEnd,iter2->emailTimeStampStart,iter2->emailTimeStampEnd,emailTimeStampStart,emailTimeStampEnd))
{
commonFound=true;
}
if(commonFound==true)
{
EmailWhereClauseUnit myEmailWhereClauseUnit;
myEmailWhereClauseUnit.from=from;
myEmailWhereClauseUnit.to=to;
myEmailWhereClauseUnit.cc=cc;
myEmailWhereClauseUnit.subject=subject;
myEmailWhereClauseUnit.body=body;
myEmailWhereClauseUnit.emailTimeStampStart=emailTimeStampStart;
myEmailWhereClauseUnit.emailTimeStampEnd=emailTimeStampEnd;
result.push_back(myEmailWhereClauseUnit);
A.erase(iter1);
B.erase(iter2);
iter1=A.begin();
iter2=B.begin();
}
}
}
for(vector<EmailWhereClauseUnit>::iterator iter1=A.begin();iter1!=A.end();iter1++)
{
result.push_back(*iter1);
}
for(vector<EmailWhereClauseUnit>::iterator iter2=B.begin();iter2!=B.end();iter2++)
{
result.push_back(*iter2);
}
return result;
}
以下函数查找两个薪资范围之间的 OR。如果范围 1 的开始小于范围 2 的开始,并且范围 2 的开始小于范围 1 的结束,则存在重叠。结果重叠是从范围 1 的开始到范围 1 的结束或范围 2 的结束(取决于哪个较大,反之亦然)。
bool FindOrBetweenTwoRanges(unsigned int range1Start,unsigned int range1End,unsigned int range2Start,unsigned int range2End, unsigned int &resRangeStart,unsigned int &resRangeEnd)
{
bool success=false;
if(range1Start<range2Start)
{
if(range2Start<=range1End)
{
resRangeStart=range1Start;
if(range1End>=range2End)
{
resRangeEnd=range1End;
}
else
{
resRangeEnd=range2End;
}
success=true;
}
}
else
{
if(range1Start<=range2End)
{
resRangeStart=range2Start;
if(range2End>=range1End)
{
resRangeEnd=range2End;
}
else
{
resRangeEnd=range1End;
}
success=true;
}
}
return success;
}
当 traverse_cond 函数完成时,会检查解析堆栈是否只包含一个类型为 ParseStackElement 的元素。ParseStackElement 中的 emailWhereClauseUnitVector 向量被传递给 GetEmails。
void GetEmails(vector<EmailWhereClauseUnit> &emailWhereClauseUnitVector,vector<Email> &emails)
{
set<int> emailIdList;
EmailWhereClauseUnitVectorToEmailsIdList(emailWhereClauseUnitVector,emailIdList);
cImap *i_Imap=InitImap();
if(i_Imap!=NULL)
{
try
{
for (set<int>::iterator iter=emailIdList.begin();iter!=emailIdList.end();iter++)
{
GuardPtr<cEmailParser> i_Email = i_Imap->FetchEmailAt(*iter); // Sends here the FETCH HEADER command
Email myEmail;
GetEmail(myEmail,i_Email,*iter);
emails.push_back(myEmail);
}
i_Imap->Close(); // Close the connection to the server
delete i_Imap;
}
catch(...)
{
}
}
}
GetEmails 函数调用 EmailWhereClauseUnitVectorToEmailsIdList,后者遍历 emailWhereClauseUnitVector 向量并准备 IMAP 搜索字符串。GetEmail 函数在调用该函数后,会遍历 emailIdList 并获取电子邮件,然后调用 GetEmail 函数,该函数填充 Email 结构体,然后将其推入 emails 向量。
void EmailWhereClauseUnitVectorToEmailsIdList(vector<EmailWhereClauseUnit> &emailWhereClauseUnitVector,set<int> &emailIdList)
{
vector<EmailWhereClauseUnit>::iterator iter=emailWhereClauseUnitVector.begin();
for(;iter!=emailWhereClauseUnitVector.end();iter++)
{
string search="SEARCH ";
if(!iter->from.empty())
{
search=search+"FROM ";
search=search+iter->from;
search=search+" ";
}
if(!iter->to.empty())
{
search=search+"TO ";
search=search+iter->to;
search=search+" ";
}
if(!iter->cc.empty())
{
search=search+"CC ";
search=search+iter->cc;
search=search+" ";
}
if(!iter->subject.empty())
{
search=search+"SUBJECT \"";
search=search+iter->subject;
search=search+"\" ";
}
if(!iter->body.empty())
{
search=search+"BODY ";
search=search+iter->body;
search=search+" ";
}
struct tm timeinfo={0};
char buffer [80];
iter->emailTimeStampStart=iter->emailTimeStampStart-60*60*24;
_localtime64_s( &timeinfo, &iter->emailTimeStampStart );
strftime (buffer,80,"%d-%b-%Y",&timeinfo);
search=search+"SINCE ";
search=search+buffer;
iter->emailTimeStampEnd=iter->emailTimeStampEnd+60*60*24;
_localtime64_s( &timeinfo, &iter->emailTimeStampEnd );
strftime (buffer,80,"%d-%b-%Y",&timeinfo);
search=search+" BEFORE ";
search=search+buffer;
char *response=CurlClientResponse(search);
if(response&&_stricmp(response,"* SEARCH\r\n")!=0)
{
std::string s=response;
std::istringstream iss(s);
std::string sub;
iss>>sub;
iss>>sub;
while(iss)
{
iss>>sub;
if(!sub.empty())
{
emailIdList.insert(atoi(sub.c_str())-1);
}
}
}
}
}
如果用户发出 select * from email 查询,则会调用 GetAllEmails 函数,该函数获取收件箱中的所有电子邮件。
void GetAllEmails(vector<Email> &emails)
{
cImap *i_Imap=InitImap();
if(i_Imap!=NULL)
{
try
{
int s32_EmailCount = i_Imap->GetEmailCount();
for (int M=s32_EmailCount-1; M>=0; M--)
{
GuardPtr<cEmailParser> i_Email = i_Imap->FetchEmailAt(M);
Email myEmail;
GetEmail(myEmail,i_Email,M);
emails.push_back(myEmail);
}
i_Imap->Close();
delete i_Imap;
}
catch(...)
{
}
}
}
MySQL 调用 rnd_next 来提供下一行。如果 m_EmailVectorIndex 小于 m_EmailVector 的大小,表示还有要提供的行,则调用 field->store 来存储 uid、from、to、cc、subject、body 和 message date。之后,将 m_EmailVectorIndex 递增。如果没有更多电子邮件要提供,则返回 HA_ERR_END_OF_FILE,否则返回 0。
int ha_example::rnd_next(uchar *buf)
{
int rc;
DBUG_ENTER("ha_example::rnd_next");
MYSQL_READ_ROW_START(table_share->db.str, table_share->table_name.str,
TRUE);
if(m_EmailVectorIndex<m_EmailVector.size())
{
Field **field=table->field;
bitmap_set_bit(table->write_set, (*field)->field_index);
(*field)->set_notnull();
(*field)->store(m_EmailVector[m_EmailVectorIndex].uid,true);
field++;
bitmap_set_bit(table->write_set, (*field)->field_index);
(*field)->set_notnull();
(*field)->store(m_EmailVector[m_EmailVectorIndex].from.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].from.c_str()), system_charset_info);
field++;
bitmap_set_bit(table->write_set, (*field)->field_index);
(*field)->set_notnull();
(*field)->store(m_EmailVector[m_EmailVectorIndex].to.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].to.c_str()), system_charset_info);
field++;
bitmap_set_bit(table->write_set, (*field)->field_index);
(*field)->set_notnull();
(*field)->store(m_EmailVector[m_EmailVectorIndex].cc.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].cc.c_str()), system_charset_info);
field++;
bitmap_set_bit(table->write_set, (*field)->field_index);
(*field)->set_notnull();
(*field)->store(m_EmailVector[m_EmailVectorIndex].subject.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].subject.c_str()), system_charset_info);
field++;
bitmap_set_bit(table->write_set, (*field)->field_index);
(*field)->set_notnull();
(*field)->store(m_EmailVector[m_EmailVectorIndex].body.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].body.c_str()), system_charset_info);
field++;
bitmap_set_bit(table->write_set, (*field)->field_index);
(*field)->set_notnull();
MYSQL_TIME myTime={0};
myTime.year=m_EmailVector[m_EmailVectorIndex].receiveTime.getYear();
myTime.month=m_EmailVector[m_EmailVectorIndex].receiveTime.getMonth();
myTime.day=m_EmailVector[m_EmailVectorIndex].receiveTime.getDay();
myTime.hour=m_EmailVector[m_EmailVectorIndex].receiveTime.getHour();
myTime.minute=m_EmailVector[m_EmailVectorIndex].receiveTime.getMinute();
myTime.second=m_EmailVector[m_EmailVectorIndex].receiveTime.getSecond();
(*field)->store_time(&myTime);
field++;
m_EmailVectorIndex++;
rc=0;
}
else
{
rc= HA_ERR_END_OF_FILE;
}
MYSQL_READ_ROW_DONE(rc);
DBUG_RETURN(rc);
}
如何运行
- 如果您还没有 Gmail 帐户,请创建一个。
- 从以下链接下载 MySQL 5.6.22。
- https://dev.mysqlserver.cn/downloads/mysql/
- 请确保下载 32 位版本 (mysql-5.6.22-win32.zip)。
- 将 EmailSearchEngine.dll 从 EmailSearchEngineBinaries.zip 复制过来。
- 转到您解压 MYSQL 的目录 (mysql-5.6.22-win32)。
- 找到子目录 lib 并打开它。
- 找到并打开里面的 plugin 目录。
- 粘贴 EmailSearchEngine.dll。
- 将 credentials.txt 从 EmailSearchEngineBinaries.zip 复制过来。找到子目录 data 并粘贴它。
- 打开 credentials.txt,将第一行替换为您的电子邮件地址,将第二行替换为您的 Gmail 密码。
- 从 EmailSearchEngineBinaries.zip 复制 curllib.dll、libcurl.dll、libeay32.dll、libgsasl-7.dll、libidn-11.dll、librtmp.dll、libsasl.dll、libssh2.dll、openldap.dll、ssleay32.dll、vmime.NET.dll、zlib1.dll。
- 找到子目录 bin 并打开它。
- 粘贴复制的 dll 文件。
- 启动 mysqld.exe (以管理员身份运行)。
- 启动命令提示符 (以管理员身份运行)。
- 更改目录到 mysql-5.6.22-win32/bin。
- 运行以下命令
- mysql –uroot
- 这将启动 MYSQL 控制台。
- 运行以下命令。
- Install plugin EmailSearchEngine soname ' EmailSearchEngine.dll';
- 这将安装 EmailSearchEngine。
- 创建一个名为 test 的数据库(如果尚不存在)。
- 现在创建一个名为 Email 的表,并将 EmailSearchEngine 指定为存储引擎。
CREATE TABLE `email` (
`UID` int(10) unsigned DEFAULT NULL,
`from` varchar(1024) DEFAULT NULL,
`to` varchar(1024) DEFAULT NULL,
`cc` varchar(1024) DEFAULT NULL,
`subject` varchar(1024) DEFAULT NULL,
`body` LONGTEXT,
`messagedate` datetime DEFAULT NULL
) ENGINE=EmailSearchEngine;
在继续之前,如果您在使用防火墙,请允许 mysqld.exe 通过。
输入以下查询,但请将主题更改为您收件箱中的一个邮件。
Select `uid`,`from`,`to`,`cc`,`subject` from email where messagedate>="2014-12-06 00:00:00" and messagedate<="2014-12-06 23:59:59" and subject='Google Account password changed';
以下是一些示例查询:
Select subject from email where messagedate>="2015-01-09 00:00:00" and messagedate<="2015-01-09 23:59:59" and subject like '%Important%';
Select count(*) from email where messagedate>="2015-01-09 00:00:00" and messagedate<="2015-01-09 23:59:59";
现在输入您选择的任何查询,然后查看结果。
在哪里获取源代码
- 从以下链接下载 MYSQL 源代码。
- https://dev.mysqlserver.cn/downloads/mysql/
- 将源代码提取到 C:/mysql-5.6.22。
如何构建源代码
- 请按照以下地址提供的说明进行操作:
- https://dev.mysqlserver.cn/doc/refman/5.6/en/installing-source-distribution.html
- 我使用 Visual Studio 2012 按照以下说明构建了源代码:
- cmake . -G "Visual Studio 11"
- 下载本文附件名为 SourceCodeEmailSearchEngine.zip 的文件。
- 它包含 example 文件夹中的 EmailSearchEngine 源代码。
- 将 example 文件夹复制到 mysql-5.6.22\storage。
结论
我们已经找到了一种方法,可以利用 MySQL 解析器的强大功能来分析您的电子邮件,这是以前从未做过的。这为新类型的分析打开了视野。这种方法可以推广。我将在本系列文章的后续部分中继续介绍这种方法。
参考文献
[1] vmime.NET - Smtp, Pop3, Imap Library (for C++ and .NET)
https://codeproject.org.cn/Articles/719490/vmime-NET-Smtp-Pop-Imap-Library-for-Cplusplus-and
[2] Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 4
https://codeproject.org.cn/Articles/796061/Building-on-My-SQL-Storage-Engines-and-APIs-to-C
[3] Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 3
https://codeproject.org.cn/Articles/768518/Building-on-My-SQL-Storage-Engines-and-APIs-to-C
[4] Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 2
https://codeproject.org.cn/Articles/732116/Building-on-My-SQL-Storage-Engines-and-APIs-to-Con
[5] Building on My SQL Storage Engines and API’s to conduct complex analysis using SQL query language
https://codeproject.org.cn/Articles/715914/Building-on-My-SQL-Storage-Engines-and-API-s-to-co