65.9K
CodeProject 正在变化。 阅读更多。
Home

电子邮件搜索引擎

starIconstarIconstarIconstarIconstarIcon

5.00/5 (11投票s)

2015年1月11日

CPOL

11分钟阅读

viewsIcon

23805

downloadIcon

408

基于 MySQL 存储引擎和 API 进行复杂分析,使用 SQL 查询语言:第 5 部分。

引言

MySQL 支持许多存储引擎,它们负责存储和检索数据。MyISAM 和 InnoDB 是常见的。在创建表时可以指定引擎。基本的存储引擎允许只读表。大多数引擎从文件中读取数据并提供给 MYSQL。另一方面,我们的想法是从 API 中检索数据并将其提供给 MYSQL。这将使我们能够使用 SQL 对从 API 检索到的数据进行复杂分析。想法是为每组相关的 API 编写一个引擎。在本系列文章的第 4 部分中,我介绍了 MySQL 存储引擎的概念,该引擎在 MySQL 控制台上公开 NOSQL 数据,供您进行复杂分析。这允许对来自Parse.com的数据进行复杂分析。本系列文章继续进行,这次介绍了 MySQL 存储引擎的概念,该引擎在 MySQL 控制台上公开您的电子邮件,供您进行复杂分析。具体来说,这允许对您 Gmail 收件箱中的电子邮件进行复杂分析。

背景

一种常规方法是将电子邮件从 Gmail 导出并插入到 MySQL 中。

  • 这种方法的一个缺点是,每次邮件到达或被删除时,您都必须向表中插入和删除行。
  • 另一个缺点是,在查询表之前,您需要触发该日志应用程序来插入和删除行。
  • 另一个缺点是,即使您只需要检索几行,所有电子邮件都保留在表中。
  • 另一个缺点是,在插入或删除期间,表可能会被锁定,这会给检索实体带来性能瓶颈。
  • 另一个缺点是数据重复
  • 如果您的日志应用程序(负责插入和删除电子邮件)停运了一周,那么您将不得不等待一段时间才能检索到今天的电子邮件。

相反,提议的方法通过以下方式避免了这些问题:

  • 无需向表中插入和删除行。
  • 在查询表之前,无需触发日志应用程序。
  • 不需要在只需要少量电子邮件时获取所有电子邮件。
  • 没有瓶颈,因为没有插入和删除。
  • 无需等待日志应用程序在一周内插入电子邮件。

此外,提议的方法优于 Gmail 的默认搜索功能,因为它利用了 MySQL 解析器的强大功能来分析您的电子邮件,这是以前从未做过的。

电子邮件搜索引擎

这个 MySQL 存储引擎与第 4 部分中介绍的存储引擎不同,因为它从 Gmail 检索数据并将其提供给 MySQL。创建了一个名为 Email 的表,其引擎为 EmailSearchEngine。

 

CREATE TABLE `email` (
  `UID` int(10) unsigned DEFAULT NULL,
  `from` varchar(1024) DEFAULT NULL,
  `to` varchar(1024) DEFAULT NULL,
  `cc` varchar(1024) DEFAULT NULL,
  `subject` varchar(1024) DEFAULT NULL,
  `body` LONGTEXT,
  `messagedate` datetime DEFAULT NULL
) ENGINE=EmailSearchEngine;

 当用户在 Email 表上输入 select 查询时,控制权会传递到 rnd_init 方法中的 EmailSearchEngine。此方法检查表名是否确实是“Email”。然后,它检查 where 子句解析树是否存在。如果存在,则遍历 where 子句解析树以转换为 IMAP 搜索命令。否则,收件箱中的所有电子邮件都将提供给 MySQL。为了检索电子邮件,我们使用 vmime.NET [1],这是一个面向 C++ 和 .NET Windows 程序员的库,允许通过 SMTP、POP3 和 IMAP(支持 TLS 和 SSL)发送和下载电子邮件。
在继续 rnd_init 之前,需要解释以下数据结构。

vector<Email> m_EmailVector;
//(belongs to ha_example class and maintains all emails for the query)
int m_EmailVectorIndex; 
//(belongs to ha_example class and maintains the index to m_EmailVector)

struct Email
{
    int uid;
    string from;
    string to;
    string cc;
    string subject;
    string body;
    vmime::datetime receiveTime;
};

 

int ha_example::rnd_init(bool scan)
{
    DBUG_ENTER("ha_example::rnd_init");
    
    if(_stricmp(table_share->table_name.str,"Email")==0)
    {
        THD *thd=this->ha_thd();
        SELECT_LEX *select_lex=&thd->lex->select_lex;
        m_EmailVectorIndex=0;
        m_EmailVector.clear();
        if(select_lex->where==0)
        {
            stack<ParseStackElement> parseStack;
            ParseStackElement topElem;//=parseStack.top();
            GetAllEmails(m_EmailVector);
        }
        else
        {
            stack<ParseStackElement> parseStack;
            select_lex->where->traverse_cond(My_Cond_traverser,(void*)&parseStack,Item::traverse_order::POSTFIX);
            if(!parseStack.empty()&&parseStack.size()==1)
            {
                ParseStackElement topElem=parseStack.top();
                if(topElem.emailWhereClauseUnitVector.size()==1)
                {
                    if(topElem.emailWhereClauseUnitVector[0].uid!=-1&&topElem.emailWhereClauseUnitVector[0].from==""&&topElem.emailWhereClauseUnitVector[0].to==""&&topElem.emailWhereClauseUnitVector[0].cc==""&&topElem.emailWhereClauseUnitVector[0].subject==""&&topElem.emailWhereClauseUnitVector[0].body=="")
                    {
                        GetSingleEmail(topElem.emailWhereClauseUnitVector[0].uid,m_EmailVector);
                    }
                    else
                    {
                        GetEmails(topElem.emailWhereClauseUnitVector,m_EmailVector);
                    }
                }
                else
                {
                    GetEmails(topElem.emailWhereClauseUnitVector,m_EmailVector);
                }
            }
            else
            {
                GetAllEmails(m_EmailVector);
            }
        }
    }
    DBUG_RETURN(0);
}

在 where 子句解析树的后缀遍历期间,创建并维护 ParseStackElement 结构体堆栈。

struct ParseStackElement
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    Item *item;
};

ParseStackElement 包含 EmailWhereClauseUnit 类型的向量。

class EmailWhereClauseUnit
{
public:
    int uid;
    string from;
    string to;
    string cc;
    string subject;
    string body;
    time_t emailTimeStampStart;
    time_t emailTimeStampEnd;
    EmailWhereClauseUnit()
    {
        uid=-1;
    }
};

EmailWhereClauseUnit 是 where 子句的基本单元,包含 uid、from、to、cc、subject、body 和 messagedate。例如,考虑以下 where 子句:

Select * from email where `from`="example@yahoo.com" and messagedate >="2014-12-06 00:00:00" and messagedate<="2014-12-06 23:59:59";

等效的 EmailWhereClauseUnit 是:

  • from=example@yahoo.com
  • emailTimeStampStart=2014-12-06 00:00:00
  • emailTimeStampEnd=2014-12-06 23:59:59

rnd_init 调用 traverse_cond 在 where 子句解析树的根节点上,以便以后缀顺序遍历它。My_Cond_traverser 函数被指定在遍历期间调用。将解析堆栈传递给此函数。此函数检查以下内容:

  • In 操作符:在这种情况下,调用 Item_func_in_case 来获取“in 子句”中指定的“from”、“to”、“cc”、“subject”、“body”或“message date”的集合,并将其推入堆栈。
  • Equality 操作符:在这种情况下,调用 Item_func_eq_case 来获取指定的“from”、“to”、“cc”、“subject”、“body”或“message date”,并将其推入堆栈。
  • Less 或 Less and equal 操作符:在这种情况下,调用 Item_func_less_case 来获取指定的“message date”范围,该范围以零开头并推入堆栈。
  • Greater 或 greater and equal 操作符:在这种情况下,调用 Item_func_greater_case 来获取指定的“message date”范围,该范围以 MYSQL_TIMESTAMP_MAX_VALUE 结尾并推入堆栈。
  • Between 操作符:在这种情况下,调用 Item_func_between_case 来获取指定的“message date”范围并推入堆栈。
  • And/Or 操作符:在这种情况下,弹出堆栈直到出现项目(My_Cond_traverser 的第一个参数)的子项。在弹出过程中,将 EmailWhereClauseUnit 类型的向量进行 AND 和 OR 操作,以获得结果集。
void My_Cond_traverser (const Item *item, void *arg)
{
    stack<ParseStackElement> *parseStack=(stack<ParseStackElement> *)arg;
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    if(dynamic_cast<const Item_func_in*>(item))
    {
        emailWhereClauseUnitVector=Item_func_in_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_func_eq*>(item))
    {
        emailWhereClauseUnitVector=Item_func_eq_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_func_ge*>(item)||dynamic_cast<const Item_func_gt*>(item))
    {
        emailWhereClauseUnitVector=Item_func_greater_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_func_le*>(item)||dynamic_cast<const Item_func_lt*>(item))
    {
        emailWhereClauseUnitVector=Item_func_less_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_func_between*>(item))
    {
        emailWhereClauseUnitVector=Item_func_between_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_cond*>(item))
    {
        const Item_cond *itemCondC=dynamic_cast<const Item_cond*>(item);
        Item_cond *itemCond=(Item_cond *)itemCondC;
        
        vector<EmailWhereClauseUnit> result;
        bool isAnd=false;
        if(dynamic_cast<const Item_cond_and*>(item))
        {
            isAnd=true;
        }
        if (!parseStack->empty()&&isChildOf((Item_cond*)item,parseStack->top().item))
        {
            result=parseStack->top().emailWhereClauseUnitVector;
            parseStack->pop();
        }
        while (!parseStack->empty()&&isChildOf((Item_cond*)item,parseStack->top().item))    
        {
            if(result.empty()&&!parseStack->top().emailWhereClauseUnitVector.empty())
            {
                result=parseStack->top().emailWhereClauseUnitVector;
            }
            else if(!result.empty()&&parseStack->top().emailWhereClauseUnitVector.empty())
            {
                result=result;
            }
            else
            {
                if(isAnd)
                {
                    result=And(result,parseStack->top().emailWhereClauseUnitVector);
                }
                else
                {
                    result=Or(result,parseStack->top().emailWhereClauseUnitVector);
                }
            }
            parseStack->pop();            
        }
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=result;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
}

以下函数提取用户在等号子句中输入的“from”、“to”、“cc”、“subject”、“body”或“message date”。代码检查该项是否为 Item_func 类型。然后它检查第一个参数是字段,第二个不是字段。然后它检查字段是“from”、“to”、“cc”、“subject”、“body”或“message date”。

vector<EmailWhereClauseUnit> Item_func_eq_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(dynamic_cast <Item_field*>(arguments[1]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value = arguments[1];
    }
    Item_field *f;

    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"from")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from=value->item_name.ptr();
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"to")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to=value->item_name.ptr();
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"cc")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc=value->item_name.ptr();
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"subject")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject=value->item_name.ptr();
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"body")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body=value->item_name.ptr();
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        MYSQL_TIME emailDateTime={0};
        value->get_date(&emailDateTime,0);
        tm tempDateTime={0};
        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
        myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"UID")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.uid=atoi(value->item_name.ptr());
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    return emailWhereClauseUnitVector;
}

以下函数提取“in 子句”中指定的“from”、“to”、“cc”、“subject”、“body”或“message date”的集合。首先检查该项是否为 Item_func 类型。然后检查第一个参数是字段,并且后续参数不是字段类型。

vector<EmailWhereClauseUnit> Item_func_in_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    int inArgcount=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
        inArgcount=itemFunction->arg_count;
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"from")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from=value->item_name.ptr();
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"to")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to=value->item_name.ptr();
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"cc")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc=value->item_name.ptr();
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"subject")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject=value->item_name.ptr();
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"body")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body=value->item_name.ptr();
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            MYSQL_TIME emailDateTime={0};
            value->get_date(&emailDateTime,0);
            tm tempDateTime={0};
            tempDateTime.tm_year=emailDateTime.year-1900;
            tempDateTime.tm_mon=emailDateTime.month-1;
            tempDateTime.tm_mday=emailDateTime.day;
            tempDateTime.tm_hour=emailDateTime.hour;
            tempDateTime.tm_min=emailDateTime.minute;
            tempDateTime.tm_sec=emailDateTime.second;
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
            myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    return emailWhereClauseUnitVector;
}

以下函数提取“less 或 less than equal 子句”中指定的“message date”。首先检查该项是否为 Item_func 类型。然后检查第一个参数是字段,并且后续参数不是字段类型。Less than 子句被转换为 where 子句单元,其中 emailTimeStampStart 为零,emailTimeStampEnd 等于 less 子句中指定的值。

vector<EmailWhereClauseUnit> Item_func_less_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(dynamic_cast <Item_field*>(arguments[1]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value = arguments[1];
    }
    Item_field *f;
    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        MYSQL_TIME emailDateTime={0};
        value->get_date(&emailDateTime,0);
        tm tempDateTime={0};
        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=0;
        myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }

    return emailWhereClauseUnitVector;
}

以下函数提取“greater 或 greater than equal 子句”中指定的 message date。首先检查该项是否为 Item_func 类型。然后检查第一个参数是字段,并且后续参数不是字段类型。Greater than 子句被转换为 where 子句单元,其中 emailTimeStampStart 为 greater 子句中指定的值,emailTimeStampEnd 等于 MYSQL_TIMESTAMP_MAX_VALUE 值。

vector<EmailWhereClauseUnit> Item_func_greater_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(dynamic_cast <Item_field*>(arguments[1]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value = arguments[1];
    }
    Item_field *f;
    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        MYSQL_TIME emailDateTime={0};
        value->get_date(&emailDateTime,0);
        tm tempDateTime={0};
        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }

    return emailWhereClauseUnitVector;
}

以下函数提取“between 子句”中指定的“message date”。首先检查该项是否为 Item_func 类型。然后检查第一个参数是字段,并且后续参数不是字段类型。“between”子句被转换为 where 子句单元,其中 emailTimeStampStart 和 emailTimeStampEnd 是“between”子句中指定的值。

vector<EmailWhereClauseUnit> Item_func_between_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value1=0;
    Item *value2=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(dynamic_cast <Item_field*>(arguments[1]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value1 = arguments[1];
    }
    if(dynamic_cast <Item_field*>(arguments[2]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value2 = arguments[2];
    }
    Item_field *f;
    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        MYSQL_TIME emailDateTime={0};
        value1->get_date(&emailDateTime,0);
        tm tempDateTime={0};
        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
        value2->get_date(&emailDateTime,0);

        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }

    return emailWhereClauseUnitVector;
}

以下是 My_Cond_traverser 用来检查父子关系的一个辅助函数。

bool isChildOf(Item_cond *parent,Item *child)
{
    List_iterator<Item> li(*(parent->argument_list()));
    Item *it= NULL;
    while ((it= li++))    
    {
        if(child==it)
            return true;
    }
    return false;
}

以下函数将 EmailWhereClauseUnit 的向量 A 和向量 B 进行 AND 操作。在此函数中,两个向量都通过嵌套循环进行遍历。如果 A 中类型为 EmailWhereClauseUnit 的一个结构体与 B 中类型相同的另一个结构体匹配,则将此 EmailWhereClauseUnit 添加到结果向量中。通过检查 A 和 B 中属于 EmailWhereClauseUnit 类型的结构体的“from”、“to”、“cc”、“subject”、“body”和“message date”来找到匹配。如果两个“from”都不为空,则它们必须匹配,否则跳过它们。类似地,如果两个“to”都不为空,则它们必须匹配,否则跳过它们,依此类推。如果其中一个“from”为空而另一个不为空,则公共“from”等于非空的一个。类似地,如果其中一个“to”为空而另一个不为空,则公共“to”等于非空的一个。类似地,如果 A 和 B 中的 EmailWhereClauseUnit 的“message date”范围有重叠,则将公共值插入到结果向量中。

vector<EmailWhereClauseUnit> And(vector<EmailWhereClauseUnit> A,vector<EmailWhereClauseUnit> B)
{
    vector<EmailWhereClauseUnit> result;

    for(vector<EmailWhereClauseUnit>::iterator iter1=A.begin();iter1!=A.end();iter1++)
    {
        for(vector<EmailWhereClauseUnit>::iterator iter2=B.begin();iter2!=B.end();iter2++)
        {
            string from="";
            if(iter1->from!=""&&iter2->from!="")
            {
                if(iter1->from==iter2->from)
                {
                    from=iter1->from;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->from!=""&&iter2->from=="")
            {
                from=iter1->from;
            }
            else if(iter1->from==""&&iter2->from!="")
            {
                from=iter2->from;
            }
            else if(iter1->from==""&&iter2->from=="")
            {
            }

            string to="";
            if(iter1->to!=""&&iter2->to!="")
            {
                if(iter1->to==iter2->to)
                {
                    to=iter1->to;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->to!=""&&iter2->to=="")
            {
                to=iter1->to;
            }
            else if(iter1->to==""&&iter2->to!="")
            {
                to=iter2->to;
            }
            else if(iter1->to==""&&iter2->to=="")
            {
            }

            string cc="";
            if(iter1->cc!=""&&iter2->cc!="")
            {
                if(iter1->cc==iter2->cc)
                {
                    cc=iter1->cc;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->cc!=""&&iter2->cc=="")
            {
                cc=iter1->cc;
            }
            else if(iter1->cc==""&&iter2->cc!="")
            {
                cc=iter2->cc;
            }
            else if(iter1->cc==""&&iter2->cc=="")
            {
            }

            string subject="";
            if(iter1->subject!=""&&iter2->subject!="")
            {
                if(iter1->subject==iter2->subject)
                {
                    subject=iter1->subject;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->subject!=""&&iter2->subject=="")
            {
                subject=iter1->subject;
            }
            else if(iter1->subject==""&&iter2->subject!="")
            {
                subject    =iter2->subject;
            }
            else if(iter1->subject==""&&iter2->subject=="")
            {
            }

            string body="";
            if(iter1->body!=""&&iter2->body!="")
            {
                if(iter1->body==iter2->body)
                {
                    body=iter1->body;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->body!=""&&iter2->body=="")
            {
                body=iter1->body;
            }
            else if(iter1->body==""&&iter2->body!="")
            {
                body=iter2->body;
            }
            else if(iter1->body==""&&iter2->body=="")
            {
            }
            unsigned int emailTimeStampStart;
            unsigned int emailTimeStampEnd;
            bool common=FindAndBetweenTwoRanges(iter1->emailTimeStampStart,iter1->emailTimeStampEnd,iter2->emailTimeStampStart,iter2->emailTimeStampEnd,emailTimeStampStart,emailTimeStampEnd);
            if(common==false)
            {
                continue;
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from=from;
            myEmailWhereClauseUnit.to=to;
            myEmailWhereClauseUnit.cc=cc;
            myEmailWhereClauseUnit.subject=subject;
            myEmailWhereClauseUnit.body=body;
            myEmailWhereClauseUnit.emailTimeStampStart=emailTimeStampStart;
            myEmailWhereClauseUnit.emailTimeStampEnd=emailTimeStampEnd;
            result.push_back(myEmailWhereClauseUnit);
        }
    }
    return result;
}

以下函数查找两个消息日期范围之间的重叠。如果范围 1 的开始小于范围 2 的开始,并且范围 2 的开始小于范围 1 的结束,则存在重叠。结果重叠是从范围 2 的开始到范围 1 的结束或范围 2 的结束(取决于哪个较小,反之亦然)。

bool FindAndBetweenTwoRanges(unsigned int range1Start,unsigned int range1End,unsigned int range2Start,unsigned int range2End,unsigned int &resRangeStart,unsigned int &resRangeEnd)
{
    bool success=false;
    if(range1Start<range2Start)
    {
        if(range2Start<=range1End)
        {
            resRangeStart=range2Start;
            if(range1End<range2End)
            {
                resRangeEnd=range1End;
            }
            else
            {
                resRangeEnd=range2End;
            }
            success=true;
        }
    }
    else
    {
        if(range1Start<=range2End)
        {
            resRangeStart=range1Start;
            if(range2End<range1End)
            {
                resRangeEnd=range2End;
            }
            else
            {
                resRangeEnd=range1End;
            }
            success=true;
        }
    }
    return success;
}

以下函数将 EmailWhereClauseUnit 的向量 A 和向量 B 进行 OR 操作。在此函数中,两个向量都通过嵌套循环进行遍历。如果 A 中类型为 EmailWhereClauseUnit 的一个结构体与 B 中类型相同的另一个结构体有共同点,则将公共值添加到结果向量中。公共点通过匹配 A 和 B 中属于 EmailWhereClauseUnit 类型的结构体的“from”、“to”、“cc”、“subject”、“body”和“message date”来找到。如果两个“from”都不为空,则它们必须匹配,否则跳过它们。类似地,如果两个“to”都不为空,则它们必须匹配,否则跳过它们。如果其中一个“from”为空而另一个不为空,则公共“from”为空。类似地,如果其中一个“to”为空而另一个不为空,则公共“to”为空。类似地,如果“message date”范围之间存在重叠,则将公共值插入到结果向量中,并从向量 A 和 B 中删除。嵌套循环完成后,遍历向量 A 并将所有元素添加到结果向量中。类似地,遍历向量 B 并将所有元素添加到结果向量中。

vector<EmailWhereClauseUnit> Or(vector<EmailWhereClauseUnit> A,vector<EmailWhereClauseUnit> B)
{
    vector<EmailWhereClauseUnit> result;
    
    for(vector<EmailWhereClauseUnit>::iterator iter1=A.begin();A.size()>0&&iter1!=A.end();iter1++)
    {
        for(vector<EmailWhereClauseUnit>::iterator iter2=B.begin();B.size()>0&&iter2!=B.end();iter2++)
        {
            bool commonFound=false;
            string from="";
            if(iter1->from!=""&&iter2->from!="")
            {
                if(iter1->from==iter2->from)
                {
                    from=iter1->from;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            string to="";
            if(iter1->to!=""&&iter2->to!="")
            {
                if(iter1->to==iter2->to)
                {
                    to=iter1->to;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            string cc="";
            if(iter1->cc!=""&&iter2->cc!="")
            {
                if(iter1->cc==iter2->cc)
                {
                    cc=iter1->cc;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            string subject="";
            if(iter1->subject!=""&&iter2->subject!="")
            {
                if(iter1->subject==iter2->subject)
                {
                    subject=iter1->subject;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            string body="";
            if(iter1->body!=""&&iter2->body!="")
            {
                if(iter1->body==iter2->body)
                {
                    body=iter1->body;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            unsigned int emailTimeStampStart;
            unsigned int emailTimeStampEnd;
            if(FindOrBetweenTwoRanges(iter1->emailTimeStampStart,iter1->emailTimeStampEnd,iter2->emailTimeStampStart,iter2->emailTimeStampEnd,emailTimeStampStart,emailTimeStampEnd))
            {
                commonFound=true;
            }
            if(commonFound==true)
            {
                EmailWhereClauseUnit myEmailWhereClauseUnit;
                myEmailWhereClauseUnit.from=from;
                myEmailWhereClauseUnit.to=to;
                myEmailWhereClauseUnit.cc=cc;
                myEmailWhereClauseUnit.subject=subject;
                myEmailWhereClauseUnit.body=body;

                myEmailWhereClauseUnit.emailTimeStampStart=emailTimeStampStart;
                myEmailWhereClauseUnit.emailTimeStampEnd=emailTimeStampEnd;
                result.push_back(myEmailWhereClauseUnit);
                A.erase(iter1);
                B.erase(iter2);
                iter1=A.begin();
                iter2=B.begin();
            }
        }
    }
    for(vector<EmailWhereClauseUnit>::iterator iter1=A.begin();iter1!=A.end();iter1++)
    {
        result.push_back(*iter1);
    }
    for(vector<EmailWhereClauseUnit>::iterator iter2=B.begin();iter2!=B.end();iter2++)
    {
        result.push_back(*iter2);
    }
    return result;
}

以下函数查找两个薪资范围之间的 OR。如果范围 1 的开始小于范围 2 的开始,并且范围 2 的开始小于范围 1 的结束,则存在重叠。结果重叠是从范围 1 的开始到范围 1 的结束或范围 2 的结束(取决于哪个较大,反之亦然)。

bool FindOrBetweenTwoRanges(unsigned int range1Start,unsigned int range1End,unsigned int range2Start,unsigned int range2End, unsigned int &resRangeStart,unsigned int &resRangeEnd)
{
    bool success=false;
    if(range1Start<range2Start)
    {
        if(range2Start<=range1End)
        {
            resRangeStart=range1Start;
            if(range1End>=range2End)
            {
                resRangeEnd=range1End;
            }
            else
            {
                resRangeEnd=range2End;
            }
            success=true;
        }
    }
    else
    {
        if(range1Start<=range2End)
        {
            resRangeStart=range2Start;
            if(range2End>=range1End)
            {
                resRangeEnd=range2End;
            }
            else
            {
                resRangeEnd=range1End;
            }
            success=true;
        }
    }
    return success;
}

当 traverse_cond 函数完成时,会检查解析堆栈是否只包含一个类型为 ParseStackElement 的元素。ParseStackElement 中的 emailWhereClauseUnitVector 向量被传递给 GetEmails。

void GetEmails(vector<EmailWhereClauseUnit> &emailWhereClauseUnitVector,vector<Email> &emails)
{
    set<int> emailIdList;
    EmailWhereClauseUnitVectorToEmailsIdList(emailWhereClauseUnitVector,emailIdList);

    cImap *i_Imap=InitImap();
    if(i_Imap!=NULL)
    {
        try
        {

            for (set<int>::iterator iter=emailIdList.begin();iter!=emailIdList.end();iter++)                
            {
            
                GuardPtr<cEmailParser> i_Email = i_Imap->FetchEmailAt(*iter); // Sends here the FETCH HEADER command
                Email myEmail;
                GetEmail(myEmail,i_Email,*iter);

                emails.push_back(myEmail);
                    
            }
            i_Imap->Close(); // Close the connection to the server
            delete i_Imap;
        }
        catch(...)
        {
        }
    }

}

GetEmails 函数调用 EmailWhereClauseUnitVectorToEmailsIdList,后者遍历 emailWhereClauseUnitVector 向量并准备 IMAP 搜索字符串。GetEmail 函数在调用该函数后,会遍历 emailIdList 并获取电子邮件,然后调用 GetEmail 函数,该函数填充 Email 结构体,然后将其推入 emails 向量。

void EmailWhereClauseUnitVectorToEmailsIdList(vector<EmailWhereClauseUnit> &emailWhereClauseUnitVector,set<int> &emailIdList)
{
    vector<EmailWhereClauseUnit>::iterator iter=emailWhereClauseUnitVector.begin();
    for(;iter!=emailWhereClauseUnitVector.end();iter++)
    {
        string search="SEARCH ";
        if(!iter->from.empty())
        {
            search=search+"FROM ";
            search=search+iter->from;
            search=search+" ";
        }
        if(!iter->to.empty())
        {
            search=search+"TO ";
            search=search+iter->to;
            search=search+" ";
        }
        if(!iter->cc.empty())
        {
            search=search+"CC ";
            search=search+iter->cc;
            search=search+" ";
        }
        if(!iter->subject.empty())
        {
            search=search+"SUBJECT \"";
            search=search+iter->subject;
            search=search+"\" ";
        }
        if(!iter->body.empty())
        {
            search=search+"BODY ";
            search=search+iter->body;
            search=search+" ";
        }
        struct tm timeinfo={0};
        char buffer [80];
        iter->emailTimeStampStart=iter->emailTimeStampStart-60*60*24;
        _localtime64_s( &timeinfo, &iter->emailTimeStampStart );
        strftime (buffer,80,"%d-%b-%Y",&timeinfo);
        search=search+"SINCE ";
        search=search+buffer;
        iter->emailTimeStampEnd=iter->emailTimeStampEnd+60*60*24;
         _localtime64_s( &timeinfo, &iter->emailTimeStampEnd );
        strftime (buffer,80,"%d-%b-%Y",&timeinfo);
        search=search+" BEFORE ";
        search=search+buffer;
        char *response=CurlClientResponse(search);
        if(response&&_stricmp(response,"* SEARCH\r\n")!=0)
        {
            std::string s=response;
            std::istringstream iss(s);
            std::string sub;
            iss>>sub;
            iss>>sub;
            while(iss)
            {

                iss>>sub;
                if(!sub.empty())
                {
                    emailIdList.insert(atoi(sub.c_str())-1);
                }
            }
        }
    }
}

如果用户发出 select * from email 查询,则会调用 GetAllEmails 函数,该函数获取收件箱中的所有电子邮件。

void GetAllEmails(vector<Email> &emails)
{
    cImap *i_Imap=InitImap();
    if(i_Imap!=NULL)
    {
        try
        {
            int s32_EmailCount = i_Imap->GetEmailCount();
            for (int M=s32_EmailCount-1; M>=0; M--)                
            {
                GuardPtr<cEmailParser> i_Email = i_Imap->FetchEmailAt(M);
                Email myEmail;
                GetEmail(myEmail,i_Email,M);
                emails.push_back(myEmail);
            }
            i_Imap->Close();
            delete i_Imap;
        }
        catch(...)
        {
        }
    }
}

 

MySQL 调用 rnd_next 来提供下一行。如果 m_EmailVectorIndex 小于 m_EmailVector 的大小,表示还有要提供的行,则调用 field->store 来存储 uid、from、to、cc、subject、body 和 message date。之后,将 m_EmailVectorIndex 递增。如果没有更多电子邮件要提供,则返回 HA_ERR_END_OF_FILE,否则返回 0。

int ha_example::rnd_next(uchar *buf)
{
    int rc;
    DBUG_ENTER("ha_example::rnd_next");
    MYSQL_READ_ROW_START(table_share->db.str, table_share->table_name.str,
                    TRUE);
    if(m_EmailVectorIndex<m_EmailVector.size())
    {

        Field **field=table->field;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].uid,true);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].from.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].from.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].to.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].to.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].cc.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].cc.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].subject.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].subject.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].body.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].body.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();

        MYSQL_TIME myTime={0};
        myTime.year=m_EmailVector[m_EmailVectorIndex].receiveTime.getYear();
        myTime.month=m_EmailVector[m_EmailVectorIndex].receiveTime.getMonth();
        myTime.day=m_EmailVector[m_EmailVectorIndex].receiveTime.getDay();
        myTime.hour=m_EmailVector[m_EmailVectorIndex].receiveTime.getHour();
        myTime.minute=m_EmailVector[m_EmailVectorIndex].receiveTime.getMinute();
        myTime.second=m_EmailVector[m_EmailVectorIndex].receiveTime.getSecond();

        (*field)->store_time(&myTime);
        field++;

        m_EmailVectorIndex++;
        rc=0;
    }
    else
    {
        rc= HA_ERR_END_OF_FILE;
    }
    MYSQL_READ_ROW_DONE(rc);
    DBUG_RETURN(rc);
}

如何运行

  • 如果您还没有 Gmail 帐户,请创建一个。
  • 从以下链接下载 MySQL 5.6.22。
    • https://dev.mysqlserver.cn/downloads/mysql/
  • 请确保下载 32 位版本 (mysql-5.6.22-win32.zip)。
  • 将 EmailSearchEngine.dll 从 EmailSearchEngineBinaries.zip 复制过来。
  • 转到您解压 MYSQL 的目录 (mysql-5.6.22-win32)。
  • 找到子目录 lib 并打开它。
  • 找到并打开里面的 plugin 目录。
  • 粘贴 EmailSearchEngine.dll。
  • 将 credentials.txt 从 EmailSearchEngineBinaries.zip 复制过来。找到子目录 data 并粘贴它。
  • 打开 credentials.txt,将第一行替换为您的电子邮件地址,将第二行替换为您的 Gmail 密码。
  • 从 EmailSearchEngineBinaries.zip 复制 curllib.dll、libcurl.dll、libeay32.dll、libgsasl-7.dll、libidn-11.dll、librtmp.dll、libsasl.dll、libssh2.dll、openldap.dll、ssleay32.dll、vmime.NET.dll、zlib1.dll。
  • 找到子目录 bin 并打开它。
  • 粘贴复制的 dll 文件。
  • 启动 mysqld.exe (以管理员身份运行)。
  • 启动命令提示符 (以管理员身份运行)。
  • 更改目录到 mysql-5.6.22-win32/bin。
  • 运行以下命令
  • mysql –uroot
  • 这将启动 MYSQL 控制台。
  • 运行以下命令。
  • Install plugin EmailSearchEngine soname ' EmailSearchEngine.dll';
  • 这将安装 EmailSearchEngine。
  • 创建一个名为 test 的数据库(如果尚不存在)。
  • 现在创建一个名为 Email 的表,并将 EmailSearchEngine 指定为存储引擎。

 

CREATE TABLE `email` (
  `UID` int(10) unsigned DEFAULT NULL,
  `from` varchar(1024) DEFAULT NULL,
  `to` varchar(1024) DEFAULT NULL,
  `cc` varchar(1024) DEFAULT NULL,
  `subject` varchar(1024) DEFAULT NULL,
  `body` LONGTEXT,
  `messagedate` datetime DEFAULT NULL
) ENGINE=EmailSearchEngine;

 

在继续之前,如果您在使用防火墙,请允许 mysqld.exe 通过。

输入以下查询,但请将主题更改为您收件箱中的一个邮件。

Select `uid`,`from`,`to`,`cc`,`subject` from email where messagedate>="2014-12-06 00:00:00" and messagedate<="2014-12-06 23:59:59" and subject='Google Account password changed';

以下是一些示例查询:

Select subject from email where messagedate>="2015-01-09 00:00:00" and messagedate<="2015-01-09 23:59:59" and subject like '%Important%';

Select count(*) from email where messagedate>="2015-01-09 00:00:00" and messagedate<="2015-01-09 23:59:59";
现在输入您选择的任何查询,然后查看结果。

在哪里获取源代码

  • 从以下链接下载 MYSQL 源代码。
    • https://dev.mysqlserver.cn/downloads/mysql/
  • 将源代码提取到 C:/mysql-5.6.22。

 

如何构建源代码

  • 请按照以下地址提供的说明进行操作:
    • https://dev.mysqlserver.cn/doc/refman/5.6/en/installing-source-distribution.html
  • 我使用 Visual Studio 2012 按照以下说明构建了源代码:
    • cmake . -G "Visual Studio 11"
  • 下载本文附件名为 SourceCodeEmailSearchEngine.zip 的文件。
  • 它包含 example 文件夹中的 EmailSearchEngine 源代码。
  • 将 example 文件夹复制到 mysql-5.6.22\storage。

结论

我们已经找到了一种方法,可以利用 MySQL 解析器的强大功能来分析您的电子邮件,这是以前从未做过的。这为新类型的分析打开了视野。这种方法可以推广。我将在本系列文章的后续部分中继续介绍这种方法。

参考文献

[1] vmime.NET - Smtp, Pop3, Imap Library (for C++ and .NET)

https://codeproject.org.cn/Articles/719490/vmime-NET-Smtp-Pop-Imap-Library-for-Cplusplus-and

[2] Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 4

https://codeproject.org.cn/Articles/796061/Building-on-My-SQL-Storage-Engines-and-APIs-to-C

[3] Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 3

https://codeproject.org.cn/Articles/768518/Building-on-My-SQL-Storage-Engines-and-APIs-to-C

[4] Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 2

https://codeproject.org.cn/Articles/732116/Building-on-My-SQL-Storage-Engines-and-APIs-to-Con

[5] Building on My SQL Storage Engines and API’s to conduct complex analysis using SQL query language

https://codeproject.org.cn/Articles/715914/Building-on-My-SQL-Storage-Engines-and-API-s-to-co

 

 

 

 

© . All rights reserved.