65.9K
CodeProject 正在变化。 阅读更多。
Home

使用访问者模式在实践中进行表达式树遍历

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.96/5 (15投票s)

2018 年 5 月 8 日

CPOL

5分钟阅读

viewsIcon

29979

downloadIcon

339

如何处理 C# 中的通用 Expression 类型表达式树,以实现一些实际目标,例如将其转换为特定的语言,如 T-SQL。

问题

我将考虑一个非常普遍的问题——我们需要将 Expression 转换为.NET 域之外的某种语言。例如,使用 Entity Framework,我们可以编写代码

var order = context.Orders.Where(x => x.Customer == "Tom" && 
                  x.Amount > 1000 && x.TheDate == today).ToList();

它将返回满足给定条件的订单。该谓词的类型为:Expression<Func<Order, bool>>,表示结构如下的表达式树:

                                           ____________  && ____________
                                         /                                               \
                              _____&&______                                 ____==___
                            /                       \                              /                \
                  ___==___                  ___>____             x.TheDate       today
                /               \               /             \           
      x.Customer     "Tom"    x.Amount    1000

Entity Framework 将会将其转换为,例如,T-SQL 语法(取决于底层提供程序)。

"SELECT * FROM Orders WHERE Customer = 'Tome' and Amount > 1000 and TheDate = '2018-04-28'"

这样,我们的目标就是弄清楚它是如何实现的,并为了演示目的,执行自定义翻译到另一种语言。

有 Azure 表存储(https://docs.microsoft.com/en-us/azure/cosmos-db/table-storage-how-to-use-dotnet)云基础设施。您可以将其视为一种类似数据库的存储——带有列、行和数据的表。在从中获取数据时,我们也可以指定过滤器。规则(必需的语法)在此处呈现。它们是:

比较运算符表
逻辑运算符 描述 示例筛选器字符串
eq 等于 City eq 'Redmond'
gt 大于 Price gt 20
ge 大于或等于 Price ge 10
lt 小于 Price lt 20
le 小于等于 Price le 100
ne 不等于 City ne 'London'
并且 Price le 200 and Price gt 3.5
或者 Price le 3.5 or Price gt 200
not not isAvailable

String 属性应包装在单引号中:Name eq 'Tom'Datetime 值应如下表示:BirtDate eq datetime'2008-07-10T00:00:00Z',其余的则通过 ToString() 调用。因此,完整的条件应该是:Customer eq 'Tome' and Amount > 1000 and TheDate eq datetime'2008-07-10T00:00:00Z'。最后,我们的任务是将 .NET Expression 转换为此查询语法。

免责声明:有一个 Nuget 包——WindowsAzure.Storagehttps://nuget.net.cn/packages/WindowsAzure.Storage/),它已经具备此功能,但我们的主要目标不是创建到具体语言——Azure 存储表语法的翻译,而是理解如何做到这一点,从而学会如何构建到任意语言的查询或解决其他自定义问题。

解决方案

每个 Expression 都是一个带有节点的树,每个节点也是 Expression。此外,每个节点都属于 Expression 的特定类型,该类型继承自基类。但是,当我们创建 Expression 树时,编译器无法告诉我们具体类型,因为可能存在无限数量的组合(树结构),例如,BinaryExpression 有两个子节点,每个子节点都可以属于任何 Expression 类型。因此,根节点及其子节点的实际类型以及树的结构仅在运行时可知。在编译时,节点将作为基 Expression 类型表示,结构将通过父节点到子节点的引用表示。

如果我们直接解决问题,我们将需要递归遍历 Expression 树,根据每个节点的类型处理它,也就是说,我们需要弄清楚我们正在处理的具体类型,然后相应地处理该节点。为此,我们将需要使用反射(expression.GetType() 等)。所以这将是一个非常困难且容易出错的方法。但幸运的是,有一个类型:ExpressionVisitor

访问者模式

访问者模式的理念——我们有一个类型,并希望在不修改它的情况下为其添加更多功能。为此,我们将创建一个包含此功能的 Visitor。每个 visitor 不仅可以为特定类型添加新功能,还可以为几种类型(ElementBElementA)添加新功能,这些类型继承自基类型(Element)。所以这正是我们的情况,主要类型是ExpressionElement),继承类型有:BinaryExpressionElementB)、UnaryExpressionElementA)等;新功能是查询构建(visitElementBvisitElementA)。ExpressionElement)类型会accept visitor,然后根据自身的类型调用相应的 visitor 方法:VisitUnaryvisitElementB)或VisitBinaryvisitElementA),并将自身作为参数传递给它。这样,我们的 visitor 方法中的参数将不属于基抽象 Expression 类型,而是属于实际类型。因此,我们不必通过丑陋的 expression.GetType() 来通过反射检测类型,而且 ExpressionVisitor 提供了 以简单方式递归遍历树的功能。这意味着它是我们的最佳选择。

实现

假设我们有一个类 Order

public class Order 
{
    public DateTime TheDate { get; set; }
    public string Customer { get; set; }
    public decimal Amount { get; set; }
    public bool Discount { get; set; }
}   

并且我们想编写如下业务代码

var date = DateTime.Today;
var order = new Order { Customer = "Tom", Amount = 1000 };

Expression<Func<Order, bool>> filter = x => (x.Customer == order.Customer && x.Amount > order.Amount) 
                                            || 
                                            (x.TheDate == date && !x.Discount);
var visitor = new FilterConstructor();
var query = visitor.GetQuery(filter);
var data = AzureTableReference.Get<Order>(query);
//desired formatted query:
//(
//    (
//        (Customer eq 'Tom') 
//        and 
//        (Amount gt 1000)
//    ) 
//    or 
//    (
//        (TheDate eq datetime'2018-04-27T21:00:00.000Z') 
//        and 
//        not (Discount)
//    )
//)

让我们创建一个继承自 ExpressionVisitorFilterConstructor

public class FilterConstructor : ExpressionVisitor
{            
    private static readonly Dictionary<ExpressionType, string> _logicalOperators;
    private static readonly Dictionary<Type, Func<object, string>> _typeConverters;
    static FilterConstructor()
    {
        //mappings for table, shown above
        _logicalOperators = new Dictionary<ExpressionType, string>
        {
            [ExpressionType.Not] = "not",
            [ExpressionType.GreaterThan] = "gt",
            [ExpressionType.GreaterThanOrEqual] = "ge",
            [ExpressionType.LessThan] = "lt",
            [ExpressionType.LessThanOrEqual] = "le",
            [ExpressionType.Equal] = "eq",
            [ExpressionType.Not] = "not",
            [ExpressionType.AndAlso] = "and",
            [ExpressionType.OrElse] = "or"
        };

        //if type is string we will wrap it into single quotes
        //if it is a DateTime we will format it like datetime'2008-07-10T00:00:00Z'
        //bool.ToString() returns "True" or "False" with first capital letter, so .ToLower() is applied
        //if it is one of the rest "simple" types we will just call .ToString() method on it
        _typeConverters = new Dictionary<Type, Func<object, string>>
        {
            [typeof(string)] = x => $"'{x}'",
            [typeof(DateTime)] = 
              x => $"datetime'{((DateTime)x).ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.fffZ")}'",
            [typeof(bool)] = x => x.ToString().ToLower()
        };
    }

    private StringBuilder _queryStringBuilder;
    private Stack<string> _fieldNames;
    public FilterConstructor()
    {
        //here we will collect our query
        _queryStringBuilder = new StringBuilder();
        //will be discussed below
        _fieldNames = new Stack<string>();
    }

    //entry point
    public string GetQuery(LambdaExpression predicate)
    {
        //Visit transfer abstract Expression to concrete method, like VisitUnary
        //it's invocation chain (at case of unary operator) approximetely looks this way:
        //inside visitor: predicate.Body.Accept(ExpressionVisitor this)
        //inside expression(visitor is this from above): visitor.VisitUnary(this) 
        //here this is Expression
        //we not pass whole predicate, just Body, because we not need predicate.Parameters: "x =>" part
        Visit(predicate.Body);
        var query = _queryStringBuilder.ToString();
        _queryStringBuilder.Clear();
                
        return query;
    }

    protected override Expression VisitUnary(UnaryExpression node)
    {                
        //assume we only allow not (!) unary operator:
        if(node.NodeType != ExpressionType.Not)
            throw new NotSupportedException("Only not(\"!\") unary operator is supported!");

        _queryStringBuilder.Append($"{_logicalOperators[node.NodeType]} ");//!

        _queryStringBuilder.Append("(");                                   //(!
        //go down from a tree
        Visit(node.Operand);                                               //(!expression
        _queryStringBuilder.Append(")");                                   //(!expression)

        //we should return expression, it will allow to create new expression based on existing one,
        //but, at our case, it is not needed, so just return initial node argument
        return node;
    }

    //corresponds to: and, or, greater than, less than, etc.
    protected override Expression VisitBinary(BinaryExpression node)
    {
        _queryStringBuilder.Append("(");                                    //(
        //left side of binary operator
        Visit(node.Left);                                                   //(leftExpr

        _queryStringBuilder.Append($" {_logicalOperators[node.NodeType]} ");//(leftExpr and

       //right side of binary operator
        Visit(node.Right);                                                  //(leftExpr and RighExpr
        _queryStringBuilder.Append(")");                                    //(leftExpr and RighExpr)

        return node;
    }

    protected override Expression VisitMember(MemberExpression node)
    {
        //corresponds to: order.Customer, .order, today variables
        //when we pass parameters to expression via closure, CLR internally creates class:

        //class NameSpace+<>c__DisplayClass12_0
        //{
        //    public Order order;
        //    public DateTime today;
        //}

        //which contains values of parameters. When we face order.Customer, it's node.Expression
        //will not have reference to value "Tom", but instead reference to parent (.order), so we
        //will go to it via Visit(node.Expression) and also save node.Member.Name into 
        //Stack(_fieldNames) fo further usage. order.Customer has type ExpressionType.MemberAccess. 
        //.order - ExpressionType.Constant, because it's node.Expression is ExpressionType.Constant
        //(VisitConstant will be called) that is why we can get it's actual value(instance of Order). 
        //Our Stack at this point: "Customer" <- "order". Firstly we will get "order" field value, 
        //when it will be reached, on NameSpace+<>c__DisplayClass12_0 class instance
        //(type.GetField(fieldName)) then value of "Customer" property
        //(type.GetProperty(fieldName).GetValue(input)) on it. We started from 
        //order.Customer Expression then go up via reference to it's parent - "order", get it's value 
        //and then go back - get value of "Customer" property on order. Forward and backward
        //directions, at this case, reason to use Stack structure

        if(node.Expression.NodeType == ExpressionType.Constant 
           || 
           node.Expression.NodeType == ExpressionType.MemberAccess)
        {
            _fieldNames.Push(node.Member.Name);
            Visit(node.Expression);
        }                    
        else                                    
            //corresponds to: x.Customer - just write "Customer"
            _queryStringBuilder.Append(node.Member.Name);                 
        return node;
    }

    //corresponds to: 1, "Tom", instance of NameSpace+<>c__DisplayClass12_0, instance of Order, i.e.
    //any expression with value
    protected override Expression VisitConstant(ConstantExpression node)
    {
        //just write value
        _queryStringBuilder.Append(GetValue(node.Value));
        return node;
    }

    private string GetValue(object input)
    {
        var type = input.GetType();
        //if it is not simple value
        if (type.IsClass && type != typeof(string))
        {
            //proper order of selected names provided by means of Stack structure
            var fieldName = _fieldNames.Pop();
            var fieldInfo = type.GetField(fieldName);
            object value;
            if (fieldInfo != null)            
               //get instance of order    
                value = fieldInfo.GetValue(input);
            else
                //get value of "Customer" property on order
                value = type.GetProperty(fieldName).GetValue(input);
            return GetValue(value);
        }                    
        else
        {
            //our predefined _typeConverters
            if (_typeConverters.ContainsKey(type))
                return _typeConverters[type](input);
            else
            //rest types
                return input.ToString();
        }
    }

结论

我们的目标是学习如何遍历表达式树,以便将其翻译成任意语言,在本例中是 Azure 表存储条件语法。直接的方法太复杂且容易出错,但借助访问者模式,我们可以轻松实现。幸运的是,.NET Framework 为我们提供了 ExpressionVisitor ,该类专门用于 Expressions 并具有所有必要的功能。我们只需要继承它并实现我们的自定义逻辑。

© . All rights reserved.