使用访问者模式在实践中进行表达式树遍历






4.96/5 (15投票s)
如何处理 C# 中的通用 Expression 类型表达式树,以实现一些实际目标,例如将其转换为特定的语言,如 T-SQL。
问题
我将考虑一个非常普遍的问题——我们需要将 Expression 转换为.NET 域之外的某种语言。例如,使用 Entity Framework,我们可以编写代码
var order = context.Orders.Where(x => x.Customer == "Tom" &&
x.Amount > 1000 && x.TheDate == today).ToList();
它将返回满足给定条件的订单。该谓词的类型为:Expression<Func<Order, bool>>
,表示结构如下的表达式树:
Entity Framework 将会将其转换为,例如,T-SQL 语法(取决于底层提供程序)。
"SELECT * FROM Orders WHERE Customer = 'Tome' and Amount > 1000 and TheDate = '2018-04-28'"
这样,我们的目标就是弄清楚它是如何实现的,并为了演示目的,执行自定义翻译到另一种语言。
有 Azure 表存储(https://docs.microsoft.com/en-us/azure/cosmos-db/table-storage-how-to-use-dotnet)云基础设施。您可以将其视为一种类似数据库的存储——带有列、行和数据的表。在从中获取数据时,我们也可以指定过滤器。规则(必需的语法)在此处呈现。它们是:
比较运算符表
逻辑运算符 | 描述 | 示例筛选器字符串 |
eq | 等于 | City eq 'Redmond' |
gt | 大于 | Price gt 20 |
ge | 大于或等于 | Price ge 10 |
lt | 小于 | Price lt 20 |
le | 小于等于 | Price le 100 |
ne | 不等于 | City ne 'London' |
和 | 并且 | Price le 200 and Price gt 3.5 |
或 | 或者 | Price le 3.5 or Price gt 200 |
not | 非 | not isAvailable |
String
属性应包装在单引号中:Name eq 'Tom'
。Datetime
值应如下表示:BirtDate eq datetime'2008-07-10T00:00:00Z'
,其余的则通过 ToString()
调用。因此,完整的条件应该是:Customer eq 'Tome' and Amount > 1000 and TheDate eq datetime'2008-07-10T00:00:00Z'
。最后,我们的任务是将 .NET Expression 转换为此查询语法。
免责声明:有一个 Nuget 包——WindowsAzure.Storage
(https://nuget.net.cn/packages/WindowsAzure.Storage/),它已经具备此功能,但我们的主要目标不是创建到具体语言——Azure 存储表语法的翻译,而是理解如何做到这一点,从而学会如何构建到任意语言的查询或解决其他自定义问题。
解决方案
每个 Expression 都是一个带有节点的树,每个节点也是 Expression。此外,每个节点都属于 Expression 的特定类型,该类型继承自基类。但是,当我们创建 Expression 树时,编译器无法告诉我们具体类型,因为可能存在无限数量的组合(树结构),例如,BinaryExpression
有两个子节点,每个子节点都可以属于任何 Expression 类型。因此,根节点及其子节点的实际类型以及树的结构仅在运行时可知。在编译时,节点将作为基 Expression 类型表示,结构将通过父节点到子节点的引用表示。
如果我们直接解决问题,我们将需要递归遍历 Expression 树,根据每个节点的类型处理它,也就是说,我们需要弄清楚我们正在处理的具体类型,然后相应地处理该节点。为此,我们将需要使用反射(expression.GetType()
等)。所以这将是一个非常困难且容易出错的方法。但幸运的是,有一个类型:ExpressionVisitor
。
访问者模式
访问者模式的理念——我们有一个类型,并希望在不修改它的情况下为其添加更多功能。为此,我们将创建一个包含此功能的 Visitor。每个 visitor 不仅可以为特定类型添加新功能,还可以为几种类型(ElementB
、ElementA
)添加新功能,这些类型继承自基类型(Element
)。所以这正是我们的情况,主要类型是Expression
(Element
),继承类型有:BinaryExpression
(ElementB
)、UnaryExpression
(ElementA
)等;新功能是查询构建(visitElementB
、visitElementA
)。Expression
(Element
)类型会accept
visitor,然后根据自身的类型调用相应的 visitor 方法:VisitUnary
(visitElementB
)或VisitBinary
(visitElementA
),并将自身作为参数传递给它。这样,我们的 visitor 方法中的参数将不属于基抽象 Expression 类型,而是属于实际类型。因此,我们不必通过丑陋的 expression.GetType()
来通过反射检测类型,而且 ExpressionVisitor
还 提供了 以简单方式递归遍历树的功能。这意味着它是我们的最佳选择。
实现
假设我们有一个类 Order
public class Order
{
public DateTime TheDate { get; set; }
public string Customer { get; set; }
public decimal Amount { get; set; }
public bool Discount { get; set; }
}
并且我们想编写如下业务代码
var date = DateTime.Today;
var order = new Order { Customer = "Tom", Amount = 1000 };
Expression<Func<Order, bool>> filter = x => (x.Customer == order.Customer && x.Amount > order.Amount)
||
(x.TheDate == date && !x.Discount);
var visitor = new FilterConstructor();
var query = visitor.GetQuery(filter);
var data = AzureTableReference.Get<Order>(query);
//desired formatted query:
//(
// (
// (Customer eq 'Tom')
// and
// (Amount gt 1000)
// )
// or
// (
// (TheDate eq datetime'2018-04-27T21:00:00.000Z')
// and
// not (Discount)
// )
//)
让我们创建一个继承自 ExpressionVisitor
的 FilterConstructor
类
public class FilterConstructor : ExpressionVisitor
{
private static readonly Dictionary<ExpressionType, string> _logicalOperators;
private static readonly Dictionary<Type, Func<object, string>> _typeConverters;
static FilterConstructor()
{
//mappings for table, shown above
_logicalOperators = new Dictionary<ExpressionType, string>
{
[ExpressionType.Not] = "not",
[ExpressionType.GreaterThan] = "gt",
[ExpressionType.GreaterThanOrEqual] = "ge",
[ExpressionType.LessThan] = "lt",
[ExpressionType.LessThanOrEqual] = "le",
[ExpressionType.Equal] = "eq",
[ExpressionType.Not] = "not",
[ExpressionType.AndAlso] = "and",
[ExpressionType.OrElse] = "or"
};
//if type is string we will wrap it into single quotes
//if it is a DateTime we will format it like datetime'2008-07-10T00:00:00Z'
//bool.ToString() returns "True" or "False" with first capital letter, so .ToLower() is applied
//if it is one of the rest "simple" types we will just call .ToString() method on it
_typeConverters = new Dictionary<Type, Func<object, string>>
{
[typeof(string)] = x => $"'{x}'",
[typeof(DateTime)] =
x => $"datetime'{((DateTime)x).ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.fffZ")}'",
[typeof(bool)] = x => x.ToString().ToLower()
};
}
private StringBuilder _queryStringBuilder;
private Stack<string> _fieldNames;
public FilterConstructor()
{
//here we will collect our query
_queryStringBuilder = new StringBuilder();
//will be discussed below
_fieldNames = new Stack<string>();
}
//entry point
public string GetQuery(LambdaExpression predicate)
{
//Visit transfer abstract Expression to concrete method, like VisitUnary
//it's invocation chain (at case of unary operator) approximetely looks this way:
//inside visitor: predicate.Body.Accept(ExpressionVisitor this)
//inside expression(visitor is this from above): visitor.VisitUnary(this)
//here this is Expression
//we not pass whole predicate, just Body, because we not need predicate.Parameters: "x =>" part
Visit(predicate.Body);
var query = _queryStringBuilder.ToString();
_queryStringBuilder.Clear();
return query;
}
protected override Expression VisitUnary(UnaryExpression node)
{
//assume we only allow not (!) unary operator:
if(node.NodeType != ExpressionType.Not)
throw new NotSupportedException("Only not(\"!\") unary operator is supported!");
_queryStringBuilder.Append($"{_logicalOperators[node.NodeType]} ");//!
_queryStringBuilder.Append("("); //(!
//go down from a tree
Visit(node.Operand); //(!expression
_queryStringBuilder.Append(")"); //(!expression)
//we should return expression, it will allow to create new expression based on existing one,
//but, at our case, it is not needed, so just return initial node argument
return node;
}
//corresponds to: and, or, greater than, less than, etc.
protected override Expression VisitBinary(BinaryExpression node)
{
_queryStringBuilder.Append("("); //(
//left side of binary operator
Visit(node.Left); //(leftExpr
_queryStringBuilder.Append($" {_logicalOperators[node.NodeType]} ");//(leftExpr and
//right side of binary operator
Visit(node.Right); //(leftExpr and RighExpr
_queryStringBuilder.Append(")"); //(leftExpr and RighExpr)
return node;
}
protected override Expression VisitMember(MemberExpression node)
{
//corresponds to: order.Customer, .order, today variables
//when we pass parameters to expression via closure, CLR internally creates class:
//class NameSpace+<>c__DisplayClass12_0
//{
// public Order order;
// public DateTime today;
//}
//which contains values of parameters. When we face order.Customer, it's node.Expression
//will not have reference to value "Tom", but instead reference to parent (.order), so we
//will go to it via Visit(node.Expression) and also save node.Member.Name into
//Stack(_fieldNames) fo further usage. order.Customer has type ExpressionType.MemberAccess.
//.order - ExpressionType.Constant, because it's node.Expression is ExpressionType.Constant
//(VisitConstant will be called) that is why we can get it's actual value(instance of Order).
//Our Stack at this point: "Customer" <- "order". Firstly we will get "order" field value,
//when it will be reached, on NameSpace+<>c__DisplayClass12_0 class instance
//(type.GetField(fieldName)) then value of "Customer" property
//(type.GetProperty(fieldName).GetValue(input)) on it. We started from
//order.Customer Expression then go up via reference to it's parent - "order", get it's value
//and then go back - get value of "Customer" property on order. Forward and backward
//directions, at this case, reason to use Stack structure
if(node.Expression.NodeType == ExpressionType.Constant
||
node.Expression.NodeType == ExpressionType.MemberAccess)
{
_fieldNames.Push(node.Member.Name);
Visit(node.Expression);
}
else
//corresponds to: x.Customer - just write "Customer"
_queryStringBuilder.Append(node.Member.Name);
return node;
}
//corresponds to: 1, "Tom", instance of NameSpace+<>c__DisplayClass12_0, instance of Order, i.e.
//any expression with value
protected override Expression VisitConstant(ConstantExpression node)
{
//just write value
_queryStringBuilder.Append(GetValue(node.Value));
return node;
}
private string GetValue(object input)
{
var type = input.GetType();
//if it is not simple value
if (type.IsClass && type != typeof(string))
{
//proper order of selected names provided by means of Stack structure
var fieldName = _fieldNames.Pop();
var fieldInfo = type.GetField(fieldName);
object value;
if (fieldInfo != null)
//get instance of order
value = fieldInfo.GetValue(input);
else
//get value of "Customer" property on order
value = type.GetProperty(fieldName).GetValue(input);
return GetValue(value);
}
else
{
//our predefined _typeConverters
if (_typeConverters.ContainsKey(type))
return _typeConverters[type](input);
else
//rest types
return input.ToString();
}
}
结论
我们的目标是学习如何遍历表达式树,以便将其翻译成任意语言,在本例中是 Azure 表存储条件语法。直接的方法太复杂且容易出错,但借助访问者模式,我们可以轻松实现。幸运的是,.NET Framework 为我们提供了 ExpressionVisitor
类 ,该类专门用于 Expressions 并具有所有必要的功能。我们只需要继承它并实现我们的自定义逻辑。