编写自己的动态 LINQ 解析器






4.95/5 (28投票s)
这是“动态生成具有自定义属性的 LINQ 查询”的替代方案。
引言
这是对原始提示 动态生成具有自定义属性的 LINQ 查询 的替代方案。
原始帖子遵循通过修改查询客户端来实现动态查询的概念。我认为这种方法侵入性很强,并且无法在未来重用,正如 我在原始帖子中的评论 所述。
我在这里概述了如何重塑您自己的动态 LINQ 版本。我介绍了一些可能对其他简单解析器任务有用的解析技术。对于“真正的”解析器工作,我强烈建议使用解析器生成器,例如 Coco/R 或 ANTLR。另请参阅使用 Coco/R 实现的 简单数学表达式解析器 的示例。
功能:
从字符串中给出的表达式创建一个可在 LINQ 查询中使用的函数,例如:
IEnumerable<MySourceElement> items = ...;
...
string s = GetUserEntry(); // e.g. Name == "x" || Number >= 800
...
var pred = SimpleExpression.PredicateParser<MySourceElement>.Parse(s);
var f = pred.Compile();
var query = from e in items where f(e) select e;
...
使用代码
这是一段非常精炼的代码,展示了如何编写自己的动态 LINQ 解析器。
- 扫描器从第 11 行到第 54 行
- 代码生成器从第 57 行到第 108 行
- 解析器从第 109 行到第 149 行
已实现的功能
- lambda 参数的名称为属性或字段
- 双精度或整数数字
- 字符串
- 嵌套表达式
- 数字、字符串、布尔类型系统,具有数字类型提升
- 运算符 ||、&&、==、!=、<、<=、>=、>、!
using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using System.Text.RegularExpressions;
namespace SimpleExpression
{
public abstract class PredicateParser
{
#region scanner
/// <summary>tokenizer pattern: Optional-SpaceS...Token...Optional-Spaces</summary>
private static readonly string _pattern = @"\s*(" + string.Join("|", new string[]
{
// operators and punctuation that are longer than one char: longest first
string.Join("|", new string[] { "||", "&&", "==", "!=", "<=", ">=" }.Select(e => Regex.Escape(e))),
@"""(?:\\.|[^""])*""", // string
@"\d+(?:\.\d+)?", // number with optional decimal part
@"\w+", // word
@"\S", // other 1-char tokens (or eat up one character in case of an error)
}) + @")\s*";
/// <summary>get 1st char of current token (or a Space if no 1st char is obtained)</summary>
private char Ch { get { return string.IsNullOrEmpty(Curr) ? ' ' : Curr[0]; } }
/// <summary>move one token ahead</summary><returns>true = moved ahead, false = end of stream</returns>
private bool Move() { return _tokens.MoveNext(); }
/// <summary>the token stream implemented as IEnumerator<string></summary>
private IEnumerator<string> _tokens;
/// <summary>constructs the scanner for the given input string</summary>
protected PredicateParser(string s)
{
_tokens = Regex.Matches(s, _pattern, RegexOptions.Compiled).Cast<Match>()
.Select(m => m.Groups[1].Value).GetEnumerator();
Move();
}
protected bool IsNumber { get { return char.IsNumber(Ch); } }
protected bool IsDouble { get { return IsNumber && Curr.Contains('.'); } }
protected bool IsString { get { return Ch == '"'; } }
protected bool IsIdent { get { char c = Ch; return char.IsLower(c) || char.IsUpper(c) || c == '_'; } }
/// <summary>throw an argument exception</summary>
protected void Abort(string msg) { throw new ArgumentException("Error: " + (msg ?? "unknown error")); }
/// <summary>get the current item of the stream or an empty string after the end</summary>
protected string Curr { get { return _tokens.Current ?? string.Empty; }}
/// <summary>get current and move to the next token (error if at end of stream)</summary>
protected string CurrAndNext { get { string s = Curr; if (!Move()) Abort("data expected"); return s; } }
/// <summary>get current and move to the next token if available</summary>
protected string CurrOptNext { get { string s = Curr; Move(); return s; } }
/// <summary>moves forward if current token matches and returns that (next token must exist)</summary>
protected string CurrOpAndNext(params string[] ops)
{
string s = ops.Contains(Curr) ? Curr : null;
if (s != null && !Move()) Abort("data expected");
return s;
}
#endregion
}
public class PredicateParser<TData>: PredicateParser
{
#region code generator
private static readonly Type _bool = typeof(bool);
private static readonly Type[] _prom = new Type[]
{ typeof(decimal), typeof(double), typeof(float), typeof(ulong), typeof(long), typeof(uint),
typeof(int), typeof(ushort), typeof(char), typeof(short), typeof(byte), typeof(sbyte) };
/// <summary>enforce the type on the expression (by a cast) if not already of that type</summary>
private static Expression Coerce(Expression expr, Type type)
{
return expr.Type == type ? expr : Expression.Convert(expr, type);
}
/// <summary>casts if needed the expr to the "largest" type of both arguments</summary>
private static Expression Coerce(Expression expr, Expression sibling)
{
if (expr.Type != sibling.Type)
{
Type maxType = MaxType(expr.Type, sibling.Type);
if (maxType != expr.Type) expr = Expression.Convert(expr, maxType);
}
return expr;
}
/// <summary>returns the first if both are same, or the largest type of both (or the first)</summary>
private static Type MaxType(Type a, Type b) { return a==b?a:(_prom.FirstOrDefault(t=>t==a||t==b)??a); }
/// <summary>
/// Code generation of binary and unary epressions, utilizing type coercion where needed
/// </summary>
private static readonly Dictionary<string, Func<Expression, Expression, Expression>> _binOp =
new Dictionary<string,Func<Expression,Expression,Expression>>()
{
{ "||", (a,b)=>Expression.OrElse(Coerce(a, _bool), Coerce(b, _bool)) },
{ "&&", (a,b)=>Expression.AndAlso(Coerce(a, _bool), Coerce(b, _bool)) },
{ "==", (a,b)=>Expression.Equal(Coerce(a,b), Coerce(b,a)) },
{ "!=", (a,b)=>Expression.NotEqual(Coerce(a,b), Coerce(b,a)) },
{ "<", (a,b)=>Expression.LessThan(Coerce(a,b), Coerce(b,a)) },
{ "<=", (a,b)=>Expression.LessThanOrEqual(Coerce(a,b), Coerce(b,a)) },
{ ">=", (a,b)=>Expression.GreaterThanOrEqual(Coerce(a,b), Coerce(b,a)) },
{ ">", (a,b)=>Expression.GreaterThan(Coerce(a,b), Coerce(b,a)) },
};
private static readonly Dictionary<string, Func<Expression, Expression>> _unOp =
new Dictionary<string, Func<Expression, Expression>>()
{
{ "!", a=>Expression.Not(Coerce(a, _bool)) },
};
/// <summary>create a constant of a value</summary>
private static ConstantExpression Const(object v) { return Expression.Constant(v); }
/// <summary>create lambda parameter field or property access</summary>
private MemberExpression ParameterMember(string s) { return Expression.PropertyOrField(_param, s); }
/// <summary>create lambda expression</summary>
private Expression<Func<TData, bool>> Lambda(Expression expr) { return Expression.Lambda<Func<TData, bool>>(expr, _param); }
/// <summary>the lambda's parameter (all names are members of this)</summary>
private readonly ParameterExpression _param = Expression.Parameter(typeof(TData), "_p_");
#endregion
#region parser
/// <summary>initialize the parser (and thus, the scanner)</summary>
private PredicateParser(string s): base(s) { }
/// <summary>main entry point</summary>
public static Expression<Func<TData, bool>> Parse(string s) { return new PredicateParser<TData>(s).Parse(); }
private Expression<Func<TData, bool>> Parse() { return Lambda(ParseExpression()); }
private Expression ParseExpression() { return ParseOr(); }
private Expression ParseOr() { return ParseBinary(ParseAnd, "||"); }
private Expression ParseAnd() { return ParseBinary(ParseEquality, "&&"); }
private Expression ParseEquality() { return ParseBinary(ParseRelation, "==", "!="); }
private Expression ParseRelation() { return ParseBinary(ParseUnary, "<", "<=", ">=", ">"); }
private Expression ParseUnary() { return CurrOpAndNext("!") != null ? _unOp["!"](ParseUnary())
: ParsePrimary(); }
private Expression ParseIdent() { return ParameterMember(CurrOptNext); }
private Expression ParseString() { return Const(Regex.Replace(CurrOptNext, "^\"(.*)\"$",
m => m.Groups[1].Value)); }
private Expression ParseNumber() { if (IsDouble) return Const(double.Parse(CurrOptNext));
return Const(int.Parse(CurrOptNext)); }
private Expression ParsePrimary()
{
if (IsIdent) return ParseIdent();
if (IsString) return ParseString();
if (IsNumber) return ParseNumber();
return ParseNested();
}
private Expression ParseNested()
{
if (CurrAndNext != "(") Abort("(...) expected");
Expression expr = ParseExpression();
if (CurrOptNext != ")") Abort("')' expected");
return expr;
}
/// <summary>generic parsing of binary expressions</summary>
private Expression ParseBinary(Func<Expression> parse, params string[] ops)
{
Expression expr = parse();
string op;
while ((op = CurrOpAndNext(ops)) != null) expr = _binOp[op](expr, parse());
return expr;
}
#endregion
}
}
此程序调用上述解析器的入口点,并执行各种查询,使用计算出的表达式。
static void Main(string[] args)
{
var items = new List<Element>()
{
new Element("a", 1000),
new Element("b", 900),
new Element("c", 800),
new Element("d", 700),
new Element("e", 600),
new Element("x", 500),
new Element("y", 400),
new Element("z", 300),
};
string s = "Name == \"x\" || Number >= 800";
var pred = SimpleExpression.PredicateParser<Element>.Parse(s);
Console.WriteLine("User Entry: {0}", s);
Console.WriteLine("Expr Tree: {0}", pred.ToString());
var f = pred.Compile();
Console.WriteLine("### mark affected items ###");
foreach (var item in items)
{
Console.WriteLine("{2} Name = {0}, Number = {1}", item.Name, item.Number, f(item) ? "x" : " ");
}
Console.WriteLine("### where-select ###");
var q = from e in items where f(e) select e;
foreach (var item in q)
{
Console.WriteLine(" Name = {0}, Number = {1}", item.Name, item.Number);
}
}
输出是:
User Entry: Name == "x" || Number >= 800
Expr Tree: _p_ => ((_p_.Name == "x") OrElse (_p_.Number >= 800))
### mark affected items ###
x Name = a, Number = 1000
x Name = b, Number = 900
x Name = c, Number = 800
Name = d, Number = 700
Name = e, Number = 600
x Name = x, Number = 500
Name = y, Number = 400
Name = z, Number = 300
### where-select ###
Name = a, Number = 1000
Name = b, Number = 900
Name = c, Number = 800
Name = x, Number = 500
扩展解析器
添加更多操作
要扩展解析器:您可以轻松地为表达式添加更多内容,特别是新运算符是一件简单的事情。
- 要添加
+
和-
二元运算符,请将它们添加到_binOp
字典中(类似于==
,例如,("+":
Expression.Add(...)
,"-":
Expression.Subtract(...)
)。创建一个ParseSum()
作为ParseRelation
的副本,将"+", "-"
作为操作数传递,将ParseSum
传递给ParseRelation
(代替ParseUnary
),将ParseUnary
传递给ParseSum
。就这样。 - 同样对于
"*", "/", "%"
:将ParseMul
作为上述ParseSum
的副本,传递正确的ParseXXX
动作,将相应的 Expression 工厂添加到_binOps
字典中。完成。 - 一元
"-"
应添加到_unOps
字典中(无需强制转换)。解析在ParseUnary()
函数中完成,例如。
...
{ ">", (a,b)=>Expression.GreaterThan(Coerce(a,b), Coerce(b,a)) },
{ "+", (a,b)=>Expression.Add(Coerce(a,b), Coerce(b,a)) },
{ "-", (a,b)=>Expression.Subtract(Coerce(a,b), Coerce(b,a)) },
{ "*", (a,b)=>Expression.Multiply(Coerce(a,b), Coerce(b,a)) },
{ "/", (a,b)=>Expression.Divide(Coerce(a,b), Coerce(b,a)) },
{ "%", (a,b)=>Expression.Modulo(Coerce(a,b), Coerce(b,a)) },
};
private static readonly Dictionary<string, Func<Expression, Expression>> _unOp =
new Dictionary<string, Func<Expression, Expression>>()
{
{ "!", a=>Expression.Not(Coerce(a, _bool)) },
{ "-", a=>Expression.Negate(a) },
};
...
private Expression ParseRelation() { return ParseBinary(ParseSum, "<", "<=", ">=", ">"); }
private Expression ParseSum() { return ParseBinary(ParseMul, "+", "-"); }
private Expression ParseMul() { return ParseBinary(ParseUnary, "*", "/", "%"); }
private Expression ParseUnary()
{
if (CurrOpAndNext("!") != null) return _unOp["!"](ParseUnary());
if (CurrOpAndNext("-") != null) return _unOp["-"](ParseUnary());
return ParsePrimary();
}
基于 IComparable<T> 进行基本比较,而不是隐式运算符
原始解析器对直接运算符执行相等性和比较检查。更广泛的方法是调用 a.CompareTo(b) rel 0
,例如 a.CompareTo(b) < 0
而不是 a < b
。这样做的好处是您可以在比较中使用更广泛的类型。要实现这一点,请添加/替换以下代码:
- 添加到第 59 行之前
private static readonly Expression _zero = Expression.Constant(0);
- 添加到第 79 行之后
/// <summary>produce comparison based on IComparable types</summary> private static Expression CompareToExpression(Expression lhs, Expression rhs, Func<Expression, Expression> rel) { lhs = Coerce(lhs, rhs); rhs = Coerce(rhs, lhs); Expression cmp = Expression.Call( lhs, lhs.Type.GetMethod("CompareTo", new Type[] { rhs.Type }) ?? lhs.Type.GetMethod("CompareTo", new Type[] { typeof(object) }), rhs); return rel(cmp); }
- 将第 88-93 行替换为以下内容:
{ "==", (a,b)=>CompareToExpression(a, b, c=>Expression.Equal (c, _zero)) }, { "!=", (a,b)=>CompareToExpression(a, b, c=>Expression.NotEqual (c, _zero)) }, { "<", (a,b)=>CompareToExpression(a, b, c=>Expression.LessThan (c, _zero)) }, { "<=", (a,b)=>CompareToExpression(a, b, c=>Expression.LessThanOrEqual (c, _zero)) }, { ">=", (a,b)=>CompareToExpression(a, b, c=>Expression.GreaterThanOrEqual(c, _zero)) }, { ">", (a,b)=>CompareToExpression(a, b, c=>Expression.GreaterThan (c, _zero)) },
添加对嵌套标识符的支持
正如 metbone 所建议的,我们可以扩展解析器以支持 a.b.c
形式的嵌套标识符。最好通过将第 122 行的 ParseIdent()
方法替换为以下代码来完成:
/// parsing single or nested identifiers. EBNF: ParseIdent = ident { "." ident } .
private Expression ParseIdent()
{
Expression expr = ParameterMember(CurrOptNext);
while (CurrOpAndNext(".") != null && IsIdent) expr = Expression.PropertyOrField(expr, CurrOptNext);
return expr;
}
关注点
如前所述,请查看可用的解析器生成器来处理实际的解析器工作。
- Coco/R: http://ssw.jku.at/Coco/
- ANTLR: http://www.antlr.org/
LINQ 表达式树
- LINQ 表达式树:http://msdn.microsoft.com/en-us/library/bb397951.aspx
- LINQ 表达式:http://msdn.microsoft.com/en-us/library/system.linq.expressions.aspx
- 表达式工厂:http://msdn.microsoft.com/en-us/library/system.linq.expressions.expression.aspx
动态 LINQ
非常感谢任何反馈。
请注意:这是该提示的替代方案。此替代方案不需要数据提供者使用特殊类型的属性或字段。要在此方面进行实际工作,请查看可用的动态 LINQ 实现。
玩得开心!
Andi
历史
V1.0 | 2012-03-28 |
第一版 |
V1.1 | 2012-03-29 |
修复第 107 行的拼写错误,简化了 ParsePrimary,修复了 ParseNumber 以正确区分整数和双精度,进行了一些小的代码格式调整,添加了源代码下载链接,更新了标签(.NET4)。 |
V1.2 | 2012-03-29 |
添加了示例代码,展示了如何轻松扩展表达式解析器。 新代码渲染功能早期采用者。 <pre ... countlines="true" countstart="93">... (感谢 Chris Maunder[^] 快速添加了此功能[^]!) |
V1.3 | 2012-04-06 |
修复字符串(正确处理嵌入的双引号)和数字(小数部分可选)的正则表达式。 |
V1.4 | 2012-09-04 |
修复了一些小的拼写错误。 |
V1.5 | 2013-05-20 |
添加了关于如何扩展解析器以基于 IComparable<T> 进行基本比较(而不是直接关系和相等运算符)的说明。 |
V1.6 | 2014-03-04 |
添加了指向简单数学表达式解析器的链接,修复了一些拼写错误,修复了损坏的突出显示,更改了类型。 |
V1.7 | 2014-10-14 |
添加了对嵌套标识符的支持(但尚未实现函数调用支持或索引访问)。 |