65.9K
CodeProject 正在变化。 阅读更多。
Home

Jace.NET:.NET 的又一个计算引擎

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.82/5 (30投票s)

2013年11月18日

CPOL

5分钟阅读

viewsIcon

53724

downloadIcon

599

Jace.NET 的技术架构,一个我业余时间开发的开源框架。

引言

在本文中,我将解释 Jace.NET(https://github.com/pieterderycke/Jace)的技术架构,这是一个我业余时间开发的开源框架。Jace.NET 是一个高性能的 .NET 平台计算引擎,可以动态解释和执行包含数学函数的字符串。这些函数可以依赖于变量。如果使用了变量,可以在数学函数执行时为这些变量提供值。Jace.NET 可用于各种 .NET 版本:.NET、WinRT、WP7 和 WP8。

背景

在各种应用程序中,都需要有适用于不同业务场景的公式。例如薪资系统、科学模拟、金融应用等。然而,对于普通程序员来说,实现这样的引擎并不是一项特别简单的任务:运算优先级规则使得解析数学公式非常困难。

然而,很多开发人员都不知道,这个问题早在大约五 (!) 十年前就已经被荷兰计算机科学家 Edsger W. Dijkstra解决了。他提出了简单的“应-yard”算法,非常受到火车道岔功能的启发。这个算法很简单,并且可以在一页的伪代码中进行解释。

我并非第一个为 .NET 框架开发计算引擎的人,该引擎允许开发人员在不被解析数学公式的复杂性所困扰的情况下,为他们的应用程序引入动态公式。但我遇到了一个事实,即现有框架存在设计限制,不允许轻松移植到更新的 .NET 平台,如 Windows Phone 或 Windows RT。因此,我决定创建 Jace.NET(Just another calculation for .NET),一个基于应-yard 算法的解析器。Jace.NET 提供了一个在所有支持的平台上完全相同的公共 API。

架构

Jace.NET 的架构与现代编译器类似:解释和执行分多个步骤进行。每个步骤都专注于公式解析和解释的一个方面。这使得整体复杂性易于管理。

分词

过程从标记化阶段开始。在此阶段,输入字符串被转换为各种允许的标记:整数、双精度数、运算和变量。如果输入字符串的某部分包含与任何标记类型都不匹配的文本,则会抛出异常,Jace 将停止。

someVariable + 2 * 3 => [‘someVariable’, ‘+’, ‘2’, ‘3’]

/// <summary>
/// Read in the provided formula and convert it into a list of tokens that can be processed by the
/// Abstract Syntax Tree Builder.
/// </summary>
/// <param name="formula">
/// The formula that must be converted into a list of tokens.</param>
/// <returns>The list of tokens for the provided formula.</returns>
public List<Token> Read(string formula)
{
    if (string.IsNullOrEmpty(formula))
        throw new ArgumentNullException("formula");

    List<Token> tokens = new List<Token>();

    char[] characters = formula.ToCharArray();

    bool isFormulaSubPart = true;

    for(int i = 0; i < characters.Length; i++)
    {
        if (IsPartOfNumeric(characters[i], true, isFormulaSubPart))
        {
            string buffer = "" + characters[i];
            int startPosition = i;

            while (++i < characters.Length && 
                     IsPartOfNumeric(characters[i], false, isFormulaSubPart))
            {
                buffer += characters[i];
            }

            // Verify if we do not have an int
            int intValue;
            if (int.TryParse(buffer, out intValue))
            {
                tokens.Add(new Token() { TokenType = TokenType.Integer, 
                  Value = intValue, StartPosition = startPosition, Length = i - startPosition });
                isFormulaSubPart = false;
            }
            else
            {
                double doubleValue;
                if (double.TryParse(buffer, NumberStyles.Float | NumberStyles.AllowThousands,
                    cultureInfo, out doubleValue))
                {
                    tokens.Add(new Token() { TokenType = TokenType.FloatingPoint, 
                      Value = doubleValue, 
                      StartPosition = startPosition, Length = i - startPosition });
                    isFormulaSubPart = false;
                }
                else if (buffer == "-")
                {
                    // Verify if we have a unary minus, 
                    // we use the token '_' for a unary minus in the AST builder
                    tokens.Add(new Token() { TokenType = TokenType.Operation, 
                      Value = '_', StartPosition = startPosition, Length = 1 });
                }
                // Else we skip
            }

            if (i == characters.Length)
            {
                // Last character read
                continue;
            }
        }

        if (IsPartOfVariable(characters[i], true))
        {
            string buffer = "" + characters[i];
            int startPosition = i;

            while (++i < characters.Length && IsPartOfVariable(characters[i], false))
            {
                buffer += characters[i];
            }

            tokens.Add(new Token() { TokenType = TokenType.Text, 
              Value = buffer, StartPosition = startPosition, Length = i -startPosition });
            isFormulaSubPart = false;

            if (i == characters.Length)
            {
                // Last character read
                continue;
            }
        }
        if (characters[i] == this.argumentSeparator)
        {
            tokens.Add(new Token() { TokenType = Tokenizer.TokenType.ArgumentSeparator, 
              Value = characters[i], StartPosition = i, Length = 1 });
            isFormulaSubPart = false;
        }
        else
        {
            switch (characters[i])
            { 
                case ' ':
                    continue;
                case '+':
                case '-':
                case '*':
                case '/':
                case '^':
                case '%':
                    if (IsUnaryMinus(characters[i], tokens))
                    {
                        // We use the token '_' for a unary minus in the AST builder
                        tokens.Add(new Token() { TokenType = TokenType.Operation, 
                          Value = '_', StartPosition = i, Length = 1 });
                    }
                    else
                    {
                        tokens.Add(new Token() { TokenType = TokenType.Operation, 
                          Value = characters[i], 
                          StartPosition = i, Length = 1 });                            
                    }
                    isFormulaSubPart = true;
                    break;
                case '(':
                    tokens.Add(new Token() { TokenType = TokenType.LeftBracket, 
                      Value = characters[i], StartPosition = i, Length = 1 });
                    isFormulaSubPart = true;
                    break;
                case ')':
                    tokens.Add(new Token() { TokenType = TokenType.RightBracket, 
                      Value = characters[i], 
                      StartPosition = i, Length = 1 });
                    isFormulaSubPart = false;
                    break;
                default:
                    break;
            }
        }
    }

    return tokens;
} 

抽象语法树构建

标记化成功完成后,将构建一个抽象语法树 (AST)。这个抽象语法树是一个树状数据模型,在内存中清晰地表示数学公式。在构建抽象语法树时,会考虑所有数学运算优先级规则。Jace 使用一个受 Dijkstra 的应-yard 算法启发的算法来创建这个 AST。

优化

创建 AST 后,优化器会尝试简化抽象语法树:如果公式的某一部分不依赖于变量而仅依赖于常量。这部分树将被计算并替换为树中的常量。

public Operation Optimize(Operation operation, IFunctionRegistry functionRegistry)
{
    if (!operation.DependsOnVariables && operation.GetType() != typeof(IntegerConstant)
        && operation.GetType() != typeof(FloatingPointConstant))
    {
        double result = executor.Execute(operation, functionRegistry);
        return new FloatingPointConstant(result);
    }
    else
    {
        if (operation.GetType() == typeof(Addition))
        {
            Addition addition = (Addition)operation;
            addition.Argument1 = Optimize(addition.Argument1, functionRegistry);
            addition.Argument2 = Optimize(addition.Argument2, functionRegistry);
        }
        else if (operation.GetType() == typeof(Subtraction))
        {
            Subtraction substraction = (Subtraction)operation;
            substraction.Argument1 = Optimize(substraction.Argument1, functionRegistry);
            substraction.Argument2 = Optimize(substraction.Argument2, functionRegistry);
        }
        else if (operation.GetType() == typeof(Multiplication))
        {
            Multiplication multiplication = (Multiplication)operation;
            multiplication.Argument1 = Optimize(multiplication.Argument1, functionRegistry);
            multiplication.Argument2 = Optimize(multiplication.Argument2, functionRegistry);
        }
        else if (operation.GetType() == typeof(Division))
        {
            Division division = (Division)operation;
            division.Dividend = Optimize(division.Dividend, functionRegistry);
            division.Divisor = Optimize(division.Divisor, functionRegistry);
        }
        else if (operation.GetType() == typeof(Exponentiation))
        {
            Exponentiation division = (Exponentiation)operation;
            division.Base = Optimize(division.Base, functionRegistry);
            division.Exponent = Optimize(division.Exponent, functionRegistry);
        }

        return operation;
    }
} 

操作码生成

最后阶段是操作码生成。在此阶段,将创建一个 .NET 动态方法,并生成必要的 MSIL 来执行公式。为此,我依赖于 .NET 框架的两个高级 API:Reflection.emit 和 Expression Trees。在这两个框架中,Reflection.Emit 是最古老的,也是最低级的。它允许通过发出必要的 MSIL 指令(MSIL 是 CLR 的汇编语言)来动态创建类和方法。此框架支持 Windows Phone 和标准的 .NET 框架,但不幸的是不支持 WinRT。Expression Trees 是微软生成动态方法的较新 API,提供了一种高于 MSIL 的抽象。不幸的是,此 API (目前) 不支持 Windows Phone。

.NET 4.0 WinRT Windows Phone
Reflection.emitPOP
表达式树PPO

Jace.NET 源代码中负责操作码生成的组件称为 DynamicCompiler。该组件有两个实现:一个基于 Reflection.Emit,一个基于 Expression Trees。根据使用的平台,会使用其中一个实现。这由 Jace.NET 内部处理,对于使用 Jace.NET 的开发人员来说是透明的。

Reflection.Emit 示例

private void GenerateMethodBody(ILGenerator generator, Operation operation, 
    IFunctionRegistry functionRegistry)
{
    if (operation == null)
        throw new ArgumentNullException("operation");
 
    if (operation.GetType() == typeof(IntegerConstant))
    {
        IntegerConstant constant = (IntegerConstant)operation;
        
        generator.Emit(OpCodes.Ldc_I4, constant.Value);
        generator.Emit(OpCodes.Conv_R8);
    }
    else if (operation.GetType() == typeof(FloatingPointConstant))
    {
        FloatingPointConstant constant = (FloatingPointConstant)operation;
 
        generator.Emit(OpCodes.Ldc_R8, constant.Value);
    }
    else if (operation.GetType() == typeof(Variable))
    {
        Type dictionaryType = typeof(Dictionary<string, double>);
 
        Variable variable = (Variable)operation;
 
        Label throwExceptionLabel = generator.DefineLabel();
        Label returnLabel = generator.DefineLabel();
 
        generator.Emit(OpCodes.Ldarg_0);
        generator.Emit(OpCodes.Callvirt, 
        typeof(FormulaContext).GetProperty("Variables").GetGetMethod());
        generator.Emit(OpCodes.Ldstr, variable.Name);
        generator.Emit(OpCodes.Callvirt, dictionaryType.GetMethod
        	("ContainsKey", new Type[] { typeof(string) }));
        generator.Emit(OpCodes.Ldc_I4_0);
        generator.Emit(OpCodes.Ceq);
        generator.Emit(OpCodes.Brtrue_S, throwExceptionLabel);
 
        generator.Emit(OpCodes.Ldarg_0);
        generator.Emit(OpCodes.Callvirt, 
        typeof(FormulaContext).GetProperty("Variables").GetGetMethod());
        generator.Emit(OpCodes.Ldstr, variable.Name);
        generator.Emit(OpCodes.Callvirt, 
        dictionaryType.GetMethod("get_Item", new Type[] { typeof(string) }));
        generator.Emit(OpCodes.Br_S, returnLabel);
 
        generator.MarkLabel(throwExceptionLabel);
        generator.Emit(OpCodes.Ldstr, string.Format(
          "The variable \"{0}\" used is not defined.", variable.Name));
        generator.Emit(OpCodes.Newobj, 
          typeof(VariableNotDefinedException).GetConstructor(new Type[] { typeof(string) }));
        generator.Emit(OpCodes.Throw);
 
        generator.MarkLabel(returnLabel);
    }
    else if (operation.GetType() == typeof(Multiplication))
    {
        Multiplication multiplication = (Multiplication)operation;
        GenerateMethodBody(generator, multiplication.Argument1, functionRegistry);
        GenerateMethodBody(generator, multiplication.Argument2, functionRegistry);
 
        generator.Emit(OpCodes.Mul);
    }
    else if (operation.GetType() == typeof(Addition))
    {
        Addition addition = (Addition)operation;
        GenerateMethodBody(generator, addition.Argument1, functionRegistry);
        GenerateMethodBody(generator, addition.Argument2, functionRegistry);
 
        generator.Emit(OpCodes.Add);
    }
    
    // ...
} 

Expression Trees 示例

private Expression GenerateMethodBody
(Operation operation, ParameterExpression contextParameter,
    IFunctionRegistry functionRegistry)
{
    if (operation == null)
        throw new ArgumentNullException("operation");
 
    if (operation.GetType() == typeof(IntegerConstant))
    {
        IntegerConstant constant = (IntegerConstant)operation;
 
        return Expression.Convert
        (Expression.Constant(constant.Value, typeof(int)), typeof(double));
    }
    else if (operation.GetType() == typeof(FloatingPointConstant))
    {
        FloatingPointConstant constant = (FloatingPointConstant)operation;
 
        return Expression.Constant(constant.Value, typeof(double));
    }
    else if (operation.GetType() == typeof(Variable))
    {
        Type contextType = typeof(FormulaContext);
        Type dictionaryType = typeof(Dictionary<string, double>);
 
        Variable variable = (Variable)operation;
 
        Expression getVariables = Expression.Property(contextParameter, "Variables");
 
        Expression isInDictionaryExpression = Expression.Call(getVariables, 
            dictionaryType.GetRuntimeMethod("ContainsKey", new Type[] { typeof(string) }),
            Expression.Constant(variable.Name));
 
        Expression throwException = Expression.Throw(
            Expression.New(typeof
            (VariableNotDefinedException).GetConstructor(new Type[] { typeof(string) }),
                Expression.Constant(string.Format
                ("The variable \"{0}\" used is not defined.", variable.Name))));
 
        LabelTarget returnLabel = Expression.Label(typeof(double));
 
        return Expression.Block(
            Expression.IfThenElse(
                isInDictionaryExpression,
                Expression.Return(returnLabel, Expression.Call(getVariables, 
                    dictionaryType.GetRuntimeMethod
                    ("get_Item", new Type[] { typeof(string) }), 
                    Expression.Constant(variable.Name))),
                throwException
            ),
            Expression.Label(returnLabel, Expression.Constant(0.0))
        );
    }
    else if (operation.GetType() == typeof(Multiplication))
    {
        Multiplication multiplication = (Multiplication)operation;
        Expression argument1 = GenerateMethodBody
        (multiplication.Argument1, contextParameter, functionRegistry);
        Expression argument2 = GenerateMethodBody
        (multiplication.Argument2, contextParameter, functionRegistry);
 
        return Expression.Multiply(argument1, argument2);
    }
    else if (operation.GetType() == typeof(Addition))
    {
        Addition addition = (Addition)operation;
        Expression argument1 = GenerateMethodBody
        	(addition.Argument1, contextParameter, functionRegistry);
        Expression argument2 = GenerateMethodBody
        	(addition.Argument2, contextParameter, functionRegistry);
 
        return Expression.Add(argument1, argument2);
    }
    
    // ...
} 

生成的动态方法会缓存在内存中。如果将来使用不同的变量值再次执行相同的公式,则会跳过解释步骤,并直接执行动态方法。如果计算公式经常重复出现,Jace.NET 具有接近编译代码的性能。

使用 Jace.NET

将 Jace.NET 添加到解决方案的最简单方法是使用 NuGet。在撰写本文时,最新稳定版本是“0.8.3”:https://nuget.net.cn/packages/Jace

要开始使用 Jace,需要创建一个计算引擎类的实例。这个计算引擎将负责解释数学公式和缓存以前使用的公式。此外,它允许轻松定义新函数和常量。它充当所有 Jace 内部组件的包装器,并为开发人员提供简单的 API。

使用提供的变量直接执行给定的数学公式

CalculationEngine engine = new CalculationEngine();
 
double result1 = engine.Calculate("1+2-3*4/5+6-7*8/9+0");
 
engine.AddFunction("test", (a, b) => a + b);
double result2 = engine.Calculate("test(2,3)");
 
Dictionary<string, double> variables = new Dictionary<string, double>();
variables.Add("variable", 2);
variables.Add("otherVariable", 4.2);
double result3 = engine.Calculate("max(sin(variable), cos(otherVariable))", variables); 

构建一个接受包含每个变量值的字典作为输入的 .NET Func

CalculationEngine engine = new CalculationEngine();
Func<Dictionary<string, double>, 
double> formula = engine.Build("var1+2/(3*otherVariable)");
 
Dictionary<string, double> variables = new Dictionary<string, double>();
variables.Add("var1", 2);
variables.Add("otherVariable", 4.2);
 
double result = formula(variables); 

Jace.NET 还允许构建类型化的 Func

CalculationEngine engine = new CalculationEngine();
Func<int, double, double> formula = 
      (Func<int, double, double>)engine.Function("var1+2/(3*otherVariable)")
    .Parameter("var1", DataType.Integer)
    .Parameter("otherVariable", DataType.FloatingPoint)
    .Result(DataType.FloatingPoint)
    .Build();
 
double result = formula(2, 4.2); 

延伸阅读

历史

  • 2013/11/18:初始版本
© . All rights reserved.