使用解析导数计算求解非线性方程组

谢尔盖·L·格拉德基

5.00/5 (7投票s)

2015年10月26日

CPOL

3分钟阅读

29146

本文介绍了如何在 C# 中使用解析计算来求解非线性方程组。

下载 nle_analytical_derivatives.rar

引言

有许多库用于求解非线性方程组。它们实现了众所周知的数学算法 - 牛顿-拉弗森方法、莱文伯格-马夸特方法、鲍威尔的狗腿方法等等。这些算法不仅需要在某个点计算系统方程的值，还需要计算梯度、雅可比矩阵或系统的黑塞矩阵，这些都是系统的某种类型的导数。计算库的常见实现使用导数的数值计算 - 有限差分。本文介绍了一种替代方法，其中非线性系统以解析表达式的形式设置，并且导数表达式也以解析形式提供。解析表达式的计算和推导过程是使用 ANALYTICS C# 库进行的。

背景

为了理解非线性方程组求解的解析实现，让我们考虑一下该问题的一般编程环境。通常情况下，必须有两个主要类：NonlinearSystem – 表示非线性方程组和 NonlinearSolver – 表示非线性求解器。这些类的实现是在 MATHEMATICS 库中实现的，如下所示

    /// <summary>
    /// Base class for Nonlinear Equation system.
    /// </summary>
    public abstract class NonlinearSystem
    {
        /// <summary>
        /// Dimension (the number of equations).
        /// </summary>
        public int Dimension
        {
            get;
        }

        /// <summary>
        /// The system supports derivatives.
        /// </summary>
        public bool DerivativeSupported
        {
            get;
        }

        /// <summary>
        /// Calculates equation values for given variable values.
        /// </summary>
        /// <param name="x"></param>
        /// <param name="y"></param>
        /// <returns></returns>
        public abstract bool Calculate(double[] x, ref double[] y);

        /// <summary>
        /// Calculates the system's Jacobian.
        /// (must be overridden in inherited classes if derivative calculation supported)
        /// </summary>
        /// <param name="x"></param>
        /// <param name="J"></param>
        /// <returns></returns>
        public virtual bool Jacobian(double[] x, ref double[][] J)
        {…}
    }

    /// <summary>
    /// Abstract nonlinear solver.
    /// </summary>
    public abstract class NonlinearSolver
    {
        /// <summary>
        /// Solution method.
        /// </summary>
        /// <param name="system"></param>
        /// <param name="x0"></param>
        /// <param name="opt"></param>
        /// <param name="x"></param>
        /// <returns></returns>
        public abstract SolutionResult Solve(NonlinearSystem system, double[] x0,
    SolverOptions opt, ref double[] x);
    }

对于常见的非线性求解器，它们是迭代的，提供了继承类 IterativeSolver，它以迭代步骤序列的形式覆盖 Solve 方法

    /// <summary>
    /// Base Abstract Iterative Solver.
    /// </summary>
    public abstract class IterativeSolver : NonlinearSolver
    {
        /// <summary>
        /// Iteration.
        /// </summary>
        /// <param name="system"></param>
        /// <param name="x0">Current x values</param>
        /// <param name="x1">Next (calculated) x values</
        /// <returns></returns>
        protected abstract bool Iteration(NonlinearSystem system, double[] x0, ref double[] x1);

        /// <summary>
        /// Common Iterative Solution method.
        /// </summary>
        public override SolutionResult Solve(NonlinearSystem system, double[] x0,
    SolverOptions opt, ref double[] x)
        {
            // iteration process here
        }

    }

MATHEMATICS 库实现了几个最终的求解器类。NewtonRaphsonSolver 类在所有示例中用作求解器，它需要导数（雅可比矩阵）的计算。

醒悟

为了实现使用解析导数计算的方法，需要一个解析评估引擎。ANALYTICS C# 库用于此目的。该库不仅提供了解析表达式评估，还提供了解析导数计算。使用此库，实现了 NonlinearSystem 的以下派生类

    /// <summary>
    /// System of Nonlinear Analytical Equations.
    /// </summary>
    public class AnalyticalSystem: ConstructedSystem
    {
        #region Analytical data members
        protected Translator translator;
        protected string[] functions;
        protected string[,] dfunctions;
        protected Formula[] formulae;
        protected Formula[,] fderivatives;
        #endregion Analytical data members

        /// <summary>
        /// Creates translator and adds Real variables to it.
        /// </summary>
        /// <param name="variables"></param>
        /// <returns></returns>
        protected bool AssignVariables(string[] variables)
        {
            if (variables == null || variables.Length == 0) return false;

            translator = new Translator();

            int l = variables.Length;

            for (int i = 0; i < l; i++)
            {
                if (!translator.Add(variables[i], (double)0.0))
                {
                    throw new InvalidNameException(variables[i]);
                }
            }

            return true;
        }

        /// <summary>
        /// Creates formulae for equations and their derivatives.
        /// </summary>
        /// <param name="f"></param>
        /// <returns></returns>
        protected bool AssignFunctions(string[] f)
        {
            if (f == null || f.Length == 0) return false;

            int l = f.Length;
            functions = new string[l];
            formulae = new Formula[l];

            for (int i = 0; i < l; i++)
            {
                if (!translator.CheckSyntax(f[i]))
                {
                    functions = null;
                    formulae = null;
                    return false;
                }

                functions[i] = f[i];
                formulae[i] = translator.BuildFormula(functions[i]);
                if (formulae[i].ResultType != typeof(double))
                {
                    Type t = formulae[i].ResultType;
                    functions = null;
                    formulae = null;
                    throw new WrongArgumentException("Function must return real value.",
typeof(double), t);
                }
            }

            dfunctions = new string[l, l];
            fderivatives = new Formula[l,l];
            try
            {
                for (int i = 0; i < l; i++)
                {
                    for (int j = 0; j < l; j++)
                    {
                         dfunctions[i, j] = translator.Derivative(functions[i],
                                                                 translator.Variables[j].Name);

                         fderivatives[i, j] = translator.BuildFormula(dfunctions[i, j]);
                    }
                }
            }

            catch (Exception)
            {
                // if some derivative could not be calculated -
                // the system does not support Jacobian.
                fderivatives = null;
            }
            return true;
        }

        /// <summary>
        /// Assigns variable values.
        /// </summary>
        /// <param name="values"></param>
        protected void AssignVariableValues(double[] values)
        {
            int l = values.Length;

            for (int i = 0; i < l; i++)
            {
                translator.Variables[i].Value = values[i];
            }
        }

        /// <summary>
        /// Calculates equation result for current variable values.
        /// </summary>
        /// <param name="i">Equation number</param>
        /// <returns></returns>
        protected double Equation(int i)
        {
            return (double)formulae[i].Calculate();
        }

        /// <summary>
        /// Calculates derivative result for current variable values.
        /// </summary>
        /// <param name="i">Equation number</param>
        /// <param name="j">Variable number</param>
        /// <returns></returns>
        protected double Derivative(int i, int j)
        {
            return (double)fderivatives[i,j].Calculate();
        }

        /// <summary>
        /// Creates equation (and derivative) delegates
        /// based on the created formulae objects.
        /// </summary>
        protected void CreateEquations()
        {
            int l = formulae.Length;

            equations = new Equation[l];

            for (int i = 0; i < l; i++)
            {
                // DO NOT remove temp variable,
                // using 'i' directly leads to incorrect lambda.
                int temp = i;
                     
                equations[i] = (double[] x) =>
                    {
                        AssignVariableValues(x);
                        return Equation(temp);
                    };
            }


            // if formulae of derivatives are not assigned -
            // system will not support Jacobian.
            if (fderivatives != null)
            {
                derivatives = new Derivative[l];
                for (int i = 0; i < l; i++)
                {
                    // DO NOT remove temp variable,
                    // using 'i' directly leads to incorrect lambda.
                    int tempi = i;

                    derivatives[i] = (int j, double[] x) =>
                    {
                        AssignVariableValues(x);
                        return Derivative(tempi, j);
                    };
                }
            }
        }

        /// <summary>
        /// Constructor.
        /// </summary>
        /// <param name="variables">Variable names, the system must be resolved of.</param>
        /// <param name="f">System Equation Functions</param>
        public AnalyticalSystem(string[] variables, string[] f)
        {
            if (variables.Length != f.Length)
            {
                throw new WrongArgumentException("Variable count must be equal to equation count.",
  variables.Length, f.Length);   
            }

            if (!AssignVariables(variables))
            {
                return;
            }

            if (!AssignFunctions(f))
            {
                return;
            }

            CreateEquations();
        }
    }

系统实现的主要特征可以从构造函数和 AssignFunctions 方法中看出。系统的构造函数获取以解析表达式形式提供的方程（字符串数组）。无需为导数（雅可比矩阵）提供解析表达式（或任何 lambda），因为它们是在 AssignFunctions 方法中通过一行代码自动计算出来的

dfunctions[i, j] = translator.Derivative(functions[i], translator.Variables[j].Name);

这种方法有一些好处。首先，无需为系统方程和雅可比矩阵提供 lambda 表达式。其次，无需为雅可比矩阵函数进行推导过程，该过程由提供的解析引擎自动完成。最后，解析表达式的计算比数值有限差分给出了更精确的导数值估计，后者缺乏步长的选择。

测试示例

VS 2010 解决方案中提供了两个测试示例。这些示例演示了使用解析系统求解几何问题 - 寻找二维空间中曲线和曲面的交点。3D 案例的代码

       // variables of the analytical system
       string[] variables = { "x", "y", "z" };
       // analytical expressions of the system's equations
       string[] functions = { "x^2+y^2+z^2-1", // sphere
                              "x^2+y^2-z",     // paraboloid along the z axis
                              "-x+2*y^2+z^2"   // paraboloid along the x axis
                                          };

       // creating nonlinear system instance - Analytical System
       NonlinearSystem system = new AnalyticalSystem(variables, functions);

       double[] x0;
       SolverOptions options;
       double[] result;
       double[] expected;
       SolutionResult actual;

       // creating nonlinear solver instance - Newton-Raphson solver.
       NonlinearSolver solver = new NewtonRaphsonSolver();

       x0 = new double[] { 1.0, 1.0, 1.0 }; // initial guess for variable values
       options = new SolverOptions()        // options for solving nonlinear problem
       {
              MaxIterationCount = 100,
              SolutionPrecision = 1e-5
       };

       result = null;

       // solving the system
       actual = solver.Solve(system, x0, options, ref result);

       expected = new double[] { 0.6835507455, 0.3883199288, 0.6180339887 }; // expected values
       // printing solution result into console out
       PrintNLResult(system, actual, result, expected, options.SolutionPrecision);

测试执行的结果被打印到 VS 控制台输出。对于之前的代码，输出如下

Analytical equations system

Dimension: 3

Support derivatives: True

Equations:

[0]=x^2+y^2+z^2-1

[1]=x^2+y^2-z

[2]=-x+2*y^2+z^2

Jacobian:

[0][x]=2*x^(2-1)

[0][y]=2*y^(2-1)

[0][z]=2*z^(2-1)

[1][x]=2*x^(2-1)

[1][y]=2*y^(2-1)

[1][z]=-1

[2][x]=-1

[2][y]=(2*y^(2-1))*2

[2][z]=2*z^(2-1)

Solution

Converged: True

RESULT: 0.683550745474947 0.3883237584331 0.618033988749989

EXPECT: 0.6835507455 0.3883199288 0.6180339887

SUCCESS

该解以 1e-5 的指定精度收敛，仅需 4 次迭代。输出还显示了为求解器算法计算的雅可比矩阵的解析表达式。

结论

本文介绍了一种使用解析导数求解非线性方程组的简单实现。该实现使用“即用即用”的解析评估引擎来提供导数。在提供的代码中，提供了非线性方程组的简单类。这个类之所以简单，是因为它只在公式中使用双精度值。这仅用于解释这个想法，并不意味着这种方法不能扩展到具有其他类型数据的更一般的情况。例如，同样的方法已成功应用于为大型数值库 ILNumerics 提供解析能力。