使用解析导数计算求解非线性方程组





5.00/5 (7投票s)
本文介绍了如何在 C# 中使用解析计算来求解非线性方程组。
引言
有许多库用于求解非线性方程组。它们实现了众所周知的数学算法 - 牛顿-拉弗森方法、莱文伯格-马夸特方法、鲍威尔的狗腿方法等等。这些算法不仅需要在某个点计算系统方程的值,还需要计算梯度、雅可比矩阵或系统的黑塞矩阵,这些都是系统的某种类型的导数。计算库的常见实现使用导数的数值计算 - 有限差分。本文介绍了一种替代方法,其中非线性系统以解析表达式的形式设置,并且导数表达式也以解析形式提供。解析表达式的计算和推导过程是使用 ANALYTICS C# 库进行的。
背景
为了理解非线性方程组求解的解析实现,让我们考虑一下该问题的一般编程环境。通常情况下,必须有两个主要类:NonlinearSystem – 表示非线性方程组 和 NonlinearSolver – 表示非线性求解器。这些类的实现是在 MATHEMATICS 库中实现的,如下所示
/// <summary>
/// Base class for Nonlinear Equation system.
/// </summary>
public abstract class NonlinearSystem
{
/// <summary>
/// Dimension (the number of equations).
/// </summary>
public int Dimension
{
get;
}
/// <summary>
/// The system supports derivatives.
/// </summary>
public bool DerivativeSupported
{
get;
}
/// <summary>
/// Calculates equation values for given variable values.
/// </summary>
/// <param name="x"></param>
/// <param name="y"></param>
/// <returns></returns>
public abstract bool Calculate(double[] x, ref double[] y);
/// <summary>
/// Calculates the system's Jacobian.
/// (must be overridden in inherited classes if derivative calculation supported)
/// </summary>
/// <param name="x"></param>
/// <param name="J"></param>
/// <returns></returns>
public virtual bool Jacobian(double[] x, ref double[][] J)
{…}
}
/// <summary>
/// Abstract nonlinear solver.
/// </summary>
public abstract class NonlinearSolver
{
/// <summary>
/// Solution method.
/// </summary>
/// <param name="system"></param>
/// <param name="x0"></param>
/// <param name="opt"></param>
/// <param name="x"></param>
/// <returns></returns>
public abstract SolutionResult Solve(NonlinearSystem system, double[] x0,
SolverOptions opt, ref double[] x);
}
对于常见的非线性求解器,它们是迭代的,提供了继承类 IterativeSolver,它以迭代步骤序列的形式覆盖 Solve 方法
/// <summary>
/// Base Abstract Iterative Solver.
/// </summary>
public abstract class IterativeSolver : NonlinearSolver
{
/// <summary>
/// Iteration.
/// </summary>
/// <param name="system"></param>
/// <param name="x0">Current x values</param>
/// <param name="x1">Next (calculated) x values</
/// <returns></returns>
protected abstract bool Iteration(NonlinearSystem system, double[] x0, ref double[] x1);
/// <summary>
/// Common Iterative Solution method.
/// </summary>
public override SolutionResult Solve(NonlinearSystem system, double[] x0,
SolverOptions opt, ref double[] x)
{
// iteration process here
}
}
MATHEMATICS 库实现了几个最终的求解器类。NewtonRaphsonSolver 类在所有示例中用作求解器,它需要导数(雅可比矩阵)的计算。
醒悟
为了实现使用解析导数计算的方法,需要一个解析评估引擎。ANALYTICS C# 库用于此目的。该库不仅提供了解析表达式评估,还提供了解析导数计算。使用此库,实现了 NonlinearSystem 的以下派生类
/// <summary>
/// System of Nonlinear Analytical Equations.
/// </summary>
public class AnalyticalSystem: ConstructedSystem
{
#region Analytical data members
protected Translator translator;
protected string[] functions;
protected string[,] dfunctions;
protected Formula[] formulae;
protected Formula[,] fderivatives;
#endregion Analytical data members
/// <summary>
/// Creates translator and adds Real variables to it.
/// </summary>
/// <param name="variables"></param>
/// <returns></returns>
protected bool AssignVariables(string[] variables)
{
if (variables == null || variables.Length == 0) return false;
translator = new Translator();
int l = variables.Length;
for (int i = 0; i < l; i++)
{
if (!translator.Add(variables[i], (double)0.0))
{
throw new InvalidNameException(variables[i]);
}
}
return true;
}
/// <summary>
/// Creates formulae for equations and their derivatives.
/// </summary>
/// <param name="f"></param>
/// <returns></returns>
protected bool AssignFunctions(string[] f)
{
if (f == null || f.Length == 0) return false;
int l = f.Length;
functions = new string[l];
formulae = new Formula[l];
for (int i = 0; i < l; i++)
{
if (!translator.CheckSyntax(f[i]))
{
functions = null;
formulae = null;
return false;
}
functions[i] = f[i];
formulae[i] = translator.BuildFormula(functions[i]);
if (formulae[i].ResultType != typeof(double))
{
Type t = formulae[i].ResultType;
functions = null;
formulae = null;
throw new WrongArgumentException("Function must return real value.",
typeof(double), t);
}
}
dfunctions = new string[l, l];
fderivatives = new Formula[l,l];
try
{
for (int i = 0; i < l; i++)
{
for (int j = 0; j < l; j++)
{
dfunctions[i, j] = translator.Derivative(functions[i],
translator.Variables[j].Name);
fderivatives[i, j] = translator.BuildFormula(dfunctions[i, j]);
}
}
}
catch (Exception)
{
// if some derivative could not be calculated -
// the system does not support Jacobian.
fderivatives = null;
}
return true;
}
/// <summary>
/// Assigns variable values.
/// </summary>
/// <param name="values"></param>
protected void AssignVariableValues(double[] values)
{
int l = values.Length;
for (int i = 0; i < l; i++)
{
translator.Variables[i].Value = values[i];
}
}
/// <summary>
/// Calculates equation result for current variable values.
/// </summary>
/// <param name="i">Equation number</param>
/// <returns></returns>
protected double Equation(int i)
{
return (double)formulae[i].Calculate();
}
/// <summary>
/// Calculates derivative result for current variable values.
/// </summary>
/// <param name="i">Equation number</param>
/// <param name="j">Variable number</param>
/// <returns></returns>
protected double Derivative(int i, int j)
{
return (double)fderivatives[i,j].Calculate();
}
/// <summary>
/// Creates equation (and derivative) delegates
/// based on the created formulae objects.
/// </summary>
protected void CreateEquations()
{
int l = formulae.Length;
equations = new Equation[l];
for (int i = 0; i < l; i++)
{
// DO NOT remove temp variable,
// using 'i' directly leads to incorrect lambda.
int temp = i;
equations[i] = (double[] x) =>
{
AssignVariableValues(x);
return Equation(temp);
};
}
// if formulae of derivatives are not assigned -
// system will not support Jacobian.
if (fderivatives != null)
{
derivatives = new Derivative[l];
for (int i = 0; i < l; i++)
{
// DO NOT remove temp variable,
// using 'i' directly leads to incorrect lambda.
int tempi = i;
derivatives[i] = (int j, double[] x) =>
{
AssignVariableValues(x);
return Derivative(tempi, j);
};
}
}
}
/// <summary>
/// Constructor.
/// </summary>
/// <param name="variables">Variable names, the system must be resolved of.</param>
/// <param name="f">System Equation Functions</param>
public AnalyticalSystem(string[] variables, string[] f)
{
if (variables.Length != f.Length)
{
throw new WrongArgumentException("Variable count must be equal to equation count.",
variables.Length, f.Length);
}
if (!AssignVariables(variables))
{
return;
}
if (!AssignFunctions(f))
{
return;
}
CreateEquations();
}
}
系统实现的主要特征可以从构造函数和 AssignFunctions 方法中看出。系统的构造函数获取以解析表达式形式提供的方程(字符串数组)。无需为导数(雅可比矩阵)提供解析表达式(或任何 lambda),因为它们是在 AssignFunctions 方法中通过一行代码自动计算出来的
dfunctions[i, j] = translator.Derivative(functions[i], translator.Variables[j].Name);
这种方法有一些好处。首先,无需为系统方程和雅可比矩阵提供 lambda 表达式。其次,无需为雅可比矩阵函数进行推导过程,该过程由提供的解析引擎自动完成。最后,解析表达式的计算比数值有限差分给出了更精确的导数值估计,后者缺乏步长的选择。
测试示例
VS 2010 解决方案中提供了两个测试示例。这些示例演示了使用解析系统求解几何问题 - 寻找二维空间中曲线和曲面的交点。3D 案例的代码
// variables of the analytical system
string[] variables = { "x", "y", "z" };
// analytical expressions of the system's equations
string[] functions = { "x^2+y^2+z^2-1", // sphere
"x^2+y^2-z", // paraboloid along the z axis
"-x+2*y^2+z^2" // paraboloid along the x axis
};
// creating nonlinear system instance - Analytical System
NonlinearSystem system = new AnalyticalSystem(variables, functions);
double[] x0;
SolverOptions options;
double[] result;
double[] expected;
SolutionResult actual;
// creating nonlinear solver instance - Newton-Raphson solver.
NonlinearSolver solver = new NewtonRaphsonSolver();
x0 = new double[] { 1.0, 1.0, 1.0 }; // initial guess for variable values
options = new SolverOptions() // options for solving nonlinear problem
{
MaxIterationCount = 100,
SolutionPrecision = 1e-5
};
result = null;
// solving the system
actual = solver.Solve(system, x0, options, ref result);
expected = new double[] { 0.6835507455, 0.3883199288, 0.6180339887 }; // expected values
// printing solution result into console out
PrintNLResult(system, actual, result, expected, options.SolutionPrecision);
测试执行的结果被打印到 VS 控制台输出。对于之前的代码,输出如下
Analytical equations system
Dimension: 3
Support derivatives: True
Equations:
[0]=x^2+y^2+z^2-1
[1]=x^2+y^2-z
[2]=-x+2*y^2+z^2
Jacobian:
[0][x]=2*x^(2-1)
[0][y]=2*y^(2-1)
[0][z]=2*z^(2-1)
[1][x]=2*x^(2-1)
[1][y]=2*y^(2-1)
[1][z]=-1
[2][x]=-1
[2][y]=(2*y^(2-1))*2
[2][z]=2*z^(2-1)
Solution
Converged: True
RESULT: 0.683550745474947 0.3883237584331 0.618033988749989
EXPECT: 0.6835507455 0.3883199288 0.6180339887
SUCCESS
该解以 1e-5 的指定精度收敛,仅需 4 次迭代。输出还显示了为求解器算法计算的雅可比矩阵的解析表达式。
结论
本文介绍了一种使用解析导数求解非线性方程组的简单实现。该实现使用“即用即用”的解析评估引擎来提供导数。在提供的代码中,提供了非线性方程组的简单类。这个类之所以简单,是因为它只在公式中使用双精度值。这仅用于解释这个想法,并不意味着这种方法不能扩展到具有其他类型数据的更一般的情况。例如,同样的方法已成功应用于为大型数值库 ILNumerics 提供解析能力。