在 .NET 中实现模型约束 - II

Alex Mikunov

4.29/5 (6投票s)

2003年7月17日

8分钟阅读

107710

717

运行时 MSIL 代码插装和 .NET 元数据扩展

下载源代码文件 - 159 Kb

摘要

在本文中，我将展示 CLR 扩展的工作原型，该扩展提供了一个基础设施，用于强制执行数据库类型的数据完整性约束，例如实体完整性、域完整性、引用完整性和用户定义完整性。

我描述的方法基于我之前的文章，其中我介绍了一组新的元数据表——所谓的元模型表（Alex Mikunov，“在 .NET 中实现模型约束）。本文的技术部分还利用了我在即将发布的 MSDN 文章（2003 年 9 月刊）中描述的各种技术：“.NET 内部。使用 .NET 剖析 API 动态重写 MSIL 代码”（http://msdn.microsoft.com/msdnmag/issues/03/09/default.aspx）

引言

在上一篇中，我们主要关注了 .NET 元数据扩展的理论基础。简而言之，我们引入了一组新的元数据表（称为元模型表），它们扩展了现有的 CLR 元数据。这些表通过查询关联的元模型和约束定义（业务规则）来填充，并描述了各种数据库类型的约束/规则，例如字段/方法级别的约束（FieldConstraint 和 MethodConstraint 表）、引用完整性约束（TypeConstraint 表）等等。

请注意，所提出的方法没有指定任何特定的约束描述格式。它可以是 UML/OCL 或 XML 格式。唯一的要求是这些约束必须正确映射到元数据表和 MSIL（Microsoft Intermediate Language）。

首先，让我简要描述一下前一篇文章中的基本思想，并展示在方法约束的情况下它是如何工作的。（我还假设本文的读者已经熟悉 CLR 的基本概念，例如元数据、MSIL、JIT 编译、.NET 剖析。您应该在继续阅读本文之前稍微了解一下这些，尽管我将在这里简要介绍一些 CLR 基本概念。）

考虑以下简单的类 C

// C# code
using System;
...
namespace SomeApplication
{
    public class C
    {
      public int foo( int nNumber )
      {
         // code goes here
         ...
          
         return nSomeValue
       
      } // foo()
    } // class C
    ...
}

让方法 C::foo() 有一个输入参数的前置条件，形式为“0 < nNumber < 56”，以及一个返回值后置条件：“nSomeValue > 4”。熟悉对象约束语言的读者可以用以下方式描述它

-- OCL code
C::foo(nNumber : int): int
pre : (nNumber > 0) and (nNumber < 56)
post: result > 4

假设方法 foo() 被编码为元数据令牌 0x06000002（即存储在 Method 表的第二行），并且使用形式为 0x89XXXXXX 的令牌来表示方法约束（它们存储在 MethodConstraint 表中），那么我们将有以下元数据布局（请注意，我们还在 Method 表中添加了一个新列，以指向 MethodConstraint 表）。

Figure 1. Layout of Method and MethodConstraints tables for the foo method

图 1. foo 方法的 Method 和 MethodConstraints 表布局

也就是说，foo 的约束分别由元数据令牌 0x89000001 和 0x89000002 编码，并且每一行在相关虚拟地址 (RVA) 列中都有一个正确的值，该值指向文件映像内的实际 IL 实现。

Method-MethodConstraints 关系的通用外观如图所示

Figure 2. Method and MethodConstraints relationships

图 2. Method 和 MethodConstraints 关系

在 JIT 编译期间，运行时会遇到 C::foo 的元数据令牌（0x06000002），并使用此令牌来查询 Method 表的第二行。之后，它会意识到该行在 MethodConstraint 表中有一个索引。运行时会检查 MethodConstraint 中的相关记录，并使用它们来获取前置或后置条件相关的 MSIL 实现的 RVA，并在方法体被 JIT 编译之前将相应的 IL 添加到方法体中。

换句话说，如果原始方法具有以下 IL

// MSIL code
C::foo(...) // before JIT compilation
{
 method body //MSIL
}

CLR 将会向方法实现添加前置条件和后置条件，如下所示

// MSIL code
C::foo(...) // before JIT compilation
{
 // IL code for
 // if !(preconditions) throw an exception;
 
 method body // original IL code with replaced 'ret' opcodes 

 // IL code for 
 // if !(postconditions) throw an exception;
}

这种技术的一种概括可以使用一个通用的函数，该函数在进入函数和退出函数事件时被调用。假设运行时有一个内部类 ConstraintsChecker，它有一个方法 CheckMethod

HRESULT ConstraintsChecker::CheckMethod ( ..., mdMethodDef md, 
                                     CorConstraintType ConstraintType, ...  )
{
// use Method and MethodConstraint tables
// to find constraints for a given mdMethodDef token and validate them
...
if ( ConstraintType & ctPreCondition )
{
 // check preconditions //{native code}
 // return result and/or throw an Exception
 // implementation could work like this
 if ( !(preconditions) ) 
   // set error code, and/or throw an exception;
 else
   // OK 
}

if ( ConstraintType & ctPostCondition )
{
 // check postconditions //{native code}
 // return result and/or throw an exception
}


if ( ConstraintType & ctInvariant )
{
 // check invariants //{native code}
 // return result and/or throw an exception
}
...

} // ConstraintsChecker::CheckMethod

其中 CorConstraintType 标志描述如下

typedef enum _CorConstraintType
{
    ctNone          = 0x0000,
    ctPreCondition  = 0x0001,
    ctPostCondition = 0x0002,
    ctInvariant     = 0x0004,
...
} CorConstraintType

因此，生成代码将如下所示

//class C, Method foo (after JIT compilation to native code)
C::foo(...)
{

// call CLR's ConstraintsChecker class for method C::foo
// to check preconditions
HRESULT hr = ConstraintsChecker.CheckMethod ( ..., 0x06000002, 
                                           ctPreCondition | ctInvariant );

 method body //{native code}

// call CLR's ConstraintsChecker class for method C::foo
// to check postconditions
hr = ConstraintsChecker.CheckMethod ( ..., 0x06000002, 
                                      ctPostCondition | ctInvariant );

} // C::foo

我们刚刚描述的方法需要对现有的 CLR 架构进行大量更改。

首先，我们必须修改元数据表（向 Method 表添加一个新列）并添加新表（MethodConstraint）。它还需要对元数据 API 进行更改，以允许编译器/设计工具发出额外的元数据/约束定义。其次，我们必须更改执行引擎和 CLR 程序集/类加载器（“fusion”）。我们还应该处理与当前 .NET 版本兼容性问题。

找到一种不需要前面提到的许多更改的中间方法会很好。

在本文中，我将描述一种基于 .NET 剖析 API 和运行时 MSIL 代码重写实现的简单方法，该方法允许我们避免对现有 CLR 进行任何更改。

我称这种技术为“.NET 元数据扩展”或简称为“.NET 扩展”。

这种方法的基本思想可以概述如下。

当 CLR 加载一个类并执行其方法时，该方法的 IL 代码会在即时（JIT）编译过程中被编译为本机指令。作为 CLR 一部分的剖析 API 允许我们拦截此过程。在方法被 JIT 编译之前，我们可以修改其 IL 代码。在最简单的情况下，我们可以将自定义的序言（prolog）和结尾（epilog）插入到方法的 IL 中，并将生成的 IL 返回给 JIT 编译器。根据应用程序逻辑，新生成的 IL 可以在调用原始方法代码之前和之后执行一些额外的工作。

在我们的例子中，这些序言和结尾（由我们的剖析器发出）只是对一个特殊的托管 DLL - CCCore.dll（我称之为“.NET 扩展 DLL”）的调用。换句话说，对于给定的 .NET 模块及其方法（假设是 C::foo），剖析器通过插入一个调用 CCCore.dll 实现的特殊方法的 IL 序言来插装方法的 IL。

public static int CCC::__CheckMethodDefOnEnter( 
                                             int mdMethodDefToken, __arglist )
{
// first parameter is a method’s metadata token
// second parameter is a collection of the actual method’s
// parameters (at the moment of the call).

// checks method’s parameters based
// on XML-encoded MethodConstraint table
...
// and returns result
}

第一个参数是方法的元数据令牌（C::foo 的令牌），第二个参数是实际方法参数的集合（在调用时）。

__CheckMethodDefOnEnter 根据一个特殊的描述符文件进行参数验证，该文件实际上是 MethodConstraint 表的 XML 编码表示（一致性检查器描述符文件 - CCD 文件）。

剖析器还插入一个调用 CCCore.dll 实现的另一个方法的结尾。

public static int CCC::__CheckMethodDefOnExit( __arglist )
{
// __arglist should have two parameters:
// 1) the orig method's return value
// 2) method token

// checks return value based
// on XML-encoded MethodConstraint table
    ...
// and returns result
}

因此，整体图景如下。

在编译成本机代码之前（类 C，方法 foo），我们有

C::foo(...) // ( before JIT compilation)
{
 method body //MSIL
}

剖析器进行以下更改

C::foo(...) //(before JIT compilation with profiler’s changes)
{
 // to check method’s preconditions/invariants
 call [CCCore]CCC::__CheckMethodDefOnEnter( foo’s token, params )

 method body // orig IL code with replaced 'ret'

 // to check postconditions/invariants

 call [CCCore]CCC::__CheckMethodDefOnExit( foo’s token, return value )

}

正如您所见，所有验证逻辑都已移至 NET 扩展 DLL（CCCore.dll），该 DLL 通过分析方法的参数和相应的 CCD 文件（XML 编码的 MethodConstraint 表）来执行实际工作。有关详细信息，请参阅图 3。

Figure 3. Runtime IL code instrumentation and .NET extension

图 3. 运行时 IL 代码插装和 .NET 扩展

这种方法的主要优点是我们的 CLR/Rotor 扩展位于运行时代码之外。每次我们更改代码时，都不需要重新编译“Rotor”源代码。完成更改后，我们可以将我们的代码与 CLR 合并——剖析器代码将成为运行时引擎的一部分。CCCore DLL 可以合并到 mscorlib 中，也可以作为单独的库。XML 编码的元数据表将成为 CLR 元数据。

实现细节

首先，我提出的方法保留了类的标识。与其他使用自定义属性、代理程序集、远程代理（上下文绑定对象）等的技术不同，我们仅在运行时进行更改！对原始源代码没有任何更改。因此，对客户端来说是完全透明的。

下面是一个解释我如何实现 IL 重写（方法插装）并为方法添加序言和结尾的图示。

给定的方法 foo 具有 N (< 255) 个参数 + “this”

ReturnType C::foo ( C* this /*invisible param*/, type1 param1, type2 param2, 
                    type3 param3, ..., typeN paramN )
{
     IL method body
}

将被剖析器重写如下

ReturnType C::foo ( C* this /*invisible param*/, type1 param1, type2 param2,
                    type3 param3, ..., typeN paramN )
{
// prolog >> 
// Arguments of the method are loaded on the stack in
// order of their appearance
// in the method signature, with the last signature param
// being loaded last.
// So for instance methods the "this" is always the first argument:
//    ----------
//   | paramN   |
//   | ...      |
//   | param3   |
//   | param2   |
//   | param1   |
//   | this     | <-- for instance methods, goes first ( slot 0 )
//      ----------
 ldc.i4    tkMethodDef    // load C::foo's token

 ldarg 0    // load param0 on the stack ( _param0 )
 ldarg 1    // load param1 on the stack ( _param1 )
 ldarg 2    // load param2 on the stack ( _param2 )
 ...

// analyze params by calling CChecker
 call vararg int32 CCCore.CCC.__CheckMethodDefOnEnter( tkMethodDef, __arglist )
 pop        // remove CCCore.CCC.Check's result
// prolog <<

 orig method body with replaced "ret" opcode goes here

// epilog >>
 dup    // to copy method's ret value and avoid adding
     // new local vars!!!
 ldc.i4    tkMethodDef    // load method's token

// analyze params by calling CChecker
 call vararg int32 CCCore.CCC.__CheckMethodDefOnExit( __arglist )
// __arglist should have the orig method's return value (goes first!!!)
// + method token

 pop        // remove CCCore.CCC.Check's result
 ret        // retun method's result
// epilog <<
}

其次，被“插装”的模块不必链接到 CCCore.dll DLL（“.NET 扩展”DLL）。查看 \Barracuda2\CChecker\IL 文件夹中提供的 CC.IL 示例模块，可以看到它不引用 CCCore.dll。因此，我们在运行时“动态链接”此 DLL。

最后，方法调用时的参数可以被 XML 序列化。因此，我们可以使用 XML 模式/XPath 表达式（=XPath 断言/规则，例如 SchemaTron 断言语言）来验证输入/输出，而不是 CCD 类文件（一致性检查器描述符）。

public static int CCC:: __CheckMethodDefOnEnter ( int mdMethodDefToken,
                                                  __arglist )
{
// first parameter is a method’s metadata token
// second parameter is a collection of the actual method’s
// parameters (at the moment of the call).

// serialize input as an XML (e.g. SOAP)
// and validate it against a schema file or a set of XPath expression
...
// and returns result
}

无论如何，这都是朝着更标准化的验证方向发展，这也可能意味着创建类似于 ASP.NET Web 服务提供的 SoapExtension 框架的基础设施。

示例

随附的 zip 文件包含两个文件夹：CChecker 和 CCCore。第一个文件夹包含一个 .NET 剖析器 DLL 的二进制文件（CChecker.DLL）。第二个文件夹包含一个实现 .NET 扩展的 C# 项目，名为 CCCore.dll。

CC.IL 示例模块在 \Barracuda2\CChecker\IL 文件夹中提供。要了解这一切如何协同工作，请按照以下步骤操作

打开 MS-DOS 命令提示符，并将当前文件夹更改为 \Barracuda2\CChecker\IL
运行 \Barracuda2\CChecker\IL\cc_on.bat 来启动剖析器（确保剖析器路径
有效）
启动 cc.exe（此测试是用 IL 编写的）

它将显示一些输出，显示有关方法参数及其有效性的各种信息。

要关闭 .NET 扩展并查看区别，只需运行 cc_off.bat。

CCCore.dll 使用 CC.exe.CCD.config 文件（一致性检查器描述符文件）来验证 CC 的方法。一致性检查器描述符文件格式不言而喻。我们使用以下 XPath 表达式“/ccdescriptor/methods/method/@token”来获取描述符文件中的所有方法令牌。要获取方法的约束，我们使用“/ccdescriptor/methods/method[@token='sometokenvalue']/parameters/parameter”。

历史

2003 年 8 月 25 日 - 更新的源代码下载