图像对齐算法

Oleg Krivtsov

4.83/5 (55投票s)

2008 年 3 月 31 日

CPOL

11分钟阅读

317504

17049

实现 Lucas-Kanade 和 Baker-Dellaert-Matthews 图像对齐算法。

demo_src

引言

要了解最新文章更新内容，请参阅历史记录。

图像对齐是指将一个称为模板（我们记为 T）的图像与另一个图像 I 进行匹配的过程（参见上图）。图像对齐有许多应用，例如视频目标跟踪、运动分析以及计算机视觉中的许多其他任务。

1981 年，Bruce D. Lucas 和 Takeo Kanade 提出了一种新技术，该技术利用图像强度梯度信息来搜索模板 T 和另一个图像 I 之间的最佳匹配。该算法在过去 20 年中被广泛应用于计算机视觉领域，并进行了许多修改和扩展。其中一种修改是 Simon Baker、Frank Dellaert 和 Iain Matthews 提出的算法。他们的算法比原始的 Lucas-Kanade 算法在计算上更有效。

在 Lucas-Kanade 20 年后：一个统一的框架系列文章中，Baker 和 Matthews 试图对图像对齐算法进行分类，并根据模板 T 的作用将其分为“前向”和“逆向”两类。此外，他们还根据形变是通过累加参数增量还是通过组合变换矩阵来更新，将算法分为“加法”或“组合”型。遵循这种分类，Lucas-Kanade 算法应称为“前向加法”算法，而 Baker-Dellaert-Matthews 算法应称为“逆向组合”算法。

在本文中，我们将使用 Intel OpenCV 开源计算机视觉库作为基础，用 C 语言实现这两种算法。我们还将尝试比较这两种算法，并看看哪种算法更快。

为什么需要这篇文章？首先，写完这篇文章后，我自己对图像对齐算法有了更深入的了解（至少我希望如此）。有大量的文章提供了关于图像对齐的海量信息。其中大多数文章都面向高级读者，而不是初学者。在这些文章中，我发现了一些我认为将困难问题描述得更简单。但我真正缺少的是可以用来实验的代码示例。因此，本文的另外两个目的是相对简单地解释图像对齐的工作原理，并提供示例源代码。

谁会对阅读本文感兴趣？我认为，如果你还记得大学课程中的一些数学知识，熟悉 C 或 C++，对学习如何跟踪视频中的对象感兴趣，并且有一点耐心，那么你可以阅读本文。

编译示例代码

运行演示非常简单——只需下载 demo_binary.zip 并运行可执行文件。

但要对代码进行实验，您需要自己编译它。首先，您需要安装 Visual Studio .NET 2005。Visual C++ Express Edition 也符合我们的需求。

下一步是从这里下载并安装 OpenCV 库。下载 OpenCV_1.0.exe 并在您的计算机上运行它。

最后，您需要按如下方式配置您的 Visual Studio

在 Visual Studio 主窗口中，选择 **工具 -> 选项...** 菜单项，然后单击 **项目和解决方案 -> VC++ 目录** 树项。
在“显示目录用于：”组合框中，选择“库文件”。在下面的列表中添加以下行

C:\Program Files\OpenCV\lib

在“显示目录用于：”组合框中，选择“包含文件”。在下面的列表中添加以下行

C:\Program Files\OpenCV\cv\include
C:\Program Files\OpenCV\cxcore\include
C:\Program Files\OpenCV\otherlibs\highgui

就这些！现在您可以打开 align_demo.sln，编译并运行它。

关于示例项目结构的一些说明。

auxfunc.h 和 auxfunc.cpp 包含辅助图像形变函数。forwadditive.h 和 forwadditive.cpp 文件包含 Lucas-Kanade 算法的实现。invcomp.h 和 invcomp.cpp 包含逆向组合算法的实现。而 main.cpp 包含主函数。

它做什么？

简而言之，这个程序试图将模板（小蝴蝶图像）与大图像（花上的蝴蝶）对齐。它不限于使用蝴蝶作为模板。例如，您可以使用人脸图像作为模板，并确定该人脸在视频序列的下一帧中的位置。

在对齐过程中，我们会对模板执行形变操作。我们假设允许三种操作：模板的平面内旋转，以及在 X 和 Y 方向上的二维平移。这些操作称为允许的形变集。

我们通过定义形变矩阵来定义允许的形变集。矩阵 W 取决于参数向量 **p** = (w_z, t_x, t_y)。这里 t_x 和 t_y 是平移分量，w_z 是旋转分量。

甚至可以使用更复杂的形变集，例如三维平移和旋转、缩放等。但是，为了简单起见，这里我们使用相对简单的形变集。

现在，关于应用程序的逻辑。

编译并运行演示后，您将看到一个控制台窗口，其中我们将显示诊断信息。输入向量 **p** = (w_z, t_x, t_y) 的分量。
之后，您可以看到两个窗口，其中包含模板 T 和输入图像 I。您还可以看到图像 I 上的一个白色矩形。这是蝴蝶位置的初始近似。我们只是假设蝴蝶在那个区域附近。
按任意键开始 Lucas-Kanade 图像对齐算法。该算法试图搜索蝴蝶的位置，通过迭代最小化模板与图像 I 之间的差异。在控制台窗口中，您将看到它的工作方式。您可以看到当前迭代次数、参数增量和平均误差值。
误差最小化过程收敛后，将在控制台窗口中显示摘要。摘要包括算法名称、计算时间、最终平均误差和参数近似值。您还可以看到显示模板在图像 I 上如何对齐的白色矩形。
现在，按任意键运行相同的过程，但使用逆向组合图像对齐算法。您可以看到当前迭代次数和平均误差值。
将显示逆向组合算法的摘要。按任意键退出程序。

让我们看看 main.cpp 的内容

// Our plan:
// 
// 1. Ask user to enter image warp parameter vector p=(wz, tx, ty)
// 2. Define our template rectangle to be bounding rectangle of the butterfly
// 3. Warp image I with warp matrix W(p)
// 4. Show template T and image I, wait for a key press
// 5. Estimate warp parameters using Lucas-Kanade method, wait for a key press
// 6. Estimate warp parameters using Baker-Dellaert-Matthews, wait for a key press
//

int main(int argc, char* argv[])
{
 // Ask user to enter image warp paremeters vector.
 // p = (wz, tx, ty)
 
 float WZ=0, TX=0, TY=0;
 printf("Please enter WZ, TX and TY, separated by space.\n");
 printf("Example: -0.01 5 -3\n");
 printf(">");
 scanf("%f %f %f", &WZ, &TX, &TY);

在 OpenCV 中，我们将图像存储在 IplImage 结构中，因此这里我们定义图像指针并将其初始值设为零。

 // Here we will store our images.
 IplImage* pColorPhoto = 0; // Photo of a butterfly on a flower.
 IplImage* pGrayPhoto = 0; // Grayscale copy of the photo.
 IplImage* pImgT = 0; // Template T.
 IplImage* pImgI = 0; // Image I.

在 OpenCV 中，矩阵存储在 CvMat 结构中。让我们定义一个指向我们的形变矩阵 W 的指针，并将其初始化为零。

 // Here we will store our warp matrix.
 CvMat* W = 0; // Warp W(p)

现在，创建两个窗口分别用于模板 T 和图像 I。OpenCV 提供了基本的 GUI 支持。GUI 支持作为 OpenCV 的一部分实现，称为 highgui 库。cvLoadImage() 函数允许从 JPG 文件加载图像。

现在，使用 cvCreateImage() 函数为图像分配内存。第一个参数是图像的像素尺寸，第二个参数是深度，第三个参数是通道数。RGB 图像的深度为 IPL_DEPTH_8U，通道数为三个。IPL_DEPTH_8U 表示使用 unsigned char 作为图像元素。IPL_DEPTH_16S 表示使用 short int 作为图像元素。

我们还需要将照片转换为灰度格式，因为图像对齐算法只处理灰度图像。

 // Create two simple windows. 
 cvNamedWindow("template"); // Here we will display T(x).
 cvNamedWindow("image"); // Here we will display I(x).

 // Load photo.
 pColorPhoto = cvLoadImage("..\\data\\photo.jpg");

 // Create other images.
 CvSize photo_size = cvSize(pColorPhoto->width,pColorPhoto->height);
 pGrayPhoto = cvCreateImage(photo_size, IPL_DEPTH_8U, 1);
 pImgT = cvCreateImage(photo_size, IPL_DEPTH_8U, 1);
 pImgI = cvCreateImage(photo_size, IPL_DEPTH_8U, 1);

 // Convert photo to grayscale, because we need intensity values instead of RGB.
 cvCvtColor(pColorPhoto, pGrayPhoto, CV_RGB2GRAY);

现在，使用 cvCreateMat() 函数为我们的形变矩阵 W 分配内存。前两个参数定义了矩阵的大小（3x3），最后一个参数是矩阵元素的类型。CV_32F 表示我们将使用 float 作为矩阵元素类型。

 // Create warp matrix, we will use it to create image I.
 W = cvCreateMat(3, 3, CV_32F);

我们定义要使用的模板。我们定义 omega 矩形，表示蝴蝶的位置。

 // Create template T
 // Set region of interest to butterfly's bounding rectangle.
 cvCopy(pGrayPhoto, pImgT);
 CvRect omega = cvRect(110, 100, 200, 150);

我们通过稍微形变模板来定义要使用的图像 I。

 // Create I by warping photo. 
 init_warp(W, WZ, TX, TY);
 warp_image(pGrayPhoto, pImgI, W);
 
 // Draw the initial template position approximation rectangle.
 cvSetIdentity(W);
 draw_warped_rect(pImgI, omega, W);

 // Show T and I and wait for key press.
 cvSetImageROI(pImgT, omega);
 cvShowImage("template", pImgT);
 cvShowImage("image", pImgI);
 printf("Press any key to start Lucas-Kanade algorithm.\n");
 cvWaitKey(0);
 cvResetImageROI(pImgT);

 // Print what parameters were entered by user.
 printf("==========================================================\n");
 printf("Ground truth: WZ=%f TX=%f TY=%f\n", WZ, TX, TY);
 printf("==========================================================\n");

 init_warp(W, WZ, TX, TY);

运行 Lucas-Kanade 算法

 // Restore image I
 warp_image(pGrayPhoto, pImgI, W);

 /* Lucas-Kanade */ 
 align_image_forwards_additive(pImgT, omega, pImgI);

现在，等待按键，然后运行逆向组合算法

 // Restore image I
 warp_image(pGrayPhoto, pImgI, W);

 printf("Press any key to start Baker-Dellaert-Matthews algorithm.\n");
 cvWaitKey();

 /* Baker-Dellaert-Matthews*/
 align_image_inverse_compositional(pImgT, omega, pImgI);

等待按键，释放所有使用的资源，然后退出。

 printf("Press any key to exit the demo.\n");
 cvWaitKey();

 // Release all used resources and exit
 cvDestroyWindow("template");
 cvDestroyWindow("image");
 cvReleaseImage(&pImgT);
 cvReleaseImage(&pImgI);
 cvReleaseMat(&W);
 
 return 0;
}

实现前向加法算法

现在，我们将详细描述 Lucas-Kanade 算法的实现。正如 Baker 和 Matthews 所说，前向加法算法包括以下步骤：

预计算

计算图像 I 的梯度。

迭代

用 W(x;p) 形变 I 以计算 I(W(x,p))。
计算误差图像 T(x) – I(W(x;p))。
用 W(x;p) 形变梯度。
在 (x;p) 处评估雅可比矩阵。
计算最陡下降图像。
计算海森矩阵
计算
计算
更新参数：

直到。

让我们来实现这个算法（参见 forwadditive.h 和 forwadditive.cpp）。

首先，让我们包含我们将要使用的函数的头文件。

#include <stdio.h> 
#include <time.h> 
#include <cv.h> // Include header for computer-vision part of OpenCV.
#include <highgui.h> // Include header for GUI part of OpenCV. 
#include "auxfunc.h" // Header for our warping functions.

让我们定义一个将实现前向加法（Lucas-Kanade）算法的函数。模板图像 pImgT、模板边界矩形 omega 和另一个图像 pImgI 代表了算法的参数。在 OpenCV 中，图像存储为 IplImage 结构，矩形可以使用 CvRect 结构定义。

// Lucas-Kanade method
// @param[in] pImgT Template image T
// @param[in] omega Rectangular template region
// @param[in] pImgI Another image I

void align_image_forwards_additive(IplImage* pImgT, CvRect omega, IplImage* pImgI)
{
 // Some constants for iterative minimization process.
 const float EPS = 1E-5f; // Threshold value for termination criteria.
 const int MAX_ITER = 100; // Maximum iteration count.

在 OpenCV 中，我们将图像存储在 IplImage 结构中，因此这里我们定义图像指针并将其初始值设为零。

 // Here we will store gradient images. 
 IplImage* pGradIx = 0; // Gradient of I in X direction.
 IplImage* pGradIy = 0; // Gradient of I in Y direction.

此外，我们将需要使用矩阵。在 OpenCV 中，矩阵存储在 CvMat 结构中。

 // Here we will store matrices.
 CvMat* W = 0; // Current value of warp W(x,p)
 CvMat* H = 0; // Hessian
 CvMat* iH = 0; // Inverse of Hessian
 CvMat* b = 0; // Vector in the right side of the system of linear equations.
 CvMat* delta_p = 0; // Parameter update value.
 CvMat* X = 0; // Point in coordinate frame of T.
 CvMat* Z = 0; // Point in coordinate frame of I.

现在，让我们使用 cvCreateMat() 函数为我们的矩阵分配内存。前两个参数定义了矩阵的大小（3x3 或 3x1），最后一个参数是矩阵元素的类型。CV_32F 表示我们将使用 float 作为矩阵元素类型。

 // Create matrices.
 W = cvCreateMat(3, 3, CV_32F);
 H = cvCreateMat(3, 3, CV_32F);
 iH = cvCreateMat(3, 3, CV_32F);
 b = cvCreateMat(3, 1, CV_32F);
 delta_p = cvCreateMat(3, 1, CV_32F);
 X = cvCreateMat(3, 1, CV_32F);
 Z = cvCreateMat(3, 1, CV_32F);

使用 cvCreateImage() 函数为图像分配内存。第一个参数是图像的像素尺寸，第二个参数是深度，第三个参数是通道数。灰度图像的深度为 IPL_DEPTH_8U，通道数为一个。IPL_DEPTH_16S 表示使用 short int 作为图像元素。

 // Create gradient images.
 CvSize image_size = cvSize(pImgI->width, pImgI->height);
 pGradIx = cvCreateImage(image_size, IPL_DEPTH_16S, 1);
 pGradIy = cvCreateImage(image_size, IPL_DEPTH_16S, 1);

保存当前时间

 // Get current time. We will use it later to obtain total calculation time.
 clock_t start_time = clock();

预计算图像梯度。请注意，cvSobel() 函数产生一个增强的图像梯度（因为我们使用了 3x1 或 1x3 的高斯核），因此我们应该调用 cvConvertScale() 来标准化梯度图像。

 /*
 * Precomputation stage.
 */
 
 // Calculate gradient of I.
 cvSobel(pImgI, pGradIx, 1, 0); // Gradient in X direction
 cvConvertScale(pGradIx, pGradIx, 0.125); // Normalize
 cvSobel(pImgI, pGradIy, 0, 1); // Gradient in Y direction
 cvConvertScale(pGradIy, pGradIy, 0.125); // Normalize

进入迭代循环。迭代直到收敛或达到最大迭代次数

 /*
 * Iteration stage.
 */

 // Here we will store parameter approximation. 
 float wz_a=0, tx_a=0, ty_a=0;
 
 // Here we will store mean error value.
 float mean_error=0;

 // Iterate
 int iter=0; // number of current iteration
 while(iter < MAX_ITER)
 {
   iter++; // Increment iteration counter

   mean_error = 0; // Set value of mean error with zero.

   int pixel_count=0; // Count of processed pixels

初始化形变 W(x,p) 并将海森矩阵设置为零

   init_warp(W, wz_a, tx_a, ty_a); // Init warp W(x, p)
   cvSet(H, cvScalar(0)); // Set Hessian with zeroes
   cvSet(b, cvScalar(0)); // Set b matrix with zeroes

   // (u,v) - pixel coordinates in the coordinate frame of T.
   int u, v;
 
   // Walk through pixels in the template T.
   int i, j;
   for(i=0; i<omega.width; i++)
   {
     u = i + omega.x;

     for(j=0; j<omega.height; j++)
     {
       v = j + omega.y;

       // Set vector X with pixel coordinates (u,v,1)
       SET_VECTOR(X, u, v);

       // Warp Z=W*X
       cvGEMM(W, X, 1, 0, 0, Z);

       // (u2,v2) - pixel coordinates in the coordinate frame of I.
       float u2, v2;
    
       // Get coordinates of warped pixel in coordinate frame of I.
       GET_VECTOR(Z, u2, v2);

       // Get the nearest integer pixel coords (u2i;v2i).
       int u2i = cvFloor(u2);
       int v2i = cvFloor(v2);

       if(u2i>=0 && u2i<pImgI->width && // check if pixel is inside I.
          v2i>=0 && v2i<pImgI->height)
       {
         pixel_count++;

         // Evaluate gradient of I at W(x,p) with subpixel accuracy
         // using bilinear interpolation.
         float Ix = interpolate<short>(pGradIx, u2, v2);
         float Iy = interpolate<short>(pGradIy, u2, v2);
 
         // Calculate steepest descent image's element.
         float stdesc[3]; // an element of steepest descent image
         stdesc[0] = (float)(-v*Ix+u*Iy);
         stdesc[1] = (float)Ix;
         stdesc[2] = (float)Iy;

     // Calculate intensity of a transformed pixel with sub-pixel accuracy
         // using bilinear interpolation.
     float I2 = interpolate<uchar>(pImgI, u2, v2);
    
     // Calculate image difference D = T(x)-I(W(x,p)).
     float D = CV_IMAGE_ELEM(pImgT, uchar, v, u) - I2;
    
     // Update mean error value.
     mean_error += fabs(D);

         // Add a term to b matrix.
         float* pb = &CV_MAT_ELEM(*b, float, 0, 0);
         pb[0] += stdesc[0] * D;
         pb[1] += stdesc[1] * D;
         pb[2] += stdesc[2] * D;
 
        // Add a term to Hessian.
        int l,m;
        for(l=0;l<3;l++)
        {
          for(m=0;m<3;m++)
          {
            CV_MAT_ELEM(*H, float, l, m) += stdesc[l]*stdesc[m];
          } 
        }
      } 
    }
  }

  // Finally, calculate resulting mean error.
  if(pixel_count!=0)
     mean_error /= pixel_count;

求海森矩阵的逆

  // Invert Hessian.
  double inv_res = cvInvert(H, iH);
  if(inv_res==0)
  {
    printf("Error: Hessian is singular.\n");
    return;
  }

更新参数：

  // Find parameter increment. 
  cvGEMM(iH, b, 1, 0, 0, delta_p);
  float delta_wz = CV_MAT_ELEM(*delta_p, float, 0, 0);
  float delta_tx = CV_MAT_ELEM(*delta_p, float, 1, 0);
  float delta_ty = CV_MAT_ELEM(*delta_p, float, 2, 0);
 
  // Update parameter vector approximation.
  wz_a += delta_wz;
  tx_a += delta_tx;
  ty_a += delta_ty;
 
  // Print diagnostic information to screen.
  printf("iter=%d dwz=%f dtx=%f dty=%f mean_error=%f\n", 
    iter, delta_wz, delta_tx, delta_ty, mean_error); 
 
  // Check termination critera.
  if(fabs(delta_wz)<EPS && fabs(delta_tx)<EPS && fabs(delta_ty)<EPS) break;
 }

显示对齐结果

 // Get current time and obtain total time of calculation.
 clock_t finish_time = clock();
 double total_time = (double)(finish_time-start_time)/CLOCKS_PER_SEC;

 // Print summary.
 printf("===============================================\n");
 printf("Algorithm: forward additive.\n");
 printf("Caclulation time: %g sec.\n", total_time);
 printf("Iteration count: %d\n", iter);
 printf("Approximation: wz_a=%f tx_a=%f ty_a=%f\n", wz_a, tx_a, ty_a);
 printf("Epsilon: %f\n", EPS);
 printf("Resulting mean error: %f\n", mean_error);
 printf("===============================================\n");
 
 // Show result of image alignment.
 init_warp(W, wz_a, tx_a, ty_a);
 draw_warped_rect(pImgI, omega, W);
 cvSetImageROI(pImgT, omega);
 cvShowImage("template",pImgT);
 cvShowImage("image",pImgI);
 cvResetImageROI(pImgT);

现在，释放所有使用的资源

 // Free used resources and exit.
 cvReleaseMat(&W);
 cvReleaseMat(&H);
 cvReleaseMat(&iH);
 cvReleaseMat(&b);
 cvReleaseMat(&delta_p);
 cvReleaseMat(&X);
 cvReleaseMat(&Z);
}

实现逆向组合图像对齐算法

在逆向组合算法中，模板 T 和图像 I 的角色互换。

预计算

计算图像 T(x) 的梯度。
在 (x;0) 处评估雅可比矩阵。
计算最陡下降图像，。
计算海森矩阵，

迭代

用 W(x;p) 形变 I 以计算 I(W(x,p))。
计算误差图像 I(W(x;p))-T(x)。
用 W(x;p) 形变梯度。
计算
计算
更新参数

直到。

此方法的代码位于 invcomp.h 和 invcomp.cpp 中。让我们看看 invcomp.cpp 包含什么。

包含所有必需的头文件

#include <stdio.h>
#include <time.h>
#include <cv.h>        // Include header for computer-vision part of OpenCV.
#include <highgui.h>    // Include header for GUI part of OpenCV.
#include "auxfunc.h"    // Header for our warping functions.

这里我们实现了 Baker-Dellaert-Matthews（逆向组合）算法

// Baker-Dellaert-Matthews inverse compositional method
// @param[in] pImgT   Template image T
// @param[in] omega   Rectangular template region
// @param[in] pImgI   Another image I

void align_image_inverse_compositional(IplImage* pImgT, CvRect omega, IplImage* pImgI)
{
    // Some constants for iterative minimization process.
    const float EPS = 1E-5f; // Threshold value for termination criteria.
    const int MAX_ITER = 100;  // Maximum iteration count.

创建所有内部使用的矩阵和图像

    // Here we will store internally used images.
    IplImage* pGradTx = 0;       // Gradient of I in X direction.
    IplImage* pGradTy = 0;       // Gradient of I in Y direction.
    IplImage* pStDesc = 0;        // Steepest descent images.

    // Here we will store matrices.
    CvMat* W = 0;  // Current value of warp W(x,p)
    CvMat* dW = 0;  // Warp update.
    CvMat* idW = 0; // Inverse of warp update.
    CvMat* X = 0;  // Point in coordinate frame of T.
    CvMat* Z = 0;  // Point in coordinate frame of I.

    CvMat* H = 0;  // Hessian.
    CvMat* iH = 0; // Inverse of Hessian.
    CvMat* b = 0;  // Vector in the right side of the system of linear equations.
    CvMat* delta_p = 0; // Parameter update value.
    
    // Create matrices.
    W = cvCreateMat(3, 3, CV_32F);
    dW = cvCreateMat(3, 3, CV_32F);
    idW = cvCreateMat(3, 3, CV_32F);
    X = cvCreateMat(3, 1, CV_32F);
    Z = cvCreateMat(3, 1, CV_32F);

    H = cvCreateMat(3, 3, CV_32F);
    iH = cvCreateMat(3, 3, CV_32F);
    b = cvCreateMat(3, 1, CV_32F);
    delta_p = cvCreateMat(3, 1, CV_32F);

IPL_DEPTH_16S 表示使用 short 作为梯度图像元素。这里我们使用 short，因为 cvSobel 产生的图像未归一化，可能会发生 uchar 溢出。为了归一化梯度图像，我们使用 cvConvertScale()。

    // Create images.
    CvSize image_size = cvSize(pImgI->width, pImgI->height);
    pGradTx = cvCreateImage(image_size, IPL_DEPTH_16S, 1);
    pGradTy = cvCreateImage(image_size, IPL_DEPTH_16S, 1);
    pStDesc = cvCreateImage(image_size, IPL_DEPTH_32F, 3);

    // Get current time. We will use it later to obtain total calculation time.
    clock_t start_time = clock();

    /*
     *  Precomputation stage.
     */

    // Calculate gradient of T.
    cvSobel(pImgT, pGradTx, 1, 0); // Gradient in X direction
    cvConvertScale(pGradTx, pGradTx, 0.125); // Normalize
    cvSobel(pImgT, pGradTy, 0, 1); // Gradient in Y direction
    cvConvertScale(pGradTy, pGradTy, 0.125); // Normalize
    

    // Compute steepest descent images and Hessian

    cvSet(H, cvScalar(0)); // Set Hessian with zeroes

现在遍历模板图像 pImgT 的像素，累加海森矩阵 H 和矩阵 b

    int u, v;    // (u,v) - pixel coordinates in the coordinate frame of T.
    float u2, v2; // (u2,v2) - pixel coordinates in the coordinate frame of I.
    
    // Walk through pixels in the template T.
    int i, j;
    for(i=0; i<omega.width; i++)
    {
        u = i + omega.x;

        for(j=0; j<omega.height; j++)
        {
            v = j + omega.y;

            // Evaluate gradient of T.
            short Tx = CV_IMAGE_ELEM(pGradTx, short, v, u);    
            short Ty = CV_IMAGE_ELEM(pGradTy, short, v, u);    
            
            // Calculate steepest descent image's element.
            // an element of steepest descent image
            float* stdesc = &CV_IMAGE_ELEM(pStDesc, float, v,u*3);
            stdesc[0] = (float)(-v*Tx+u*Ty);
            stdesc[1] = (float)Tx;
            stdesc[2] = (float)Ty;

            // Add a term to Hessian.
            int l,m;
            for(l=0;l<3;l++)
            {
                for(m=0;m<3;m++)
                {
                    CV_MAT_ELEM(*H, float, l, m) += stdesc[l]*stdesc[m];
                }
            }
        }    
    }

    // Invert Hessian.
    double inv_res = cvInvert(H, iH);
    if(inv_res==0)
    {
        printf("Error: Hessian is singular.\n");
        return;
    }

现在进入迭代循环以最小化平均误差值。

    /*
     *   Iteration stage.
     */

    // Set warp with identity.
    cvSetIdentity(W);

    // Here we will store current value of mean error.
    float mean_error=0;

    // Iterate
    int iter=0; // number of current iteration
    while(iter < MAX_ITER)
    {
        iter++; // Increment iteration counter

        mean_error = 0; // Set mean error value with zero

        int pixel_count=0; // Count of processed pixels
        
        cvSet(b, cvScalar(0)); // Set b matrix with zeroes
            
        // Walk through pixels in the template T.
        int i, j;
        for(i=0; i<omega.width; i++)
        {
            int u = i + omega.x;

            for(j=0; j<omega.height; j++)
            {
                int v = j + omega.y;

                // Set vector X with pixel coordinates (u,v,1)
                SET_VECTOR(X, u, v);

                // Warp Z=W*X
                cvGEMM(W, X, 1, 0, 0, Z);

                // Get coordinates of warped pixel in coordinate frame of I.
                GET_VECTOR(Z, u2, v2);

                // Get the nearest integer pixel coords (u2i;v2i).
                int u2i = cvFloor(u2);
                int v2i = cvFloor(v2);

                if(u2i>=0 && u2i<pImgI->width && // check if pixel is inside I.
                    v2i>=0 && v2i<pImgI->height)
                {
                    pixel_count++;

                    // Calculate intensity of a transformed pixel with sub-pixel accuracy
                    // using bilinear interpolation.
                    float I2 = interpolate<uchar>(pImgI, u2, v2);

                    // Calculate image difference D = I(W(x,p))-T(x).
                    float D = I2 - CV_IMAGE_ELEM(pImgT, uchar, v, u);

                    // Update mean error value.
                    mean_error += fabs(D);

                    // Add a term to b matrix.
                    float* stdesc = &CV_IMAGE_ELEM(pStDesc, float, v, u*3);
                    float* pb = &CV_MAT_ELEM(*b, float, 0, 0);
                    pb[0] += stdesc[0] * D;
                    pb[1] += stdesc[1] * D;
                    pb[2] += stdesc[2] * D;                    
                }    
            }
        }

        // Finally, calculate resulting mean error.
        if(pixel_count!=0)
            mean_error /= pixel_count;

        // Find parameter increment. 
        cvGEMM(iH, b, 1, 0, 0, delta_p);
        float delta_wz = CV_MAT_ELEM(*delta_p, float, 0, 0);
        float delta_tx = CV_MAT_ELEM(*delta_p, float, 1, 0);
        float delta_ty = CV_MAT_ELEM(*delta_p, float, 2, 0);

        init_warp(dW, delta_wz, delta_tx, delta_ty);
        // Invert warp.
        inv_res = cvInvert(dW, idW);
        if(inv_res==0)
        {
            printf("Error: Warp matrix is singular.\n");
            return;
        }

        cvGEMM(idW, W, 1, 0, 0, dW);
        cvCopy(dW, W);

        // Print diagnostic information to screen.
        printf("iter=%d mean_error=%f\n", iter, mean_error);

        // Check termination critera.
        if(fabs(delta_wz)<=EPS && fabs(delta_tx)<=EPS && fabs(delta_ty)<=EPS) break;
    }

    // Get current time and obtain total time of calculation.
    clock_t finish_time = clock();
    double total_time = (double)(finish_time-start_time)/CLOCKS_PER_SEC;

将摘要信息打印到 stdout

    // Print summary.
    printf("===============================================\n");
    printf("Algorithm: inverse compositional.\n");
    printf("Caclulation time: %g sec.\n", total_time);
    printf("Iteration count: %d\n", iter);
    printf("Epsilon: %f\n", EPS);
    printf("Resulting mean error: %f\n", mean_error);
    printf("===============================================\n");

    // Show result of image alignment.
    draw_warped_rect(pImgI, omega, W);
    cvSetImageROI(pImgT, omega);
    cvShowImage("template",pImgT);
    cvShowImage("image",pImgI);
    cvResetImageROI(pImgT);

    // Free used resources and exit.
    cvReleaseMat(&W);
    cvReleaseMat(&dW);
    cvReleaseMat(&idW);
    cvReleaseMat(&H);
    cvReleaseMat(&iH);
    cvReleaseMat(&b);
    cvReleaseMat(&delta_p);
    cvReleaseMat(&X);
    cvReleaseMat(&Z);
    
}

双线性插值

在这两种算法中，我们都使用双线性像素插值。插值有助于更准确地计算变换后像素的强度。变换后的像素通常具有非整数坐标，因此通过插值它们的强度，我们考虑了邻近像素的强度。这也有助于避免平均误差振荡并提高整体最小化性能。

我们使用 interpolate() 模板函数，该函数定义在 auxfunc.h 中

// Interpolates pixel intensity with subpixel accuracy.
// Abount bilinear interpolation in Wikipedia:
// http://en.wikipedia.org/wiki/Bilinear_interpolation

template <class T>
float interpolate(IplImage* pImage, float x, float y)
{
    // Get the nearest integer pixel coords (xi;yi).
    int xi = cvFloor(x);
    int yi = cvFloor(y);

    float k1 = x-xi; // Coefficients for interpolation formula.
    float k2 = y-yi;

    int f1 = xi<pImage->width-1;  // Check that pixels to the right  
    int f2 = yi<pImage->height-1; // and to down direction exist.

    T* row1 = &CV_IMAGE_ELEM(pImage, T, yi, xi);
    T* row2 = &CV_IMAGE_ELEM(pImage, T, yi+1, xi);
                
    // Interpolate pixel intensity.
    float interpolated_value = (1.0f-k1)*(1.0f-k2)*(float)row1[0] +
                (f1 ? ( k1*(1.0f-k2)*(float)row1[1] ):0) +
                (f2 ? ( (1.0f-k1)*k2*(float)row2[0] ):0) +                        
                ((f1 && f2) ? ( k1*k2*(float)row2[1] ):0) ;

    return interpolated_value;

}

结论

图像对齐在计算机视觉领域有许多应用，例如目标跟踪。在本文中，我们描述了两种图像对齐算法：Lucas-Kanade 前向加法算法和 Baker-Dellaert-Matthews 逆向组合算法。我们还看到了这些算法的 C 源代码。

现在，让我们比较一下哪种算法在我的 Pentium IV 3 Ghz（Release 配置）上运行更快。让我们输入向量 **p** 的以下分量：

-0.01 5 -3

这意味着我们将图像 I 向右平移了五个像素，向上平移了三个像素，并对其进行了轻微旋转（但很难说具体角度是多少）。

这是前向加法算法的摘要

===============================================
Algorithm: forward additive.
Caclulation time: 0.157 sec.
Iteration count: 13
Approximation: wz_a=-0.007911 tx_a=4.893255 ty_a=-3.944573
Epsilon: 0.000010
Resulting mean error: 2.898076
===============================================

这是逆向组合算法的摘要

===============================================
Algorithm: inverse compositional.
Caclulation time: 0.078 sec.
Iteration count: 11
Epsilon: 0.000010
Resulting mean error: 2.896847
===============================================

前向加法算法的计算时间为 0.157 秒。逆向组合算法的计算时间为 0.078 秒。如我们所见，逆向组合算法运行得更快，因为它在计算上更有效。

另请参阅

您可能还想阅读图像对齐算法 - 第二部分，其中我们实现了前向组合和逆向加法图像对齐算法，并比较了所有四种已实现的算法。

另一个关于二维目标跟踪和示例源代码的教程可以在实现简单的二维目标跟踪器文章中找到。

如果您正在寻找目标跟踪的教程，请访问图像对齐算法网站。

历史

2008 年 8 月 6 日

已修复：cvSobel() 未归一化。感谢 hbwang_1427 告知我这个问题。
改进：实现了双线性插值，该插值允许以亚像素精度计算形变像素的强度。这也有助于避免平均误差振荡。
在文章正文中描述了逆向组合算法的代码。