连通组件标记算法的实现

Torben Trindkaer Nielsen

4.94/5 (14投票s)

2014 年 10 月 1 日

CPOL

4分钟阅读

76026

2110

本文介绍了一种带有堆栈限制规避方法的递归连通分量标记算法。所有代码均在70行C/C++代码以内。

下载源代码 - 838 B

Sample Image of Connected Component Labelling

引言

连通分量标记算法的迭代解法在文献中有详细描述，但在实现时需要相当复杂的方法。更简单的递归解法存在的问题是，即使对于小型图像，它也会使用比通常可用的更多的堆栈空间。

本文介绍了一种带有堆栈限制规避方法的递归4连通分量标记算法。所有代码均在70行C/C++代码以内。该实现使用纯C语言编写，因此可以用于任何C/C++项目，无论平台和编译器如何。

背景

连通分量标记（connected component labeling）常用于计算机视觉和图像分析领域。

Using the Code

代码包含一个源文件。要使用它，只需将代码包含到您的C/C++项目中并调用函数LabelImage。

void LabelImage(unsigned short width, unsigned short height, unsigned char* input, int* output);

输入图像是一个字节数组，其中0值表示背景，其他值（通常是1或255）表示一个对象。这通常是通过阈值处理得到的。

输出图像（标记后的图像）是一个整数数组，其中0值表示背景，标记数字从1开始，直到找到的标签数量。

输出图像必须预先分配并初始化为零。

限制

由于无符号短整型通常是16位的，输入和输出图像的大小限制为最大宽度 x 高度 = (65535 x 65535 像素)。

输入和输出图像的大小也受可用堆大小的限制（临时分配的堆内存约为“6*宽度*高度”字节，数量级上是输出图像大小的1.5倍）。

使用示例如下所示

unsigned short W = 640;
unsigned short H = 480;
unsigned char* input  = (unsigned char*) malloc(W*H*sizeof(unsigned char));
int*           output = (int*)           malloc(W*H*sizeof(int));
/* -- TODO: You will need to initialize the input image with whatever image you want to label! */
memset(output, 0, W*H*sizeof(int));
LabelImage(W, H, input, output);

输入和输出图像的数据直接布局为连续的IPL图像或OpenCV中的矩阵（我认为这是图像数据行和列的标准分发方式）。这使得OpenCV与该算法一起使用非常方便。

在OpenCV项目中使用的示例如下所示

unsigned short W = 640;
unsigned short H = 480;
Mat input (H, W, CV_8UC1);
Mat output(H, W, CV_32SC1, 0);
/* -- TODO: You will need to initialize the input image with whatever image you want to label! */
LabelImage(W, H, (unsigned char*)input.data, (int*)output.data);

如前所述，要初始化上述示例中的输入图像，必须知道行和列的分布。

通过以下简单的演示，一个具有(宽度, 高度) = (8,6)像素的虚拟图像进行了说明。

Image of memory layout of matrix

行编号为 0 .. 5
列编号为 0 .. 7
索引为 column + row*width = 6 + 2*8 = 22 处的像素值为255，即 input[22] = 255。

实现

由于该方法是对标准递归算法的修改，我们首先展示其外观。

标准的4连通分量递归算法，用C/C++编写（递归过程LabelComponent在下一个代码片段中）

void LabelImage(unsigned short width, unsigned short height, unsigned char* input, int* output)
{
  int labelNo = 0;
  int index   = -1;
  for (unsigned short y = 0; y < height; y++)
  {
    for (unsigned short x = 0; x < width; x++)
    {
      index++;
      if (input [index] == 0) continue;   /* This pixel is not part of a component */
      if (output[index] != 0) continue;   /* This pixel has already been labelled  */
      /* New component found */
      labelNo++;
      LabelComponent(width, height, input, output, labelNo, x, y);
    }
  }
}

递归部分在此过程LabelComponent中

void LabelComponent(unsigned short width, unsigned short height, 
unsigned char* input, int* output, int labelNo, unsigned short x, unsigned short y)
{
  int index = x + width*y;
  if (input [index] == 0) return;   /* This pixel is not part of a component */
  if (output[index] != 0) return;   /* This pixel has already been labelled  */
  output[index] = labelNo;

  /* Now label the 4 neighbours: */
  if (x > 0)        LabelComponent(width, height, input, output, labelNo, x-1, y);   /* left  pixel */
  if (x < width-1)  LabelComponent(width, height, input, output, labelNo, x+1, y);   /* right pixel */
  if (y > 0)        LabelComponent(width, height, input, output, labelNo, x, y-1);   /* upper pixel */
  if (y < height-1) LabelComponent(width, height, input, output, labelNo, x, y+1);   /* lower pixel */
}

现在我们只需要重写过程LabelComponent以避免堆栈溢出。但要做到这一点，我们必须首先理解当一个过程被调用时会发生什么。

调用过程包括将过程参数与过程后的程序地址一起压入堆栈。然后跳转到过程地址（见下面的宏CALL_LabelComponent）。过程中的所有参数变量然后引用它们在堆栈上的实际位置。
当过程结束时（过程中的返回点 - 见下面的宏RETURN），弹出堆栈上的值并跳转到过程后的地址（这也已压入堆栈）。

View of stack

整个程序已用自定义实现的堆栈重写（四个宏名称CALL_LabelComponent、RETURN、X和Y应该提高可读性）。

CALL_LabelComponent对应于LabelComponent过程的递归调用。
RETURN表示LabelComponent过程调用的结束。
X和Y（大写）分别对应于局部变量x和y（小写）。

#define CALL_LabelComponent(x,y,returnLabel) { STACK[SP] = x; 
STACK[SP+1] = y; STACK[SP+2] = returnLabel; SP += 3; goto START; }
#define RETURN { SP -= 3;                \
                 switch (STACK[SP+2])    \
                 {                       \
                 case 1 : goto RETURN1;  \
                 case 2 : goto RETURN2;  \
                 case 3 : goto RETURN3;  \
                 case 4 : goto RETURN4;  \
                 default: return;        \
                 }                       \
               }
#define X (STACK[SP-3])
#define Y (STACK[SP-2])

void LabelComponent(unsigned short* STACK, unsigned short width, unsigned short height, 
unsigned char* input, int* output, int labelNo, unsigned short x, unsigned short y)
{
  STACK[0] = x;
  STACK[1] = y;
  STACK[2] = 0;  /* return - component is labelled */
  int SP   = 3;
  int index;

START: /* Recursive routine starts here */

  index = X + width*Y;
  if (input [index] == 0) RETURN;   /* This pixel is not part of a component */
  if (output[index] != 0) RETURN;   /* This pixel has already been labelled  */
  output[index] = labelNo;

  if (X > 0) CALL_LabelComponent(X-1, Y, 1);   /* left  pixel */
RETURN1:

  if (X < width-1) CALL_LabelComponent(X+1, Y, 2);   /* right pixel */
RETURN2:

  if (Y > 0) CALL_LabelComponent(X, Y-1, 3);   /* upper pixel */
RETURN3:

  if (Y < height-1) CALL_LabelComponent(X, Y+1, 4);   /* lower pixel */
RETURN4:

  RETURN;
}

void LabelImage(unsigned short width, unsigned short height, unsigned char* input, int* output)
{
  unsigned short* STACK = (unsigned short*) malloc(3*sizeof(unsigned short)*(width*height + 1));
  
  int labelNo = 0;
  int index   = -1;
  for (unsigned short y = 0; y < height; y++)
  {
    for (unsigned short x = 0; x < width; x++)
    {
      index++;
      if (input [index] == 0) continue;   /* This pixel is not part of a component */
      if (output[index] != 0) continue;   /* This pixel has already been labelled  */
      /* New component found */
      labelNo++;
      LabelComponent(STACK, width, height, input, output, labelNo, x, y);
    }
  }

  free(STACK);
}

关注点

所描述的实现非常简单，易于根据您的需求进行定制，例如：

将其更改为查找8连通分量。
根据您的需求更改输入或输出图像的数据类型。
只需修改这两行代码，测试输入图像像素是否为零，并与阈值进行比较，您就可以避免先遍历图像进行固定阈值处理。

除了易于使用和易于定制的解决连通分量标记问题的算法之外，它还提供了一种实现递归算法的方法，该算法没有简单的迭代解法，即使在这里会导致堆栈溢出。缺点是该方法不具有普遍适用性，仅适用于所需堆栈大小在通常可以在堆上分配的范围内的情况。

实现中的CALL和RETURN宏可以做得更智能。放弃跨平台和ANSI C兼容性，GNU GCC编译器允许使用标签作为值（&&运算符扩展）。因此，可以避免RETURN宏中的switch语句，但代价是在堆栈中使用指针大小的条目。

历史

2014-10-01 文章初版