用于边缘检测和卡通效果的像素着色器

Santhosh G_

4.88/5 (58投票s)

2010 年 7 月 19 日

CPOL

5分钟阅读

165040

8717

使用像素着色器实现 Sobel 边缘检测和卡通效果。

引言

本文解释了如何使用 C++ 在像素着色器中实现 Sobel 边缘检测方法。以及如何通过边缘信息实现图像的卡通效果。

输入图像 (TajMahal.bmp)

TajMahal.bmp 边缘检测输出的屏幕截图

TajMahal.bmp 卡通效果输出的屏幕截图

卡通效果的另一张屏幕截图

背景

边缘检测是一种简单的图像处理技术，旨在识别数字图像中亮度发生剧烈变化（或更正式地说，存在不连续性）的点。Sobel 边缘检测使用了 http://en.wikipedia.org/wiki/Sobel_operator 中解释的 Sobel 算子实现。

在实现了边缘检测之后，我发现了一种简单的方法，可以通过组合输入图像及其边缘图像来制作卡通效果图像。

使用代码

最初，我们可以看一下 Sobel 边缘检测方法的 C++ 实现。每个像素应用两个卷积核。一个卷积核用于检测 X 方向的颜色变化 [梯度]，另一个用于检测 Y 方向的颜色变化 [梯度]。

下面的部分描述了这些矩阵如何应用于示例图像。

这是示例图像，我们可以看到 X 方向和 Y 方向梯度计算是如何工作的。

用于解释 X 方向和 Y 方向梯度计算的示例图像

X 方向梯度计算

下面的代码用于检测 X 方向的变化。只需计算周围 3*3 像素的加权和。

// Initializing X direction gradient kernel.
GX[0][0] = -1; GX[0][1] = 0; GX[0][2] = 1;
GX[1][0] = -2; GX[1][1] = 0; GX[1][2] = 2;
GX[2][0] = -1; GX[2][1] = 0; GX[2][2] = 1;

// Looping to findout change in X diretion
for(I=-1; I<=1; I++)
{
    for(J=-1; J<=1; J++)
    {
        sumX = sumX + (int)( (*(stOriginalImage_i.pData + nX + I +
            (nY + J)*stOriginalImage_i.nCols)) * GX[I+1][J+1]);
    }
}

这是 X 方向梯度计算的输出图像

X 方向梯度计算的输出

Y 方向梯度计算

下面的代码用于检测 Y 方向的变化。只需计算周围 3*3 像素的加权和。

// Initializing Y direction gradient kernel.
GY[0][0] =  1; GY[0][1] =  2; GY[0][2] =  1;
GY[1][0] =  0; GY[1][1] =  0; GY[1][2] =  0;
GY[2][0] = -1; GY[2][1] = -2; GY[2][2] = -1;
// Looping to findout change in Y diretion
for(I=-1; I<=1; I++)
{
    for(J=-1; J<=1; J++)
    {
        sumY = sumY + (int)( (*(stOriginalImage_i.pData + nX + I + 
            (nY + J)*stOriginalImage_i.nCols)) * GY[I+1][J+1]);
    }
}

这是 Y 方向梯度计算的输出图像

Y 方向梯度计算的输出图像

最终图像计算

最后，X 和 Y 方向的梯度值将使用以下公式组合

以及代码

SUM =  sqrt(double(sumX * sumX) + double(sumY * sumY));

输出如下

测试位图的输出图像

RGB 图像的边缘检测

该逻辑用于计算单通道图像的梯度。我分别对输入图像的每个分量 [R, G, B] 使用此算法，然后将输出组合在一起。

// Extract each component[R,G,B] to separate buffer to find gradient in 
// each component.
BYTE* pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
    originalImageR.pData[n] = *pbyTemp++; // Blue
    originalImageG.pData[n] = *pbyTemp++; // Green
    originalImageB.pData[n] = *pbyTemp++; // Red
}

// Find Gradient of each component separately.
FindEdge( originalImageR, OutputRed );
FindEdge( originalImageG, OutputGreen );
FindEdge( originalImageB, OutputBlue );

// Combine RGB gradient information to a output buffer.
pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
    *pbyTemp++ = OutputRed.pData[n];
    *pbyTemp++ = OutputGreen.pData[n];
    *pbyTemp++ = OutputBlue.pData[n];
}

这是组件梯度计算的完整代码

void EdgeDetectCPU::FindEdge( ImageInfo_t& stOriginalImage_i, 
                              ImageInfo_t& stEdgeImage_o )
{
    int        nX, nY,I, J;
    long            sumX, sumY;
    int            nColors, SUM;
    int            GX[3][3];
    int            GY[3][3];
    // Allocate output buffer
    stEdgeImage_o.pData = 
      new BYTE[stOriginalImage_i.nCols * stOriginalImage_i.nRows];

    // X Directional Gradient matrix.
    GX[0][0] = -1; GX[0][1] = 0; GX[0][2] = 1;
    GX[1][0] = -2; GX[1][1] = 0; GX[1][2] = 2;
    GX[2][0] = -1; GX[2][1] = 0; GX[2][2] = 1;

    // Y Directional Gradient matrix.
    GY[0][0] =  1; GY[0][1] =  2; GY[0][2] =  1;
    GY[1][0] =  0; GY[1][1] =  0; GY[1][2] =  0;
    GY[2][0] = -1; GY[2][1] = -2; GY[2][2] = -1;

    // Iterate each pixels in the image.
    for(nY=0; nY<=(stOriginalImage_i.nRows-1); nY++)
    {
        for(nX=0; nX<=(stOriginalImage_i.nCols-1); nX++)
        {
            sumX = 0;
            sumY = 0;

            SUM = 0;
            // Skip top,bottom, left and right pixels.
            if( !(nX==0 || nX==stOriginalImage_i.nCols-1 || 
                  nY==0 || nY==stOriginalImage_i.nRows-1))
            {
                // Looping to findout change in X direction
                for(I=-1; I<=1; I++)
                {
                    for(J=-1; J<=1; J++)
                    {
                        sumX = sumX + (int)( (*(stOriginalImage_i.pData + nX + I + 
                            (nY + J)*stOriginalImage_i.nCols)) * GX[I+1][J+1]);
                    }
                }

                // Looping to find out change in Y direction
                for(I=-1; I<=1; I++)
                {
                    for(J=-1; J<=1; J++)
                    {
                        sumY = sumY + (int)( (*(stOriginalImage_i.pData + nX + I + 
                            (nY + J)*stOriginalImage_i.nCols)) * GY[I+1][J+1]);
                    }
                }
                SUM =  sqrt(double(sumX * sumX) + double(sumY * sumY));
            }

            if(SUM>255) SUM=255;
            if(SUM<0) SUM=0;
            *(stEdgeImage_o.pData + nX + nY * stOriginalImage_i.nCols) = 
                                    255 - (unsigned char)(SUM);
        }
    }
}

卡通效果实现

图像的边缘信息现在已经完成了。卡通效果可以简单地通过此边缘图像及其输入数据创建。只需组合边缘图像及其对应的输入图像。下图说明了使用边缘图像创建卡通效果。

通过组合边缘图像和输入图像创建的卡通效果

以下代码解释了使用边缘图像实现的卡通效果。EdgeDetectCPU::FindEdge 已修改为基于 m_bCartoonEffect 标志创建卡通效果或边缘图像。

if(SUM>255) SUM=255;
if(SUM<0) SUM=0;
int nOut = 0;
// Checking Cartoon Effect flag to create final image.
if( m_bCartoonEffect )
{
    // Make Cartoon effect by combining edge information and original image.
    nOut = (SUM * 0.5) + (*(stOriginalImage_i.pData + nX + 
            nY * stOriginalImage_i.nCols) * 0.5);
}
else
{
    // Creating displayable edge data.
    nOut = 255 - (unsigned char)(SUM);
}
*(stEdgeImage_o.pData + nX + nY * stOriginalImage_i.nCols) = nOut;

像素着色器实现

我认为将此逻辑移植到像素着色器非常简单，因为像素的梯度计算逻辑是一个公共函数，可以应用于图像的每个像素。并且像素的梯度计算不依赖于其他像素的计算结果（如果一个像素的输出依赖于其他像素的输出，则很难实现像素着色器）。因此，我从像素着色器中删除了两个用于遍历位图中每个像素的 for 循环。创建像素着色器剩余的任务是将某些数据类型转换为着色器兼容的数据类型。例如，着色器不支持二维数组，但它提供了一个矩阵数据类型，可以像二维数组一样使用。这是着色器中 X、Y 梯度矩阵的声明。

// X directional search matrix.
mat3 GX = mat3( -1.0, 0.0, 1.0,
               -2.0, 0.0, 2.0,
               -1.0, 0.0, 1.0 );
// Y directional search matrix.
mat3 GY =  mat3( 1.0,  2.0,  1.0,
                0.0,  0.0,  0.0,
                -1.0, -2.0, -1.0 );

在像素着色器中接收的纹理坐标用于查找 C++ for 循环代码中的 nX、nY 值。

// Findout X , Y index of incoming pixel from its texture coordinate.
float fXIndex = gl_TexCoord[0].s * fWidth;
float fYIndex = gl_TexCoord[0].t * fHeight;

整个用于边缘检测和卡通效果的像素着色器是

// Image texture.
uniform sampler2D ImageTexture;

// Width of Image.
uniform float fWidth;
// Height of Image.
uniform float fHeight;
// Indicating cartoon effect is enabled or not.
uniform float fCartoonEffect;

void main()
{
    // X directional search matrix.
    mat3 GX = mat3( -1.0, 0.0, 1.0,
                    -2.0, 0.0, 2.0,
                    -1.0, 0.0, 1.0 );
    // Y directional search matrix.
    mat3 GY =  mat3( 1.0,  2.0,  1.0,
                     0.0,  0.0,  0.0,
                    -1.0, -2.0, -1.0 );

    vec4  fSumX = vec4( 0.0,0.0,0.0,0.0 );
    vec4  fSumY = vec4( 0.0,0.0,0.0,0.0 );
    vec4 fTotalSum = vec4( 0.0,0.0,0.0,0.0 );

    // Findout X , Y index of incoming pixel
    // from its texture coordinate.
    float fXIndex = gl_TexCoord[0].s * fWidth;
    float fYIndex = gl_TexCoord[0].t * fHeight;

    /* image boundaries Top, Bottom, Left, Right pixels*/
    if( ! ( fYIndex < 1.0 || fYIndex > fHeight - 1.0 || 
            fXIndex < 1.0 || fXIndex > fWidth - 1.0 ))
    {
        // X Directional Gradient calculation.
        for(float I=-1.0; I<=1.0; I = I + 1.0)
        {
            for(float J=-1.0; J<=1.0; J = J + 1.0)
            {
                float fTempX = ( fXIndex + I + 0.5 ) / fWidth ;
                float fTempY = ( fYIndex + J + 0.5 ) / fHeight ;
                vec4 fTempSumX = texture2D( ImageTexture, vec2( fTempX, fTempY ));
                fSumX = fSumX + ( fTempSumX * vec4( GX[int(I+1.0)][int(J+1.0)],
                                                    GX[int(I+1.0)][int(J+1.0)],
                                                    GX[int(I+1.0)][int(J+1.0)],
                                                    GX[int(I+1.0)][int(J+1.0)]));
            }
        }

        { // Y Directional Gradient calculation.
            for(float I=-1.0; I<=1.0; I = I + 1.0)
            {
                for(float J=-1.0; J<=1.0; J = J + 1.0)
                {
                    float fTempX = ( fXIndex + I + 0.5 ) / fWidth ;
                    float fTempY = ( fYIndex + J + 0.5 ) / fHeight ;
                    vec4 fTempSumY = texture2D( ImageTexture, vec2( fTempX, fTempY ));
                    fSumY = fSumY + ( fTempSumY * vec4( GY[int(I+1.0)][int(J+1.0)],
                                                        GY[int(I+1.0)][int(J+1.0)],
                                                        GY[int(I+1.0)][int(J+1.0)],
                                                        GY[int(I+1.0)][int(J+1.0)]));
                }
            }
            // Combine X Directional and Y Directional Gradient.
            vec4 fTem = fSumX * fSumX + fSumY * fSumY;
            fTotalSum = sqrt( fTem );
        }
    }
    // Checking status of cartoon effect.
    if( 0.5 < fCartoonEffect )
    {
        // Creaing cartoon effect by combining
        // edge informatioon and original image data.
        fTotalSum = mix( fTotalSum, texture2D( ImageTexture, 
                         vec2( gl_TexCoord[0].s, gl_TexCoord[0].t)), 0.5);
    }
    else
    {
        // Creating displayable edge data.
        fTotalSum = vec4( 1.0,1.0,1.0,1.0) - fTotalSum;
    }
    
    gl_FragColor = ( fTotalSum );
}

卡通效果或边缘图像是此着色器的输出。该程序将为图像的每个像素执行，并将计算每个像素的边缘信息。然后在最后阶段，fCartoonEffect 标志用于确定着色器的输出颜色。fCartoonEffect 根据 CartoonEffect 复选框的状态，从应用程序设置到着色器。

// Checking status of cartoon effect.
if( 0.5 < fCartoonEffect )
{
    // Creaing cartoon effect by combining edge
    // informatioon and original image data.
    fTotalSum = mix( fTotalSum, texture2D( ImageTexture, 
                     vec2( gl_TexCoord[0].s, gl_TexCoord[0].t)), 0.5);
}
else
{
    // Creating displayable edge data.
    fTotalSum = vec4( 1.0,1.0,1.0,1.0) - fTotalSum;
}

我认为不需要对这个像素着色器做更多解释，因为我们已经在上面的 C++ 逻辑中讨论了 X 方向和 Y 方向的梯度（变化）计算。

保存功能

边缘检测应用程序的“保存”按钮可用于将边缘检测算法的输出保存到 BMP 文件。保存功能仅使用 glReadPixels() API 从屏幕读取像素。

这是从屏幕读取像素信息的代码

pbyData = new BYTE[stImageArea.bottom * stImageArea.right  * 3];
if( 0 == pbyData )
{
    AfxMessageBox( L"Memory Allocation failed" );
    return;
}
glReadPixels( 0, 0, stImageArea.right, stImageArea.bottom, 
              GL_BGR_EXT, GL_UNSIGNED_BYTE, pbyData );
BMPLoader SaveBmp;
SaveBmp.SaveBMP( csFileName, stImageArea.right, stImageArea.bottom, pbyData );

用于保存的文件名创建代码很棘手

CString csFileName;
// This one create different names in CPU mode and GPU mode.
csFileName.Format( L"EdgeDetection_%s_%d.bmp", 
                 ( RUN_IN_CPU == m_nRunIn ) ? L"CPU" : L"GPU",
                 ( RUN_IN_CPU == m_nRunIn ) ? ++nCPUCount : ++nGPUCount );
CFileDialog SaveDlg( false, L"*.bmp", csFileName );

CPU 和 GPU 模式下生成的不同名称的示例：EdgeDetection_GPU_1.bmp、EdgeDetection_CPU_1.bmp。

关注点

BMP 数据的 4 字节边界填充给 CPU 上的边缘检测计算带来了一些困难。CPU 边缘检测在位图的列数不是 4 的倍数时会进行特殊处理。当执行边缘检测但不考虑填充数据时 [BMP 文件在填充像素中包含一些意外值]，我会得到一个奇怪的输出图像，因为提供给 EdgeDetectCPU 的缓冲区在填充像素中包含一些不需要的数据。因此，我遵循了以下步骤来避免这个问题

移除了图像右侧添加的意外像素，以使列数成为 4 的倍数。
使用 EdgeDetect.EdgeDetect( m_nImageWidth, m_nImageHeight, pbyImage ) 找到边缘图像。
添加了填充像素（列数设为 4 的倍数）。

下面的代码处理这种情况

// Extract each component[R,G,B] to separate buffer to find gradient in 
// each component.
BYTE* pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
    originalImageR.pData[n] = *pbyTemp++; // Blue
    originalImageG.pData[n] = *pbyTemp++; // Green
    originalImageB.pData[n] = *pbyTemp++; // Red
}

// Find Gradient of each component separately.
FindEdge( originalImageR, OutputRed );
FindEdge( originalImageG, OutputGreen );
FindEdge( originalImageB, OutputBlue );

// Combine RGB gradient information to a output buffer.
pbyTemp = pbyBMP_io;
for (int n= 0; n < nWidth * nHeight; n++)
{
    *pbyTemp++ = OutputRed.pData[n];
    *pbyTemp++ = OutputGreen.pData[n];
    *pbyTemp++ = OutputBlue.pData[n];
}

BMPLoader::LoadBMP 使用 GDI+ 加载图像文件。它使用 GDIPlus::Bitmap 类从文件读取图像。m_pBitmap->LockBits() 提供图像数据。

// Code for reading Image file from file using GDI+
Gdiplus::Bitmap* m_pBitmap = new Gdiplus::Bitmap(pFileName_i, true);

BYTE* pbyData = 0;
int nWidth = m_pBitmap->GetWidth();
int nHeight = m_pBitmap->GetHeight();
Gdiplus::Rect rect(0,0,nWidth,nHeight);
Gdiplus::BitmapData pBMPData;
m_pBitmap->LockBits( &rect,Gdiplus::ImageLockMode::ImageLockModeRead, 
                        PixelFormat24bppRGB, &pBMPData );
pbyData_o = new BYTE[nWidth * nHeight * 3];
nWidth_o = nWidth;
nHeight_o = nHeight;
if( 0 == pBMPData.Scan0 )
{
    return false;
}
BYTE* pSrc = (BYTE*)pBMPData.Scan0;
int nVert = nHeight - 1;
for( int nY = 0; nY < nHeight && nVert > 0; nY++ )
{
    // Avoid top and bottom difference.
    BYTE* pDest = pbyData_o + ( nWidth * nVert * 3 );
    memcpy( pDest, pSrc, 3 * nWidth);
    nVert--;
    pSrc += ( nWidth * 3 );
}
m_pBitmap->UnlockBits( &pBMPData );

历史

2010 年 7 月 19 日 - 初始版本。
2010 年 7 月 28 日 - 添加了 AppError 类，以便在着色器创建失败时显示错误消息。
2010 年 8 月 14 日 - 添加了卡通效果功能。
2010 年 8 月 20 日 - 使用 GDI+ 添加了 JPEG、PNG 图像加载。