65.9K
CodeProject 正在变化。 阅读更多。
Home

第六部分:原始重启和 OpenGL 互操作性

starIconstarIconstarIconstarIconstarIcon

5.00/5 (1投票)

2011年5月24日

CPOL

6分钟阅读

viewsIcon

67276

这是关于使用 OpenCL™ 进行便携式多线程编程系列文章的第六篇,Rob Farber 在文中讨论了如何在同一个应用程序中计算 OpenCL™ 中的数据并使用 OpenGL 进行渲染。

涵盖的主题包括配置、缓冲区互操作性以及使用原始重启加速 GPU 渲染。 本系列上一篇文章(第 5 部分)演示了如何在多设备、多 GPU 环境中将计算与数据关联起来。

引言

OpenGL 是一个通用的图形编程 API(应用程序编程接口),它基于标准,跨语言,跨平台。它可以用来创建可以在大多数可视化硬件上渲染 2D 和 3D 图像的应用程序。OpenGL 和 OpenCL 的互操作性可以极大地加速数据生成和数据可视化。基本上,OpenCL 应用程序映射 OpenGL 缓冲区,以便它们可以被 GPU 上运行的大规模并行内核修改。这样数据就保留在 GPU 上,避免了昂贵的 PCIe 总线传输。

原始重启是 OpenGL 3.1 规范中新增的一项功能。简而言之,原始重启允许程序员定义一个数值,该数值充当一个标记,告诉 OpenGL 状态机重新开始一个 OpenGL 渲染指令,从下一个数据项开始。这有几个优点。

  • 对于程序员来说,这意味着可以使用一个命令渲染多条线、三角形扇形和不规则网格。
  • OpenGL 状态机的开发人员可以优化他们的代码,以便在 GPU 上对原始重启标记进行测试,从而消除 PCIe 总线上的主机/GPU 通信瓶颈。
  • 通过安排数据以实现纹理单元中缓存的最高重用率,可以提高渲染性能。
  • 如原始重启 规范中所述,并通过以下两图说明,可以通过交替细分方向来创建更高质量的图像。

Description: http://i.cmpnet.com/ddj/images/article/2010/1005/100527cuda18_f6.gif

带有伪影的两个三角形带

Description: http://i.cmpnet.com/ddj/images/article/2010/1005/100527cuda18_f7.gif

数据渲染为四个三角形扇形(中心用实心圆标记)

本文并非 OpenGL 教程。互联网上有许多优秀的书籍和教程。其中一个优秀的教程是 NeHe 教程系列。下面的应用程序旨在演示 OpenCL 和 OpenGL 的互操作性以及原始重启的使用。

构建应用程序

gltest.cpp 的完整源代码复制并粘贴到名为该文件的文件中。在文章末尾的演练之后,提供了完整的源代码。同样,复制并粘贴 OpenCL 内核 sinewave.cl 的源代码。

在 Linux 下,可以使用以下命令编译源代码:

g++  -I
$ATISTREAMSDKROOT/include -L $ATISTREAMSDKROOT/lib/x86_64 gltest.cpp -lglut
-lGLEW -lOpenCL -o gltest

该应用程序默认在 GPU 上运行。当在命令行上指定任何内容时,程序将在 CPU 上运行。这个简单的应用程序要求 OpenCL 内核位于同一目录下的 sinewave.cl 文件中。

./gltest # running on the GPU
./gltest CPU # This will run the application on the CPU

默认情况下,应用程序开始渲染曲面。使用鼠标旋转和缩放图像。请注意,图像的颜色和形状会随着时间变化,以显示 OpenCL 内核的重复计算。

按键盘上的“D”或“d”键可以在渲染模式之间循环:曲面、点集合、线集合,然后再次回到曲面,如下图所示。

默认渲染为曲面

渲染为彩色点集合的曲面

渲染为彩色线集合的曲面

按“q”键退出程序。

gltest.cpp 源代码演练

代码的初始部分指定了包含文件、常量和一组全局变量。

//Code by Rob Farber
#include <iostream>
#include <fstream>
using namespace std;
 
#include <CL/cl.h>
#include <CL/cl_gl.h>
#include <GL/glew.h>
#include <GL/glut.h>
#ifndef _WIN32
#include <GL/glxew.h>
#endif //!_WIN32
 
#define WIDTH  1408
#define HEIGHT 1024
 
// Globals used in the program
const unsigned int      mesh_width = 128, mesh_height = 128;
const unsigned int RestartIndex = 0xffffffff;
 
cl_platform_id          platform;         
cl_device_id            device;
cl_context              context;
cl_command_queue        queue;
cl_program              program;
cl_kernel               kernel;
size_t                  kernelsize;
size_t                  global[] = {mesh_width, mesh_height};

har                    *pathname = NULL;
char                    *source = NULL;

VBO(顶点缓冲区对象)是 GPU 高速内存中的 OpenGL 内存缓冲区。以下源代码显示 gltest.cpp 使用了两个 VBO,一个描述曲面顶点集的坐标,另一个描述每个顶点的颜色。VBO 还可以存储法线、纹理坐标、索引和其他数据。

// Globals associated with the position vbo
const unsigned int p_vbo_size = mesh_width*mesh_height*4*sizeof(float); 
GLuint  p_vbo;
cl_mem  p_vbocl;
 
// Globals associated with the color vbo
const unsigned int c_vbo_size = mesh_width*mesh_height*4*sizeof(unsigned char); 
GLuint  c_vbo;
cl_mem  c_vbocl;

原始重启要求使用一组索引来指向数据(或原始重启标记)的每个位置。qIndicies 向量保存这些索引。

// Globals associated with the indices for primitive restart
GLuint* qIndices=NULL;
int qIndices_size = 5*(mesh_height-1)*(mesh_width-1);

此代码使用了 OpenGL 实用工具包 GLUT,这是一个可以在 UNIX、Windows 和其他操作系统上运行的可移植工具包。GLUT 提供了一个简单的 API 来指定处理鼠标、窗口、键盘和其他事件的函数的函数回调。

float   anim = 0.0;
int drawMode=GL_TRIANGLE_FAN; // the default draw mode
const char* drawStr="fan";
const char* platformString="notset";

// Globals associated with the mouse controls
int mouse_old_x, mouse_old_y;
int mouse_buttons = 0;
float rotate_x = 0.0, rotate_y = 0.0;
float translate_z = -2.5;
 
// Forward references for the GLUT callbacks
void display();
void motion(int x, int y);
void mouse(int button, int state, int x, int y);
void keyboard(unsigned char key, int x, int y);
void initgl(int argc, const char** argv);

为了方便起见,窗口的标题用于报告重要特征,例如设备(CPU 或 GPU)和绘图模式。这些信息通过以下辅助函数指定:

// helper routine to set the window title
void setTitle()
{
  char title[256];
  sprintf(title, "GL Interop Wrapper: mode %s device %s",
         drawStr, platformString);  
  glutSetWindowTitle(title);
}

main 的第一部分初始化窗口并提供非常简单的命令行处理来选择 CPU 或 GPU。

int main(int argc, const char **argv) 
{
  initgl(argc, argv);
  
  clGetPlatformIDs(1, &platform, NULL);
  if(argc > 1) {
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_CPU, 1, &device, NULL);
    platformString = "CPU";
  } else {
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
    platformString = "GPU";
  }

创建了 OpenCL 上下文和队列。请注意,上下文必须绑定到 OpenGL 上下文,如 #ifdef 预处理器语句所示。否则,将不会显示任何内容。

  // It is necessary to add the gl context to the properties or
  // nothing will display
#ifdef _WIN32
  HGLRC glCtx = wglGetCurrentContext();
#else //!_WIN32
  GLXContext glCtx = glXGetCurrentContext();
#endif //!_WIN32
  
  cl_context_properties props[] = { CL_CONTEXT_PLATFORM, 
                  (cl_context_properties)platform,
#ifdef _WIN32
                  CL_WGL_HDC_KHR, (intptr_t) wglGetCurrentDC(),
#else //!_WIN32
                  CL_GLX_DISPLAY_KHR, (intptr_t) glXGetCurrentDisplay(),
#endif //!_WIN32
                  CL_GL_CONTEXT_KHR, (intptr_t) glCtx, 0};
  
  // Create the context and the queue
  context = clCreateContext(props, 1, &device, NULL, NULL, NULL);
  
  queue = clCreateCommandQueue(context, device, 0, NULL);

同时创建了位置和颜色 VBO。请注意,glBindBuffer 调用会在设备上分配空间。VBO 通过调用 clCreateFromGLBuffer 映射到 OpenCL 内存空间,该函数返回一个可以传递给 OpenCL 内核的全局内存指针。

  // create position p_vbo
  glGenBuffers(1, &p_vbo);
  glBindBuffer(GL_ARRAY_BUFFER, p_vbo);
  // initialize buffer object
  glBufferData(GL_ARRAY_BUFFER, p_vbo_size, 0, GL_DYNAMIC_DRAW);        
  // create OpenCL buffer from GL VBO
  p_vbocl = clCreateFromGLBuffer(context, CL_MEM_WRITE_ONLY, p_vbo, NULL);
 
  // create color c_vbo (very similar to the position vbo)
  glGenBuffers(1, &c_vbo);
  glBindBuffer(GL_ARRAY_BUFFER, c_vbo);
  glBufferData(GL_ARRAY_BUFFER, c_vbo_size, 0, GL_DYNAMIC_DRAW);        
  c_vbocl = clCreateFromGLBuffer(context, CL_MEM_WRITE_ONLY, c_vbo, NULL);

尽管这是一个 C API OpenCL 程序,但 C++ 用于加载程序源代码。然后构建 OpenCL 程序,指定内核参数,并定义原始重启的索引。

  // For convenience use C++ to load the program source into memory
  ifstream file("sinewave.cl");
  string prog(istreambuf_iterator<char>(file), (istreambuf_iterator<char>()));
  file.close();
  const char* source = prog.c_str();
  const size_t kernelsize = prog.length()+1;
  program = clCreateProgramWithSource(context, 1, (const char**) &source,
                                 &kernelsize, NULL);
 
  // Build the program executable
  int err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
  if (err != CL_SUCCESS) {
    size_t len;
    char buffer[2048];
    
    cerr << "Error: Failed to build program executable!" << endl;
    clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_LOG,
                       sizeof(buffer), buffer, &len);
    cerr << buffer << endl;
    exit(1);
  }
  
  // Create the compute kernel in the program
  kernel = clCreateKernel(program, "sinewave", &err);
  if (!kernel || err != CL_SUCCESS) {
    cerr << "Error: Failed to create compute kernel!" << endl;
    exit(1);
  }
  
  // Set the kernel arguments. Note argument 3 is set in display
  clSetKernelArg(kernel, 0, sizeof(cl_mem), (void*)&p_vbocl);
  clSetKernelArg(kernel, 1, sizeof(unsigned int), &mesh_width);
  clSetKernelArg(kernel, 2, sizeof(unsigned int), &mesh_height);
  clSetKernelArg(kernel, 4, sizeof(cl_mem), (void*)&c_vbocl);
  
  // Generate the indices for primitive restart
  // allocate and assign trianglefan indicies 
  qIndices = (GLuint *) malloc(qIndices_size*sizeof(GLint));
  int index=0;
  for(int i=1; i < mesh_height; i++) {
    for(int j=1; j < mesh_width; j++) {
      qIndices[index++] = (i)*mesh_width + j; 
      qIndices[index++] = (i)*mesh_width + j-1; 
      qIndices[index++] = (i-1)*mesh_width + j-1; 
      qIndices[index++] = (i-1)*mesh_width + j; 
      qIndices[index++] = RestartIndex;
    }
  }

main 方法通过设置窗口标题并调用 GLUT 主循环来完成,主循环不会退出。

  setTitle(); 
  glutMainLoop();
}

initgl 方法执行设置窗口、注册回调和定义初始视条件的全部工作。

// setup the window and assign callbacks
void initgl(int argc, const char** argv) 
{
  glutInit(&argc, (char**)argv);
  glutInitDisplayMode(GLUT_RGBA | GLUT_DOUBLE);
  glutInitWindowPosition (glutGet(GLUT_SCREEN_WIDTH)/2 - WIDTH/2, 
                       glutGet(GLUT_SCREEN_HEIGHT)/2 - HEIGHT/2);
  glutInitWindowSize(WIDTH, HEIGHT);
  glutCreateWindow("");
  
  glutDisplayFunc(display);       // register GLUT callback functions
  glutKeyboardFunc(keyboard);
  glutMouseFunc(mouse);
  glutMotionFunc(motion);
 
  glewInit();
  
  glClearColor(0.0, 0.0, 0.0, 1.0);
  glDisable(GL_DEPTH_TEST);
  
  glViewport(0, 0, WIDTH, HEIGHT);
  glMatrixMode(GL_PROJECTION);
  glLoadIdentity();
  gluPerspective(10000.0, (GLfloat)WIDTH / (GLfloat)HEIGHT, 0.1, 10.0);
  
  glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
  glMatrixMode(GL_MODELVIEW);
  glLoadIdentity();
  return;
}

display 方法是所有工作发生的地方。基本上,获取 OpenGL 缓冲区对象,并排队内核。内核完成后,释放 OpenGL 缓冲区。

// This method is called everytime the screen is redisplayed. No
// optimization is performed as it recalculates the kernel every time.
void display() 
{
  anim += 0.01f;
  
  // map OpenGL buffer object for writing from OpenCL
  glFinish();
  clEnqueueAcquireGLObjects(queue, 1, &p_vbocl, 0,0,0);
  clEnqueueAcquireGLObjects(queue, 1, &c_vbocl, 0,0,0);
  
  // Set arg 3 and queue the kernel
  clSetKernelArg(kernel, 3, sizeof(float), &anim);
  clEnqueueNDRangeKernel(queue, kernel, 2, NULL, global, NULL, 0, 0, 0);
  
  // queue unmap buffer object
  clEnqueueReleaseGLObjects(queue, 1, &c_vbocl, 0,0,0);
  clEnqueueReleaseGLObjects(queue, 1, &p_vbocl, 0,0,0);
  clFinish(queue);
  
  // clear graphics
  glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

然后根据绘图模式渲染新计算的数据,之后 display 进行清理。请注意,通过调用 glEnableClientState 启用原始重启,并通过 glPrimitiveRestartIndexNV 将唯一索引指定给 OpenGL 状态机。只调用一次 glDrawElements 来渲染所有三角形扇形。

  // Apply the image transforms
  glMatrixMode(GL_MODELVIEW);
  glLoadIdentity();
  glTranslatef(0.0, 0.0, translate_z);
  glRotatef(rotate_x, 1.0, 0.0, 0.0);
  glRotatef(rotate_y, 0.0, 1.0, 0.0);

  //render from the p_vbo
  glBindBuffer(GL_ARRAY_BUFFER, p_vbo);
  glVertexPointer(4, GL_FLOAT, 0, 0);
  glEnableClientState(GL_VERTEX_ARRAY);

  // enable colors from the c_vbo
  glBindBuffer(GL_ARRAY_BUFFER, c_vbo);
  glColorPointer(4, GL_UNSIGNED_BYTE, 0, 0);
  glEnableClientState(GL_COLOR_ARRAY);

  // draw points, lines or triangles according to the user keyboard input
  switch(drawMode) {
  case GL_LINE_STRIP:
    for(int i=0 ; i < mesh_width*mesh_height; i+= mesh_width)
      glDrawArrays(GL_LINE_STRIP, i, mesh_width);
    break;
  case GL_TRIANGLE_FAN:
    glPrimitiveRestartIndexNV(RestartIndex);
    glEnableClientState(GL_PRIMITIVE_RESTART_NV);
    glDrawElements(GL_TRIANGLE_FAN, qIndices_size, GL_UNSIGNED_INT, qIndices);
    break;
  default:
    glDrawArrays(GL_POINTS, 0, mesh_width * mesh_height);
    break;
  }
  
  // handle housekeeping and redisplay
  glDisableClientState(GL_COLOR_ARRAY);
  glDisableClientState(GL_VERTEX_ARRAY);
  glutSwapBuffers();
  glutPostRedisplay();
}

键盘回调处理键盘输入。按下“D”或“d”键可以在渲染曲面、一组点、线以及回到曲面之间循环。按下“q”或 ESC 键会终止应用程序。

// Keyboard events handler for GLUT
void keyboard(unsigned char key, int x, int y)
{
  switch(key) {
  case('q') :
  case(27) :
    exit(0);
  break;
  case 'd':
  case 'D':
    switch(drawMode) {
    case GL_POINTS: drawMode = GL_LINE_STRIP; drawStr = "line"; break;
    case GL_LINE_STRIP: drawMode = GL_TRIANGLE_FAN; drawStr = "fan"; break;
    default: drawMode=GL_POINTS; drawStr = "points"; break;
    }
  } 
  setTitle();
  glutPostRedisplay();
}

以下是鼠标处理程序。

// Mouse event handler for GLUT
void mouse(int button, int state, int x, int y)
{
  if (state == GLUT_DOWN) {
    mouse_buttons |= 1<<button;
  } else if (state == GLUT_UP) {
    mouse_buttons = 0;
  }
  
  mouse_old_x = x;
  mouse_old_y = y;
  glutPostRedisplay();
}
 
// Motion event handler for GLUT
void motion(int x, int y)
{
  float dx, dy;
  dx = x - mouse_old_x;
  dy = y - mouse_old_y;
  
  if (mouse_buttons & 1) {
    rotate_x += dy * 0.2;
    rotate_y += dx * 0.2;
  } else if (mouse_buttons & 4) {
    translate_z += dy * 0.01;
  }
  
  mouse_old_x = x;
  mouse_old_y = y;
}

gltest.cpp 的完整源代码如下:

//Code by Rob Farber
#include <iostream>
#include <fstream>
using namespace std;
 
#include <CL/cl.h>
#include <CL/cl_gl.h>
#include <GL/glew.h>
#include <GL/glut.h>
#ifndef _WIN32
#include <GL/glxew.h>
#endif //!_WIN32
 
#define WIDTH  1408
#define HEIGHT 1024
 
// Globals used in the program
const unsigned int      mesh_width = 128, mesh_height = 128;
const unsigned int RestartIndex = 0xffffffff;
 
cl_platform_id          platform;         
cl_device_id            device;
cl_context              context;
cl_command_queue        queue;
cl_program              program;
cl_kernel               kernel;
size_t                  kernelsize;
size_t                  global[] = {mesh_width, mesh_height};
char                    *pathname = NULL;
char                    *source = NULL; 
 
// Globals associated with the position vbo
const unsigned int p_vbo_size = mesh_width*mesh_height*4*sizeof(float); 
GLuint  p_vbo;
cl_mem  p_vbocl;
 
// Globals associated with the color vbo
const unsigned int c_vbo_size = mesh_width*mesh_height*4*sizeof(unsigned char); 
GLuint  c_vbo;
cl_mem  c_vbocl;
 
// Globals associated with the indices for primitive restart
GLuint* qIndices=NULL;
int qIndices_size = 5*(mesh_height-1)*(mesh_width-1);
float   anim = 0.0;
int drawMode=GL_TRIANGLE_FAN; // the default draw mode
const char* drawStr="fan";
const char* platformString="notset";
 
// Globals associated with the mouse controls
int mouse_old_x, mouse_old_y;
int mouse_buttons = 0;
float rotate_x = 0.0, rotate_y = 0.0;
float translate_z = -2.5;
 
// Forward references for the GLUT callbacks
void display();
void motion(int x, int y);
void mouse(int button, int state, int x, int y);
void keyboard(unsigned char key, int x, int y);
void initgl(int argc, const char** argv);
 
// helper routine to set the window title
void setTitle()
{
  char title[256];
  sprintf(title, "GL Interop Wrapper: mode %s device %s",
         drawStr, platformString);  
  glutSetWindowTitle(title);
}
 
int main(int argc, const char **argv) 
{
  initgl(argc, argv);
  
  clGetPlatformIDs(1, &platform, NULL);
  if(argc > 1) {
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_CPU, 1, &device, NULL);
    platformString = "CPU";
  } else {
    clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
    platformString = "GPU";
  }
  
  // It is necessary to add the gl context to the properties or
  // nothing will display
#ifdef _WIN32
  HGLRC glCtx = wglGetCurrentContext();
#else //!_WIN32
  GLXContext glCtx = glXGetCurrentContext();
#endif //!_WIN32
  
  cl_context_properties props[] = { CL_CONTEXT_PLATFORM, 
                  (cl_context_properties)platform,
#ifdef _WIN32
                  CL_WGL_HDC_KHR, (intptr_t) wglGetCurrentDC(),
#else //!_WIN32
                  CL_GLX_DISPLAY_KHR, (intptr_t) glXGetCurrentDisplay(),
#endif //!_WIN32
                  CL_GL_CONTEXT_KHR, (intptr_t) glCtx, 0};
  
  // Create the context and the queue
  context = clCreateContext(props, 1, &device, NULL, NULL, NULL);
  
  queue = clCreateCommandQueue(context, device, 0, NULL); 
  
  // create position p_vbo
  glGenBuffers(1, &p_vbo);
  glBindBuffer(GL_ARRAY_BUFFER, p_vbo);
  // initialize buffer object
  glBufferData(GL_ARRAY_BUFFER, p_vbo_size, 0, GL_DYNAMIC_DRAW);        
  // create OpenCL buffer from GL VBO
  p_vbocl = clCreateFromGLBuffer(context, CL_MEM_WRITE_ONLY, p_vbo, NULL);
 
  // create color c_vbo (very similar to the position vbo)
  glGenBuffers(1, &c_vbo);
  glBindBuffer(GL_ARRAY_BUFFER, c_vbo);
  glBufferData(GL_ARRAY_BUFFER, c_vbo_size, 0, GL_DYNAMIC_DRAW);        
  c_vbocl = clCreateFromGLBuffer(context, CL_MEM_WRITE_ONLY, c_vbo, NULL);
  
  // For convenience use C++ to load the program source into memory
  ifstream file("sinewave.cl");
  string prog(istreambuf_iterator<char>(file), (istreambuf_iterator<char>()));
  file.close();
  const char* source = prog.c_str();
  const size_t kernelsize = prog.length()+1;
  program = clCreateProgramWithSource(context, 1, (const char**) &source,
                                 &kernelsize, NULL);
 
  // Build the program executable
  int err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
  if (err != CL_SUCCESS) {
    size_t len;
    char buffer[2048];
    
    cerr << "Error: Failed to build program executable!" << endl;
    clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_LOG,
                       sizeof(buffer), buffer, &len);
    cerr << buffer << endl;
    exit(1);
  }
  
  // Create the compute kernel in the program
  kernel = clCreateKernel(program, "sinewave", &err);
  if (!kernel || err != CL_SUCCESS) {
    cerr << "Error: Failed to create compute kernel!" << endl;
    exit(1);
  }
  
  // Set the kernel arguments. Note argument 3 is set in display
  clSetKernelArg(kernel, 0, sizeof(cl_mem), (void*)&p_vbocl);
  clSetKernelArg(kernel, 1, sizeof(unsigned int), &mesh_width);
  clSetKernelArg(kernel, 2, sizeof(unsigned int), &mesh_height);
  clSetKernelArg(kernel, 4, sizeof(cl_mem), (void*)&c_vbocl);
  
  // Generate the indices for primitive restart
  // allocate and assign trianglefan indicies 
  qIndices = (GLuint *) malloc(qIndices_size*sizeof(GLint));
  int index=0;
  for(int i=1; i < mesh_height; i++) {
    for(int j=1; j < mesh_width; j++) {
      qIndices[index++] = (i)*mesh_width + j; 
      qIndices[index++] = (i)*mesh_width + j-1; 
      qIndices[index++] = (i-1)*mesh_width + j-1; 
      qIndices[index++] = (i-1)*mesh_width + j; 
      qIndices[index++] = RestartIndex;
    }
  }
  setTitle(); 
  glutMainLoop();
}
 
// setup the window and assign callbacks
void initgl(int argc, const char** argv) 
{
  glutInit(&argc, (char**)argv);
  glutInitDisplayMode(GLUT_RGBA | GLUT_DOUBLE);
  glutInitWindowPosition (glutGet(GLUT_SCREEN_WIDTH)/2 - WIDTH/2, 
                       glutGet(GLUT_SCREEN_HEIGHT)/2 - HEIGHT/2);
  glutInitWindowSize(WIDTH, HEIGHT);
  glutCreateWindow("");
  
  glutDisplayFunc(display);       // register GLUT callback functions
  glutKeyboardFunc(keyboard);
  glutMouseFunc(mouse);
  glutMotionFunc(motion);
 
  glewInit();
  
  glClearColor(0.0, 0.0, 0.0, 1.0);
  glDisable(GL_DEPTH_TEST);
  
  glViewport(0, 0, WIDTH, HEIGHT);
  glMatrixMode(GL_PROJECTION);
  glLoadIdentity();
  gluPerspective(10000.0, (GLfloat)WIDTH / (GLfloat)HEIGHT, 0.1, 10.0);
  
  glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
  glMatrixMode(GL_MODELVIEW);
  glLoadIdentity();
  return;
}
 
// This method is called everytime the screen is redisplayed. No
// optimization is performed as it recalculates the kernel every time.
void display() 
{
  anim += 0.01f;
  
  // map OpenGL buffer object for writing from OpenCL
  glFinish();
  clEnqueueAcquireGLObjects(queue, 1, &p_vbocl, 0,0,0);
  clEnqueueAcquireGLObjects(queue, 1, &c_vbocl, 0,0,0);
  
  // Set arg 3 and queue the kernel
  clSetKernelArg(kernel, 3, sizeof(float), &anim);
  clEnqueueNDRangeKernel(queue, kernel, 2, NULL, global, NULL, 0, 0, 0);
  
  // queue unmap buffer object
  clEnqueueReleaseGLObjects(queue, 1, &c_vbocl, 0,0,0);
  clEnqueueReleaseGLObjects(queue, 1, &p_vbocl, 0,0,0);
  clFinish(queue);
  
  // clear graphics
  glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);     
 
  // Apply the image transforms
  glMatrixMode(GL_MODELVIEW);
  glLoadIdentity();
  glTranslatef(0.0, 0.0, translate_z);
  glRotatef(rotate_x, 1.0, 0.0, 0.0);
  glRotatef(rotate_y, 0.0, 1.0, 0.0);
 
  //render from the p_vbo
  glBindBuffer(GL_ARRAY_BUFFER, p_vbo);
  glVertexPointer(4, GL_FLOAT, 0, 0);
  glEnableClientState(GL_VERTEX_ARRAY);
  
  // enable colors from the c_vbo
  glBindBuffer(GL_ARRAY_BUFFER, c_vbo);
  glColorPointer(4, GL_UNSIGNED_BYTE, 0, 0);
  glEnableClientState(GL_COLOR_ARRAY);
  
  // draw points, lines or triangles according to the user keyboard input
  switch(drawMode) {
  case GL_LINE_STRIP:
    for(int i=0 ; i < mesh_width*mesh_height; i+= mesh_width)
      glDrawArrays(GL_LINE_STRIP, i, mesh_width);
    break;
  case GL_TRIANGLE_FAN:
    glPrimitiveRestartIndexNV(RestartIndex);
    glEnableClientState(GL_PRIMITIVE_RESTART_NV);
    glDrawElements(GL_TRIANGLE_FAN, qIndices_size, GL_UNSIGNED_INT, qIndices);
    break;
  default:
    glDrawArrays(GL_POINTS, 0, mesh_width * mesh_height);
    break;
  }
  
  // handle housekeeping and redisplay
  glDisableClientState(GL_COLOR_ARRAY);
  glDisableClientState(GL_VERTEX_ARRAY);
  glutSwapBuffers();
  glutPostRedisplay();
}
 
// Keyboard events handler for GLUT
void keyboard(unsigned char key, int x, int y)
{
  switch(key) {
  case('q') :
  case(27) :
    exit(0);
  break;
  case 'd':
  case 'D':
    switch(drawMode) {
    case GL_POINTS: drawMode = GL_LINE_STRIP; drawStr = "line"; break;
    case GL_LINE_STRIP: drawMode = GL_TRIANGLE_FAN; drawStr = "fan"; break;
    default: drawMode=GL_POINTS; drawStr = "points"; break;
    }
  } 
  setTitle();
  glutPostRedisplay();
}
 
// Mouse event handler for GLUT
void mouse(int button, int state, int x, int y)
{
  if (state == GLUT_DOWN) {
    mouse_buttons |= 1<<button;
  } else if (state == GLUT_UP) {
    mouse_buttons = 0;
  }
  
  mouse_old_x = x;
  mouse_old_y = y;
  glutPostRedisplay();
}
 
// Motion event handler for GLUT
void motion(int x, int y)
{
  float dx, dy;
  dx = x - mouse_old_x;
  dy = y - mouse_old_y;
  
  if (mouse_buttons & 1) {
    rotate_x += dy * 0.2;
    rotate_y += dx * 0.2;
  } else if (mouse_buttons & 4) {
    translate_z += dy * 0.01;
  }
  
  mouse_old_x = x;
  mouse_old_y = y;
}

以下是 sinewave.cl 的完整源代码。请注意,它在 __global 向量上运行,这些向量恰好也是 OpenGL 缓冲区。

__kernel void sinewave(__global float4* pos, unsigned int width, 
                     unsigned int height, float time, __global uchar4* color)
{
  unsigned int x = get_global_id(0);
  unsigned int y = get_global_id(1);
  
  // calculate uv coordinates
  float u = x / (float) width;
  float v = y / (float) height;
  u = u*2.0f - 1.0f;
  v = v*2.0f - 1.0f;
  
  // calculate simple sine wave pattern
  float freq = 4.0f;
  float w = sin(u*freq + time) * cos(v*freq + time) * 0.5f;
  
  // write output vertex
  pos[y*width+x] = (float4)(u, w, v, 1.0f);
  color[y*width+x] = (uchar4) (
                            (uchar) 255.f *0.5f*(1.f+sin(w+x)),
                            (uchar) 255.f *0.5f*(1.f+sin(x)*cos(y)),
                            (uchar) 255.f *0.5f*(1.f+sin(w+time/10.f)), 0 );
}

摘要

GPGPU 设备是强大的可视化和计算设备。利用 OpenCL 图形互操作性功能是加速可视化应用程序并充分利用这些设备功能的绝佳方式。通过本文的示例代码比较 CPU 和 GPU 上可视化的速度是了解差异的一种方法。

原始重启是进一步加速 GPU 设备上可视化的绝佳方式,它将命令标记和数据保留在 GPU 上,以避免 PCIe 总线传输。此外,它为开发人员在处理不规则网格和其他具有挑战性的可视化任务方面提供了极大的灵活性。除了速度之外,原始重启还可以生成更高质量的图形。

© . All rights reserved.