使用 Python 和 Numpy 在 gdb 调试器中分析 C/C++ 矩阵

M.Mo

5.00/5 (4投票s)

2013年10月16日

CPOL

5分钟阅读

29269

290

使用 gdb 调试器的 Python API 在调试会话中分析和可视化 C/C++ 数组。

下载源代码 - 3.89 KB

引言

在调试用 Matlab 或 Python 编写的代码时，可以在断点处停止，操作局部向量、矩阵变量并绘制结果。在本文中，我将展示如何使用 gdb 调试器的 Python API 在调试过程中绘制和操作 C/C++ 数组和向量。例如，如果我们想在调试会话中绘制一个二维数组 mat 的特征值

float mat[10][10] = ....;

然后使用附带的代码，我们可以在 Python 中创建一个 numpy 数组并绘制其特征值

(gdb) py
> import gdb_numpy
> import numpy as np
> import matplotlib.pyplot as plt
> mat = gdb_numpy.to_array("mat") #Creates a numpy array that corresponds to the variable mat.
> print mat.shape
(10,10)
> y = np.linalg.eigvalsh(x)
> plt.plot(y)
> plt.show() #This is needed to show the figure, see notes below.

生成特征值图。

Plot of eigenvalues using python.

关于在 gdb 中使用 matplotlib 的说明

在继续之前，值得指出的是，当在 gdb 中使用 matplotlib.pyplot 或 matplotlib.pylab 时，必须调用 show 方法来显示图形。此外，在图形关闭之前，gdb 将不会响应任何命令。

使用代码

附带的代码依赖于 Python 包 numpy，并且可以直接创建 numpy 数组，这些数组来自 C/C++ 指针、数组和 STL 向量，以及它们的嵌套类型。有关如何安装和使用 numpy 的信息，请访问其网站。在本文中，我们将假设您对 numpy 包有一些基本了解。(Matlab 用户可能对此链接感兴趣)。

要安装附带的代码，我们可以通过在文件夹中运行 setup.py 脚本，并使用 install 参数。(在 Linux shell 或 Windows 命令提示符中输入以下命令)。

python setup.py install

在使用代码时，在 gdb 控制台中导入模块 gdb_numpy

(gdb) py import gdb_numpy

要从 C/C++ 指针/数组/向量类型创建 numpy 数组，请将其名称作为字符串传递给 gdb_numpy 模块中的 to_array 函数

(gdb) py vec = gdb_numpy.to_array("vec") #vec is now a numpy array.

如果 vec 是一个 STL 向量或内置数组，这将创建一个形状合适的 numpy 数组。但是，如果 vec 是一个指针，那么用户必须提供第二个参数来指示其维度。例如，如果我们有

float** mat = ...;

那么维度必须作为 tuple 提供。

(gdb) py
> mat = gdb_numpy.to_array("mat", (10,10))
> mat = gdb_numpy.to_array("mat") #error: sizes are not provided.
> mat = gdb_numpy.to_array("mat", (10)) #error: Not all sizes are provided.

请注意，即使只有一个维度，也必须将其作为 tuple 传递

float* vec = ...;
(gdb) py vec = gdb_numpy.to_array("vec", (10))
(gdb) py vec = gdb_numpy.to_array("vec", 10) #error: Dimensions must be passed as tuple.

该方法还支持嵌套类型，例如

std::vector<std::vector<double> > mat = ...;
> py mat = gdb_numpy.to_array("mat") #mat is a 2D numpy array

背景

现在我们将快速了解一些我们代码中使用的 gdb-python API (支持 gdb 7 及以上版本)。关于 gdb 中 Python API 的完整详细信息，请参见 gdb 文档。在 gdb 控制台中，可以通过命令 python (或 py) 访问 Python 解释器，后跟一个 Python 命令

(gdb) py print 1 + 2
3

如果命令 python 没有提供参数，则会进入多行模式

(gdb) py
> x = 1 + 2
> print x
> end
3

可以使用 gdb 模块的 parse_and_eval 方法访问我们正在调试的 C/C++ 程序中的变量。该模块在通过 gdb 访问 Python 解释器时会自动导入。

(gdb) py my_var = gdb.parse_and_eval("my_var")

parse_and_eval 返回一个 gdb.Value 类型的实例，其中包含 C/C++ 变量的信息。例如，可以通过 type 成员访问 C/C++ 类型的名称

(gdb) py
> my_array = gdb.parse_and_eval("my_array")
> print my_array.type
> end
double[10]

类成员可以通过索引运算符访问

(gdb) py
> my_class = gdb.parse_and_eval("my_class")
> my_data = my_class['data'] #Gives my_class.data

如果变量是指针类型，则可以使用索引器对其进行解引用

(gdb) py print my_data[10]

这涵盖了我们扩展附带代码所需的内容。

扩展代码

该模块可以扩展以适应自定义容器类型。这涉及从 deref 模块中的 DeRefBase 类派生。假设我们有一个用户定义的矩阵类型，并且我们想扩展该模块以使用它

template <typename T>
class MyMatrix
{
public:
    ....
    //Index operator.
    T& operator()(int i, int j){ return data[i*columns+j]; }
    const T& operator()(int i, int j){ return data[i*column+j]; }
private:
    //Underlying data
    T* data;
    //Number of rows
    int rows;
    //Number of columns
    int columns;
}

该类型将其底层数据存储在 data 成员中，因此如果 M 是 MyMatrix 的实例，那么 M(i,j) 由 *(M.data+i*M.columns+j) 给出。

首先，我们需要重写 DeRefBase 中的 deref 方法，该方法用于解引用容器。这可以通过解引用 MyMatrix 实例的 data 成员来完成。

#Converts a MyMatrix instance named Mat to a gdb.Value instance in python.
(gdb) py Mat = gdb.parse_and_eval("Mat")
#Gets an gdb.Value instance that corresponds to M.data. (Even though data is a private member)
(gdb) py
> data = Mat['data']
> columns = int(Mat['columns']) #Gets the columns and cast into integer
> print data[i*columns+j] #Gives Mat(i,j)

然后 deref 函数是

def deref(self, val, indices):
    data = val['data']
    columns = int(val['columns'])
    return data[indices[0] * columns + indices[1]]

所以，例如，下面的代码解引用了我们的矩阵

#derefMyMat is an instance of the appropriate DeRef class
(gdb) py print derefMyMat.deref(Mat,(i,j)) #Gives Mat(i,j)

请注意，与 gdb_numpy.to_array 一样，该方法期望一个 tuple 或 list。

接下来，我们需要更新和初始化类的一些成员。bounds 成员存储矩阵的维度。

#Constructor expected to take 3 variables:
#Mat: gdb.Value instance that represents the matrix
#shape_ind: An integer for internal bookkeeping purpose.
#shape: A tuple or list, for internal use.
def __init__(self, Mat, shape_ind, shape):
    ...
    self.bounds=[Mat['rows'], Mat['columns']]
    ...

此处矩阵的维度是从矩阵实例中获取的。

另一方面，如果维度由用户提供，例如在指针的情况下，那么应该使用 _get_range_from_shape 方法从 shape 参数中提取维度。

self._get_range_from_shape(2) #'2' here is the number of dimensions to extract.

这将正确初始化 shape_ind 和 bounds 成员。

需要更新的另一个类成员是 val。这应该是一个 gdb.Value 实例，对应于解引用后的对象。

self.val = self.deref(Mat,(0,0))

由于 self.val 的值不会被使用，因此在 deref 方法中使用什么索引并不重要，只要它是一个有效的索引即可。例如，我们也可以使用

self.val = self.deref(Mat, (self.bounds[0]-1, 
             self.bounds[1]-1)) #Works as long as the indices are valid.

最后，我们需要提供一个正则表达式来标识我们的类。这应该是与我们的类类型名称匹配的内容，可以通过相应 gdb.Value 实例的 type 成员访问。

(gdb) py my_mat = gdb.parse_and_eval("my_mat")
(gdb) py print my_mat.type
MyMatrix

所以，在我们的例子中，模式可以是 ^MyMatrix。

class DeRefMyMatrix(DeRefBase):
    pattern = re.compile('^MyMatrix')
    ....

总结一下，我们需要编写的 Python 类是

class DeRefMyMatrix(DeRefBase):

    pattern = re.compile('^MyMatrix')

    def __init__(self, Mat, shape_ind, shape):
        super(DeRefMyMatrix, self).__init__(Mat, shape_ind, shape)
        self.val = self.deref(Mat, [0,0]) #Updates to a dereferenced type
        self.bounds=[Mat['rows'], Mat['columns']] #The dimensions of the matrix

    def deref(self, val, indices):
        data = val['data']
        columns = int(val['columns'])
        return data[indices[0] * columns + indices[1]]

为了在 gdb_numpy 模块中使用这个类，我们需要通过将其添加到模块中的 _container_list 变量来注册它。

_container_list = [... ,deref.DeRefMyMatrix]

现在可以使用 MyMatrix 类来使用 gdb_numpy.to_array 方法。它还将自动支持嵌套类型，例如

MyMatrix<MyMatrix<double> > 4DTensor = ...;
std::vector<MyMatrix<double> > 3DTensor = ...;
MyMatrix<std::vector<double> > Another3DTensor = ...;

都将与 gdb_numpy.to_array 方法一起使用。

历史

初始提交：13/10/13。