在 C/C++ 中嵌入 Python：第一部分

Jun Du

4.83/5 (41投票s)

2005年9月28日

CPOL

9分钟阅读

603003

12077

本文介绍如何通过使用 Python/C API 将 Python 模块嵌入到 C/C++ 应用程序中。

引言

受文章《在多线程 C/C++ 应用程序中嵌入 Python》（Linux Journal）的启发，我感到有必要对嵌入 Python 这个主题进行更全面的阐述。在撰写本文时，我有两个目标：

本文面向比 Python 更熟悉 C/C++ 的程序员，教程采取了实用的方法，省略了所有理论性讨论。
尝试在编写嵌入代码时保持 Python 的跨平台兼容性。

现在，您已经拥有了别人用 Python 编写的一些模块，并想使用它们。您熟悉 C/C++，但对 Python 比较陌生。您可能会想，是否能找到一个工具像转换 FORTRAN 那样将它们转换为 C 代码？答案是否定的。有些工具可以帮助您从 Python 模块生成可执行文件。问题解决了？不。将代码转换为可执行文件通常会使事情变得更加复杂，因为您必须弄清楚 C/C++ 应用程序如何与可执行的“黑盒子”通信。

我将向 C/C++ 程序员介绍 Python/C API，这是一个 C 库，有助于将 Python 模块嵌入到 C/C++ 应用程序中。API 库提供了一系列 C 例程来初始化 Python 解释器、调用您的 Python 模块并完成嵌入。该库是使用 Python 构建的，并随所有较新的 Python 版本一起分发。

本文系列的第一部分讨论了 Python 嵌入的基础知识。第二部分将深入探讨更高级的主题。本教程不系统地教授 Python 语言，但我会简要描述 Python 代码在出现时是如何工作的。重点将放在如何将 Python 模块与您的 C/C++ 应用程序集成。请参阅文章：“在 C/C++ 中嵌入 Python：第二部分”。

为了使用源代码，您应该安装一个较新的 Python 版本、Visual C++（或 Linux 上的 GCC 编译器）。我用来测试的环境是：Python 2.4（Windows 和 Linux）、Visual C++ 6.0（Windows）或 GCC 3.2（RedHat 8.0 Linux）。使用 Visual C++ 时，请选择 Release 配置进行构建。Debug 配置需要 Python 调试库“python24_d.lib”，该库不随普通发行版提供。

背景

Python 是一种强大的解释型语言，类似于 Java、Perl 和 PHP。它支持任何程序员所期望的许多优秀功能，我最喜欢的两个功能是“简单”和“可移植”。加上现有的工具和库，Python 是一个用于建模和仿真开发者的好语言。最重要的是，它是免费的，而且为 Python 程序员编写的工具和库也是免费的。有关该语言的更多详细信息，请访问官方网站。

嵌入基础：函数、类和方法

首先，让我们从一个调用 Python 模块中函数的示例 C 程序开始。这是源文件“call_function.c”

// call_function.c - A sample of calling 
// python functions from C code
// 
#include <Python.h>

int main(int argc, char *argv[])
{
    PyObject *pName, *pModule, *pDict, *pFunc, *pValue;

    if (argc < 3) 
    {
        printf("Usage: exe_name python_source function_name\n");
        return 1;
    }

    // Initialize the Python Interpreter
    Py_Initialize();

    // Build the name object
    pName = PyString_FromString(argv[1]);

    // Load the module object
    pModule = PyImport_Import(pName);

    // pDict is a borrowed reference 
    pDict = PyModule_GetDict(pModule);

    // pFunc is also a borrowed reference 
    pFunc = PyDict_GetItemString(pDict, argv[2]);

    if (PyCallable_Check(pFunc)) 
    {
        PyObject_CallObject(pFunc, NULL);
    } else 
    {
        PyErr_Print();
    }

    // Clean up
    Py_DECREF(pModule);
    Py_DECREF(pName);

    // Finish the Python Interpreter
    Py_Finalize();

    return 0;
}

Python 源文件“py_function.py”如下所示：

'''py_function.py - Python source designed to '''
'''demonstrate the use of python embedding'''

def multiply():
    c = 12345*6789
    print 'The result of 12345 x 6789 :', c
    return c

请注意，为了简洁起见，省略了对对象有效性的检查。在 Windows 上，只需编译 C 源文件即可获得可执行文件，我们称之为“call_function.exe”。要运行它，请输入命令行“call_function py_function multiply”。第二个参数是要加载后成为模块名的 Python 文件名（不带扩展名）。第三个参数是要在模块中调用的 Python 函数名。Python 源文件不参与编译或链接；它只在运行时加载和解释。执行的输出是：

The result of 12345 x 6789 : 83810205

C 代码本身是自解释的，除了：

Python 中的一切都是对象。pDict 和 pFunc 是借用的引用，所以我们不需要 Py_DECREF() 它们。
所有 Py_XXX 和 PyXXX_XXX 调用都是 Python/C API 调用。
该代码将在 Python 支持的所有平台上进行编译和运行。

现在，我们想向 Python 函数传递参数。我们添加一个块来处理调用的参数：

if (PyCallable_Check(pFunc)) 
{
    // Prepare the argument list for the call
    if( argc > 3 )
    {
            pArgs = PyTuple_New(argc - 3);
            for (i = 0; i < argc - 3; i++)
            {
            pValue = PyInt_FromLong(atoi(argv[i + 3]));
                    if (!pValue)
                    {
                PyErr_Print();
                         return 1;
                    }
                    PyTuple_SetItem(pArgs, i, pValue);    
            }
            
        pValue = PyObject_CallObject(pFunc, pArgs);

        if (pArgs != NULL)
        {
            Py_DECREF(pArgs);
        }
    } else
    {
        pValue = PyObject_CallObject(pFunc, NULL);
    }

    if (pValue != NULL) 
    {
        printf("Return of call : %d\n", PyInt_AsLong(pValue));
        Py_DECREF(pValue);
    }
    else 
    {
        PyErr_Print();
    }
    
    // some code omitted...
}

新的 C 源文件添加了一个“准备调用参数列表”的块，并对返回值进行了检查。它创建了一个元组（类似列表的类型）来存储调用的所有参数。您可以运行命令“call_function py_source multiply1 6 7”并获得输出：

The result of 6 x 7 : 42
Return of call : 42

在 Python 中编写类很容易。在 C 代码中使用 Python 类也很容易。您所要做的就是创建一个对象实例并调用它的方法，就像调用普通函数一样。这是一个例子：

// call_class.c - A sample of python embedding 
// (calling python classes from C code)
//
#include <Python.h>

int main(int argc, char *argv[])
{
    PyObject *pName, *pModule, *pDict, 
                  *pClass, *pInstance, *pValue;
    int i, arg[2];

    if (argc < 4) 
    {
        printf(
          "Usage: exe_name python_fileclass_name function_name\n");
        return 1;
    }

    // some code omitted...
   
    // Build the name of a callable class 
    pClass = PyDict_GetItemString(pDict, argv[2]);

    // Create an instance of the class
    if (PyCallable_Check(pClass))
    {
        pInstance = PyObject_CallObject(pClass, NULL); 
    }

    // Build the parameter list
    if( argc > 4 )
    {
        for (i = 0; i < argc - 4; i++)
            {
                    arg[i] = atoi(argv[i + 4]);
            }
        // Call a method of the class with two parameters
        pValue = PyObject_CallMethod(pInstance, 
                    argv[3], "(ii)", arg[0], arg[1]);
    } else
    {
        // Call a method of the class with no parameters
        pValue = PyObject_CallMethod(pInstance, argv[3], NULL);
    }
    if (pValue != NULL) 
    {
        printf("Return of call : %d\n", PyInt_AsLong(pValue));
        Py_DECREF(pValue);
    }
    else 
    {
        PyErr_Print();
    }
   
    // some code omitted...
}

PyObject_CallMethod() 的第三个参数，“"(ii)"”是一个格式字符串，表示下一个参数是两个整数。请注意，PyObject_CallMethod() 将 C 变量类型作为其参数，而不是 Python 对象。这与我们迄今为止看到的其他调用不同。Python 源文件“py_class.py”如下所示：

'''py_class.py - Python source designed to demonstrate''' 
'''the use of python embedding'''

class Multiply: 
    def __init__(self): 
            self.a = 6 
            self.b = 5 
    
    def multiply(self):
            c = self.a*self.b
    print 'The result of', self.a, 'x', self.b, ':', c
            return c
    
    def multiply2(self, a, b):
            c = a*b
    print 'The result of', a, 'x', b, ':', c
    return c

要运行应用程序，您需要在模块名和函数名之间添加类名，在本例中为“Multiply”。命令行变成“call_class py_class Multiply multiply”或“call_class py_class Multiply multiply2 9 9”。

多线程 Python 嵌入

在做好以上准备工作后，我们就可以开始处理一些重要的事情了。Python 模块和您的 C/C++ 应用程序需要不时地同时运行。在仿真社区中，这种情况并不少见。例如，要嵌入的 Python 模块是实时仿真的一部分，您需要将其与仿真其余部分并行运行。同时，它需要在运行时与其余部分进行交互。一种传统的技术是多线程。有多线程嵌入的几种选择。我们将在下面讨论其中两种。

一种方法是您在 C 中创建一个单独的线程，并从线程函数调用 Python 模块。这是自然且正确的，只是您需要保护 Python 解释器的状态。基本上，我们在使用 Python 解释器之前锁定它，在使用后释放它，以便 Python 可以跟踪不同调用线程的状态。Python 提供了全局锁来实现此目的。让我们先看一些源代码。以下是“call_thread.c”的完整内容：

// call_thread.c - A sample of python embedding 
// (C thread calling python functions)
// 
#include <Python.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

#ifdef WIN32    // Windows includes
#include <Windows.h>
#include <process.h>
#define sleep(x) Sleep(1000*x)
HANDLE handle;
#else    // POSIX includes
#include <pthread.h>
pthread_t mythread;
#endif

void ThreadProc(void*);

#define NUM_ARGUMENTS 5
typedef struct 
{
   int argc;
   char *argv[NUM_ARGUMENTS]; 
} CMD_LINE_STRUCT;

int main(int argc, char *argv[])
{
    int i;
    CMD_LINE_STRUCT cmd;
    pthread_t mythread;

    cmd.argc = argc;
    for( i = 0; i < NUM_ARGUMENTS; i++ )
    {
        cmd.argv[i] = argv[i];
    }

    if (argc < 3) 
    {
        fprintf(stderr,
          "Usage: call python_filename function_name [args]\n");
        return 1;
    }

    // Create a thread
#ifdef WIN32
    // Windows code
    handle = (HANDLE) _beginthread( ThreadProc,0,&cmd);
#else
    // POSIX code
    pthread_create( &mythread, NULL, 
                 ThreadProc, (void*)&cmd );
#endif

    // Random testing code
    for(i = 0; i < 10; i++)
    {
        printf("Printed from the main thread.\n");
    sleep(1);
    }

    printf("Main Thread waiting for My Thread to complete...\n");

    // Join and wait for the created thread to complete...
#ifdef WIN32
    // Windows code
    WaitForSingleObject(handle,INFINITE);
#else
    // POSIX code
    pthread_join(mythread, NULL);
#endif

    printf("Main thread finished gracefully.\n");

    return 0;
}

void ThreadProc( void *data )
{
    int i;
    PyObject *pName, *pModule, *pDict, 
               *pFunc, *pInstance, *pArgs, *pValue;
    PyThreadState *mainThreadState, *myThreadState, *tempState;
    PyInterpreterState *mainInterpreterState;
    
    CMD_LINE_STRUCT* arg = (CMD_LINE_STRUCT*)data;

    // Random testing code
    for(i = 0; i < 15; i++)
    {
        printf("...Printed from my thread.\n");
    sleep(1);
    }

    // Initialize python inerpreter
    Py_Initialize();
        
    // Initialize thread support
    PyEval_InitThreads();

    // Save a pointer to the main PyThreadState object
    mainThreadState = PyThreadState_Get();

    // Get a reference to the PyInterpreterState
    mainInterpreterState = mainThreadState->interp;

    // Create a thread state object for this thread
    myThreadState = PyThreadState_New(mainInterpreterState);
    
    // Release global lock
    PyEval_ReleaseLock();
    
    // Acquire global lock
    PyEval_AcquireLock();

    // Swap in my thread state
    tempState = PyThreadState_Swap(myThreadState);

    // Now execute some python code (call python functions)
    pName = PyString_FromString(arg->argv[1]);
    pModule = PyImport_Import(pName);

    // pDict and pFunc are borrowed references 
    pDict = PyModule_GetDict(pModule);
    pFunc = PyDict_GetItemString(pDict, arg->argv[2]);

    if (PyCallable_Check(pFunc)) 
    {
        pValue = PyObject_CallObject(pFunc, NULL);
    }
    else {
        PyErr_Print();
    }

    // Clean up
    Py_DECREF(pModule);
    Py_DECREF(pName);

    // Swap out the current thread
    PyThreadState_Swap(tempState);

    // Release global lock
    PyEval_ReleaseLock();
    
    // Clean up thread state
    PyThreadState_Clear(myThreadState);
    PyThreadState_Delete(myThreadState);

    Py_Finalize();
    printf("My thread is finishing...\n");

    // Exiting the thread
#ifdef WIN32
    // Windows code
    _endthread();
#else
    // POSIX code
    pthread_exit(NULL);
#endif
}

线程函数需要一些解释。PyEval_InitThreads() 初始化 Python 的线程支持。PyThreadState_Swap(myThreadState) 交换当前线程的状态，PyThreadState_Swap(tempState) 将其交换出去。Python 解释器将保存这两个调用之间的操作作为与该线程相关的状态数据。事实上，Python 为使用解释器的每个线程保存数据，以便线程状态是互斥的。但创建和维护每个 C 线程的状态是您的责任。您可能想知道为什么我们没有调用第一个 PyEvel_AcquireLock()。因为 PyEval_InitThreads() 默认会这样做。在其他情况下，我们需要成对使用 PyEvel_AcquireLock() 和 PyEvel_ReleaseLock()。

运行“call_thread py_thread pythonFunc”，您将获得如下所示的输出。文件“py_thread.py”定义了一个名为 pythonFunc() 的函数，其中一个类似的随机测试块会向屏幕打印“print from pythonFunc...”十五次。

Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
...Printed from my thread.
Printed from the main thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Printed from the main thread.
...Printed from my thread.
Main Thread waiting for My Thread to complete...
...Printed from my thread.
...Printed from my thread.
...Printed from my thread.
...Printed from my thread.
...Printed from my thread.
My thread is finishing...
Main thread finished gracefully.

显然，实现变得复杂了，因为在 C/C++ 中编写多线程代码并非易事。尽管代码是可移植的，但它包含大量补丁，这些补丁需要对特定平台的系统调用接口有详细的了解。幸运的是，Python 已经为我们完成了大部分工作，这引出了我们讨论问题的第二个解决方案，即让 Python 处理多线程。这次 Python 代码得到了增强，添加了一个线程模型：

''' Demonstrate the use of python threading'''

import time
import threading

class MyThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
    def run(self):
        for i in range(15):
            print 'printed from MyThread...'
            time.sleep(1)

def createThread():
    print 'Create and run MyThread'
    background = MyThread()
    background.start()
    print  'Main thread continues to run in foreground.'
    for i in range(10):
        print 'printed from Main thread.'
        time.sleep(1)
    print  'Main thread joins MyThread and waits until it is done...'
    background.join() # Wait for the background task to finish
    print  'The program completed gracefully.'

C 代码不再处理线程。它所需要做的就是调用 createThread()。尝试使用之前的“call_function.c”。您可以运行“call_function py_thread createThread”来查看输出。在这种情况下，第二种解决方案更清晰、更简单。更重要的是，Python 线程模型是可移植的。虽然 Unix 和 Windows 的 C 线程代码不同，但在 Python 中却保持不变。

如果我们的 C 代码调用线程类的 start() 和 joint() 方法，则不需要 Python 函数 createThread()。相关的更改列在以下内容中（来自 C 源文件“call_thread_2.c”）：

// Create instance
pInstance = PyObject_CallObject(pClass, NULL); 

PyObject_CallMethod(pInstance, "start", NULL);

i = 0;
while(i<10)
{
printf("Printed from C thread...\n");

// !!!Important!!! C thread will not release CPU to 
// Python thread without the following call.
PyObject_CallMethod(pInstance, "join", "(f)", 0.001);        
Sleep(1000);
i++;
}

printf(
  "C thread join and wait for Python thread to complete...\n");
PyObject_CallMethod(pInstance, "join", NULL);        

printf("Program completed gracefully.\n");

基本上，在创建类实例后，调用其 start() 方法来创建新线程并执行其 run() 方法。请注意，如果没有频繁地对创建的线程进行短时间 join，创建的线程只能在开始时执行，而主线程在完成之前不会释放任何 CPU 给它。您可以通过注释掉 while 循环内的 join 调用来尝试这一点。行为与我们之前从 Python 模块中调用 start() 的情况有所不同。这似乎是多线程的一个特性，在 Python 库参考中没有记录。

关注点

我特意关注了编写通用的、可移植的 Python 嵌入 C 代码。通过封装低级系统调用，Python 支持平台的可移植性，并使编写可移植代码更加容易。大多数 Python 模块可以轻松地在类 Unix 环境和 Windows 之间移植。在为 Python 模块编写 C/C++ 包装代码时，我们应该牢记这种可移植性。自己编写可移植的 C 代码可能并不总是那么直接。Python 已经做了很多艰苦的工作，就像上面的例子一样。尝试探索更简单、更简洁的解决方案。总之，我在这里停下了。如何编写可移植的 Python 代码超出了本教程的范围。这或许可以成为一篇新文章的好标题。

虽然嵌入是利用 C/C++ 应用程序中的 Python 模块的一种好方法，但还有其他替代方法。在 Windows 上，某些工具（例如“py2exe”）可以将 Python 模块直接转换为 Windows 可执行文件。然后，您可以从 C/C++ 应用程序中启动一个进程来运行该可执行文件。一个缺点是您无法直接调用模块。相反，您的应用程序必须通过某种类型的 IPC（进程间通信）与模块进行交互。这要求被讨论的 Python 模块是“IPC 就绪”的，这意味着它应该已经实现了一个 IPC 接口来处理传入和传出的数据。本文的第二部分将从 IPC 的角度讨论嵌入相关的 IPC 技术。

本文提供的所有源代码都是用于演示目的的简单 C 代码。在实际应用中，我建议将 Python 嵌入代码放在 C++ 包装类中。这样，高级应用程序开发人员就不必处理嵌入的细节。

结论

在本文中，我涵盖了从调用函数、类和方法等基础知识到多线程嵌入等不那么基础的主题。Python/C API 提供了一个一致的调用接口，以简化 C/C++ 与 Python 模块之间的集成任务。

关于多线程嵌入的讨论引发了一个问题：您的 C/C++ 应用程序如何与嵌入的 Python 模块通信？本文的第二部分将从 IPC 的角度解决这个问题。

历史

这是本文和源代码的第一个修订版。