音频物联网 Arduino C++

Player：一个简单的跨平台音频类，用于 IoT

honey the codewitch

5.00/5 (5投票s)

2023年4月8日

MIT

7分钟阅读

10051

121

使用这个简单易用的库混合 wav 文件和波形。

下载演示 - 555.1 KB

player

引言

我最初编写htcw_sfx是为了处理物联网音频。我使其模块化程度很高，但遗憾的是，这使得它比我想要的更复杂。

为此，我创建了一个简单的类，用于在具有音频子系统的物联网设备（如I2S硬件）上播放音频。

理解这段乱码

播放器通过维护一个声音及其关联状态的链接列表来工作。它为不同类型的声音（如wav文件、正弦波和三角波）提供了生成器函数。

它会定期在缓冲区上依次运行每个函数，其中每个函数将其结果添加到缓冲区中已有的值中，从而有效地混合。这还允许您使用相同的机制创建过滤器，但我还没有实现任何过滤器。

构造并写入缓冲区后，它会调用flush回调函数将其发送到平台特定的音频层。

使用这个烂摊子

实际的播放器部分很简单，但我们将介绍如何使用ESP32。包含的代码适用于M5 Stack Fire、M5 Stack Core2或AI-Thinker ESP32 Audio Kit 2.2。如果您有不同的设备，则必须修改项目，但播放器代码本身无论如何都将基本相同。

包含并声明它

#include <player.hpp>

player sound(44100,2,16); // 44.1khz, stereo, 16-bit

在应用程序的setup代码中，初始化它。

if(!sound.initialize()) {
    printf("Sound initialization failure.\n");    
    while(1);
}

接下来是我喜欢初始化平台特定音频层的地方。在这种情况下，我们将以AI-Thinker ESP32 Audio Kit 2.2为例

i2s_config_t i2s_config;
memset(&i2s_config,0,sizeof(i2s_config_t));
i2s_config.mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_TX);
i2s_config.sample_rate = 44100;
i2s_config.bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT;
i2s_config.channel_format = I2S_CHANNEL_FMT_RIGHT_LEFT;
i2s_config.communication_format = I2S_COMM_FORMAT_STAND_MSB;
i2s_config.dma_buf_count = 2;
i2s_config.dma_buf_len = sound.buffer_size();
i2s_config.use_apll = true;
i2s_config.intr_alloc_flags = ESP_INTR_FLAG_LEVEL2;
i2s_driver_install((i2s_port_t)I2S_NUM_1, &i2s_config, 0, NULL);
i2s_pin_config_t pins = {
    .mck_io_num = 0,
    .bck_io_num = 27,
    .ws_io_num = 26,
    .data_out_num = 25,
    .data_in_num = I2S_PIN_NO_CHANGE};
i2s_set_pin((i2s_port_t)I2S_NUM_1,&pins);

接下来设置回调函数。在这种情况下，我们需要两个

sound.on_flush([](const void* buffer,size_t buffer_size,void* state){
    size_t written;
    i2s_write(I2S_NUM_1,buffer,buffer_size,&written,portMAX_DELAY);
});
sound.on_sound_disable([](void* state) {
    i2s_zero_dma_buffer(I2S_NUM_1);
});

所有这些都是平台和硬件特定的，但不同ESP32 MCU上的外观会相似。请注意，我们使用sound库来计算DMA缓冲区大小。

我们的两个回调函数很简单：`on_flush`将数据写入I2S端口。`on_sound_disable`将I2S DMA缓冲区清零，使输出静音。

现在，让我们以440Hz、40%的幅度播放一个正弦波

voice_handle_t sin_handle = sound.sin(0,440,.4);

第一个参数是“端口”，它是一个任意的数字标识符，表示声音将在哪个管道上播放。如果您不需要管道（通常不需要），则只需使用0。当您需要将过滤器应用于某些声音集而不是其他声音时，端口对于将声音分组很有用。所有具有相同端口标识符的声音都在同一个管道上。

第二个参数是以Hz为单位的频率。

第三个参数是幅度，缩放到0到1之间。

返回值是一个句柄，可用于稍后引用该声音，例如，在播放一段时间后停止它。

voice_handle_t wav_handle = sound.wav(0,read_demo,nullptr,.4,true,seek_demo,nullptr);

第一个参数是端口。

第二个参数是从回调函数读取下一个wav数据字节。

第三个参数是回调函数的state。

第四个参数是幅度修饰符，在此情况下表示40%。

第五个参数表示声音是否循环播放。

第六个参数是seek回调函数（仅在循环时使用）。

第七个参数是seek回调函数的state。

让我们看一下那些回调函数的实现

size_t test_len = sizeof(test_data);
size_t test_pos = 0;
int read_demo(void* state) {
    if(test_pos>=test_len) {
        return -1;
    }
    return test_data[test_pos++];
}
void seek_demo(unsigned long long pos, void* state) {
    test_pos = pos;
}

这些非常直接。`test_data[]`来自*test.hpp*，包含我们的wav数据。

`read_demo()`返回`test_data[]`中的下一个值，并递增位置（`test_pos`），如果已到达末尾则返回-1。

`seek_demo()`设置位置。

如果我们想停止wav文件播放，我们可以调用

sound.stop(wav_handle);

如果我们想停止所有声音

sound.stop();

您还可以通过将端口号传递给`stop_port()`来停止特定端口上的所有声音。

sound.stop_port(0);

为了让它实际播放任何内容，您需要反复调用`update()`

sound.update();

以上粗略地涵盖了基本功能。

编写这个混乱的程序

现在，让我们深入了解它是如何制作的。我们将介绍*player.cpp*。

#include <player.hpp>
#if __has_include(<Arduino.h>)
#include <Arduino.h>
#else
#include <inttypes.h>
#include <stddef.h>
#include <math.h>
#include <string.h>
#define PI (3.1415926535f)
#endif

这提供了核心包含和定义，这些根据Arduino是否可用而有所不同（尽管其余代码相同）。

现在，我们来看一些在实现中使用的私有定义。

constexpr static const float player_pi = PI;
constexpr static const float player_two_pi = player_pi*2.0f;

typedef struct voice_info {
    unsigned short port;
    voice_function_t fn;
    void* fn_state;
    voice_info* next;
} voice_info_t;
typedef struct {
    float frequency;
    float amplitude;
    float phase;
    float phase_delta;
} waveform_info_t;
typedef struct wav_info {
    on_read_stream_callback on_read_stream;
    void* on_read_stream_state;
    on_seek_stream_callback on_seek_stream;
    void* on_seek_stream_state;
    float amplitude;
    bool loop;
    unsigned short channel_count;
    unsigned short bit_depth;
    unsigned long long start;
    unsigned long long length;
    unsigned long long pos;
} wav_info_t;

这些结构指示每个声音的基本信息，以及每种声音类型的特定状态。

接下来，我们有一些从前面提到的读取回调函数进行读取的函数。回调函数仅返回8位无符号值，因此我们有方法使用回调函数读取复合/多字节值和有符号值。我将省略这些实现的细节，因为它们并不特别重要或复杂。

static bool player_read32(on_read_stream_callback on_read_stream, 
                            void* on_read_stream_state,
                            uint32_t* out) {...}
static bool player_read16(on_read_stream_callback on_read_stream, 
                            void* on_read_stream_state,
                            uint16_t* out) {...}
static bool player_read8s(on_read_stream_callback on_read_stream, 
                            void* on_read_stream_state,
                            int8_t* out) {...}
static bool player_read16s(on_read_stream_callback on_read_stream, 
                            void* on_read_stream_state,
                            int16_t* out) {...}
static bool player_read_fourcc(on_read_stream_callback on_read_stream, 
                                void* on_read_stream_state, 
                                char* buf) {...}

上面最后一个函数实际上读取一个“fourCC”值，这是一个4个字符长的标识符，如“WAVE”或“RIFF”。这些通常用作文件格式指示符，但wav文件在文件中的多个地方也使用fourCC代码。

接下来，我们有一些声音函数。这些函数的目的是以指定格式将声音数据渲染到提供的缓冲区中。对于生成简单波形，我们使用以下形式，我将展开一次实现，并为后续函数省略，因为代码几乎相同。

static void sin_voice(const voice_function_info_t& info, void*state) {
    waveform_info_t* wi = (waveform_info_t*)state;
    for(int i = 0;i<info.frame_count;++i) {
        float f = (sinf(wi->phase) + 1.0f) * 0.5f;
        wi->phase+=wi->phase_delta;
        if(wi->phase>=player_two_pi) {
            wi->phase-=player_two_pi;
        }
        float samp = (f*wi->amplitude)*info.sample_max;
        switch(info.bit_depth) {
            case 8: {
                uint8_t* p = ((uint8_t*)info.buffer)+(i*info.channel_count);
                uint32_t tmp = *p+roundf(samp);
                if(tmp>info.sample_max) {
                    tmp = info.sample_max;
                }
                for(int j = 0;j<info.channel_count;++j) {
                    *p++=tmp;
                }
            }
            break;
            case 16: {
                uint16_t* p = ((uint16_t*)info.buffer)+(i*info.channel_count);
                uint32_t tmp = *p+roundf(samp);
                if(tmp>info.sample_max) {
                    tmp = info.sample_max;
                }
                for(int j = 0;j<info.channel_count;++j) {
                    *p++=tmp;
                }
            }
            break;
            default:
            break;
        }
    }
}
static void sqr_voice(const voice_function_info_t& info, void*state) {...}

static void saw_voice(const voice_function_info_t& info, void*state) {...}

static void tri_voice(const voice_function_info_t& info, void*state) {...}

我们上面所做的是将正弦波计算到`f`中，然后将其以指定的格式写入提供的缓冲区。目前，这些函数仅支持8位和16位输出。

我们也有wav文件的声音函数。为了保持速度快且易于实现，我们为wav和输出格式的每种组合都有一个函数。我提供了一个实现，并像上面一样省略了其余的。

static void wav_voice_16_2_to_16_2(const voice_function_info_t& info, void*state) {
    wav_info_t* wi = (wav_info_t*)state;
    if(!wi->loop&&wi->pos>=wi->length) {
        return;
    }
    uint16_t* dst = (uint16_t*)info.buffer;
    for(int i = 0;i<info.frame_count;++i) {
        int16_t i16;
        
        if(wi->pos>=wi->length) {
            if(!wi->loop) {
                break;
            }
            wi->on_seek_stream(wi->start,wi->on_seek_stream_state);
            wi->pos = 0;
        }
        for(int j=0;j<info.channel_count;++j) {
            if(player_read16s(wi->on_read_stream,wi->on_read_stream_state,&i16)) {
                wi->pos+=2;
            } else {
                break;
            }
            *dst+=(uint16_t)(((i16*wi->amplitude)+32768U));
            ++dst;
        }
    }
}
static void wav_voice_16_2_to_8_1(const voice_function_info_t& info, void*state) {...}

static void wav_voice_16_1_to_16_2(const voice_function_info_t& info, void*state) {...}

static void wav_voice_16_2_to_16_1(const voice_function_info_t& info, void*state) {...}

static void wav_voice_16_1_to_16_1(const voice_function_info_t& info, void*state) {...}

static void wav_voice_16_1_to_8_1(const voice_function_info_t& info, void*state) {...}

这些函数使用读取和seek回调函数来读取wav数据并通过将其添加到提供的缓冲区中。

请注意，wav数据是有符号的，而我们的内部数据是无符号的。

接下来是我们第一个链接列表函数——添加声音的函数。

static voice_handle_t player_add_voice(unsigned char port, 
                                        voice_handle_t* in_out_first, 
                                        voice_function_t fn, 
                                        void* fn_state, 
                                        void*(allocator)(size_t)) {
    voice_info_t* pnew;
    if(*in_out_first==nullptr) {
        pnew = (voice_info_t*)allocator(sizeof(voice_info_t));
        if(pnew==nullptr) {
            return nullptr;
        }
        pnew->port = port;
        pnew->next = nullptr;
        pnew->fn = fn;
        pnew->fn_state = fn_state;
        *in_out_first = pnew;
        return pnew;
    }
    voice_info_t* v = (voice_info_t*)*in_out_first;
    if(v->port>port) {
        pnew = (voice_info_t*)allocator(sizeof(voice_info_t));
        if(pnew==nullptr) {
            return nullptr;
        }
        pnew->port = port;
        pnew->next = v;
        pnew->fn = fn;
        pnew->fn_state = fn_state;
        *in_out_first = pnew;
        return pnew;
    }
    while(v->next!=nullptr && v->next->port<=port) {
        v=v->next;
    }
    voice_info_t* vnext = v->next;
    pnew = (voice_info_t*)allocator(sizeof(voice_info_t));
    if(pnew==nullptr) {
        return nullptr;
    }
    pnew->port = port;
    pnew->next = vnext;
    pnew->fn = fn;
    pnew->fn_state = fn_state;
    v->next = pnew;
   return v;
}

它接受一个端口，我简要提过，一个指向列表中第一个元素的句柄指针，该指针可能会被更改，新声音的函数，上述函数的state，以及用于分配内存的分配器，这允许使用自定义堆。

然后，我们仅对`port`进行排序插入。对于任何实现过链接列表的人来说，这段代码看起来应该很直接。请注意，我们的函数state必须在该例程调用之前已经分配。

现在是对应的，移除声音的函数。

static bool player_remove_voice(voice_handle_t* in_out_first,
                                voice_handle_t handle,
                                void(deallocator)(void*)) {
    voice_info_t** pv = (voice_info_t**)in_out_first;
    voice_info_t* v = *pv;
    if(v==nullptr) {return false;}
    if(handle==v) {
        *pv = v->next;
        if(v->fn_state!=nullptr) {
            deallocator(v->fn_state);
        }
        deallocator(v);
    } else {
        while(v->next!=handle) {
            v=v->next;
            if(v->next==nullptr) {
                return false;
            }
        }
        void* to_free = v->next;
        if(to_free==nullptr) {
            return false;
        }
        void* to_free2 = v->next->fn_state;
        if(v->next->next!=nullptr) {
            v->next = v->next->next;
        } else {
            v->next = nullptr;
        }
        deallocator(to_free);
        deallocator(to_free2);
    }
    return true;
}

同样，对于任何实现过链接列表的人来说，这应该很直接。我们在这里多做的一件事是释放`fn_state`。

下面的函数是前一个函数的变体，允许您移除特定端口上的所有声音。

static bool player_remove_port(voice_handle_t* in_out_first,
                            unsigned short port,
                            void(deallocator)(void*)) {
    voice_info_t* first = (voice_info_t*)(*in_out_first);
    voice_info_t* before = nullptr;
    
    while(first!=nullptr && first->port<port) {
        before = first;
        first = first->next;
    }
    if(first==nullptr || first->port>port) {
        return false;
    }
    
    voice_info_t* after = first->next;
    while(after!=nullptr && after->port==port) {
        void* to_free = after;
        if(after->fn_state!=nullptr) {
            deallocator(after->fn_state);
        }
        after=after->next;
        deallocator(to_free);
    }
    if(before!=nullptr) {
        before->next = after;
    } else {
        *in_out_first = after;
    }

    return true;
}

现在我们终于进入播放器类实现本身了，所以我们暂时转到*player.hpp*来获取定义。

// info used for custom voice functions
typedef struct voice_function_info {
    void* buffer;
    size_t frame_count;
    unsigned int channel_count;
    unsigned int bit_depth;
    unsigned int sample_max;
} voice_function_info_t;
// custom voice function
typedef void (*voice_function_t)(const voice_function_info_t& info, void* state);
// the handle to refer to a playing voice
typedef void* voice_handle_t;
// called when the sound output should be disabled
typedef void (*on_sound_disable_callback)(void* state);
// called when the sound output should be enabled
typedef void (*on_sound_enable_callback)(void* state);
// called when there's sound data to send to the output
typedef void (*on_flush_callback)(const void* buffer, size_t buffer_size, void* state);
// called to read a byte off a stream
typedef int (*on_read_stream_callback)(void* state);
// called to seek a stream
typedef void (*on_seek_stream_callback)(unsigned long long pos, void* state);
// represents a polyphonic player capable of playing wavs or various waveforms
class player final {
    voice_handle_t m_first;
    void* m_buffer;
    size_t m_frame_count;
    unsigned int m_sample_rate;
    unsigned int m_channel_count;
    unsigned int m_bit_depth;
    unsigned int m_sample_max;
    bool m_sound_enabled;
    on_sound_disable_callback m_on_sound_disable_cb;
    void* m_on_sound_disable_state;
    on_sound_enable_callback m_on_sound_enable_cb;
    void* m_on_sound_enable_state;
    on_flush_callback m_on_flush_cb;
    void* m_on_flush_state;
    void*(*m_allocator)(size_t);
    void*(*m_reallocator)(void*,size_t);
    void(*m_deallocator)(void*);
    player(const player& rhs)=delete;
    player& operator=(const player& rhs)=delete;
    void do_move(player& rhs);
    bool realloc_buffer();
public:
    // construct the player with the specified arguments
    player(unsigned int sample_rate = 44100, 
        unsigned short channels = 2, 
        unsigned short bit_depth = 16, 
        size_t frame_count = 256, 
        void*(allocator)(size_t)=::malloc,
        void*(reallocator)(void*,size_t)=::realloc,
        void(deallocator)(void*)=::free);
    player(player&& rhs);
    ~player();
    player& operator=(player&& rhs);
    // indicates if the player has been initialized
    bool initialized() const;
    // initializes the player
    bool initialize();
    // deinitializes the player
    void deinitialize();
    // plays a sine wave at the specified frequency and amplitude
    voice_handle_t sin(unsigned short port, float frequency, float amplitude = .8);
    // plays a square wave at the specified frequency and amplitude
    voice_handle_t sqr(unsigned short port, float frequency, float amplitude = .8);
    // plays a sawtooth wave at the specified frequency and amplitude
    voice_handle_t saw(unsigned short port, float frequency, float amplitude = .8);
    // plays a triangle wave at the specified frequency and amplitude
    voice_handle_t tri(unsigned short port, float frequency, float amplitude = .8);
    // plays RIFF PCM wav data at the specified amplitude, optionally looping
    voice_handle_t wav(unsigned short port, 
                    on_read_stream_callback on_read_stream, 
                    void* on_read_stream_state, 
                    float amplitude = .8, 
                    bool loop = false,
                    on_seek_stream_callback on_seek_stream = nullptr, 
                    void* on_seek_stream_state=nullptr);
    // plays a custom voice
    voice_handle_t voice(unsigned short port, 
                        voice_function_t fn, 
                        void* state = nullptr);
    // stops a playing voice, or all playing voices
    bool stop(voice_handle_t handle = nullptr);
    // stops a playing voice, or all playing voices
    bool stop(unsigned short port);
    // set the sound disable callback
    void on_sound_disable(on_sound_disable_callback cb, void* state=nullptr);
    // set the sound enable callback
    void on_sound_enable(on_sound_enable_callback cb, void* state=nullptr);
    // set the flush callback (always necessary)
    void on_flush(on_flush_callback cb, void* state=nullptr);
    // A frame is every sample for every channel on a given a tick.
    // A stereo frame would have two samples.
    // This is the count of frames in the mixing buffer.
    size_t frame_count() const;
    // assign a new frame count
    bool frame_count(size_t value);
    // get the sample rate
    unsigned int sample_rate() const;
    // set the sample rate
    bool sample_rate(unsigned int value);
    // get the number of channels
    unsigned short channel_count() const;
    // set the number of channels
    bool channel_count(unsigned short value);
    // get the bit depth
    unsigned short bit_depth() const;
    // set the bit depth
    bool bit_depth(unsigned short value);
    // indicates the size of the internal audio buffer
    size_t buffer_size() const;
    // indicates the bandwidth required to play the buffer
    size_t bytes_per_second() {
        return m_sample_rate*m_channel_count*(m_bit_depth/8);
    }
    // give a timeslice to the player to update itself
    void update();
    // allocates memory for a custom voice state
    template<typename T>
    T* allocate_voice_state() const {
        return (T*)m_allocator(sizeof(T));
    }
};

现在，让我们回到实现，从一些样板代码开始。

void player::do_move(player& rhs) {
    m_first = rhs.m_first ;
    rhs.m_first = nullptr;
    m_buffer = rhs.m_buffer;
    rhs.m_buffer = nullptr;
    m_frame_count = rhs.m_frame_count;
    rhs.m_frame_count = 0;
    m_sample_rate = rhs.m_sample_rate;
    m_sample_max = rhs.m_sample_max;
    m_sound_enabled = rhs.m_sound_enabled;
    m_on_sound_disable_cb=rhs.m_on_sound_disable_cb;
    rhs.m_on_sound_enable_cb = nullptr;
    m_on_sound_disable_state = rhs.m_on_sound_disable_state;
    m_on_sound_enable_cb = rhs.m_on_sound_enable_cb;
    rhs.m_on_sound_enable_cb = nullptr;
    m_on_flush_cb = rhs.m_on_flush_cb;
    rhs.m_on_flush_cb = nullptr;
    m_on_flush_state = rhs.m_on_flush_state;
    m_allocator = rhs.m_allocator;
    m_reallocator = rhs.m_reallocator;
    m_deallocator = rhs.m_deallocator;
}

此函数基本上实现了C++移动语义的核心部分，因为我没有提供复制构造函数或赋值运算符，原因应该很容易理解，只要您稍加思考。

接下来是构造函数和析构函数。构造函数实际上什么都不做，只是将所有成员分配或设置为其初始值，而析构函数调用`deinitialize()`。

player::player(unsigned int sample_rate, 
            unsigned short channel_count, 
            unsigned short bit_depth, 
            size_t frame_count, 
            void*(allocator)(size_t), 
            void*(reallocator)(void*,size_t), 
            void(deallocator)(void*)) :
                m_first(nullptr),
                m_buffer(nullptr),
                m_frame_count(frame_count),
                m_sample_rate(sample_rate),
                m_channel_count(channel_count),
                m_bit_depth(bit_depth),
                m_on_sound_disable_cb(nullptr),
                m_on_sound_disable_state(nullptr),
                m_on_sound_enable_cb(nullptr),
                m_on_sound_enable_state(nullptr),
                m_on_flush_cb(nullptr),
                m_on_flush_state(nullptr),
                m_allocator(allocator),
                m_reallocator(reallocator),
                m_deallocator(deallocator)
                {
}
player::~player() {
    deinitialize();
}

接下来的两个定义通过委托给`do_move()`来实现移动语义。

player::player(player&& rhs) {
    do_move(rhs);    
}
player& player::operator=(player&& rhs) {
    do_move(rhs);
    return *this;
}

下一组方法处理初始化和反初始化。

bool player::initialized() const { return m_buffer!=nullptr;}
bool player::initialize() {
    if(m_buffer!=nullptr) {
        return true;
    }
    m_buffer=m_allocator(m_frame_count*m_channel_count*(m_bit_depth/8));
    if(m_buffer==nullptr) {
        return false;
    }
    m_sample_max = powf(2,m_bit_depth)-1;
    m_sound_enabled = false;
    return true;
}
void player::deinitialize() {
    if(m_buffer==nullptr) {
        return;
    }
    stop();
    m_deallocator(m_buffer);
    m_buffer = nullptr;
}

`initialize()`分配一个缓冲区来保存帧，并将`m_sample_max`设置为位深的最大值。它还将声音的初始状态设置为禁用。

`deinitialize()`停止任何正在播放的声音，同时释放声音的内存。然后它取消分配保存帧的缓冲区。

接下来的方法处理创建波形声音，当您调用相应的方法时。它们的工作方式大致相同，因此它们各自委托给同一个辅助方法来完成大部分繁重的工作。

static voice_handle_t player_waveform(unsigned short port, 
                                    unsigned int sample_rate,
                                    voice_handle_t* in_out_first, 
                                    voice_function_t fn, 
                                    float frequency, 
                                    float amplitude, 
                                    void*(allocator)(size_t)) {
    waveform_info_t* wi = (waveform_info_t*)allocator(sizeof(waveform_info_t));
    if(wi==nullptr) {
        return nullptr;
    }
    wi->frequency = frequency;
    wi->amplitude = amplitude;
    wi->phase = 0;
    wi->phase_delta = player_two_pi*wi->frequency/(float)sample_rate;
    return player_add_voice(port, in_out_first,fn,wi,allocator);
}
voice_handle_t player::sin(unsigned short port, float frequency, float amplitude) {
    voice_handle_t result = player_waveform(port,
                                            m_sample_rate,
                                            &m_first,
                                            sin_voice,
                                            frequency,
                                            amplitude,
                                            m_allocator);
    return result;
}
voice_handle_t player::sqr(unsigned short port, float frequency, float amplitude) {
    voice_handle_t result = player_waveform(port,
                                            m_sample_rate,
                                            &m_first,
                                            sqr_voice,
                                            frequency,
                                            amplitude,
                                            m_allocator);
    return result;
}
voice_handle_t player::saw(unsigned short port, float frequency, float amplitude) {
    voice_handle_t result = player_waveform(port,
                                            m_sample_rate,
                                            &m_first,
                                            saw_voice,
                                            frequency,
                                            amplitude,
                                            m_allocator);
    return result;
}
voice_handle_t player::tri(unsigned short port, float frequency, float amplitude) {
    voice_handle_t result = player_waveform(port,
                                            m_sample_rate,
                                            &m_first,
                                            tri_voice,
                                            frequency,
                                            amplitude,
                                            m_allocator);
    return result;
}

现在到wav文件。我们必须从头读取RIFF块以获取文件中wav的开始和停止点，这正是这个例程大部分内容的作用。

voice_handle_t player::wav(unsigned short port, 
                        on_read_stream_callback on_read_stream, 
                        void* on_read_stream_state, float amplitude, 
                        bool loop, 
                        on_seek_stream_callback on_seek_stream, 
                        void* on_seek_stream_state) {
    if(on_read_stream==nullptr) {
        return nullptr;
    }
    if(loop && on_seek_stream==nullptr) {
        return nullptr;
    }
    unsigned int sample_rate=0;
    unsigned short channel_count=0;
    unsigned short bit_depth=0;
    unsigned long long start=0;
    unsigned long long length=0;
    uint32_t size;
    uint32_t remaining;
    uint32_t pos;
    //uint32_t fmt_len;
    int v = on_read_stream(on_read_stream_state);
    if(v!='R') { 
        return nullptr;
    }
    v = on_read_stream(on_read_stream_state);
    if(v!='I') { 
        return nullptr;
    }
    v = on_read_stream(on_read_stream_state);
    if(v!='F') { 
        return nullptr;
    }
    v = on_read_stream(on_read_stream_state);
    if(v!='F') { 
        return nullptr;
    }
    pos =4;
    uint32_t t32 = 0;
    if(!player_read32(on_read_stream,on_read_stream_state,&t32)) {
        return nullptr;
    }
    size = t32;
    pos+=4;
    remaining = size-8;
    v = on_read_stream(on_read_stream_state);
    if(v!='W') { 
        return nullptr;
    }
    v = on_read_stream(on_read_stream_state);
    if(v!='A') { 
        return nullptr;
    }
    v = on_read_stream(on_read_stream_state);
    if(v!='V') { 
        return nullptr;
    }
    v = on_read_stream(on_read_stream_state);
    if(v!='E') { 
        return nullptr;
    }
    pos+=4;
    remaining-=4;
    char buf[4];
    while(remaining) {
        if(!player_read_fourcc(on_read_stream,on_read_stream_state,buf)) {
            return nullptr;
        }
        pos+=4;
        remaining-=4;    
        if(!player_read32(on_read_stream,on_read_stream_state,&t32)) {
            return nullptr;
        }
        pos+=4;
        remaining-=4;
        if(0==memcmp("fmt ",buf,4)) {
            uint16_t t16;
            if(!player_read16(on_read_stream,on_read_stream_state,&t16)) {
                return nullptr;
            }
            if(t16!=1) { // PCM format
                return nullptr;
            }
            pos+=2;
            remaining-=2;
            if(!player_read16(on_read_stream,on_read_stream_state,&t16)) {
                return nullptr;
            }
            channel_count = t16;
            if(channel_count<1 || channel_count>2) {
                return nullptr;
            }
            pos+=2;
            remaining-=2;
            if(!player_read32(on_read_stream,on_read_stream_state,&t32)) {
                return nullptr;
            }
            sample_rate = t32;
            if(sample_rate!=this->sample_rate()) {
                return nullptr;
            }
            pos+=4;
            remaining-=4;
            if(!player_read32(on_read_stream,on_read_stream_state,&t32)) {
                return nullptr;
            }
            pos+=4;
            remaining-=4;
            if(!player_read16(on_read_stream,on_read_stream_state,&t16)) {
                return nullptr;
            }
            pos+=2;
            remaining-=2;
            if(!player_read16(on_read_stream,on_read_stream_state,&t16)) {
                return nullptr;
            }
            bit_depth = t16;
            pos+=2;
            remaining-=2;
            
        } else if(0==memcmp("data",buf,4)) {
            length = t32;
            start = pos;
            break;
        } else {
            // TODO: Seek instead
            while(t32--) {
                if(0>on_read_stream(on_read_stream_state)) {
                    return nullptr;
                }
                ++pos;
                --remaining;
            }
        }

    }
    wav_info_t* wi = (wav_info_t*)m_allocator(sizeof(wav_info_t));
    if(wi==nullptr) {
        return nullptr;
    }
    wi->on_read_stream = on_read_stream;
    wi->on_read_stream_state = on_read_stream_state;
    wi->on_seek_stream = on_seek_stream;
    wi->on_seek_stream_state = on_seek_stream_state;
    wi->amplitude = amplitude;
    wi->bit_depth = bit_depth;
    wi->channel_count = channel_count;
    wi->loop = loop;
    wi->on_read_stream = on_read_stream;
    wi->on_read_stream_state = on_read_stream_state;
    wi->start = start;
    wi->length = length;
    wi->pos = 0;

    if(wi->channel_count==2 && 
        wi->bit_depth==16 && 
        m_channel_count==2 && 
        m_bit_depth==16) {
        voice_handle_t res = player_add_voice(port, 
                                            &m_first,
                                            wav_voice_16_2_to_16_2,
                                            wi,
                                            m_allocator);
        if(res==nullptr) {
            m_deallocator(wi);
        }
        return res;
    } else if(wi->channel_count==1 && 
            wi->bit_depth==16 && 
            m_channel_count==2 && 
            m_bit_depth==16) {
        voice_handle_t res = player_add_voice(port, 
                                            &m_first,
                                            wav_voice_16_1_to_16_2,
                                            wi,
                                            m_allocator);
        if(res==nullptr) {
            m_deallocator(wi);
        }
        return res;
    } else if(wi->channel_count==2 && 
            wi->bit_depth==16 && 
            m_channel_count==1 && 
            m_bit_depth==16) {
        voice_handle_t res = player_add_voice(port, 
                                            &m_first,
                                            wav_voice_16_2_to_16_1,
                                            wi,
                                            m_allocator);
        if(res==nullptr) {
            m_deallocator(wi);
        }
        return res;
    } else if(wi->channel_count==1 && 
            wi->bit_depth==16 && 
            m_channel_count==1 && 
            m_bit_depth==16) {
        voice_handle_t res = player_add_voice(port, 
                                            &m_first,
                                            wav_voice_16_1_to_16_1,
                                            wi,
                                            m_allocator);
        if(res==nullptr) {
            m_deallocator(wi);
        }
        return res;
    } else if(wi->channel_count==2 && 
            wi->bit_depth==16 && 
            m_channel_count==1 && 
            m_bit_depth==8) {
        voice_handle_t res = player_add_voice(port, 
                                            &m_first,
                                            wav_voice_16_2_to_8_1,
                                            wi,
                                            m_allocator);
        if(res==nullptr) {
            m_deallocator(wi);
        }
        return res;
    } else if(wi->channel_count==1 && 
            wi->bit_depth==16 && 
            m_channel_count==1 && 
            m_bit_depth==8) {
        voice_handle_t res = player_add_voice(port, 
                                            &m_first,
                                            wav_voice_16_1_to_8_1,
                                            wi,
                                            m_allocator);
        if(res==nullptr) {
            m_deallocator(wi);
        }
        return res;
    }
    m_deallocator(wi);
    return nullptr;    
}

正如我所说，大部分内容是从wav文件中读取RIFF头部信息，然后标记wav数据本身在文件中的开始和结束点。请注意，我们为不同的wav和输出格式组合使用不同的函数。这比一个通用例程要直接得多，也更高效。

我们跳过到`update()`，因为它之前的代码微不足道。

void player::update() {
    const size_t buffer_size = m_frame_count*m_channel_count*(m_bit_depth/8);
    voice_info_t* first = (voice_info_t*)m_first;
    bool has_voices = false;
    voice_function_info_t vinf;
    vinf.buffer = m_buffer;
    vinf.frame_count = m_frame_count;
    vinf.channel_count = m_channel_count;
    vinf.bit_depth = m_bit_depth;
    vinf.sample_max = m_sample_max;
    voice_info_t* v = first;
    memset(m_buffer,0,buffer_size);
    while(v!=nullptr) {
        has_voices = true;
        v->fn(vinf, v->fn_state);
        v=v->next;
    }
    if(has_voices) {
        if(!m_sound_enabled) {
            if(m_on_sound_enable_cb!=nullptr) {
                m_on_sound_enable_cb(m_on_sound_enable_state);
            }
            m_sound_enabled = true;   
        }
    } else {
        if(m_sound_enabled) {
            if(m_on_sound_disable_cb!=nullptr) {
                m_on_sound_disable_cb(m_on_sound_disable_state);
            }
            m_sound_enabled = false;
        }
    }
    if(m_sound_enabled && m_on_flush_cb!=nullptr) {
        m_on_flush_cb(m_buffer, buffer_size, m_on_flush_state);
    }
}

我们在这里所做的是计算声音信息，然后遍历每个声音，将其数据添加到缓冲区。在此过程中，我们跟踪是否有声音。如果有声音且声音被禁用，我们会启用它。如果没有声音且声音已启用，我们会禁用它。接下来，我们调用flush将数据发送到声音硬件。

历史

2023年4月8日 - 首次提交
2023年4月9日 - 错误修复