65.9K
CodeProject 正在变化。 阅读更多。
Home

使用 NAudio 进行 MIDI 节拍检测

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.55/5 (9投票s)

2014年8月6日

CPOL

5分钟阅读

viewsIcon

22859

downloadIcon

1048

概述了用于《吉他英雄》等游戏的基本节拍检测。

引言

在过去的几周里,我一直在开发一款类似《吉他英雄》的游戏。当我绞尽脑汁思考如何获取每个音符以便在屏幕上显示时,我偶然发现很多人都在尝试做与我类似的事情,但网上没有一个明确的答案。我确实找到了第三方库 NAudio,可以在这里下载。这个库拥有令人惊叹的各种实用音频工具,但在这里,我们将使用它的 MIDI 库,因为 MIDI 文件格式是节拍检测中最容易使用的格式之一。

其他工具

另一个在(重新)使用此代码时有用的工具是Anvil Studio。它是一个免费的 MIDI 编辑器;然而,我们在这里需要做的是在其中打开和保存 MIDI 文件。这将把单轨 MIDI 转换为多轨 MIDI,以便于节拍检测的使用。

使用代码

我们需要做的第一件事是获取每个 MIDI 音符的频率。使用 440Hz 的钢琴调音可以很容易地做到这一点

static double[] midi = new double[127];

static void getFrequencies()
{
    //get frequencies for midi notes at 440 tuning (piano)
    int a = 440;
    for (int i = 0; i < 127; i++)
    {
        midi[i] = (a / 32) * Math.Pow(2, (i - 9) / 12);
    }
}

此代码将 MIDI 音阶(0-127)中的每个音符转换为赫兹(Hz)的频率。

接下来,我们需要获取 MIDI 文件的各种信息,以便我们可以对其进行操作

static string getFilename()
{
    Console.Write("Path to MIDI file (relative or absolute): ");
    string filen = Console.ReadLine();
    if (!File.Exists(filen))
    {
        Console.WriteLine("That file does not exist.");
        return getFilename();
    }
    return filen;
}

static int getTrackNumber()
{
    Console.Write("Melody Track #: ");
    string track = Console.ReadLine();
    int trackN;
    if (!int.TryParse(track, out trackN))
    {
        Console.WriteLine(track + " is not a valid number.");
        return getTrackNumber();
    }
    return trackN;
}

static string getSortType()
{
    Console.Write("Sort importance by [(d)uration or (v)olume]: ");
    string sortType = Console.ReadLine();
    if (sortType != "d" && sortType != "v")
    {
        Console.WriteLine("Invalid sort type " + sortType);
        return getSortType();
    }
    return sortType;
}

static string getOutputPath()
{
    Console.Write("Path to output (*.song): ");
    string output = Console.ReadLine() + ".song";
    foreach (char c in Path.GetInvalidFileNameChars())
    {
        if (output.Contains(c))
        {
            Console.WriteLine("The character '" + c + "' is not allowed in file names.");
            return getOutputPath();
        }
    }
    return output;
}

这些函数提供输入文件名、输出文件名、旋律轨道编号以及如何确定音符重要性的方法。重要性将决定音符的难度(高重要性用于简单难度,中等重要性用于中等难度,低重要性用于困难难度。这似乎有些矛盾,但音符的重要性决定了它对旋律的重要性,这意味着低重要性的音符是装饰性的,而高重要性的音符在所有难度下都应该出现)。

现在,我们可以开始使用 NAudio 来处理 MIDI 了。在我们开始查看代码之前,我们需要包含所有必要的 using 指令,包括后面文章中需要的一些

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using NAudio;
using NAudio.Midi;
using System.IO;
using System.Xml;
using System.Xml.Serialization;

我们需要访问的主要音频处理命名空间是 NAudioNAudio.Midi。如果您只关心节拍检测而不关心我如何应用它,则不需要 IO 或 XML 命名空间。

现在,让我们看看 Main 函数。粗略地浏览一下以大致了解我们在做什么,之后我会详细分解

static void Main(string[] args)
{
    getFrequencies();

    string filen = getFilename();

    MidiFile file = new MidiFile(filen);
    Console.WriteLine("Here are some possible tracks for you to choose from:");
    for (int i = 0; i < file.Tracks; i++)
    {
        string instrument = "";
        var events = file.Events.GetTrackEvents(i);
        foreach (var x in events)
        {
            if (x is PatchChangeEvent)
            {
                instrument = (x as PatchChangeEvent).ToString().Split(':')[1].Replace(" " + x.Channel + " ", "");
                break;
            }
        }
        if (!string.IsNullOrWhiteSpace(instrument)) Console.WriteLine(i + ": " + instrument);
    }

    int trackN = getTrackNumber();

    string sortType = getSortType();

    string output = getOutputPath();

    var trackevents = file.Events.GetTrackEvents(trackN);
    //this is the track 0 will have tempo information
    var tempoGetter = file.Events.GetTrackEvents(0);
    List<MidiNote> notes = new List<MidiNote>();
    int tempo = 0;
    foreach (var e in tempoGetter)
    {
        //get the tempo and drop out of that track
        if (e is TempoEvent)
        {
            //tempo in milliseconds
            tempo = (e as TempoEvent).MicrosecondsPerQuarterNote / 1000;
            break;
        }
    }
    for (int i = 0; i < trackevents.Count; i++)
    {
        //for every note
        MidiEvent e = trackevents[i];
        //if it's a note turning ON
        if (e is NoteOnEvent)
        {
            //the note on event, contains the time, volume, and the length of note
            var On = e as NoteOnEvent;
            //the note event, contains the midi note number (pitch)
            var n = e as NoteEvent;
            //the absolute time (in delta ticks) over the delta ticks per quarter note times the number of milliseconds per quarter note = time in milliseconds
            notes.Add(new MidiNote(On.NoteLength, midi[n.NoteNumber], (long)((On.AbsoluteTime / (float)file.DeltaTicksPerQuarterNote) * tempo), On.Velocity));
        }
    }
    //uses known values to get unknown values needed for a guitar hero clone
    //get the min and max frequency
    notes.Sort((n, n2) => n.frequency.CompareTo(n2.frequency));
    int minFreq = notes.First().frequency;
    int maxFreq = notes.Last().frequency;

    //get the min and max volume
    notes.Sort((n, n2) => n.volume.CompareTo(n2.volume));
    int minVol = notes.First().volume;
    int maxVol = notes.Last().volume;

    //get the min and max note duration
    notes.Sort((n, n2) => n.duration.CompareTo(n2.duration));
    int minLen = notes.First().duration;
    int maxLen = notes.Last().duration;

    //sort by time
    notes.Sort((n, n2) => n.startTime.CompareTo(n2.startTime));

    //outputs the song data to {output}.song
    List<Note> nt = new List<Note>();
    foreach (MidiNote n in notes)
    {
        MidiNote N = n;
        //gets unknown values for button and importance based off of known values
        buttonSignificance(ref N, minFreq, maxFreq, minVol, maxVol, minLen, maxLen, sortType == "v");
        nt.Add((Note)N);
    }
    //serialize to XML document
    XmlTextWriter w = new XmlTextWriter(output, null);
    XmlSerializer serializer = new XmlSerializer(typeof(List<Note>));
    serializer.Serialize(w, nt);
    w.Close();
    Console.WriteLine("done");
    Console.ReadKey();
}

此函数的第一部分是可选的,但仍然很有帮助

MidiFile file = new MidiFile(filen);
Console.WriteLine("Here are some possible tracks for you to choose from:");
for (int i = 0; i < file.Tracks; i++)
{
    string instrument = "";
    var events = file.Events.GetTrackEvents(i);
    foreach (var x in events)
    {
        if (x is PatchChangeEvent)
        {
            instrument = (x as PatchChangeEvent).ToString().Split(':')[1].Replace(" " + x.Channel + " ", "");
            break;
        }
    }
    if (!string.IsNullOrWhiteSpace(instrument)) Console.WriteLine(i + ": " + instrument);
}

此代码打开文件并获取 MIDI 文件中的音轨。Anvil Studio 格式化的 MIDI 文件中的每个音轨都有多个音轨,其中音轨零描述了每分钟节拍数(BPM)等信息,以及其他有助于我们精确确定音符发生时间的信息。所有其他音轨都包含一个 PatchChangeEvent(音色变化事件)。这告诉 MIDI 文件使用哪种类型的声音,例如小号、钢琴、小提琴等。我们检查每个音轨的唯一 PatchChangeEvent。PatchChangeEvent.ToString() 的格式为“{时间} PatchChange Ch: {通道号} {音色名称}”,因此获取冒号后面的所有内容并移除通道号即可获得乐器名称。音轨 0 在使用 Anvil 格式化时不会有 PatchChangeEvents,因为它没有要播放的音符。这意味着如果乐器为空,我们可以将该音轨排除在显示给用户的列表中。

现在,让我们看看一些时间信息

var trackevents = file.Events.GetTrackEvents(trackN);
//this is the track 0 will have tempo information
var tempoGetter = file.Events.GetTrackEvents(0);
List<MidiNote> notes = new List<MidiNote>();
int tempo = 0;
foreach (var e in tempoGetter)
{
    //get the tempo and drop out of that track
    if (e is TempoEvent)
    {
        //tempo in milliseconds
        tempo = (e as TempoEvent).MicrosecondsPerQuarterNote / 1000;
        break;
    }
}

我们首先获取用户指定的音轨 0 和旋律音轨的信息。音轨 0 包含我们需要的所有信息。我们遍历音轨 0 中的每个事件。如果它是 tempo 事件(速度事件),我们就获取每四分音符的毫秒数。由于只有一个 tempo 事件,循环随后就会中断以节省时间。

在开始处理所有信息之前,我们必须读取每个音符

for (int i = 0; i < trackevents.Count; i++)
{
    //for every note
    MidiEvent e = trackevents[i];
    //if it's a note turning ON
    if (e is NoteOnEvent)
    {
        //the note on event, contains the time, volume, and the length of note
        var On = e as NoteOnEvent;
        //the note event, contains the midi note number (pitch)
        var n = e as NoteEvent;
        //the absolute time (in delta ticks) over the delta ticks per quarter note times the number of milliseconds per quarter note = time in milliseconds
        notes.Add(new MidiNote(On.NoteLength, midi[n.NoteNumber], (long)((On.AbsoluteTime / (float)file.DeltaTicksPerQuarterNote) * tempo), On.Velocity));
    }
}

您可能已经注意到,MIDI 文件中包含各种类型的事件。在此代码块中,我们将重点关注 NoteEventNoteOnEvent。每次发生音符事件时,都会将一个新的 MidiNote 添加到音符列表中。MidiNote 定义如下

public struct MidiNote
{
    public int duration;
    public int frequency;
    public long startTime;
    public int volume;
    public int button;
    public bool? importance;//null = low,false=mid,true=high
    public MidiNote(int duration, double frequency, long startTime, int volume)
    {
        this.duration = duration;
        this.frequency = (int)frequency;
        this.startTime = startTime;
        this.volume = volume;
        button = 0;
        importance = null;
    }
    public override string ToString()
    {
        return "@" + startTime + "-" + (startTime + duration) + ":" + frequency;
    }
}

每个音符定义了用于定义按钮和重要性的频率、持续时间和音量。它们还定义了音符开始的时间(以毫秒为单位)。

现在,我们需要每个音量、持续时间和频率的最小值和最大值,以便我们能够获取未知值。然后,我们将按开始时间对音符进行排序

//uses known values to get unknown values needed for a guitar hero clone
//get the min and max frequency
notes.Sort((n, n2) => n.frequency.CompareTo(n2.frequency));
int minFreq = notes.First().frequency;
int maxFreq = notes.Last().frequency;

//get the min and max volume
notes.Sort((n, n2) => n.volume.CompareTo(n2.volume));
int minVol = notes.First().volume;
int maxVol = notes.Last().volume;

//get the min and max note duration
notes.Sort((n, n2) => n.duration.CompareTo(n2.duration));
int minLen = notes.First().duration;
int maxLen = notes.Last().duration;

//sort by time
notes.Sort((n, n2) => n.startTime.CompareTo(n2.startTime));

现在,我们必须将每个 MidiNote 转换为 Note

List<Note> nt = new List<Note>();
foreach (MidiNote n in notes)
{
    MidiNote N = n;
    //gets unknown values for button and importance based off of known values
    buttonSignificance(ref N, minFreq, maxFreq, minVol, maxVol, minLen, maxLen, sortType == "v");
    nt.Add((Note)N);
}

MidiNoteNote 之间的区别是什么?主要区别在于 Note 缺少持续时间和频率。它们不是必需的,因为频率变成了按钮,而持续时间仅用于排序。Note 也可以被序列化为 XML

[Serializable]
public class Note : IXmlSerializable
{
    int m, s, ms, b;
    bool? sig;

    public Note()
    { }

    public Note(int minute, int second, int milli, int button, bool? significance)
    {
        m = minute;
        s = second;
        ms = milli;
        sig = significance;
        b = button;
    }

    public System.Xml.Schema.XmlSchema GetSchema()
    {
        //GetSchema should always return null
        return null;
    }

    public void ReadXml(XmlReader reader)
    {
        //move to the next node. If it's a note, get the data
        if (reader.MoveToContent() == XmlNodeType.Element && reader.LocalName == "Note")
        {
            ms = int.Parse(reader.GetAttribute("milliseconds"));
            s = int.Parse(reader.GetAttribute("seconds"));
            m = int.Parse(reader.GetAttribute("minutes"));
            b = int.Parse(reader.GetAttribute("button"));
            string input = reader.GetAttribute("significance");
            bool sn;
            if (bool.TryParse(input, out sn))
            {
                sig = sn;
            }
            else
            {
                sig = null;
            }
            reader.Read();
        }
    }

    public void WriteXml(XmlWriter writer)
    {
        //write values to XML
        writer.WriteAttributeString("milliseconds", ms.ToString());
        writer.WriteAttributeString("seconds", s.ToString());
        writer.WriteAttributeString("minutes", m.ToString());
        writer.WriteAttributeString("significance", !sig.HasValue ? "null" : sig.ToString());
        writer.WriteAttributeString("button", b.ToString());
    }

    public static explicit operator Note(MidiNote n)
    {
        Note ret = new Note();
        ret.b = n.button;
        ret.sig = n.importance;
        long time = n.startTime;

        TimeSpan span = new TimeSpan(0, 0, 0, 0, (int)time);
        ret.m = span.Minutes;
        ret.s = span.Seconds;
        ret.ms = span.Milliseconds;

        return ret;
    }
}

我们还定义 buttonSignificance(MidiNote, int, int, int, int, int, int, bool) 函数

static void buttonSignificance(ref MidiNote note, int minFreq, int maxFreq, int minVel, int maxVel, int minLen, int maxLen, bool sortVol)
{
    //divide the frequencies into five steps
    float btnStep = (maxFreq - minFreq) / 5.0f;

    float button1Min = minFreq;
    float button2Min = button1Min + btnStep;
    float button3Min = button2Min + btnStep;
    float button4Min = button3Min + btnStep;
    float button5Min = button4Min + btnStep;
    //based off of the note frequency, get the button it needs to be
    if (note.frequency >= button1Min && note.frequency < button2Min) note.button = 0;
    if (note.frequency >= button2Min && note.frequency < button3Min) note.button = 1;
    if (note.frequency >= button3Min && note.frequency < button4Min) note.button = 2;
    if (note.frequency >= button4Min && note.frequency < button5Min) note.button = 3;
    if (note.frequency >= button5Min && note.frequency <= maxFreq) note.button = 4;

    if (sortVol)
    {
        //if sorting by volume, split volume into three steps and get importance
        float vStep = (maxVel - minVel) / 3.0f;
        float v1 = minVel;
        float v2 = v1 + vStep;
        float v3 = v2 + vStep;
        if (note.volume >= v1 && note.volume < v2) note.importance = null;
        if (note.volume >= v2 && note.volume < v3) note.importance = false;
        if (note.volume >= v3 && note.volume <= maxVel) note.importance = true;
    }
    else
    {
        //if sorting by duration, split duration into three steps and get importance
        float lStep = (maxLen - minLen) / 3.0f;
        float l1 = minLen;
        float l2 = l1 + lStep;
        float l3 = l2 + lStep;
        if (note.duration >= l1 && note.duration < l2) note.importance = null;
        if (note.duration >= l2 && note.duration < l3) note.importance = false;
        if (note.duration >= l3 && note.duration <= maxLen) note.importance = true;
    }
}

此函数将使用歌曲相对于音符的频率来获取按钮(0-5),并根据最长或最响亮的音符的持续时间或音量来确定音符的重要性。我个人更喜欢使用持续时间作为排序标准,但音量也完全有效,只是效果不如持续时间。

最后,我们必须将其序列化为 XML 文档

//serialize to XML document
XmlTextWriter w = new XmlTextWriter(output, null);
XmlSerializer serializer = new XmlSerializer(typeof(List<Note>));
serializer.Serialize(w, nt);
w.Close();
Console.WriteLine("done");
Console.ReadKey();

这就总结了所有内容。MIDI 文件将被转换为歌曲文件(XML)。

关注点

总的来说,节拍检测是一个复杂的过程。然而,它可以像我们所做的那样被简化(例如),也可以像我们想要的那么复杂(实时节拍检测)。这是节拍检测的众多方法之一。如果您对这个领域真正感兴趣,我鼓励您尝试使用 WAV 文件进行此操作。除非您以信号处理为生,否则这可能会非常困难。但是,如果您不受时间的限制(就像我一样)并且非常坚持,您就可以用节拍检测做出很棒的事情。

感谢阅读这篇我写的第一篇 CodeProject 文章;非常感谢您的反馈。

历史

2014 年 8 月 5 日 原帖

© . All rights reserved.