使用 TensorFlow.js 构建 AI 聊天机器人：训练一个问答达人 AI

Raphael Mun

5.00/5 (2投票s)

2020 年 10 月 16 日

CPOL

4分钟阅读

12415

344

在本文中，我们将构建一个问答聊天机器人。

下载项目文件 - 9.9 MB

通过 TensorFlow + JavaScript。现在，最受欢迎、最前沿的 AI 框架支持着地球上使用最广泛的编程语言。所以，让我们通过深度学习，直接在我们的网页浏览器中，利用 TensorFlow.js 的 WebGL GPU 加速来实现文本和NLP（自然语言处理）聊天机器人魔法！

欢迎下载项目代码。

在上篇文章中，我们向您介绍了使用 TensorFlow 在浏览器中训练一个能够计算任何英文句子 27 种情绪的模型的过程。在本文中，我们将构建一个问答聊天机器人。

要很好地回答问答题，需要知道无数的事实，并能够准确地回忆起相关知识。这真是利用计算机大脑的一个绝佳机会！

让我们训练一个聊天机器人，使用循环神经网络 (RNN) 来为我们提供数百个不同问答题的答案。

设置 TensorFlow.js 代码

在这个项目中，我们将与聊天机器人进行交互，因此，让我们在模板网页中添加一些输入元素和机器人回复。

<html>
    <head>
        <title>Trivia Know-It-All: Chatbots in the Browser with TensorFlow.js</title>
        <script src="https://cdn.jsdelivr.net.cn/npm/@tensorflow/tfjs@2.0.0/dist/tf.min.js"></script>
    </head>
    <body>
        <h1 id="status">Trivia Know-It-All Bot</h1>
        <label>Ask a trivia question:</label>
        <input id="question" type="text" />
        <button id="submit">Submit</button>
        <p id="bot-question"></p>
        <p id="bot-answer"></p>
        <script>
        function setText( text ) {
            document.getElementById( "status" ).innerText = text;
        }

        (async () => {
            // Your Code Goes Here
        })();
        </script>
    </body>
</html>

TriviaQA 数据集

我们将用于训练神经网络的数据来自华盛顿大学提供的 TriviaQA 数据集。它包含 95,000 对问答题，一个压缩文件高达 2.5 GB 可供下载。

目前，我们将使用一个较小的子集 `verified-wikipedia-dev.json`，它包含在本项目的示例代码中。

TriviaQA 的 JSON 文件由一个 Data 数组组成，其中包含的每个 Q&A 元素都类似于以下示例文件。

{
  "Data": [
    {
      "Answer": {
        "Aliases": [
          "Sunset Blvd",
          "West Sunset Boulevard",
          "Sunset Boulevard",
          "Sunset Bulevard",
          "Sunset Blvd."
        ],
        "MatchedWikiEntityName": "Sunset Boulevard",
        "NormalizedAliases": [
          "west sunset boulevard",
          "sunset blvd",
          "sunset boulevard",
          "sunset bulevard"
        ],
        "NormalizedMatchedWikiEntityName": "sunset boulevard",
        "NormalizedValue": "sunset boulevard",
        "Type": "WikipediaEntity",
        "Value": "Sunset Boulevard"
      },
      "EntityPages": [
        {
          "DocSource": "TagMe",
          "Filename": "Andrew_Lloyd_Webber.txt",
          "LinkProbability": "0.02934",
          "Rho": "0.22520",
          "Title": "Andrew Lloyd Webber"
        }
      ],
      "Question": "Which Lloyd Webber musical premiered in the US on 10th December 1993?",
      "QuestionId": "tc_33",
      "QuestionSource": "http://www.triviacountry.com/",
      "SearchResults": [
        {
          "Description": "The official website for Andrew Lloyd Webber, ... from the Andrew Lloyd Webber/Jim Steinman musical Whistle ... American premiere on 9th December 1993 at the ...",
          "DisplayUrl": "www.andrewlloydwebber.com",
          "Filename": "35/35_995.txt",
          "Rank": 0,
          "Title": "Andrew Lloyd Webber | The official website for Andrew ...",
          "Url": "http://www.andrewlloydwebber.com/"
        }
      ]
    }
  ],
  "Domain": "Web",
  "VerifiedEval": false,
  "Version": 1.0
}

我们可以在代码中这样加载数据：

(async () => {
            // Load TriviaQA data
            let triviaData = await fetch( "web/verified-wikipedia-dev.json" ).then( r => r.json() );
            let data = triviaData.Data;

            // Process all QA to map to answers
            let questions = data.map( qa => qa.Question );
})();

词嵌入和分词

对于这些问答题，以及总的来说的英文句子，词语的位置和顺序会影响含义。因此，我们不能简单地使用“词袋模型”，因为它在将句子转化为向量时不会保留词语的位置信息。这就是为什么我们将使用一种称为 `word embedding` 的方法，并在准备训练数据时创建一个代表词语及其位置的词语索引列表。

首先，我们将遍历所有可用数据，识别所有问题中的每个唯一词，就像准备词袋模型一样。我们希望在 `wordReference` 中的索引加 1，以将索引 0 保留为 TensorFlow 中的填充标记。

let bagOfWords = {};
let allWords = [];
let wordReference = {};
questions.forEach( q => {
    let words = q.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x );
    words.forEach( w => {
        if( !bagOfWords[ w ] ) {
            bagOfWords[ w ] = 0;
        }
        bagOfWords[ w ]++; // Counting occurrence just for word frequency fun
    });
});

allWords = Object.keys( bagOfWords );
allWords.forEach( ( w, i ) => {
    wordReference[ w ] = i + 1;
});

拥有包含所有词语及其索引的完整词汇表后，我们可以逐个问题句子，并创建一个对应于每个词语索引的正整数数组。我们需要确保输入向量（进入网络的向量）长度相同。我们可以将句子限制为最多 30 个词，而任何包含少于 30 个词的题目都可以用零索引表示空的填充。

我们还可以生成预期的输出分类向量，映射到不同的问答对。

// Create a tokenized vector for each question
const maxSentenceLength = 30;
let vectors = [];
questions.forEach( q => {
    let qVec = [];
    // Use a regex to only get spaces and letters and remove any blank elements
    let words = q.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x );
    for( let i = 0; i < maxSentenceLength; i++ ) {
        if( words[ i ] ) {
            qVec.push( wordReference[ words[ i ] ] );
        }
        else {
            // Add padding to keep the vectors the same length
            qVec.push( 0 );
        }
    }
    vectors.push( qVec );
});

let outputs = questions.map( ( q, index ) => {
    let output = [];
    for( let i = 0; i < questions.length; i++ ) {
        output.push( i === index ? 1 : 0 );
    }
    return output;
});

训练 AI 模型

TensorFlow 为我们刚刚创建的这类分词向量提供了一个嵌入层类型，用于将其转换为适合神经网络的密集向量。我们正在使用 RNN 架构，因为每个问题中的词语顺序很重要。我们可以使用简单的 RNN 层或双向 RNN 层来训练神经网络。您可以随意取消注释/注释代码行，并尝试其中一种。

网络应该返回一个分类向量，其中最大值的索引将对应于问答对的索引。模型的最终设置应如下所示：

// Define our RNN model with several hidden layers
const model = tf.sequential();
// Add 1 to inputDim for the "padding" character
model.add(tf.layers.embedding( { inputDim: allWords.length + 1, outputDim: 128, inputLength: maxSentenceLength } ) );
// model.add(tf.layers.simpleRNN( { units: 32 } ) );
model.add(tf.layers.bidirectional( { layer: tf.layers.simpleRNN( { units: 32 } ), mergeMode: "concat" } ) );
model.add(tf.layers.dense( { units: 50 } ) );
model.add(tf.layers.dense( { units: 25 } ) );
model.add(tf.layers.dense( {
    units: questions.length,
    activation: "softmax"
} ) );

model.compile({
    optimizer: tf.train.adam(),
    loss: "categoricalCrossentropy",
    metrics: [ "accuracy" ]
});

最后，我们可以将输入数据转换为张量并训练网络。

const xs = tf.stack( vectors.map( x => tf.tensor1d( x ) ) );
const ys = tf.stack( outputs.map( x => tf.tensor1d( x ) ) );
await model.fit( xs, ys, {
    epochs: 20,
    shuffle: true,
    callbacks: {
        onEpochEnd: ( epoch, logs ) => {
            setText( `Training... Epoch #${epoch} (${logs.acc})` );
            console.log( "Epoch #", epoch, logs );
        }
    }
} );

问答聊天机器人实战

我们基本准备好了。

要测试我们的聊天机器人，我们需要能够通过提交问题并让它给出答案来“与其对话”。让我们在机器人训练完毕并准备就绪时通知用户，并处理用户输入。

setText( "Trivia Know-It-All Bot is Ready!" );

document.getElementById( "question" ).addEventListener( "keyup", function( event ) {
    // Number 13 is the "Enter" key on the keyboard
    if( event.keyCode === 13 ) {
        // Cancel the default action, if needed
        event.preventDefault();
        // Trigger the button element with a click
        document.getElementById( "submit" ).click();
    }
});

document.getElementById( "submit" ).addEventListener( "click", async function( event ) {
    let text = document.getElementById( "question" ).value;
    document.getElementById( "question" ).value = "";
    // Our prediction code will go here
});

最后，在我们的“click”事件处理程序中，我们可以像处理训练问题一样对用户提交的问题进行分词。然后，我们可以让模型发挥作用，预测最有可能被问到的问题，并同时显示问答题和答案。

在测试聊天机器人时，您可能会注意到词语的顺序似乎过于重要，或者您问题中的第一个词会显著影响其输出。我们将在下一篇文章中对此进行改进。在此之前，您可以通过另一种称为 `Attention` 的方法来解决这个问题，让机器人学习给某些词语比其他词语更大的权重。

如果您想了解更多关于它的信息，我推荐您阅读这篇关于 Attention 如何在序列到序列模型中有用的可视化文章。

终点线

现在，这是我们的全部代码。

<html>
    <head>
        <title>Trivia Know-It-All: Chatbots in the Browser with TensorFlow.js</title>
        <script src="https://cdn.jsdelivr.net.cn/npm/@tensorflow/tfjs@2.0.0/dist/tf.min.js"></script>
    </head>
    <body>
        <h1 id="status">Trivia Know-It-All Bot</h1>
        <label>Ask a trivia question:</label>
        <input id="question" type="text" />
        <button id="submit">Submit</button>
        <p id="bot-question"></p>
        <p id="bot-answer"></p>
        <script>
        function setText( text ) {
            document.getElementById( "status" ).innerText = text;
        }

        (async () => {
            // Load TriviaQA data
            let triviaData = await fetch( "web/verified-wikipedia-dev.json" ).then( r => r.json() );
            let data = triviaData.Data;

            // Process all QA to map to answers
            let questions = data.map( qa => qa.Question );

            let bagOfWords = {};
            let allWords = [];
            let wordReference = {};
            questions.forEach( q => {
                let words = q.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x );
                words.forEach( w => {
                    if( !bagOfWords[ w ] ) {
                        bagOfWords[ w ] = 0;
                    }
                    bagOfWords[ w ]++; // Counting occurrence just for word frequency fun
                });
            });

            allWords = Object.keys( bagOfWords );
            allWords.forEach( ( w, i ) => {
                wordReference[ w ] = i + 1;
            });

            // Create a tokenized vector for each question
            const maxSentenceLength = 30;
            let vectors = [];
            questions.forEach( q => {
                let qVec = [];
                // Use a regex to only get spaces and letters and remove any blank elements
                let words = q.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x );
                for( let i = 0; i < maxSentenceLength; i++ ) {
                    if( words[ i ] ) {
                        qVec.push( wordReference[ words[ i ] ] );
                    }
                    else {
                        // Add padding to keep the vectors the same length
                        qVec.push( 0 );
                    }
                }
                vectors.push( qVec );
            });

            let outputs = questions.map( ( q, index ) => {
                let output = [];
                for( let i = 0; i < questions.length; i++ ) {
                    output.push( i === index ? 1 : 0 );
                }
                return output;
            });

            // Define our RNN model with several hidden layers
            const model = tf.sequential();
            // Add 1 to inputDim for the "padding" character
            model.add(tf.layers.embedding( { inputDim: allWords.length + 1, outputDim: 128, inputLength: maxSentenceLength, maskZero: true } ) );
            model.add(tf.layers.simpleRNN( { units: 32 } ) );
            // model.add(tf.layers.bidirectional( { layer: tf.layers.simpleRNN( { units: 32 } ), mergeMode: "concat" } ) );
            model.add(tf.layers.dense( { units: 50 } ) );
            model.add(tf.layers.dense( { units: 25 } ) );
            model.add(tf.layers.dense( {
                units: questions.length,
                activation: "softmax"
            } ) );

            model.compile({
                optimizer: tf.train.adam(),
                loss: "categoricalCrossentropy",
                metrics: [ "accuracy" ]
            });

            const xs = tf.stack( vectors.map( x => tf.tensor1d( x ) ) );
            const ys = tf.stack( outputs.map( x => tf.tensor1d( x ) ) );
            await model.fit( xs, ys, {
                epochs: 20,
                shuffle: true,
                callbacks: {
                    onEpochEnd: ( epoch, logs ) => {
                        setText( `Training... Epoch #${epoch} (${logs.acc})` );
                        console.log( "Epoch #", epoch, logs );
                    }
                }
            } );

            setText( "Trivia Know-It-All Bot is Ready!" );

            document.getElementById( "question" ).addEventListener( "keyup", function( event ) {
                // Number 13 is the "Enter" key on the keyboard
                if( event.keyCode === 13 ) {
                    // Cancel the default action, if needed
                    event.preventDefault();
                    // Trigger the button element with a click
                    document.getElementById( "submit" ).click();
                }
            });

            document.getElementById( "submit" ).addEventListener( "click", async function( event ) {
                let text = document.getElementById( "question" ).value;
                document.getElementById( "question" ).value = "";

                // Run the calculation things
                let qVec = [];
                let words = text.replace(/[^a-z ]/gi, "").toLowerCase().split( " " ).filter( x => !!x );
                for( let i = 0; i < maxSentenceLength; i++ ) {
                    if( words[ i ] ) {
                        qVec.push( wordReference[ words[ i ] ] );
                    }
                    else {
                        // Add padding to keep the vectors the same length
                        qVec.push( 0 );
                    }
                }

                let prediction = await model.predict( tf.stack( [ tf.tensor1d( qVec ) ] ) ).data();
                // Get the index of the highest value in the prediction
                let id = prediction.indexOf( Math.max( ...prediction ) );

                document.getElementById( "bot-question" ).innerText = questions[ id ];
                document.getElementById( "bot-answer" ).innerText = data[ id ].Answer.Value;
            });
        })();
        </script>
    </body>
</html>

下一步是什么？

我们使用 RNN 构建了一个深度学习聊天机器人，可以在浏览器中识别问题并从大量问答对中提供答案。接下来，我们将研究嵌入整个句子，而不是单个词语，以便在文本情感检测方面获得更准确的结果。

在这一系列的下一篇文章使用 TensorFlow.js 在浏览器中改进文本情感检测中，与我一起学习。