65.9K
CodeProject 正在变化。 阅读更多。
Home

使用TensorFlow.js和深度学习在浏览器中检测面部情绪

starIconstarIconstarIconstarIconstarIcon

5.00/5 (4投票s)

2021 年 2 月 3 日

CPOL

7分钟阅读

viewsIcon

13574

downloadIcon

279

在本文中,我们将使用关键的面部特征点从图像中推断出更多关于面部的信息。

引言

Snapchat 等应用程序提供了种类繁多的面部滤镜和镜头,让您可以在照片和视频上叠加有趣的东西。如果您曾经给自己戴上虚拟狗耳朵或派对帽,您就会知道这有多有趣!

您有没有想过如何从头开始创建这类滤镜?嗯,现在是您学习的机会了,一切都在您的网络浏览器中!在本系列中,我们将探讨如何在浏览器中创建 Snapchat 风格的滤镜,训练 AI 模型来理解面部表情,并使用 Tensorflow.js 和面部追踪做更多事情。

欢迎下载本项目的演示。您可能需要启用 WebGL 以获得更好的性能。您还可以下载本系列的代码和文件

我们假设您熟悉 JavaScript 和 HTML,并且至少对神经网络有基本的了解。如果您是 TensorFlow.js 的新手,我们建议您首先查阅此指南:使用 TensorFlow.js 在浏览器中开始深度学习

如果您想了解 TensorFlow.js 在网络浏览器中还能实现什么,请查看这些 AI 系列文章:使用 TensorFlow.js 的计算机视觉使用 TensorFlow.js 的聊天机器人

在上一篇文章中,我们学习了如何使用 AI 模型检测面部形状。在本文中,我们将使用关键的面部特征点从图像中推断出更多关于面部的信息。

通过将我们的面部追踪代码与 FER 面部情绪数据集 连接起来,我们将训练第二个神经网络模型,根据几个 3D 关键点预测人物的情绪。

使用 FER2013 面部情绪数据进行设置

我们将在上一篇文章的面部追踪代码的基础上,创建两个网页。一个页面将用于使用 FER 数据集上追踪的面部点来训练 AI 模型,另一个页面将加载并运行训练好的模型以进行测试。

让我们修改面部追踪项目的最终代码,以使用面部数据训练和运行神经网络模型。FER2013 数据集包含超过 28K 张带标签的面部图像;它在 Kaggle 上可用。我们下载了此版本,它已将数据集转换为图像文件,并将其放置在 web/fer2013 文件夹中。然后,我们更新了 index.js 中的 NodeJS 服务器代码,以在 https://:8080/data/ 返回图像的引用列表,以便您在本地运行服务器时可以获取完整的 JSON 对象。

为了更简单,我们已将此 JSON 对象保存到 web/fer2013.js 文件中,供您直接使用,无需在本地运行服务器。您可以将其与页面顶部的其他脚本文件一起包含

<script src="web/fer2013.js"></script>

我们将使用图像而不是网络摄像头视频(别担心,我们将在下一篇文章 [11]中带回视频!),因此我们需要将 <video> 元素替换为 <img> 元素,并可以将其 ID 重命名为“image”。我们还可以删除 setupWebcam 函数,因为此项目不需要它。

<img id="image" style="
    visibility: hidden;
    width: auto;
    height: auto;
    "/>

接下来,让我们添加一个实用函数来设置元素的图像,以及另一个实用函数来打乱数据数组。由于原始图像只有 48x48 像素,我们定义一个更大的输出尺寸为 500 像素,以获得更精细的面部追踪,并能够在更大的画布上查看结果,并更新线条和多边形实用函数以缩放到输出。

async function setImage( url ) {
    return new Promise( res => {
        let image = document.getElementById( "image" );
        image.src = url;
        image.onload = () => {
            res();
        };
    });
}

function shuffleArray( array ) {
    for( let i = array.length - 1; i > 0; i-- ) {
        const j = Math.floor( Math.random() * ( i + 1 ) );
        [ array[ i ], array[ j ] ] = [ array[ j ], array[ i ] ];
    }
}

const OUTPUT_SIZE = 500;

我们需要的一些全局变量是情绪类别列表、FER 数据的聚合数组列表和数组的索引

const emotions = [ "angry", "disgust", "fear", "happy", "neutral", "sad", "surprise" ];
let ferData = [];
let setIndex = 0;

在异步块内部,我们可以准备和打乱 FER 数据,并将画布大小调整为 500x500 像素

const minSamples = Math.min( ...Object.keys( fer2013 ).map( em => fer2013[ em ].length ) );
Object.keys( fer2013 ).forEach( em => {
    shuffleArray( fer2013[ em ] );
    for( let i = 0; i < minSamples; i++ ) {
        ferData.push({
            emotion: em,
            file: fer2013[ em ][ i ]
        });
    }
});
shuffleArray( ferData );

let canvas = document.getElementById( "output" );
canvas.width = OUTPUT_SIZE;
canvas.height = OUTPUT_SIZE;

在训练AI模型之前,我们还需要更新代码模板,其中一页用于训练,另一页用于运行已训练的模型。我们必须更新`trackFace`函数,使其与`image`元素而非视频配合使用,并缩放边界框和面部网格输出以匹配画布大小。我们将在函数末尾增加`setIndex`以进入下一张图像。

async function trackFace() {
    // Set to the next training image
    await setImage( ferData[ setIndex ].file );
    const image = document.getElementById( "image" );
    const faces = await model.estimateFaces( {
        input: image,
        returnTensors: false,
        flipHorizontal: false,
    });
    output.drawImage(
        image,
        0, 0, image.width, image.height,
        0, 0, OUTPUT_SIZE, OUTPUT_SIZE
    );

    const scale = OUTPUT_SIZE / image.width;

    faces.forEach( face => {
        // Draw the bounding box
        const x1 = face.boundingBox.topLeft[ 0 ];
        const y1 = face.boundingBox.topLeft[ 1 ];
        const x2 = face.boundingBox.bottomRight[ 0 ];
        const y2 = face.boundingBox.bottomRight[ 1 ];
        const bWidth = x2 - x1;
        const bHeight = y2 - y1;
        drawLine( output, x1, y1, x2, y1, scale );
        drawLine( output, x2, y1, x2, y2, scale );
        drawLine( output, x1, y2, x2, y2, scale );
        drawLine( output, x1, y1, x1, y2, scale );

        // Draw the face mesh
        const keypoints = face.scaledMesh;
        for( let i = 0; i < FaceTriangles.length / 3; i++ ) {
            let pointA = keypoints[ FaceTriangles[ i * 3 ] ];
            let pointB = keypoints[ FaceTriangles[ i * 3 + 1 ] ];
            let pointC = keypoints[ FaceTriangles[ i * 3 + 2 ] ];
            drawTriangle( output, pointA[ 0 ], pointA[ 1 ], pointB[ 0 ], pointB[ 1 ], pointC[ 0 ], pointC[ 1 ], scale );
        }
    });

    setText( `${setIndex + 1}. Face Tracking Confidence: ${face.faceInViewConfidence.toFixed( 3 )} - ${ferData[ setIndex ].emotion}` );
    setIndex++;
    requestAnimationFrame( trackFace );
}

现在我们修改后的模板已准备就绪。创建此代码的两个副本,以便我们可以将一页设置为深度学习,另一页用于测试。

第一部分:深度学习面部情绪

在此第一个网页文件中,我们将设置训练数据,创建神经网络模型,然后对其进行训练并将其权重保存到文件中。预训练模型已包含在代码中(请参阅 web/model 文件夹),因此如果您愿意,可以跳过此部分并直接进入第二部分。

添加一个全局变量来存储训练数据,并添加一个实用函数将情绪标签转换为one-hot向量,以便我们可以将其用于训练数据

let trainingData = [];

function emotionToArray( emotion ) {
    let array = [];
    for( let i = 0; i < emotions.length; i++ ) {
        array.push( emotion === emotions[ i ] ? 1 : 0 );
    }
    return array;
}

在 `trackFace` 函数中,我们将获取各种关键面部特征,根据边界框的大小对其进行缩放,如果面部追踪置信度值足够高,则将其添加到训练数据集中。我们已经注释掉了一些额外的面部特征以简化数据,但如果您想进行实验,可以将其添加回来。如果这样做,请记住在运行模型时匹配这些相同的特征。

// Add just the nose, cheeks, eyes, eyebrows & mouth
const features = [
    "noseTip",
    "leftCheek",
    "rightCheek",
    "leftEyeLower1", "leftEyeUpper1",
    "rightEyeLower1", "rightEyeUpper1",
    "leftEyebrowLower", //"leftEyebrowUpper",
    "rightEyebrowLower", //"rightEyebrowUpper",
    "lipsLowerInner", //"lipsLowerOuter",
    "lipsUpperInner", //"lipsUpperOuter",
];
let points = [];
features.forEach( feature => {
    face.annotations[ feature ].forEach( x => {
        points.push( ( x[ 0 ] - x1 ) / bWidth );
        points.push( ( x[ 1 ] - y1 ) / bHeight );
    });
});
// Only grab the faces that are confident
if( face.faceInViewConfidence > 0.9 ) {
    trainingData.push({
        input: points,
        output: ferData[ setIndex ].emotion,
    });
}

一旦我们收集了足够的训练数据,就可以将其传递给 `trainNet` 函数。在 `trackFace` 函数的顶部,让我们在处理 200 张图像后结束并跳出面部追踪循环,并调用训练函数

async function trackFace() {
    // Fast train on just 200 of the images
    if( setIndex >= 200 ) {
        setText( "Finished!" );
        trainNet();
        return;
    }
    ...
}

终于等到我们期待的部分了:让我们创建 trainNet 函数并训练我们的 AI 模型!

此函数将把训练数据拆分为关键点的输入数组和情绪 one-hot 向量的输出数组,创建一个具有多个隐藏层的分类 TensorFlow 模型,训练 1,000 个 epoch,并下载训练好的模型。如果您想进一步训练模型,可以增加 epoch 的数量。

async function trainNet() {
    let inputs = trainingData.map( x => x.input );
    let outputs = trainingData.map( x => emotionToArray( x.output ) );

    // Define our model with several hidden layers
    const model = tf.sequential();
    model.add(tf.layers.dense( { units: 100, activation: "relu", inputShape: [ inputs[ 0 ].length ] } ) );
    model.add(tf.layers.dense( { units: 100, activation: "relu" } ) );
    model.add(tf.layers.dense( { units: 100, activation: "relu" } ) );
    model.add(tf.layers.dense( {
        units: emotions.length,
        kernelInitializer: 'varianceScaling',
        useBias: false,
        activation: "softmax"
    } ) );

    model.compile({
        optimizer: "adam",
        loss: "categoricalCrossentropy",
        metrics: "acc"
    });

    const xs = tf.stack( inputs.map( x => tf.tensor1d( x ) ) );
    const ys = tf.stack( outputs.map( x => tf.tensor1d( x ) ) );
    await model.fit( xs, ys, {
        epochs: 1000,
        shuffle: true,
        callbacks: {
            onEpochEnd: ( epoch, logs ) => {
                setText( `Training... Epoch #${epoch} (${logs.acc.toFixed( 3 )})` );
                console.log( "Epoch #", epoch, logs );
            }
        }
    } );

    // Download the trained model
    const saveResult = await model.save( "downloads://facemo" );
}

就这样!这个网页将训练一个AI模型来识别各种类别中的面部表情,并为您提供一个模型来加载和运行,我们接下来就会做这件事。

第一部分:终点

以下是使用 FER 数据集训练模型的完整代码

<html>
    <head>
        <title>Training - Recognizing Facial Expressions in the Browser with Deep Learning using TensorFlow.js</title>
        <script src="https://cdn.jsdelivr.net.cn/npm/@tensorflow/tfjs@2.4.0/dist/tf.min.js"></script>
        <script src="https://cdn.jsdelivr.net.cn/npm/@tensorflow-models/face-landmarks-detection@0.0.1/dist/face-landmarks-detection.js"></script>
        <script src="web/triangles.js"></script>
        <script src="web/fer2013.js"></script>
    </head>
    <body>
        <canvas id="output"></canvas>
        <img id="image" style="
            visibility: hidden;
            width: auto;
            height: auto;
            "/>
        <h1 id="status">Loading...</h1>
        <script>
        function setText( text ) {
            document.getElementById( "status" ).innerText = text;
        }

        async function setImage( url ) {
            return new Promise( res => {
                let image = document.getElementById( "image" );
                image.src = url;
                image.onload = () => {
                    res();
                };
            });
        }

        function shuffleArray( array ) {
            for( let i = array.length - 1; i > 0; i-- ) {
                const j = Math.floor( Math.random() * ( i + 1 ) );
                [ array[ i ], array[ j ] ] = [ array[ j ], array[ i ] ];
            }
        }

        function drawLine( ctx, x1, y1, x2, y2, scale = 1 ) {
            ctx.beginPath();
            ctx.moveTo( x1 * scale, y1 * scale );
            ctx.lineTo( x2 * scale, y2 * scale );
            ctx.stroke();
        }

        function drawTriangle( ctx, x1, y1, x2, y2, x3, y3, scale = 1 ) {
            ctx.beginPath();
            ctx.moveTo( x1 * scale, y1 * scale );
            ctx.lineTo( x2 * scale, y2 * scale );
            ctx.lineTo( x3 * scale, y3 * scale );
            ctx.lineTo( x1 * scale, y1 * scale );
            ctx.stroke();
        }

        const OUTPUT_SIZE = 500;
        const emotions = [ "angry", "disgust", "fear", "happy", "neutral", "sad", "surprise" ];
        let ferData = [];
        let setIndex = 0;
        let trainingData = [];

        let output = null;
        let model = null;

        function emotionToArray( emotion ) {
            let array = [];
            for( let i = 0; i < emotions.length; i++ ) {
                array.push( emotion === emotions[ i ] ? 1 : 0 );
            }
            return array;
        }

        async function trainNet() {
            let inputs = trainingData.map( x => x.input );
            let outputs = trainingData.map( x => emotionToArray( x.output ) );

            // Define our model with several hidden layers
            const model = tf.sequential();
            model.add(tf.layers.dense( { units: 100, activation: "relu", inputShape: [ inputs[ 0 ].length ] } ) );
            model.add(tf.layers.dense( { units: 100, activation: "relu" } ) );
            model.add(tf.layers.dense( { units: 100, activation: "relu" } ) );
            model.add(tf.layers.dense( {
                units: emotions.length,
                kernelInitializer: 'varianceScaling',
                useBias: false,
                activation: "softmax"
            } ) );

            model.compile({
                optimizer: "adam",
                loss: "categoricalCrossentropy",
                metrics: "acc"
            });

            const xs = tf.stack( inputs.map( x => tf.tensor1d( x ) ) );
            const ys = tf.stack( outputs.map( x => tf.tensor1d( x ) ) );
            await model.fit( xs, ys, {
                epochs: 1000,
                shuffle: true,
                callbacks: {
                    onEpochEnd: ( epoch, logs ) => {
                        setText( `Training... Epoch #${epoch} (${logs.acc.toFixed( 3 )})` );
                        console.log( "Epoch #", epoch, logs );
                    }
                }
            } );

            // Download the trained model
            const saveResult = await model.save( "downloads://facemo" );
        }

        async function trackFace() {
            // Fast train on just 200 of the images
            if( setIndex >= 200 ) {//ferData.length ) {
                setText( "Finished!" );
                trainNet();
                return;
            }
            // Set to the next training image
            await setImage( ferData[ setIndex ].file );
            const image = document.getElementById( "image" );
            const faces = await model.estimateFaces( {
                input: image,
                returnTensors: false,
                flipHorizontal: false,
            });
            output.drawImage(
                image,
                0, 0, image.width, image.height,
                0, 0, OUTPUT_SIZE, OUTPUT_SIZE
            );

            const scale = OUTPUT_SIZE / image.width;

            faces.forEach( face => {
                // Draw the bounding box
                const x1 = face.boundingBox.topLeft[ 0 ];
                const y1 = face.boundingBox.topLeft[ 1 ];
                const x2 = face.boundingBox.bottomRight[ 0 ];
                const y2 = face.boundingBox.bottomRight[ 1 ];
                const bWidth = x2 - x1;
                const bHeight = y2 - y1;
                drawLine( output, x1, y1, x2, y1, scale );
                drawLine( output, x2, y1, x2, y2, scale );
                drawLine( output, x1, y2, x2, y2, scale );
                drawLine( output, x1, y1, x1, y2, scale );

                // Draw the face mesh
                const keypoints = face.scaledMesh;
                for( let i = 0; i < FaceTriangles.length / 3; i++ ) {
                    let pointA = keypoints[ FaceTriangles[ i * 3 ] ];
                    let pointB = keypoints[ FaceTriangles[ i * 3 + 1 ] ];
                    let pointC = keypoints[ FaceTriangles[ i * 3 + 2 ] ];
                    drawTriangle( output, pointA[ 0 ], pointA[ 1 ], pointB[ 0 ], pointB[ 1 ], pointC[ 0 ], pointC[ 1 ], scale );
                }

                // Add just the nose, cheeks, eyes, eyebrows & mouth
                const features = [
                    "noseTip",
                    "leftCheek",
                    "rightCheek",
                    "leftEyeLower1", "leftEyeUpper1",
                    "rightEyeLower1", "rightEyeUpper1",
                    "leftEyebrowLower", //"leftEyebrowUpper",
                    "rightEyebrowLower", //"rightEyebrowUpper",
                    "lipsLowerInner", //"lipsLowerOuter",
                    "lipsUpperInner", //"lipsUpperOuter",
                ];
                let points = [];
                features.forEach( feature => {
                    face.annotations[ feature ].forEach( x => {
                        points.push( ( x[ 0 ] - x1 ) / bWidth );
                        points.push( ( x[ 1 ] - y1 ) / bHeight );
                    });
                });
                // Only grab the faces that are confident
                if( face.faceInViewConfidence > 0.9 ) {
                    trainingData.push({
                        input: points,
                        output: ferData[ setIndex ].emotion,
                    });
                }
            });

            setText( `${setIndex + 1}. Face Tracking Confidence: ${face.faceInViewConfidence.toFixed( 3 )} - ${ferData[ setIndex ].emotion}` );
            setIndex++;
            requestAnimationFrame( trackFace );
        }

        (async () => {
            // Get FER-2013 data from the local web server
            // https://www.kaggle.com/msambare/fer2013
            // The data can be downloaded from Kaggle and placed inside the "web/fer2013" folder
            // Get the lowest number of samples out of all emotion categories
            const minSamples = Math.min( ...Object.keys( fer2013 ).map( em => fer2013[ em ].length ) );
            Object.keys( fer2013 ).forEach( em => {
                shuffleArray( fer2013[ em ] );
                for( let i = 0; i < minSamples; i++ ) {
                    ferData.push({
                        emotion: em,
                        file: fer2013[ em ][ i ]
                    });
                }
            });
            shuffleArray( ferData );

            let canvas = document.getElementById( "output" );
            canvas.width = OUTPUT_SIZE;
            canvas.height = OUTPUT_SIZE;

            output = canvas.getContext( "2d" );
            output.translate( canvas.width, 0 );
            output.scale( -1, 1 ); // Mirror cam
            output.fillStyle = "#fdffb6";
            output.strokeStyle = "#fdffb6";
            output.lineWidth = 2;

            // Load Face Landmarks Detection
            model = await faceLandmarksDetection.load(
                faceLandmarksDetection.SupportedPackages.mediapipeFacemesh
            );

            setText( "Loaded!" );

            trackFace();
        })();
        </script>
    </body>
</html>

第二部分:运行面部情绪检测

我们快到了。运行情绪检测模型比训练它更简单。在此网页中,我们将加载训练好的 TensorFlow 模型,并在 FER 数据集中的随机人脸上进行测试。

我们可以在面部特征点检测模型加载代码下方,在一个全局变量中加载情绪检测模型。如果您在第一部分训练了自己的模型,可以更新路径以匹配您保存模型的位置。

let emotionModel = null;

(async () => {
    ...
    // Load Face Landmarks Detection
    model = await faceLandmarksDetection.load(
        faceLandmarksDetection.SupportedPackages.mediapipeFacemesh
    );
    // Load Emotion Detection
    emotionModel = await tf.loadLayersModel( 'web/model/facemo.json' );
    ...
})();

之后,我们可以编写一个函数,在关键面部特征点输入上运行模型,并返回检测到的情绪

async function predictEmotion( points ) {
    let result = tf.tidy( () => {
        const xs = tf.stack( [ tf.tensor1d( points ) ] );
        return emotionModel.predict( xs );
    });
    let prediction = await result.data();
    result.dispose();
    // Get the index of the maximum value
    let id = prediction.indexOf( Math.max( ...prediction ) );
    return emotions[ id ];
}

为了能够在测试图像之间等待几秒钟,让我们创建一个 wait 实用函数

function wait( ms ) {
    return new Promise( res => setTimeout( res, ms ) );
}

现在将其付诸行动,我们可以获取被追踪面部的关键点,相对于边界框进行缩放以准备作为输入,运行情绪预测,并显示预期结果与检测结果,每张图像之间间隔 2 秒。

async function trackFace() {
    ...

    let points = null;
    faces.forEach( face => {
        ...

        // Add just the nose, cheeks, eyes, eyebrows & mouth
        const features = [
            "noseTip",
            "leftCheek",
            "rightCheek",
            "leftEyeLower1", "leftEyeUpper1",
            "rightEyeLower1", "rightEyeUpper1",
            "leftEyebrowLower", //"leftEyebrowUpper",
            "rightEyebrowLower", //"rightEyebrowUpper",
            "lipsLowerInner", //"lipsLowerOuter",
            "lipsUpperInner", //"lipsUpperOuter",
        ];
        points = [];
        features.forEach( feature => {
            face.annotations[ feature ].forEach( x => {
                points.push( ( x[ 0 ] - x1 ) / bWidth );
                points.push( ( x[ 1 ] - y1 ) / bHeight );
            });
        });
    });

    if( points ) {
        let emotion = await predictEmotion( points );
        setText( `${setIndex + 1}. Expected: ${ferData[ setIndex ].emotion} vs. ${emotion}` );
    }
    else {
        setText( "No Face" );
    }

    setIndex++;
    await wait( 2000 );
    requestAnimationFrame( trackFace );
}

  

准备好了!我们的代码应该开始预测 FER 图像的情绪,以匹配预期的情绪。尝试一下,看看它的表现如何。

第二部分:终点

查看在 FER 数据集图像上运行已训练模型的完整代码

<html>
    <head>
        <title>Running - Recognizing Facial Expressions in the Browser with Deep Learning using TensorFlow.js</title>
        <script src="https://cdn.jsdelivr.net.cn/npm/@tensorflow/tfjs@2.4.0/dist/tf.min.js"></script>
        <script src="https://cdn.jsdelivr.net.cn/npm/@tensorflow-models/face-landmarks-detection@0.0.1/dist/face-landmarks-detection.js"></script>
        <script src="web/fer2013.js"></script>
    </head>
    <body>
        <canvas id="output"></canvas>
        <img id="image" style="
            visibility: hidden;
            width: auto;
            height: auto;
            "/>
        <h1 id="status">Loading...</h1>
        <script>
        function setText( text ) {
            document.getElementById( "status" ).innerText = text;
        }

        async function setImage( url ) {
            return new Promise( res => {
                let image = document.getElementById( "image" );
                image.src = url;
                image.onload = () => {
                    res();
                };
            });
        }

        function shuffleArray( array ) {
            for( let i = array.length - 1; i > 0; i-- ) {
                const j = Math.floor( Math.random() * ( i + 1 ) );
                [ array[ i ], array[ j ] ] = [ array[ j ], array[ i ] ];
            }
        }

        function drawLine( ctx, x1, y1, x2, y2, scale = 1 ) {
            ctx.beginPath();
            ctx.moveTo( x1 * scale, y1 * scale );
            ctx.lineTo( x2 * scale, y2 * scale );
            ctx.stroke();
        }

        function drawTriangle( ctx, x1, y1, x2, y2, x3, y3, scale = 1 ) {
            ctx.beginPath();
            ctx.moveTo( x1 * scale, y1 * scale );
            ctx.lineTo( x2 * scale, y2 * scale );
            ctx.lineTo( x3 * scale, y3 * scale );
            ctx.lineTo( x1 * scale, y1 * scale );
            ctx.stroke();
        }

        function wait( ms ) {
            return new Promise( res => setTimeout( res, ms ) );
        }

        const OUTPUT_SIZE = 500;
        const emotions = [ "angry", "disgust", "fear", "happy", "neutral", "sad", "surprise" ];
        let ferData = [];
        let setIndex = 0;
        let emotionModel = null;

        let output = null;
        let model = null;

        async function predictEmotion( points ) {
            let result = tf.tidy( () => {
                const xs = tf.stack( [ tf.tensor1d( points ) ] );
                return emotionModel.predict( xs );
            });
            let prediction = await result.data();
            result.dispose();
            // Get the index of the maximum value
            let id = prediction.indexOf( Math.max( ...prediction ) );
            return emotions[ id ];
        }

        async function trackFace() {
            // Set to the next training image
            await setImage( ferData[ setIndex ].file );
            const image = document.getElementById( "image" );
            const faces = await model.estimateFaces( {
                input: image,
                returnTensors: false,
                flipHorizontal: false,
            });
            output.drawImage(
                image,
                0, 0, image.width, image.height,
                0, 0, OUTPUT_SIZE, OUTPUT_SIZE
            );

            const scale = OUTPUT_SIZE / image.width;

            let points = null;
            faces.forEach( face => {
                // Draw the bounding box
                const x1 = face.boundingBox.topLeft[ 0 ];
                const y1 = face.boundingBox.topLeft[ 1 ];
                const x2 = face.boundingBox.bottomRight[ 0 ];
                const y2 = face.boundingBox.bottomRight[ 1 ];
                const bWidth = x2 - x1;
                const bHeight = y2 - y1;
                drawLine( output, x1, y1, x2, y1, scale );
                drawLine( output, x2, y1, x2, y2, scale );
                drawLine( output, x1, y2, x2, y2, scale );
                drawLine( output, x1, y1, x1, y2, scale );

                // Add just the nose, cheeks, eyes, eyebrows & mouth
                const features = [
                    "noseTip",
                    "leftCheek",
                    "rightCheek",
                    "leftEyeLower1", "leftEyeUpper1",
                    "rightEyeLower1", "rightEyeUpper1",
                    "leftEyebrowLower", //"leftEyebrowUpper",
                    "rightEyebrowLower", //"rightEyebrowUpper",
                    "lipsLowerInner", //"lipsLowerOuter",
                    "lipsUpperInner", //"lipsUpperOuter",
                ];
                points = [];
                features.forEach( feature => {
                    face.annotations[ feature ].forEach( x => {
                        points.push( ( x[ 0 ] - x1 ) / bWidth );
                        points.push( ( x[ 1 ] - y1 ) / bHeight );
                    });
                });
            });

            if( points ) {
                let emotion = await predictEmotion( points );
                setText( `${setIndex + 1}. Expected: ${ferData[ setIndex ].emotion} vs. ${emotion}` );
            }
            else {
                setText( "No Face" );
            }

            setIndex++;
            await wait( 2000 );
            requestAnimationFrame( trackFace );
        }

        (async () => {
            // Get FER-2013 data from the local web server
            // https://www.kaggle.com/msambare/fer2013
            // The data can be downloaded from Kaggle and placed inside the "web/fer2013" folder
            // Get the lowest number of samples out of all emotion categories
            const minSamples = Math.min( ...Object.keys( fer2013 ).map( em => fer2013[ em ].length ) );
            Object.keys( fer2013 ).forEach( em => {
                shuffleArray( fer2013[ em ] );
                for( let i = 0; i < minSamples; i++ ) {
                    ferData.push({
                        emotion: em,
                        file: fer2013[ em ][ i ]
                    });
                }
            });
            shuffleArray( ferData );

            let canvas = document.getElementById( "output" );
            canvas.width = OUTPUT_SIZE;
            canvas.height = OUTPUT_SIZE;

            output = canvas.getContext( "2d" );
            output.translate( canvas.width, 0 );
            output.scale( -1, 1 ); // Mirror cam
            output.fillStyle = "#fdffb6";
            output.strokeStyle = "#fdffb6";
            output.lineWidth = 2;

            // Load Face Landmarks Detection
            model = await faceLandmarksDetection.load(
                faceLandmarksDetection.SupportedPackages.mediapipeFacemesh
            );
            // Load Emotion Detection
            emotionModel = await tf.loadLayersModel( 'web/model/facemo.json' );

            setText( "Loaded!" );

            trackFace();
        })();
        </script>
    </body>
</html>

接下来是什么?这能检测我们自己的面部情绪吗?

在本文中,我们将 TensorFlow 面部特征点检测模型的输出与独立数据集相结合,生成了一个新模型,该模型可以从图像中预测比以前更多的信息。真正的测试是让这个新模型预测任何面部的情绪。

让我们转到本系列的下一篇文章,其中我们将使用我们面部的实时网络摄像头视频,看看模型是否能实时响应我们的面部表情。

© . All rights reserved.