图像识别、翻译和语音合成 - 3合1 Web API

Jalle

5.00/5 (6投票s)

2016年12月5日

CPOL

2分钟阅读

20821

WebAPI、Azure、Android

引言

在本教程中，我将演示如何使用强大的 Google API 来创建一些有用的应用程序。本教程分为两部分
第一部分：构建处理图像标注和翻译成不同语言的 WebAPI 服务。
第二部分：从 Android 应用程序消费此 RESTful 服务。

使用代码

我们将从创建一个新的 WebAPI 项目开始。启动 Visual Studio，选择“新建项目”-> C# -> Web -> ASP.NET Web 应用程序 - 模板。勾选 WebAPI，并选择“在云中托管”以便以后可以发布此项目。

packages.config 将包含此项目所需的所有库。

    <?xml version="1.0" encoding="utf-8"?>
<packages>
  <package id="BouncyCastle" version="1.7.0" targetFramework="net45" />
  <package id="Google.Apis" version="1.19.0" targetFramework="net45" />
  <package id="Google.Apis.Auth" version="1.19.0" targetFramework="net45" />
  <package id="Google.Apis.Core" version="1.19.0" targetFramework="net45" />
  <package id="Google.Apis.Translate.v2" version="1.19.0.543" targetFramework="net45" />
  <package id="Google.Apis.Vision.v1" version="1.19.0.683" targetFramework="net45" />
  <package id="GoogleApi" version="2.0.13" targetFramework="net45" />
  <package id="log4net" version="2.0.3" targetFramework="net45" />
  <package id="Microsoft.AspNet.WebApi" version="5.2.3" targetFramework="net45" />
  <package id="Microsoft.AspNet.WebApi.Client" version="5.2.3" targetFramework="net45" />
  <package id="Microsoft.AspNet.WebApi.Core" version="5.2.3" targetFramework="net45" />
  <package id="Microsoft.AspNet.WebApi.WebHost" version="5.2.3" targetFramework="net45" />
  <package id="Microsoft.CodeDom.Providers.DotNetCompilerPlatform" version="1.0.0" targetFramework="net45" />
  <package id="Microsoft.Net.Compilers" version="1.0.0" targetFramework="net45" developmentDependency="true" />
  <package id="Newtonsoft.Json" version="7.0.1" targetFramework="net45" />
  <package id="Zlib.Portable.Signed" version="1.11.0" targetFramework="net45" />
</packages>

设置 API 密钥

由于我们将使用 Google API，因此我们需要先设置 Google Cloud Vision API 项目。

1. 对于 Google Vision API，下载 VisionAPI-xxxxxx.json 文件并将其保存在项目的根目录中
2. 对于翻译 API，从同一页面获取 API 密钥

回到代码中，我们将首先调用这些 API 变量。将值替换为上面获取的密钥。

 using System;
using System.Configuration;
using System.Diagnostics;
using System.IO;
using System.Web.Http;

namespace ThingTranslatorAPI2 {
  public class Global : System.Web.HttpApplication {

    public static String apiKey;
    protected void Application_Start() {
      GlobalConfiguration.Configure(WebApiConfig.Register);

      apiKey = "API-KEY";

      createEnvVar();
    }

    private static void createEnvVar() {
      var GAC = Environment.GetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS");

        if (GAC == null) {
        var VisionApiKey = ConfigurationManager.AppSettings["VisionApiKey"]; 
        if (VisionApiKey != null) {
          var path = System.Web.Hosting.HostingEnvironment.MapPath("~/") + "YOUR-API-KEY.json";

          Trace.TraceError("path: " + path);

          File.WriteAllText(path,VisionApiKey );
          Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", path);
        }
      }
    }
  }
}

位于 Ap_Start 文件夹中的 WebApiConfig 将包含这些内容。我们告诉服务器使用属性路由处理路由，而不是默认的路由器配置。

using System.Web.Http;
namespace ThingTranslatorAPI2
{
    public static class WebApiConfig
    {
    public static void Register(HttpConfiguration config)
        {
            // Web API routes
            config.MapHttpAttributeRoutes();
        }
    }
}

API 控制器

我们需要一个 API 控制器来处理请求并处理它们。请求应包含图像文件和语言代码，用于我们希望进行翻译的语言。图像将在内存中处理，因此无需将其保存到磁盘。

TranslatorController.cs

using System;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using System.Web.Http;
using System.Web.Http.Results;
using GoogleApi;
using GoogleApi.Entities.Translate.Translate.Request;
using TranslationsResource = Google.Apis.Translate.v2.Data.TranslationsResource;

namespace ThingTranslatorAPI2.Controllers {
   
  [RoutePrefix("api")]
  public class TranslatorController : ApiController
  {

    [Route("upload")]
    [HttpPost]
    public async Task<jsonresult<response>> Upload() {
      if (!Request.Content.IsMimeMultipartContent())
        throw new HttpResponseException(HttpStatusCode.UnsupportedMediaType);

      String langCode = string.Empty;
      var response = new Response();
      byte[] buffer = null;

      var provider = new MultipartMemoryStreamProvider();
      await Request.Content.ReadAsMultipartAsync(provider);

      foreach (var content in provider.Contents)
      {
        if (content.Headers.ContentType !=null && content.Headers.ContentType.MediaType.Contains("image"))
           buffer = await content.ReadAsByteArrayAsync();
        else
          langCode = await content.ReadAsStringAsync();
      }

      var labels = LabelDetectior.GetLabels(buffer);
      
      try {
        //Take the first label  that has the best match
        var bestMatch = labels[0].LabelAnnotations.FirstOrDefault()?.Description;
        String translateText;
        if (langCode == "en")
          translateText = bestMatch;
        else
          translateText = TranslateText(bestMatch, "en", langCode);

        //original is our text in English
        response.Original = bestMatch;
        response.Translation = translateText;

      } catch (Exception ex) {
        response.Error = ex.Message;
        return Json(response);
      }

      return Json(response);
    }

   //Translate text from source to target language
    private String TranslateText(String text, String source, String target) {

      var _request = new TranslateRequest {
        Source = source,
        Target = target,
        Qs = new[] { text },
        Key = Global.apiKey
      };

      try {
        var _result = GoogleTranslate.Translate.Query(_request);
        return _result.Data.Translations.First().TranslatedText;
      } catch (Exception ex) {
        return ex.Message;
      }
    }
 }
}

对于图像标注，我们需要这个类

LabelDetect or.cs

    using Google.Apis.Auth.OAuth2;
using Google.Apis.Services;
using Google.Apis.Vision.v1;
using Google.Apis.Vision.v1.Data;
using System;
using System.Collections.Generic;
using System.Diagnostics;

namespace ThingTranslatorAPI2 {
  public class LabelDetectior {
    
   // Get labels from image in memory
    public static IList<AnnotateImageResponse> GetLabels(byte[] imageArray) {
      try
      {
        VisionService vision = CreateAuthorizedClient();
        // Convert image to Base64 encoded for JSON ASCII text based request   
        string imageContent = Convert.ToBase64String(imageArray);
        // Post label detection request to the Vision API
        var responses = vision.Images.Annotate(
            new BatchAnnotateImagesRequest() {
              Requests = new[] {
                    new AnnotateImageRequest() {
                        Features = new [] { new Feature() { Type = "LABEL_DETECTION"}},
                        Image = new Image() { Content = imageContent }
                    }
           }
            }).Execute();
        return responses.Responses;
      }
      catch (Exception ex)
      {
        Trace.TraceError(ex.StackTrace);
      }
      return null;
    }

    // returns an authorized Cloud Vision client. 
    public static VisionService CreateAuthorizedClient() {
      try {
        GoogleCredential credential = GoogleCredential.GetApplicationDefaultAsync().Result;
        // Inject the Cloud Vision scopes
        if (credential.IsCreateScopedRequired) {
          credential = credential.CreateScoped(new[]
          {
                    VisionService.Scope.CloudPlatform
                });
        }
        return new VisionService(new BaseClientService.Initializer {
          HttpClientInitializer = credential,
          GZipEnabled = false
        });
      } catch (Exception ex) {
        Trace.TraceError("CreateAuthorizedClient: " + ex.StackTrace);
      }
      return null;
    }
  }
}<annotateimageresponse>

Response.cs 看起来像这样

namespace ThingTranslatorAPI2.Controllers
{
  public class Response
  {
    public string Original { get; set; }
    public string Translation { get; set; }
    public string Error { get; set; }
  }
}

如果您在编译代码时遇到任何问题，请查看此处附加的源代码。

现在让我们将其发布到 Azure Cloud。转到“生成” - “发布”，并填写所有 4 个输入框以匹配您的 Azure 设置。

现在我们可以使用 Postman 来测试它。

响应
{"Original":"mouse","Translation":"miš","Error":null}

我们收到了包含英文图像标签和翻译成我们使用 langCode 参数指定的语言版本的响应。

关注点

如今有几个 API 可以处理图像标注。其中一个我发现非常奇特的是 CloudSight。虽然它比其他 API 更准确，但它依赖于人工标注。缺点是完成工作比机器花费更多的时间。通常在 10-30 秒后收到响应。
我可以想象如果我们运行我们的应用程序并且发生超时。我们应该将其称为连接超时还是咖啡休息超时？ ;-)

本教程到此结束。在下一篇文章中，我们将构建一个Thing Translator 应用程序，该应用程序使用此 API。

祝您编码愉快！