Bing Visual Search and Entity Search APIs for video apps

In this blog, I will go over how you can use the Bing Visual Search API, in combination with Bing Entity Search API to build an enhanced viewing experience in your video app.

General availability of Bing Visual Search API was announced at Build 2018, in this blog. Bing Visual Search API enables you to use an image as a query to get information about what entities are in the image, along with a list of visually similar images from the image index built by Bing. GA of Bing Entity Search was announced in this blog, published on March 1st, 2018. Bing Entity Search API enables you to brings rich contextual information about people, places, things, and local businesses to any application, blog, or website for a more engaging user experience.

By combining the power of these two APIs, you can build a more engaging experience in your video app, by following the steps listed below

  • Write a JavaScript function that triggers when the user clicks the pause button in your video app. In this JavaScript function, grab the paused video frame as an image. Take a look at this discussion to learn more about how to do this.
  • Pass the captured video frame via AJAX to a server-side function.
  • In your server side function, you can now call the Bing Visual Search API, with that image as input. Bing Visual Search documentation page provides code samples in multiple languages for this.
  • Bing Visual Search API will return a JSON string with insights for the image. The response format is well documented on this documentation page.
  • The entities can be found in “actions” tag with “_type” = “ImageEntityAction”. As an example, for this image, which has Tom Hanks and his wife, you will see two “actions” tag with “_type” = “ImageEntityAction”.
  • You can now use the “displayName” for these entities to call the Entity Search API. Code samples for Bing Entity Search API can be found on this documentation page, which you can then pass back to the video app, to render it in the UI.

Entities in Bing are not limited to people. As an example, the Visual Search API identifies “Yosemite National Park” as an entity in this image. Depending on the types of videos you intend on showcasing in your video app, you can decide if you want to focus on certain or all types of entities.

Source: Azure Blog Feed

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.