04 The Consumerisation of Artificial Intelligence

The Consumerisation of Artificial Intelligence.


We live in very exciting times. Artificial Intelligence has developed relatively quickly over the last 70 years - from the invention of the first mechanical computer in the early 19th century to the invention of the first programmable digital computer in the 1940s which enabled more complex mathematical operations. The improved efficiencies and computational possibilities, inspired a handful of scientists to begin talking about the construction an electronic brain.

Fast forward from the 1940’s to now where only 70 years on, Artificial Intelligence is evolving into something that people can use, without requiring a degree in mathematics, access to enormous amounts of processing hardware or access to enormous amount of data.

Where we are now

There is still a long way to go, however progress has been occurring at an exponential rate. Instead of being able to solve just mathematical problems, we can now detect emotion, perform facial recognition, identify landmarks or celebrities in images, understand spoken language and provide textual descriptions of images.

More importantly, these advanced services are now available to a broad general audience. Utilising cloud technology to provide these complex services, we are now at the tipping point where vast numbers of people are empowered to create solutions or applications that leverage this power.

The vehicle for this is predominantly (but not limited to) the exposition of Application Programming Interfaces (API) to consumption by integrators or solution providers. The simplicity of these API’s lowers the boundary to what was previously complex and hard to leverage operations.

Offerings

There are quite a few vendors providing offerings in this space. Google, Microsoft, AWS, Salesforce to name a few. All provide different levels and types of usage of these advanced services. Microsoft is well positioned here with its Azure cloud and cognitive services technology stack. The remainder of this article will concentrate on the Microsoft offerings.

It is not all easy

While uses of advanced artificial intelligence services are available today, lower level technology is widely used to build these consumer-friendly services. To show the spectrum on offer, the following image shows the broad Microsoft suite, with least complex on the left moving to more complex towards the right.

 

Cognitive Services

A set of API’s built using pre-trained models and example data with easy to consume interfaces. The API’s allow developers to build more intelligence into their applications with little to no knowledge on how to build specific machine learning or artificial intelligence solutions.

Machine learning and bots

This features a toolset called “Cortana intelligence suite” to allow a relatively easy interface for the building and training of machine learning models. In addition, a “bot framework” is also available that developers can leverage to build applications with intelligent bots. These tools are a little more advanced than the cognitive services in that a higher degree of investment and development effort is required.

Cognitive toolkit

This provides a framework to access the lower levels of machine learning through neural networks. It is predominantly script driven and very complex. This toolset provides the basis from which both cognitive services, cortana intelligence suite and the bot framework are built. Large amounts of sample data and a good understanding of the mathematics behind machine learning are required. As such, the barrier for entry is typically quite high.

An example

Now that you have an idea of the scope of what is provided, we will show an actual example of how easy it is to consume the powerful services that cognitive services has to offer.

Setup

To utilise Microsoft cognitive services, you must first provision a service you wish to use in Azure. To do that you simply go into the Azure portal, and add a specific cognitive service to your selected resource group. The following image shows a list of some of the possible cognitive services you may add.

 



Once this is done, select the service and click on the ‘Keys’ option to bring up the ‘Keys’ blade. This has some access keys which is what allows you to use the service and is how your usage is measured. In this image below, we have setup an instance of the ‘Computer Vision’ cognitive service which will allow us to analyse an image (amongst other things).

 

 

You are now all setup to use the cognitive service.

Let’s have a play

For this example, we will analyse the following image which is located at https://www.planwallpaper.com/static/images/Child-Girl-with-Sunflowers-Images.jpg

 

 

In order to analyse this image, we send a request to the cognitive service vision api at the following Url: https://southeastasia.api.cognitive.microsoft.com/vision/v1.0/analyze?visualFeatures=Categories,tags,description,adult&language=en

In that url, you can see that there some options specified, specifically “visualFeatures=Categories,tags,description,adult”. This tells the API that we want to analyse this image with the following as part of the requested response: categories, tags associated with the image, a plain text description, and an indication if there is any adult content.

As part of that request, we need to specify the access key that was provisioned when we first setup this API. This is provided as part of the HTTP header collection, with a header named “Ocp-Apim-Subscription-Key” and the value of that header, one of the keys that was listed in the azure portal for that’s service.

In addition, we need to tell the service where the image is located to analyse. We do this by including the following content as part of the body:

{ "url": "https://www.planwallpaper.com/static/images/Child-Girl-with-Sunflowers-Images.jpg" }

This request is then posted to the API endpoint via a standard HTTP POST.

Once this is done, the following response is received (with some sections removed for brevity): 

 {

    "categories": [

        {

            "name": "people_young",

            "score": 0.765625

        }

    ],

    "adult": {

        "isAdultContent": false,

        "isRacyContent": false,

        "adultScore": 0.0080434931442141533,

        "racyScore": 0.012916238978505135

    },

    "tags": [

        {

            "name": "outdoor",

            "confidence": 0.99058502912521362

        },

        {

            "name": "tree",

            "confidence": 0.98921263217926025

        },

        {

            "name": "flower",

            "confidence": 0.91467201709747314

        },

        {

            "name": "person",

            "confidence": 0.90480440855026245

        },

        {

            "name": "plant",

            "confidence": 0.89624994993209839

        },

        {

            "name": "little",

            "confidence": 0.87906831502914429

        },

        {

            "name": "yellow",

            "confidence": 0.87603884935379028

        }

    ],

    "description": {

        "tags": [

            "outdoor",

            "flower",

            "person",

            "plant",

            "little",

….more responses here

            "grass",

            "yellow",

            "holding",

            "garden"

        ],

        "captions": [

            {

                "text": "a little girl wearing a yellow flower",

                "confidence": 0.89666165480076254

            }

        ]

    },

}       


Looking at this response, there is a wealth of information that was taken from the image. We can see that:

  • The category was “people_young” with a confidence that this the correct answer of 0.765625 or approximately 77%

  • There is no adult or racy content within the image.

  • The tags show that the image contains items relating to outdoor, trees, flowers, persons, plants, little and yellow, all with a relatively high degree of confidence.

  • The description tags follow closely with the requested overall tags but also contain a plain text caption of “a little girl wearing a yellow flower” with a confidence score of 0.89666165480076254 or approximately 90%.

From a simple API call, we have been able to determine a vast amount of information with varying confidence levels. This allows the developer or consumer to use only the information which they feel is more accurate based on the confidence level. In addition, being able to convert a static image into plain text with no manual intervention required!

This is just one example from this set of API’s in the ComputerVision family. There are many more families of API’s such as Text analysis, face recognition, emotion recognition and a range of others.

Conclusion

The development of artificial intelligence and application of machine learning has come a long way in a relatively short amount of time. Not just the development of the algorithms and associated models, but also the ease of consumption of these technologies. Using an industry standard HTTP request, we can leverage the huge amount of work that has been developed by specialists in the artificial intelligence space, harnessing the large scale of the cloud.

With such a vast amount of resources available in such an easy way, it is no longer about how to harness this technology, but where can we apply this technology to make our applications and systems more intelligent. End users see immediate benefit as applications seamlessly become more “human like” or intelligent, and as a result, become more engaged.

The examples and information presented here are just the tip of the iceberg though. For more information, head over too https://azure.microsoft.com/en-us/services/cognitive-services/directory/ to see the full list of cognitive services available. In addition, there is the ability to try interactive versions of a few of the cognitive services to get a feel for what they can do.

--

Author: Paul Glavich, 12th December 2017
Originally posted at: Glavs Blog

Image - Gregor Cresnar