Public preview: Named Entity Recognition in the Cognitive Services Text Analytics API

Today, we are happy to announce the public preview of Named Entity Recognition as part of the Text Analytics Cognitive Service. Named Entity Recognition (NER) is the ability to take free-form text and identify the occurrences of entities such as people, locations, organizations, and more. With just a simple API call, NER in Text Analytics uses robust machine learning models to find and categorize more than twenty types of named entities in any text documents.

Many organizations have messy piles of unstructured text in the form of customer feedback, enterprise documents, social media feeds, and more. However, it is challenging to understand what information these ever-growing stacks of documents contain. Text Analytics has long been helping customers make sense of these troves of text with capabilities such as Key Phrase Extraction, Sentiment Analysis, and Language Detection. Today's announcement adds to this suite of powerful and easy-to-use natural language processing solutions that make it easy to tackle many problems.

Named Entity Recognition and Entity Linking

Building upon the Entity Linking feature that was announced at Build earlier this year, the new Entities API processes the text using both NER and Entity Linking capabilities. This makes it an extremely powerful solution for squeezing the most structured information out of the unstructured text.

Entity Linking is the ability to identify and disambiguate the well-known identity of an entity found in the text, for example, determining whether the word "Mars" is being used as the planet or as the Roman god of war. This process requires the presence of a knowledge base which recognizes entities are linked. Knowledge bases from Bing and Wikipedia are used for Text Analytics. When the Text Analytics Entities API recognizes an entity using entity linking, it will provide links to more information about the entity on the web.

Named Entity Recognition, in contrast, can identify the entities in unstructured text regardless of whether the entities are well-known or exist in a knowledge base. When Text Analytics identifies an entity using NER, it will provide the type of entity i.e. person, location, organization, and others in the API response. In some cases, it will also provide a subtype.

In cases where an entity is recognized using both Entity Linking and Named Entity Recognition, the API will return the entity's type as well as web links to more information about the entity.

Supported entity types

Using the Text Analytics Cognitive Service, it's currently possible to recognize more than twenty types of entities in both Spanish and English. View the most current list of supported languages:

Type SubType Example
Person N/A* "Jeff", "Ashish Makadia"
Location N/A* "Redmond, Washington", "Paris"
Organization N/A* "Microsoft"
Quantity Number "6", "six"
Quantity Percentage "50%", "fifty percent"
Quantity Ordinal "2nd", "second"
Quantity NumberRange "4 to 8"
Quantity Age "90 days old", "30 years old"
Quantity Currency "$10.99"
Quantity Dimension "10 miles", "40 cm"
Quantity Temperature "32 degrees"
DateTime N/A* "6:30PM February 4, 2012"
DateTime Date "May 2nd, 2017", "05/02/2017"
DateTime Time "8am", "8:00"
DateTime DateRange "May 2nd to May 5th"
DateTime TimeRange "6pm to 7pm"
DateTime Duration "1 minute and 45 seconds"
DateTime Set "every Tuesday"
DateTime TimeZone “UTC-7”, “CST”
URL N/A* "http://www.bing.com"
Email N/A* "support@microsoft.com"

Depending on the input and extracted entities, certain entities may omit the SubType.

Next steps

Read more about Text Analytics and its capabilities, then visit our documentation. Please visit our pricing page to learn about the various tiers of service to fit your needs.

Source: Azure Blog Feed

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.