A Medley of Potpourri

Wednesday, July 15, 2020

Windows Speech Recognition

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Windows_Speech_Recognition

Windows Speech Recognition

The tutorial for Windows Speech Recognition in Windows Vista depicting the selection of text in WordPad for deletion.
Developer(s)	Microsoft
Initial release	January 30, 2007; 13 years ago
Operating system	Windows Vista and later
Type	Speech recognition

Windows Speech Recognition (WSR) is speech recognition developed by Microsoft for Windows Vista that enables voice commands to control the desktop user interface; dictate text in electronic documents and email; navigate websites; perform keyboard shortcuts; and to operate the mouse cursor. It supports custom macros to perform additional or supplementary tasks.

WSR is a locally processed speech recognition platform; it does not rely on cloud computing for accuracy, dictation, or recognition, but adapts based on contexts, grammars, speech samples, training sessions, and vocabularies. It provides a personal dictionary that allows users to include or exclude words or expressions from dictation and to record pronunciations to increase recognition accuracy. Custom language models are also supported.

With Windows Vista, WSR was developed to be part of Windows, as speech recognition was previously exclusive to applications such as Windows Media Player. It is present in Windows 7, Windows 8, Windows 8.1, Windows RT, and Windows 10.

History

Microsoft was involved in speech recognition and speech synthesis research for many years before WSR. In 1993, Microsoft hired Xuedong Huang from Carnegie Mellon University to lead its speech development efforts; the company's research led to the development of the Speech API (SAPI) introduced in 1994. Speech recognition had also been used in previous Microsoft products. Office XP and Office 2003 provided speech recognition capabilities among Internet Explorer and Microsoft Office applications; it also enabled limited speech functionality in Windows 98, Windows ME, Windows NT 4.0, and Windows 2000. Windows XP Tablet PC Edition 2002 included speech recognition capabilities with the Tablet PC Input Panel, and Microsoft Plus! for Windows XP enabled voice commands for Windows Media Player. However, these all required installation of speech recognition as a separate component; before Windows Vista, Windows did not include integrated or extensive speech recognition. Office 2007 and later versions rely on WSR for speech recognition services.

Windows Vista

A prototype speech recognition Aero Wizard in Windows Vista (then known as "Longhorn") build 4093.

At WinHEC 2002 Microsoft announced that Windows Vista (codenamed "Longhorn") would include advances in speech recognition and in features such as microphone array support as part of an effort to "provide a consistent quality audio infrastructure for natural (continuous) speech recognition and (discrete) command and control." Bill Gates stated during PDC 2003 that Microsoft would "build speech capabilities into the system — a big advance for that in 'Longhorn,' in both recognition and synthesis, real-time"; and pre-release builds during the development of Windows Vista included a speech engine with training features. A PDC 2003 developer presentation stated Windows Vista would also include a user interface for microphone feedback and control, and user configuration and training features. Microsoft clarified the extent to which speech recognition would be integrated when it stated in a pre-release software development kit that "the common speech scenarios, like speech-enabling menus and buttons, will be enabled system-wide."

During WinHEC 2004 Microsoft included WSR as part of a strategy to improve productivity on mobile PCs. Microsoft later emphasized accessibility, new mobility scenarios, support for additional languages, and improvements to the speech user experience at WinHEC 2005. Unlike the speech support included in Windows XP, which was integrated with the Tablet PC Input Panel and required switching between separate Commanding and Dictation modes, Windows Vista would introduce a dedicated interface for speech input on the desktop and would unify the separate speech modes; users previously could not speak a command after dictating or vice versa without first switching between these two modes. Windows Vista Beta 1 included integrated speech recognition. To incentivize company employees to analyze WSR for software glitches and to provide feedback, Microsoft offered an opportunity for its testers to win a Premium model of the Xbox 360.

During a demonstration by Microsoft on July 27, 2006—before Windows Vista's release to manufacturing (RTM)—a notable incident involving WSR occurred that resulted in an unintended output of "Dear aunt, let's set so double the killer delete select all" when several attempts to dictate led to consecutive output errors; the incident was a subject of significant derision among analysts and journalists in the audience, despite another demonstration for application management and navigation being successful. Microsoft revealed these issues were due to an audio gain glitch that caused the recognizer to distort commands and dictations; the glitch was fixed before Windows Vista's release.

Reports from early 2007 indicated that WSR is vulnerable to attackers using speech recognition for malicious operations by playing certain audio commands through a target's speakers; it was the first vulnerability discovered after Windows Vista's general availability. Microsoft stated that although such an attack is theoretically possible, a number of mitigating factors and prerequisites would limit its effectiveness or prevent it altogether: a target would need the recognizer to be active and configured to properly interpret such commands; microphones and speakers would both need to be enabled and at sufficient volume levels; and an attack would require the computer to perform visible operations and produce audible feedback without users noticing. User Account Control would also prohibit the occurrence of privileged operations.

Windows 7

The dictation scratchpad in Windows 7 replaces the "enable dictation everywhere" option of Windows Vista.

WSR was updated to use Microsoft UI Automation and its engine now uses the WASAPI audio stack, substantially enhancing its performance and enabling support for echo cancellation, respectively. The document harvester, which can analyze and collect text in email and documents to contextualize user terms has improved performance, and now runs periodically in the background instead of only after recognizer startup. Sleep mode has also seen performance improvements and, to address security issues, the recognizer is turned off by default after users speak "stop listening" instead of being suspended. Windows 7 also introduces an option to submit speech training data to Microsoft to improve future recognizer versions.

A new dictation scratchpad interface functions as a temporary document into which users can dictate or type text for insertion into applications that are not compatible with the Text Services Framework. Windows Vista previously provided an "enable dictation everywhere option" for such applications.

Windows 8.x and Windows RT

WSR can be used to control the Metro user interface in Windows 8, Windows 8.1, and Windows RT with commands to open the Charms bar ("Press Windows C"); to dictate or display commands in Metro-style apps ("Press Windows Z"); to perform tasks in apps (e.g., "Change to Celsius" in MSN Weather); and to display all installed apps listed by the Start screen ("Apps").

Windows 10

WSR is featured in the Settings application starting with the Windows 10 April 2018 Update (Version 1803); the change first appeared in Insider Preview Build 17083. The April 2018 Update also introduces a new ⊞ Win+Ctrl+S keyboard shortcut to activate WSR.

Overview and features

WSR allows a user to control applications and the Windows desktop user interface through voice commands. Users can dictate text within documents, email, and forms; control the operating system user interface; perform keyboard shortcuts; and move the mouse cursor. The majority of integrated applications in Windows Vista can be controlled; third-party applications must support the Text Services Framework for dictation. English (U.S.), English (U.K.), French, German, Japanese, Mandarin Chinese, and Spanish are supported languages.

When started for the first time, WSR presents a microphone setup wizard and an optional interactive step-by-step tutorial that users can commence to learn basic commands while adapting the recognizer to their specific voice characteristics; the tutorial is estimated to require approximately 10 minutes to complete. The accuracy of the recognizer increases through regular use, which adapts it to contexts, grammars, patterns, and vocabularies. Custom language models for the specific contexts, phonetics, and terminologies of users in particular occupational fields such as legal or medical are also supported. With Windows Search, the recognizer also can optionally harvest text in documents, email, as well as handwritten tablet PC input to contextualize and disambiguate terms to improve accuracy; no information is sent to Microsoft.

WSR is a locally processed speech recognition platform; it does not rely on cloud computing for accuracy, dictation, or recognition. Speech profiles that store information about users are retained locally. Backups and transfers of profiles can be performed via Windows Easy Transfer.

Interface

The speech recognizer displaying information based on different modes; the color of the recognizer button changes based on user interaction.

The WSR interface consists of a status area that displays instructions, information about commands (e.g., if a command is not heard by the recognizer), and the status of the recognizer; a voice meter displays visual feedback about volume levels. The status area represents the current state of WSR in a total of three modes, listed below with their respective meanings:

Listening: The recognizer is active and waiting for user input
Sleeping: The recognizer will not listen for or respond to commands other than "Start listening"
Off: The recognizer will not listen or respond to any commands; this mode can be enabled by speaking "Stop listening"

Colors of the recognizer listening mode button denote its various modes of operation: blue when listening; blue-gray when sleeping; gray when turned off; and yellow when the user switches context (e.g., from the desktop to the taskbar) or when a voice command is misinterpreted. The status area can also display custom user information as part of Windows Speech Recognition Macros.

The alternates panel displaying suggestions for a phrase.

Alternates panel

An alternates panel disambiguation interface lists items interpreted as being relevant to a user's spoken word(s); if the word or phrase that a user desired to insert into an application is listed among results, a user can speak the corresponding number of the word or phrase in the results and confirm this choice by speaking "OK" to insert it within the application. The alternates panel also appear when launching applications or speaking commands that refer to more than one item (e.g., speaking "Start Internet Explorer" may list both the web browser and a separate version with add-ons disabled). An ExactMatchOverPartialMatch entry in the Windows Registry can limit commands to items with exact names if there is more than one instance included in results.

Common commands

Listed below are common WSR commands. Words in italics indicate a word that can be substituted for the desired item (e.g., "direction" in "scroll direction" can be substituted with the word "down"). A "start typing" command enables WSR to interpret all dictation commands as keyboard shortcuts.

Dictation commands: "New line"; "New paragraph"; "Tab"; "Literal word"; "Numeral number"; "Go to word"; "Go after word"; "No space"; "Go to start of sentence"; "Go to end of sentence"; "Go to start of paragraph"; "Go to end of paragraph"; "Go to start of document" "Go to end of document"; "Go to field name" (e.g., go to address, cc, or subject). Special characters such as a comma are dictated by speaking the name of the special character.

Navigation commands:

Keyboard shortcuts: "Press keyboard key"; "Press ⇧ Shift plus a"; "Press capital b."

Keys that can be pressed without first giving the press command include: ← Backspace, Delete, End, ↵ Enter, Home, Page Down, Page Up, and Tab ↹.

Mouse commands: "Click"; "Click that"; "Double-click"; "Double-click that"; "Mark"; "Mark that"; "Right-click"; "Right-click that"; "MouseGrid".

Window management commands: "Close (alternatively maximize, minimize, or restore) window"; "Close that"; "Close name of open application"; "Switch applications"; "Switch to name of open application"; "Scroll direction"; "Scroll direction in number of pages"; "Show desktop"; "Show Numbers."

Speech recognition commands: "Start listening"; "Stop listening"; "Show speech options"; "Open speech dictionary"; "Move speech recognition"; "Minimize speech recognition"; "Restore speech recognition". In the English language, applicable commands can be shown by speaking "What can I say?" Users can also query the recognizer about tasks in Windows by speaking "How do I task name" (e.g., "How do I install a printer?") which opens related help documentation.

The MouseGrid command displaying a grid of numbers on the Windows Vista desktop.

MouseGrid

MouseGrid enables users to control the mouse cursor by overlaying numbers across nine regions on the screen; these regions gradually narrow as a user speaks the number(s) of the region on which to focus until the desired interface element is reached. Users can then issue commands including "Click number of region," which moves the mouse cursor to the desired region and then clicks it; and "Mark number of region", which allows an item (such as a computer icon) in a region to be selected, which can then be clicked with the previous click command. Users also can interact with multiple regions at once.

Show Numbers

Applications and interface elements that do not present identifiable commands can still be controlled by asking the system to overlay numbers on top of them through a Show Numbers command. Once active, speaking the overlaid number selects that item so a user can open it or perform other operations. Show Numbers was designed so that users could interact with items that are not readily identifiable.

The Show Numbers command overlaying numbers in the Games Explorer.

Dictation

WSR enables dictation of text in applications and Windows. If a dictation mistake occurs it can be corrected by speaking "Correct word" or "Correct that" and the alternates panel will appear and provide suggestions for correction; these suggestions can be selected by speaking the number corresponding to the number of the suggestion and by speaking "OK." If the desired item is not listed among suggestions, a user can speak it so that it might appear. Alternatively, users can speak "Spell it" or "I'll spell it myself" to speak the desired word on letter-by-letter basis; users can use their personal alphabet or the NATO phonetic alphabet (e.g., "N as in November") when spelling.

Multiple words in a sentence can be corrected simultaneously (for example, if a user speaks "dictating" but the recognizer interprets this word as "the thing," a user can state "correct the thing" to correct both words at once). In the English language over 100,000 words are recognized by default.

Speech dictionary

A personal dictionary allows users to include or exclude certain words or expressions from dictation. When a user adds a word beginning with a capital letter to the dictionary, a user can specify whether it should always be capitalized or if capitalization depends on the context in which the word is spoken. Users can also record pronunciations for words added to the dictionary to increase recognition accuracy; words written via a stylus on a tablet PC for the Windows handwriting recognition feature are also stored. Information stored within a dictionary is included as part of a user's speech profile. Users can open the speech dictionary by speaking the "show speech dictionary" command.

Macros

An Aero Wizard interface displaying options to create speech recognition macros.

WSR supports custom macros through a supplementary application by Microsoft that enables additional natural language commands. As an example of this functionality, an email macro released by Microsoft enables a natural language command where a user can speak "send email to contact about subject," which opens Microsoft Outlook to compose a new message with the designated contact and subject automatically inserted. Microsoft has also released sample macros for the speech dictionary, for Windows Media Player, for Microsoft PowerPoint, for speech synthesis, to switch between multiple microphones, to customize various aspects of audio device configuration such as volume levels, and for general natural language queries such as "What is the weather forecast?" "What time is it?" and "What's the date?" Responses to these user inquiries are spoken back to the user in the active Microsoft text-to-speech voice installed on the machine.

Application or item

Sample macro phrases (italics indicate substitutable words)
Microsoft Outlook	Send email	Send email to	Send email to Makoto	Send email to Makoto Yamagishi	Send email to Makoto Yamagishi about	Send email to Makoto Yamagishi about This week's meeting	Refresh Outlook email contacts
Microsoft PowerPoint	Next slide	Previous slide	Next	Previous	Go forward 5 slides	Go back 3 slides	Go to slide 8
Windows Media Player	Next track	Previous song	Play Beethoven	Play something by Mozart	Play the CD that has In the Hall of the Mountain King	Play something written in 1930	Pause music
Microphones in Windows	Microphone	Switch microphone	Microphone Array microphone	Switch to Line	Switch to Microphone Array	Switch to Line microphone	Switch to Microphone Array microphone
Volume levels in Windows	Mute the speakers	Unmute the speakers	Turn off the audio	Increase the volume	Increase the volume by 2 times	Decrease the volume by 50	Set the volume to 66
WSR Speech Dictionary	Export the speech dictionary	Add a pronunciation	Add that [selected text] to the speech dictionary	Block that [selected text] from the speech dictionary	Remove that [selected text]	[Selected text] sounds like...	What does that [selected text] sound like?
Speech Synthesis	Read that [selected text]	Read the next 3 paragraphs	Read the previous sentence	Please stop reading	What time is it?	What's today's date?	Tell me the weather forecast for Redmond

Users and developers can create their own macros based on text transcription and substitution; application execution (with support for command-line arguments); keyboard shortcuts; emulation of existing voice commands; or a combination of these items. XML, JScript and VBScript are supported. Macros can be limited to specific applications and rules for macros can be defined programmatically. For a macro to load, it must be stored in a Speech Macros folder within the active user's Documents directory. All macros are digitally signed by default if a user certificate is available to ensure that stored commands are not altered or loaded by third-parties; if a certificate is not available, an administrator can create one. Configurable security levels can prohibit unsigned macros from being loaded; to prompt users to sign macros after creation; and to load unsigned macros.

Performance

As of 2017 WSR uses Microsoft Speech Recognizer 8.0, the version introduced in Windows Vista. For dictation it was found to be 93.6% accurate without training by Mark Hachman, a Senior Editor of PC World—a rate that is not as accurate as competing software. According to Microsoft, the rate of accuracy when trained is 99%. Hachman opined that Microsoft does not publicly discuss the feature because of the 2006 incident during the development of Windows Vista, with the result being that few users knew that documents could be dictated within Windows before the introduction of Cortana.

Cortana

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Cortana

Cortana

Cortana on Windows 10

Developer(s)

Microsoft

Initial release

April 2, 2014; 6 years ago

Stable release(s)

Android	3.3.3.2753-enus-release / November 29, 2019; 7 months ago
iOS	3.3.3 / November 30, 2019; 7 months ago

Preview release(s)

Android	3.3.3.2753 / November 29, 2019

Operating system

Windows, iOS, Android, Xbox OS

Platform

(Coming Soon)

Available in

Type

Intelligent personal assistant

License

Proprietary

Website

www.microsoft.com/en-us/windows/cortana

Cortana is a virtual assistant developed by Microsoft, which uses the Bing search engine to perform tasks such as setting reminders and answering questions for the user.

Cortana is currently available in English, Portuguese, French, German, Italian, Spanish, Chinese, and Japanese language editions, depending on the software platform and region in which it is used.

Microsoft began reducing the prevalence of Cortana and converting it from an assistant into different software integrations in 2019. It was split from Windows 10's search bar in April 2019, and was removed from iOS and Android in certain markets on January 31st the following year.

History

Cortana was demonstrated for the first time at the Microsoft BUILD Developer Conference in San Francisco. It has been launched as a key ingredient of Microsoft's planned "makeover" of the future operating systems for Windows Phone and Windows.

It is named after Cortana, a synthetic intelligence character in Microsoft's Halo video game franchise originating in Bungie folklore, with Jen Taylor, the character's voice actress, returning to voice the personal assistant's US-specific version.

Development

The development of Cortana started in 2009 in the Microsoft Speech products team with general manager Zig Serafin and Chief Scientist Larry Heck. Heck and Serafin established the vision, mission, and long-range plan for Microsoft's digital personal assistant and they built a team with the expertise to create the initial prototypes for Cortana. Some of the key researchers in these early efforts included Microsoft Research researchers Dilek Hakkani-Tür, Gokhan Tur, Andreas Stolcke, and Malcolm Slaney, research software developer Madhu Chinthakunta, and user experience designer Lisa Stifelman. To develop the Cortana digital assistant, the team interviewed human personal assistants. These interviews inspired a number of unique features in Cortana, including the assistant's "notebook" feature. Originally Cortana was only meant to be a codename, but a petition on Windows Phone's UserVoice site that proved to be popular made the codename official.

Expansion to other platforms

Cortana white interface on Windows 10 Mobile

In January 2015, Microsoft announced the availability of Cortana for Windows 10 desktops and mobile devices as part of merging Windows Phone into the operating system at large.

On May 26, 2015, Microsoft announced that Cortana would also be available on other mobile platforms. An Android release was set for July 2015, but an Android APK file containing Cortana was leaked ahead of its release. It was officially released, along with an iOS version, in December 2015.

During E3 2015, Microsoft announced that Cortana would come to the Xbox One as part of a universally designed Windows 10 update for the console.

Cortana in other services

Cortana integrated in Microsoft Edge

Microsoft has integrated Cortana into numerous products such as Microsoft Edge, the browser bundled with Windows 10. Microsoft's Cortana assistant is deeply integrated into its Edge browser. Cortana can find opening hours when on restaurant sites, show retail coupons for websites, or show weather information in the address bar. At the Worldwide Partners Conference 2015 Microsoft demonstrated Cortana integration with products such as GigJam. Conversely, Microsoft announced in late April 2016 that it would block anything other than Bing and Edge from being used to complete Cortana searches, again raising questions of anticompetitive behavior by the company.

Microsoft's "Windows in the car" concept includes Cortana. The concept makes it possible for drivers to make restaurant reservations and see places before they go there.

At Microsoft Build 2016, Microsoft announced plans to integrate Cortana into Skype (Microsoft's instant messaging service) as a bot to allow users to order food, book trips, transcribe video messages and make calendar appointments through Cortana in addition to other bots. As of 2016, Cortana can underline certain words and phrases in Skype conversations that relate to contacts and corporations. A writer from Engadget has criticised the Cortana integration in Skype for only responding to very specific keywords, feeling as if she was "chatting with a search engine" due to the impersonal way the bots replied to certain words such as "Hello" causing the Bing Music bot to bring up Adele's song of that name.

Microsoft also announced at Microsoft Build 2016 that Cortana would be able to cloud-synchronise notifications between Windows 10 Mobile's and Windows 10's Action Center, as well as notifications from Android devices.

In December 2016, Microsoft announced the preview of Calendar.help, a service that enabled people to delegate the scheduling of meetings to Cortana. Users interact with Cortana by including her in email conversations. Cortana would then check people's availability in Outlook Calendar or Google Calendar, and work with others Cc'd on the email to schedule the meeting. The service relied on automation and human-based computation.

In May 2017, Microsoft in collaboration with Harman Kardon announced INVOKE, a voice-activated speaker featuring Cortana. The premium speaker has a cylindrical design and offers 360 degree sound, the ability to make and receive calls with Skype, and all of the other features currently available with Cortana.

The Harman Kardon Invoke speaker, powered by Cortana

Functionality

Cortana can set reminders, recognize natural voice without the requirement for keyboard input, and answer questions using information from the Bing search engine (e.g., current weather and traffic conditions, sports scores, biographies). Searches using Windows 10 are only made with Microsoft Bing search engine and all links will open with Microsoft Edge, except when a screen reader such as Narrator is being used, where the links will open in Internet Explorer. Windows Phone 8.1's universal Bing SmartSearch features are incorporated into Cortana, which replaces the previous Bing Search app which was activated when a user presses the "Search" button on their device. Cortana includes a music recognition service. Cortana can simulate rolling dice and flipping a coin. 'Cortana's "Concert Watch" monitors Bing searches to determine which bands or musicians the user is interested in. It integrates with the Microsoft Band watch band for Windows Phone devices if connected via Bluetooth, it can make reminders and phone notifications.

Since the Lumia Denim mobile phone series, launched in October 2014, active listening was added to Cortana, enabling it to be invoked with the phrase: "Hey Cortana"; it can then be controlled as usual. Some devices from the United Kingdom by O2 have received the Lumia Denim update without the feature but this was later clarified as a bug and Microsoft has since fixed it.

Cortana integrates with services such as Foursquare to provide restaurant and local attraction recommendations and LIFX to control smart light bulbs.

Notebook

Cortana stores personal information such as interests, location data, reminders, and contacts in the "Notebook". It can draw upon and add to this data to learn a user's specific patterns and behaviors. Users can view and specify what information is collected to allow some control over privacy, said to be "a level of control that goes beyond comparable assistants". Users can delete information from the "Notebook".

Reminders

Cortana has a built-in system of reminders which for example can be associated with a specific contact; it will then remind the user when in communication with that contact, possibly at a specific time or when the phone is in a specific location. Originally these reminders were specific to the device Cortana was installed on, but since then Windows 10 Microsoft synchronizes reminders across devices.

Design

Most versions of Cortana take the form of two nested, animated circles which are animated to indicate activities such as searching or talking. The main color scheme includes a black or white background and shades of blue for the respective circles.

Phone notification syncing

Cortana on Windows mobile and Android is capable of capturing device notifications and sending them to a Windows 10 device. This allows a computer user to view notifications from their phone in the Windows 10 Action Center. The feature was announced in early 2016 and released later in the year.

Miscellaneous

Cortana has a "do-not-disturb" mode in which users can specify "quiet hours", as was available for Windows Phone 8.1 users. Users can change the settings so that Cortana calls users by their names or nicknames. It also has a library of "Easter Eggs", pre-determined remarks.

When asked for a prediction, Cortana correctly predicted the winners of the first 14 matches of the football 2014 FIFA World Cup knockout stage, including the semi-finals, before it incorrectly picked Brazil over the Netherlands in the third place play-off match; this streak topped Paul the Octopus who correctly predicted all 7 of Germany's 2010 FIFA World Cup matches as well as the Final. Cortana can forecast results in various other sports such as the NBA, the NFL, the Super Bowl, the ICC Cricket World Cup and various European football leagues. Cortana can solve mathematical equations, convert units of measurement, and determine the exchange rates between currencies including Bitcoin.

Integrations

Cortana can integrate with third-party apps on Windows 10 or directly through the service. Starting in late 2016, Cortana integrated with Microsoft's Wunderlist service, allowing Cortana to add and act on reminders.

At Microsoft's Build 2017 conference, Microsoft announced that Cortana would get a consumer third-party skills capability, similar to that in Amazon Alexa.

On February 16, 2018, Microsoft announced connected home skills were added for ecobee, Honeywell Lyric, Honeywell Total Connect Comfort, LIFX, TP-Link Kasa, and Geeni, as well as support for IFTTT. At Microsoft's Ignite 2018 conference, Microsoft announced an Technology Adopters Program that Enterprises could build skills that could be developed and deployed into Azure tenants, accessible by organizational units or security groups.

Privacy concerns

Cortana indexes and stores user information. It can be disabled; this will cause Windows search to search the Web as well as the local computer, but this can be turned off. Turning Cortana off does not in itself delete user data stored on Microsoft's servers, but it can be deleted by user action. Microsoft has further been criticized for requests to Bing's website for a file called "threshold.appcache" which contains Cortana's information through searches made through the Start Menu even when Cortana is disabled on Windows 10.

As of April 2014, Cortana was disabled for users aged under 13 years.

Regions and languages

The Chinese version of Cortana, Xiao Na

The British version of Cortana speaks with a British accent and uses British idioms, while the Chinese version, known as Xiao Na, speaks Mandarin Chinese and has an icon featuring a face and two eyes, which is not used in other regions.

As of 2020 the English version of Cortana on Windows devices is available to all users in the United States (American English), Canada (French/English), Australia, India, and the United Kingdom (British English). Other language versions of Cortana are available in France (French), China (Simplified Chinese), Japan (Japanese), Germany (German), Italy (Italian), Brazil (Portuguese), Mexico, and Spain (Spanish). Cortana listens generally to the hot word "Hey Cortana" in addition to certain languages' customized versions, such as "Hola Cortana" in Spanish.

The English United Kingdom localised version of Cortana is voiced by voice actress Ginnie Watson, while the United States localised version is voiced by Jen Taylor. Taylor is the voice actress who voices Cortana, the namesake of the virtual assistant, in the Halo video game series.

The following table identifies the localized version of Cortana currently available. Except where indicated, this applies to both Windows Mobile and Windows 10 versions of the assistant.

Language	Region	Variant	Status	Platforms
English	United States	American English	Available	Windows, Android, iOS
	United Kingdom	British English	Available	Windows, Android
	Canada	Canadian English	Available	Windows, Android, iOS
	Australia	Australian English	Available	Windows, Android, iOS
	India	Indian English	Available	Windows
French	France	French of France	Available	Windows, Android, iOS
French	Canada	Canadian French	Available	Windows, Android, iOS
German	Germany	Standard German	Available	Windows
Italian	Italy	Standard Italian	Available	Windows
Spanish	Spain	Peninsular Spanish	Available	Windows
Spanish	Mexico	Mexican Spanish	Available	Windows
Traditional Chinese	Taiwan	Taiwanese Mandarin	Not Available
	Hong Kong	Cantonese	Not Available
	Macau	Cantonese	Not Available
Simplified Chinese	China	Mandarin Chinese	Available	Windows, Android, iOS
Portuguese	Brazil	Brazilian Portuguese	Available	Windows
Japanese	Japan	Standard Japanese	Available	Windows, iOS
Russian	Russia	Standard Russian	Not Available	Windows, iOS

Technology

The natural language processing capabilities of Cortana are derived from Tellme Networks (bought by Microsoft in 2007) and are coupled with a Semantic search database called Satori.

While many of Cortana's U.S. English responses are voiced by Jen Taylor, organic responses require the use of a text-to-speech engine. Microsoft Eva is the name of the text-to-speech voice for organic response in Cortana's U.S. English.

Updating

Cortana updates were delivered independently of those to the main Windows Phone OS, allowing Microsoft to provide new features at a faster pace. Not all Cortana-related features could be updated in this manner as some features such as "Hey Cortana" required the Windows Phone update service and the Qualcomm Snapdragon SensorCore Technology.

Google Voice Search

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Google_Voice_Search

Google Voice Search or Search by Voice is a Google product that allows users to use Google Search by speaking on a mobile phone or computer, i.e. have the device search for data upon entering information on what to search into the device by speaking.

Initially named as Voice Action which allowed one to give speech commands to an Android phone. Once only available for the U.S. English locale – commands were later recognizable and replied to in American, British, and Indian English; Filipino, French, Italian, German, and Spanish.

In Android 4.1+ (Jelly Bean), it was merged with Google Now.

In August 2014, a new feature was added to Google Voice Search, allowing users to choose up to five languages and the app will automatically understand the spoken language.

Google Voice Search on Google.com

On June 14, 2011, Google announced at its Inside Google Search event that it would start to roll out Voice Search on Google.com during the coming days.

Google rolled out the support, but only for the Google Chrome browser.

History

Google Voice Search was a tool from Google Labs that allowed someone to use their phone to make a Google query. After the user called (650) 623-6706, the number of Google Voice's search system, they would wait for the words Say your Search Keywords and then say the keywords. Next, they would either wait to have the page updated, or click on a link to bring up the search page the user requested. At the moment, both the demo of this service and the page have been shut down. Since the introduction of the service, products from Google, such as GOOG-411, Google Maps and Google Mobile App, have been developed to use speech recognition technology in various ways.

On October 30, 2012, Google released a new Google Search app for iOS, which featured an enhanced Google Voice Search function, similar to that of the Voice Search function found in Google's Android Jelly Bean and aimed to compete with Apple's own Siri voice assistant. The new app has been compared favorably by reviewers to Siri and The Unofficial Apple Weblog's side-by-side comparison said that Google's Voice Search on iOS is "amazingly quick and relevant, and has more depth [than Siri]". Of note is that as of May 2016 20% of search queries on mobile devices were done through voice with the number expected to grow.

Supported languages

The following languages and variants are partially supported in Google Voice Search:

Abaza since 2021
Afrikaans since 2010
Albanian since 2020
Amharic since 2017
Armenian since 2017
Azerbaijani since 2017
Basque since 2012
Bangla since 2017
Bulgarian since 2012
Burmese since 2018
Catalan since 2012
Czech since 2010
Danish since 2014
Dutch since 2010
English (Australia, Canada, India, New Zealand, South Africa, UK, US), some variants since 2008 launch
Filipino since 2013
Finnish since 2012
French since 2010
Galician since 2012
Georgian since 2017
German since 2010
Greek since 2014
Gujarati since 2017
Hebrew since 2011
Hungarian since 2012
Icelandic since 2012
Italian since 2010
Indonesian since 2011
Japanese since 2009
Javanese since 2017
Kannada since 2017
Korean since 2010
Khmer since 2017
Kurdish since 2021
Kyrgyz since 2022
Lao since 2017
Latin
Latvian since 2017
Lithuanian since 2015
Luxembourgish since 2020
Macedonian since 2020
Mandarin Chinese (Traditional Taiwan, Simplified China, Traditional Hong Kong) since 2009
Malay since 2011
Malayalam since 2017
Marathi since 2017
Mongolian since 2020
Nepali since 2017
Norwegian since 2012
Persian since 2013
Polish since 2010
Pig Latin since April 1, 2011 but it was actually added and not just because of April Fools' Day, although it is not officially listed
Portuguese (Brazilian; European since 2012)
Punjabi since 2020
Romanian since 2012
Russian since 2010
Serbian since 2012
Sindhi since 2021
Sinhala since 2017
Slovak since 2012
Spanish (Argentina, Bolivia, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, El Salvador, Guatemala, Honduras, Mexico, Nicaragua, Panama, Paraguay, Peru, Puerto Rico, Spain, US, Uruguay, Venezuela) since 2010 and Latin American Spanish since 2011
Sundanese since 2017
Swahili since 2017
Swedish since 2012
Tamil since 2017
Telugu since 2017
Turkish since 2010
Urdu since 2017
Uzbek since 2018
Yue Chinese (Traditional Hong Kong) since 2010
Zulu since 2010
Vietnamese since 2015

Integration in other Google products

Google Maps with voice search

In the summer of 2008, Google added voice search to the BlackBerry Pearl version of Google Maps for mobile, allowing Pearl users to say their searches in addition to typing them.

Google Mobile App with voice search

The Google Mobile app for Blackberry and Nokia (Symbian) mobiles allows users to search Google by voice at them touch of a button by speaking their queries. See http://www.google.com/mobile/apple/app.html for more information. Google also introduced voice search to all "Google Experience" Android phones with the 1.1 platform update, which includes the functionality on board the built-in Google Search widget.

In November 2008, Google added voice search to Google Mobile App on iPhone. With a later update, Google announced Voice Search for iPod touch. It requires a third party microphone. On August 5, 2009, T-Mobile launched the MyTouch 3G with Google, which features one-touch Google Voice Search.

Google Voice Search in YouTube

Since March 2010, a beta-grade derivation of Google Voice Search is used on YouTube to provide optional automatic text caption annotations of videos in the case that annotations are not provided. This feature is geared to the hearing-impaired and, at present, is only available for use by English-speaking users.

Search This Blog

Wednesday, July 15, 2020

Windows Speech Recognition

History

Windows Vista

Windows 7

Windows 8.x and Windows RT

Windows 10

Overview and features

Interface

Alternates panel

Common commands

MouseGrid

Show Numbers

Dictation

Speech dictionary

Macros

Performance

Cortana

History

Development

Expansion to other platforms

Cortana in other services

Functionality

Notebook

Reminders

Design

Phone notification syncing

Miscellaneous

Integrations

Privacy concerns

Regions and languages

Technology

Updating

Google Voice Search

Google Voice Search on Google.com

History

Supported languages

Integration in other Google products

Google Maps with voice search

Google Mobile App with voice search

Google Voice Search in YouTube

Fine-tuned universe