Enggar Rahayu Safitri 1841720008


Before we got further information about AI for speech recognition let us know
about what AI(Artificial Intelligence) and speech recognition are. So, Artificial
Intelligence is the theory and development of computer systems able to perform tasks that
normally require human intelligence, such as visual perception, speech recognition,
decision-making, and translation between languages. While speech recognition is the
ability of a machine that can identify words and phrases in spoken language and convert
them to a machine-readable format. It is commonly known as ASR or Automatic Speech
Speech recognition allows you to speak input into systems. You talk to your
computer, phone, or device and it uses what you said as input to trigger some action. The
technology of speech recognition is being used to replace other methods of input like
typing, clicking, or selecting in other ways.

Technology Companies On Developing Speech Recognition

The potential reason why recognition market will be worth
techonolgy companies develop $18 billion by 2023. As the voice
speech recognition is technology recognition technology gets bigger
companies recognized interests in and better, the research estimates
speech recognition technologies and that it could be incorporated into
are working toward making voice everything from phones to
recognition a standard for most refrigerators to cars. A glimpse of
products. One goal of these that was seen at the annual CES
companies may be to make voice 2017 show in Las Vegas where new
assistants speak and reply with devices with voice were either
greater accuracy around context and launched or announced.
content. Research firm Research and For a company to build a
Markets reported that the speech robust speech recognition
experience, the artificial intelligence behind it has to become better at
handling challenges such as accents and background noise. Speaking of background
noise, it can cause a whole system to fail. As
a result, speech recognition fails in many cases due to noises that are out of the user's
control. Besides that, there is also a problem on the language itself. Not all languages are
supposed in speech recognition, and those that do are often not supported as well as
English. As a result, most devices that run speech recognition software perform
reasonably only in English. But today, developments in natural language processing and
neural network technology have improved the speech and voice technology, so much so
that today it is reportedly on par with humans. In 2017. For example, the word error rate
for Microsoft’s voice technology has been recorded at 5.1 percent by the company, while
Google reports that it has reduced its rate to 4.9 percen.

Application of Speech Recognition

There are some application of speech recognition that we will discuss. It has been
clustered based on what the research points to as the primary focus areas of each device.
They are Smart Speaker and Smart Home, Highlighting Google and Mobile Device
Applications, highlighting Facebook’s speech recognition integrations.

1. Smart Speaker and Smart Home(Amazon Echo and Alexa)

Google Assistant is Google’s voice-activated virtual assistant whose skills

include tasks such as sending and requesting payments via Google Pay.
Assistant is available on devices such as Android or iOS phones, smart
watches, Pixelbook laptops, Android smart TVs/displays and Android auto-
enabled cars. Users can also type commands to Assistant when quiet is
needed in places like libraries.

For children and families, the Google Assistant offers 50 voice-related

games. For example, children can command Assistant to play space trivia
with them.
Google claims that the speaker works with more than 5,000 smart home
devices — such as coffee machines, lights, and thermostats — from more
than 150 brands including Sony, Philips, LG and Toshiba. To make the

Google has opened the software development kit through Actions, which
allows developers to build voice into their own products that support artificial
Google also recently launched the Assistant Investments program. Under the
program, Google will provide support in terms of technical, business
development, and product leads aspects. The startups will also receive first
access to Assistant’s new features and programs. Another of Google’s speech-
recognition product is the AI-driven Cloud Speech-to-Text tool which
enables developers to convert audio to text through deep learning neural
network algorithms. Working in 120 languages, the tool enables voice
command-and-control, transcribe audio from call centers, process real-time
streaming or pre-recorded audio.
2. Mobile Device Application(Facebook Speech Recognition Project)

While Facebook has expanded on and refined its facial recognition

capabilities, it also purchased, a company that offers a natural language
development tool, in 2015. Facebook today has the capability
to automatically caption video ads through speech recognition. The video
below explains that adding subtitles to video ads enable Facebook users to see
the topic of the ad as they scroll down the newsfeed. Facebook advertisers
can add the subtitles by going to Power Editor and choosing “generate
automatically” as instructed. Facebook also acquired Oculus, a virtual reality
headset maker, for $2 billion in 2014. In March 2017, Oculus announced that
it had integrated voice and speech recognition into its headset to enable users
to easily navigate virtual reality. The application, available in English on Rift
and Gear VR headsets, allows wearers to conduct voice searches from Oculus
Home to navigate games, apps, and experiences.

