Professional Voice UI Integration Tools

The Creoir EdgeVUI™ Software Development Kit simplifies the development of voice user interfaces (VUI) by providing advanced Natural Language Understanding (NLU) tools utilizing Automatic Speech Recognition (ASR), Text-to-speech (TTS), and Speech Signal Enhancement (SSE) technologies from Cerence.

High Accuracy

  • Proven Cerence speech technologies used in more than 500 million cars
  • Fully SW-based Speech Signal Enhancement (SSE) algorithms reducing environmental noise impact

On-The-Edge Privacy

  • No audio data sent to cloud or 3rd party
  • Reliable operation without internet connection

Easy Integration

  • Fast development cycles with wide range of Voice UI creation tools
  • Simple-to-use MQTT and Rest-like-API customer user interface for Linux, Android, Windows OS

EdgeVUI™ Software Development Kit

Features
Requirements

Domain-specific, speaker-independent speech recognition and voice feedback

  • Built-in run-time libraries and API
  • Tools and documentation
  • Step-by-step instructions and sample applications
  • Method for domain-specific Natural Language Understanding
Built-in interface options for
Linux, Windows and Android

Programming language independent action code implementation

  • Python, C, C++, ReactJS, etc. for Linux and Windows
  • Kotlin and Java for Android

Simple-to-use customer interface with

  • MQTT (Linux & Windows)
  • Rest-like API (Android)

Hardware requirements

  • Embedded: ARM Cortex @1GHZ, 256MB RAM, 2GB file system
  • PC (Linux, Windows): X86-64, SSE2, AVX2
  • Mobile: Android 8 (API level 26)

Memory requirements

  • Flash: Data models ~7 MB/language
  • Executable code ~50 MB depending on features and complexity of Voice UI

RAM Usage

  • 150 – 250 MB depending on features
  • Required footprint at runtime depends on the requested set / specific project

Cerence Speech Technologies

Creoir is a technology partner of Cerence, the developer and manufacturer of speech technologies used in nearly 500 million cars worldwide.

Creoir EdgeVUI™ Software Development Kit utilizes the following Cerence core speech technologies:

Automatic Speech Recognition (ASR)

Automatic Speech Recognition (ASR) language models to support 40 languages (on request). The advanced Cerence speech recognition engine delivers a new level of speaker-independent and continuous speech recognition capabilities with unique features for voice-enabled applications:

  • Large vocabulary support: Enable embedded recognition for large lists up to millions of items
  • Wake-up words: Always listening mode with key-word activation removes the need for a “press to talk” button
  • Barge-in: Allows user to speak over spoken dialog prompts and be recognized
  • Global language support: Global support for over 40 languages provides universal functionality

Natural Text-to-Speech (TTS)

Natural Text-to-Speech (TTS) for up to 65 languages and 147 voices (on request). The Cerence Text-to-Speech (TTS) is a suite of speech output solutions to generate high-quality speech, with seamless blending of dynamic text-to-speech, pre-recorded audio and tuned prompts. It is optimized to read long texts in a natural, human way. New, deep learning-based algorithms deliver higher smoothness and more natural prosody, resulting in a unique voice experience, e.g.

  • Emotional TTS: Developers can choose from four different speaking styles: neutral, lively, forceful, and apologetic.
  • Prosody control: Volume, pitch, speaking rate, and timbre can be changed at run time for more dynamic and lively affects.
  • Languages and Voices: A truly universal voice portfolio offers 65 languages and 147 voices for creation of global structures using a single engine.
  • Accuracy: High linguistic accuracy offers correct readout for all type of text, including a large set of personal names.

Speech Signal Enhancement (SSE)

Speech Signal Enhancement (SSE) improves the quality and clarity of spoken voice commands by reducing noise and distortions. In environments where background noise is prevalent, various techniques of audio processing are employed for optimal speech recognition. These include noise suppression through adaptive filtering and spectral subtraction. SSE optimizes speech intelligibility in adverse acoustics conditions. Several software algorithms are utilized in SSE tuning for different microphone arrays between 1-16 microphones.

  • Acoustic Speaker Localization: Determines the location and direction of speaker based on audio signals, analyzing the arrival times, phase, and amplitude of the sound signals.
  • Wind Buffet Suppression: Removes the wind-induced vibrations and voice signals from the overall collected audio signal.
  • Noise Reduction: Minimizes unwanted sounds from an audio signal to increase the clarity of desired speech signal
  • Software Based Solution: Creoir EdgeVUI™ SSE is constantly developed with new features and evolving performance, and updates do not require any modifications to HW
Contact us