User Adaptation of AAC Device Voices

Award Information

Agency: Department of Health and Human Services

Branch: National Institutes of Health

Contract: 2R42DC008712-02

Agency Tracking Number: R42DC008712

Amount: $947,065.00

Phase: Phase II

Program: STTR

Solicitation Topic Code: NIDCD

Solicitation Number: PHS2010-2

Timeline

Solicitation Year: 2010

Award Year: 2010

Award Start Date (Proposal Award Date): N/A

Award End Date (Contract End Date): N/A

Small Business Information

BIOSPEECH INC

940 UPPER DEVON LANE

LAKE OSWEGO, OR -

United States

DUNS: 144815151

HUBZone Owned: No

Woman Owned: No

Socially and Economically Disadvantaged: No

Principal Investigator

Name: ESTHER KLABBERS
Phone: (503) 341-1192
Email: KLABBERS@CSLU.OGI.EDU

Business Contact

Name: JAN VAN
Phone: (503) 494-7784
Email: enotices@ohsu.edu

Research Institution

Name: Oregon Health And Science University
Address:

3181 SW Sam Jackson Pk Rd

PORTLAND, OR 97239-3098

United States

Phone: () -
Type: Nonprofit College or University

Abstract

DESCRIPTION (provided by applicant): Augmentative and alternative communication devices with voice output (also known as Speech Generating Devices, or SGDs) enable individuals to speak by electronic means. Typical users of SGDs are individuals who have suffered from a stroke, traumatic brain injury, or who have neurodegenerative or neurodevelopmental disorders. In most cases, the user was able to speak previously, or had or still has intermittent speech. In these cases, the user's relatives and friends are familiar with the user's voice. An often expressed desire is for the SGD to sound like the user. However, typical SGDs do not mimic any characteristics of the user's speech; in fact, they typically have an extremely limited array of synthetic voices, and the prosodic patterns of these voices are not customizable. As a result, the synthetic voice is impersonal, which may be a factor in discouraging impaired speakers and their communication partners from using the SGD. To address this impersonal character of current SGDs, we propose to offer a system with a wide range of personal customization options, by making use of (1) a substantial number of synthetic voices to choose from; (2) customizable prosody; and, most important, (3) Speaker Mimicry (SM) technology to mimic the user. Phase I of this project established the feasibility of using SM technology to adapt an existing Text-to-Speech (TTS) synthesis system to mimic a specific target speaker, requiring only a small set of training recordings to be made of the target speaker. Mimicry of the spectral aspects of the target speaker was achieved with two Voice Transformation (VT) technologies, one that required extremely few recordings that, moreover, did not need to be of high acoustic quality (hence, pre-morbid home videos could in principle be used); and a second one that required more and better-quality recordings, but also provided better results. Mimicry of the prosodic aspects of the target speaker (Prosody Mimicry, or PM) was achieved by estimating static and dynamic parameters of the target speaker's intonational and durational patterns, which were then incorporated into the TTS system. The deliverables of this Phase II STTR project consists of: (1) A complete SM-capable SGD, comprising an SM-capable TTS system and a built-in touch-screen Graphical User Interface (GUI) for user input, installed on a low-cost touch-screen dedicated netbook (alternative keyboards or special input devices will also be supported); (2) Efficient software tools and processes to be used by BioSpeech personnel to compute the individual user data needed for the SM capability. The SM capability will have multiple options, depending on the availability, quantity, and quality of user recordings. The goals of this Phase II proposal are to develop a complete prototype of this product concept, and to co- develop and field-test the system with a group of SGD users. Moreover, we will show that, even with these unique features, the system can be made available at a far lower cost than most current SGDs, thanks to BioSpeech's complete ownership of the technology, to minimal ROI pressures, and to the availability of low-cost netbooks . PUBLIC HEALTH RELEVANCE: Millions of Americans with impaired or absent speech communication ability rely on Augmentative and Alternative Communication devices with voice output (Speech Generating Devices) to communicate. A psychologically important feature that no currently available systems have is the ability to speak with the user's voice, i.e. the ability to produce speech that mimics the individual's pre-morbid speech or speech that the individual may be able to intermittently produce. Phase I of this project established the feasibility of using Speaker Mimicry (SM) technology to adapt an existing SGD to mimic a specific target speaker, requiring only a small set of training recordings to be made of the target speaker; the goal of this Phase II proposal is to develop, further improve, and evaluate a complete prototype of this concept, and to deploy it in a limited release to a select group of individuals for in-the-field use.

* Information listed above is at the time of submission. *

You are here

User Adaptation of AAC Device Voices