You are here

User Adaptation of AAC Device Voices

Award Information
Agency: Department of Health and Human Services
Branch: National Institutes of Health
Contract: 2R42DC008712-02
Agency Tracking Number: R42DC008712
Amount: $947,065.00
Phase: Phase II
Program: STTR
Solicitation Topic Code: NIDCD
Solicitation Number: PHS2010-2
Timeline
Solicitation Year: 2010
Award Year: 2010
Award Start Date (Proposal Award Date): N/A
Award End Date (Contract End Date): N/A
Small Business Information
940 UPPER DEVON LANE
LAKE OSWEGO, OR -
United States
DUNS: 144815151
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 ESTHER KLABBERS
 (503) 341-1192
 KLABBERS@CSLU.OGI.EDU
Business Contact
 JAN VAN
Phone: (503) 494-7784
Email: enotices@ohsu.edu
Research Institution
 Oregon Health And Science University
 
3181 SW Sam Jackson Pk Rd
PORTLAND, OR 97239-3098
United States

 () -
 Nonprofit College or University
Abstract

DESCRIPTION (provided by applicant): Augmentative and alternative communication devices with voice output (also known as Speech Generating Devices, or SGDs) enable individuals to speak by electronic means. Typical users of SGDs are individuals who have suffered from a stroke, traumatic brain injury, or who have neurodegenerative or neurodevelopmental disorders. In most cases, the user was able to speak previously, or had or still has intermittent speech. In these cases, the user's relatives and friends are familiar with the user's voice. An often expressed desire is for the SGD to sound like the user. However, typical SGDs do not mimic any characteristics of the user's speech; in fact, they typically have an extremely limited array of synthetic voices, and the prosodic patterns of these voices are not customizable. As a result, the synthetic voice is impersonal, which may be a factor in discouraging impaired speakers and their communication partners from using the SGD. To address this impersonal character of current SGDs, we propose to offer a system with a wide range of personal customization options, by making use of (1) a substantial number of synthetic voices to choose from; (2) customizable prosody; and, most important, (3) Speaker Mimicry (SM) technology to mimic the user. Phase I of this project established the feasibility of using SM technology to adapt an existing Text-to-Speech (TTS) synthesis system to mimic a specific target speaker, requiring only a small set of training recordings to be made of the target speaker. Mimicry of the spectral aspects of the target speaker was achieved with two Voice Transformation (VT) technologies, one that required extremely few recordings that, moreover, did not need to be of high acoustic quality (hence, pre-morbid home videos could in principle be used); and a second one that required more and better-quality recordings, but also provided better results. Mimicry of the prosodic aspects of the target speaker (Prosody Mimicry, or PM) was achieved by estimating static and dynamic parameters of the target speaker's intonational and durational patterns, which were then incorporated into the TTS system. The deliverables of this Phase II STTR project consists of: (1) A complete SM-capable SGD, comprising an SM-capable TTS system and a built-in touch-screen Graphical User Interface (GUI) for user input, installed on a low-cost touch-screen dedicated netbook (alternative keyboards or special input devices will also be supported); (2) Efficient software tools and processes to be used by BioSpeech personnel to compute the individual user data needed for the SM capability. The SM capability will have multiple options, depending on the availability, quantity, and quality of user recordings. The goals of this Phase II proposal are to develop a complete prototype of this product concept, and to co- develop and field-test the system with a group of SGD users. Moreover, we will show that, even with these unique features, the system can be made available at a far lower cost than most current SGDs, thanks to BioSpeech's complete ownership of the technology, to minimal ROI pressures, and to the availability of low-cost netbooks . PUBLIC HEALTH RELEVANCE: Millions of Americans with impaired or absent speech communication ability rely on Augmentative and Alternative Communication devices with voice output (Speech Generating Devices) to communicate. A psychologically important feature that no currently available systems have is the ability to speak with the user's voice, i.e. the ability to produce speech that mimics the individual's pre-morbid speech or speech that the individual may be able to intermittently produce. Phase I of this project established the feasibility of using Speaker Mimicry (SM) technology to adapt an existing SGD to mimic a specific target speaker, requiring only a small set of training recordings to be made of the target speaker; the goal of this Phase II proposal is to develop, further improve, and evaluate a complete prototype of this concept, and to deploy it in a limited release to a select group of individuals for in-the-field use.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government