DSA (Digital Speech Aid) - a New Device to Decrease or Eliminate Stuttering

DSA (Digital Speech Aid) - a New Device to Decrease or Eliminate Stuttering
by Marek Roland-Mieszkowski, M.Sc., Ph.D., Digital Recordings and
Andrzej Czyzewski, Ph.D. and Bozena Kostek, Ph.D., Technical University of Gdansk, Poland
This paper was presented in Munich , Germany, during the 1st World Congress on Fluency Disorders (August 8-12, 1994). It is published in the Proceedings of the Congress.
Copyright 1989-2014, Digital Recordings. All Rights Reserved.

Content

Introduction
Origins of Stuttering and Standard Treatment
New Models and Electronic Speech Aids
Digital Speech Aid (DSA)
Facts about Stuttering and Implications for Treatment
How to Correct Stuttering via Signal Processing?
Hypothesis about SCS
Conclusions
References

Introduction
New models and a new approach to the stuttering disorder were developed. Elements of this new theory were applied to explain research results and many features of the stuttering disorder. They were the basis for the design and construction of a new electronic device as a method for elimination or reduction of stuttering. DSA is based on an advanced DSdP (Digital Sound Processing) of speech signal in the auditory feedback loop (Fig.1). The device uses the most modern digital signal processing circuitry, designed according to the algorithms developed by the authors, to elicit the desired response characteristics of the feedback signal.
Authors believe that stuttering is a physiological disorder, in most cases of a neurological nature. Stutterers become nervous, because they stutter, not as believed by some, they stutter since they are nervous. DSA is aimed to correct stuttering in the case of "classical stutterers", who are characterized by usual patterns of stuttering, have no problem with speech synthesis and do not stutter during whispering. In some cases subjects outside this group can also benefit from this technology, but results are less predictable. With DSA many stuttering people can speak fluently (or more fluently) in any fashion and at any rate.

[ Fig.1. Photograph of DSA Digital Speech Aid -
III-d generation model ]

Long-term effects were tested on several patients. Due to the relaxing and reassuring effect of DSA, the base-line stuttering level decreases. The effectiveness of the device over time appears to remain the same for removal of the "natural level" of stuttering. In a sense, DSA is working as a prosthetic device, similar to a hearing aid.
Back to the Content

Origins of Stuttering and Standard Treatment
In many cases stuttering was believed to be caused by psychological disorders and nervousness of the person. This stigma still exists in the large part of the society and large part of the medical and health community. Many stutterers themselves believe in this very strongly. Outcome of this believe is not only great suffering on the part of the person, due to the fact of being labeled as "weird", "nervous" and "psychologically unstable" but also very often wrong approach to the treatment of the disorder.
Treatment offered by the Speech Pathologists involves various techniques to slow down the speech, coordinate speech production with breathing, change the way of speaking and pronouncing words etc. It also involves some counselling and relaxation therapy, which very often overlaps with work and input from the Psychologist. These techniques work to certain degree and results depend very much on the particular case. Also they work often better in the clinical setting than in the real world, where person cannot concentrate as much on speech production. And unfortunately many of these techniques require a conscious effort on the part of the stutterer. Many people give up the speech therapy because in some cases they feel that fluent, but unnatural sounding speech is worse then stuttering itself. It is estimated that about 5% to 10% of stutterers are receiving some form of the therapy ( indicative of the current treatments effectiveness) [1,2].
Back to the Content

New Models and Electronic Speech Aids
If stuttering is a physiological disorder (of neurological nature in most cases), telling a person to control it does not make much sense. It is like telling a person with faulty vision to take off glasses and to concentrate to see better [2].
This approach to the problem was supported by many researchers. Among others Fairbanks [3] noticed the role of the feedback monitoring in speech production. According to this theory the outgoing speech movements are controlled by a cybernetic system which depends upon feedback based on sensory information for the maintenance of the right performance. Besides the sensory, or kinesthetic- tactile mechanism, there is another feedback loop mechanism acting at the acoustic level. This auditory feedback plays a very important role in the speech production.
The Wiener's cybernetic theory presented in 1948 caused the appearance of many theoretical models dealing with stuttering origin and mechanism. These theories are based on the assumption that the speech output is returned to the central integrating system through the airborne side tone, bone conducted side tone, tissue connected side tone and kinesthetic- tactile sensors on both sides of the body. Stromsta [4] noticed in 1962, that the auditory feedback signals in these different channels arrive in different moments and that the resulting information received by the brain becomes very complex. Subsequently, many authors [5,6] dealt with many possible sources of distortions in the feedback systems used to monitor speech. They include asynchrony of feedback signals that arrive in the right and the left hemispheres and also differential delays in bone and airborne feedback loops.
Consequently, it is evident that the following question need to be raised: how could we modify the distorted feedback loop in order to improve the speaker's performance ? As it is known since the fifties, the strong effect occurs when auditory delays are introduced when a person is speaking [6-8]. While many explanations proposed by Stromsta [4] may have some partial validity, it is worth to notice that it is also possible that the reason for the observed fluency is that the patient is trying to suppress monitoring of his own delayed voice, because the delayed speech is disturbing. Another known effects are masking noise (MAF) [8] and frequency-altered auditory feedback (FAF) [5].
However, nobody noticed the role of discorrelating of the signals as the more universal approach to the modification of the auditory feedback loop. The mentioned discorrelation concerns the signals arriving to the sensory system of the brain from the tissue (afferent channels) and acoustic loops. Looking at the known electronic techniques of stuttering suppression one can notice, that there exist an obvious link between the degree of the expected discorrelation and the efficiency of the method. For example: total masking (through the use of the very loud noise or caused by the deafness) proves to be very effective. This result is not surprising, because it causes a total discorrelation through the decay of one of the signals. The frequency alteration performed in the auditory feedback loop (FAF) also seriously discorrelates the sensed signals. Delayed feedback (DAF) causes less discorrelation, so it is not surprising that the performance of this method is worse in many cases [9,10]. Moreover, some methods are disadvantageous because of their unpleasant effects. For example the echo is slowing speech and the masking noise is annoying and even may be dangerous to the hearing sense.
Consequently, the best results with known methods were found by the authors when using FAF or FAF combined with small delays in DAF [9,10]. Interesting question arises : are there other methods leading to the discorrelation of the signals received by the speech monitoring system that are more efficient and are less distorting the speech received by the stutterer ?. The expertise gained by the authors in the domain of signal processing allow them to positively answer above question. Consequently, various new techniques were investigated during the thorough experiments conducted at the Sound Engineering Dept. of the TU Gdansk, among others by the master degree candidates [9,10].
Some of the newly proposed DSdP and filtering techniques are realizable only in the digital domain, so there was a need to design and to construct the proper digital processing platform to perform these experiments - namely DSA. The mentioned experiments are in progress now, and authors' new results will be published soon.
Performed experiments allow to draw a conclusion that the digital filtering and similar techniques using the discorrelation techniques effect very strongly the speech production process. Depending on the parameters of the algorithm it is possible to cause both the increase and the decrease of the stuttering frequency. Parameters and the results depend very strongly on the stutterer case. Consequently, one of the DSA's software version was provided with 256 settings of filter parameters selectable by the user or more precisely, by the investigators implementing the device at this stage. The prepared program options allow one to match the filtering parameters to the individual needs of the patients. However, such experiments are very time consuming process and optimization of proposed algorithms is not an easy task. Hypothesis that the discorrelation effect may be considered as a common feature of the most effective methods of electronic stuttering elimination was confirmed during these initial tests.
Based on these experiments, the new concepts of the DSdP algorithms for DSA are being implemented. They also open a new interest in search for new, more effective methods of alteration of the auditory feedback loop to reduce stuttering.
Back to the Content

Digital Speech Aid (DSA)
DSA - III generation model (Fig.1) is a small 11cmx6cmx3cm (LxWxH), sophisticated, electronic device with 256 different program settings and 5 independently variable parameters. All known and many new algorithms can be easily implemented in the existing hardware, due to the fact that DSdP (Digital Sound Processing) software is stored in EPROM, which could be easily changed and reprogrammed. DSA uses a microphone and a pair of earphones. It operates on batteries.
The device is relaxing and non-disturbing. With DSA a person can speak in any fashion and at any rate. DSA is most effective in the case of "Classical Stutterers" who consist about 80% - 90% of the stuttering population. Significant improvement or total fluency is observed in about 40% - 60% of "Classical Stutterers". Rest of them also improve to various degrees.
Device was and is still tested in real life, not artificial laboratory situation. Improvements were observed in all situations : in the office, at home, on the telephone, during public meetings, presentations, good and bad days, etc. Improvement is instant, however, we observe increase in effectiveness during the first 2 - 6 weeks (this is consistent with other similar observations and could be explained in author's opinion on the bases of neural networks, which could be used as the model of inner-working of the auditory system). After that it seems to remain the same. In majority of cases there is a carryover effect - person remains more fluent for 2 hours - 2 weeks after using DSA.
Significant improvement in self-esteem and self-confidence are observed. People like to use DSA and say that it is relaxing. Many people also indicate, that they feel, that they cannot stutter (in authors' opinion, speech production/monitoring system indicates to them that conditions for fluent speech are fulfilled). Long term effects seem to support authors' theory and expectations - DSA is still effective (same level) after being used for 10 months [2].
Back to the Content

Facts about Stuttering and Implications for Treatment
About 4% of children and 1% of adults stutter. Stuttering changes with age. People stutter to a varying degree and in different ways. Often people stutter on particular sounds. Often rate of the stuttering varies for given individual (depending on various factors). Stuttering usually depends on language used by the person. Males stutter 3 times more often than females. Stuttering starts in the early age and in some cases goes away at later age. In many cases food and alcohol (or other chemicals) change stuttering - better or worse. In many cases exercise and physical activity change stuttering. In many cases stress changes the rate of stuttering - better or worse.
Becoming suddenly deaf leads to total fluency. With shadow speech (whispering) or choral speech (with other person) - majority of cases is fluent (90% ?). When singing or talking in noise (cafeteria, bar, music) - majority of cases is fluent (90% ?). Lowering or increasing pitch of the ones voice, assuming foreign accent and slowing down the rate of speech production also results in increased fluency .
Amplification or attenuation of the voice , delay of the voice in the range 1-100 ms, white or other types of noise, frequency shifting of the voice in the range -1 to +1 octave, reverberation of the voice, combination of the above increases fluency.
From above facts it is obvious that hearing plays very important role in speech production and control. It is also clear that stuttering is caused by physiological disorder, neurological in nature in most of the cases. Speed of propagation of neural signals seems to play important role (facts and experiments), lower frequencies are more important than higher (facts and experiments), vocal tone is very important (facts and experiments). Stuttering seems to be correctable by the processing of sound (facts and experiments).
Back to the Content

How to Correct Stuttering via Signal Processing?
Methods of stuttering correction could be divided into three broad categories :

Masking- for example various types of noise, etc. make signal unusable for control in the Speech Control System (SCS). SCS relies on other afferent channels in this case (Fig.2).

Non-Masking - discorrelation of the signals - for example DAF, FAF, Reverberation. This seems to be better, since signal is not as disturbing and it is comprehendible as voice by the higher levels of Speech Synthesis System (SSS), therefore helping in this synthesis (Fig.2). However SCS is probably not using this signal for the control (servo) purposes.

Correction - signal is shaped via DSdP (Digital Sound Processing) in such a way as to correct for deficiencies and at the same time make it still acceptable by SCS for the control (servo) purposes. This is the preferred way of correcting, since it will be more effective and pleasant to use by the stutterer. Authors' hypothesis, based on our theory and experiments is that it is possible to use this type of systems in certain cases of stuttering. It should be also possible in certain cases to gradually adjust parameters of DSdP algorithms to do retraining of the neural network associated with the speech production / auditory system. It is expected that in some cases of stuttering the damage or pathology is on the level of neural network programming and in others on the level of neural network structure (or "hardware"). Currently authors are engaged in applications of neural networks for DSdP. Further tests and experiments are in progress.

Back to the Content

Hypothesis about SCS
In stutterers the auditory signal is used by SCS, but from time to time the voice signal is not being accepted leading to prolongations and other observed stuttering effects (Fig.2). By manipulating signal via DSdP, one can obtain auditory feedback which will be on one hand acceptable by SCS for control (servo) purposes and on the other hand will lead to correction of speech production. This should in turn lead to fluency. Hopefully this could be done for all sounds produced by the stutterer. Also one would hope, that this correction will be working over the whole range of variability (stress, alcohol, etc).

[ Fig.2. Block diagram of the speech production system. ]

Fig. 2 Block diagram of the speech production system.
Introduction of some kind of processing - ear plugs, amplification, equalization, filtering, DAF, FAF, DSA, noise, etc. changes auditory feedback. SCS is comparing the remembered signal (by the already formed and "fixed" neural network) with the produced signal and is correcting its shape. If SCS cannot do this, it will get stuck (stuttering effect) - system is trying to overcome a problem and is trying to do this over and over again. Since most of the SCS work is under a subconscious control, the stutterer has very little control over it.
Back to the Content

Conclusions
Performed experiments and research allow the authors to draw a conclusion that the digital filtering methods using the discorrelation techniques effect very strongly the speech production process. Depending on the parameters of the algorithm it is possible to cause both the increase and the decrease of the stuttering frequency. However, the optimal algorithm parameters depend very strongly on the particular stutterer case.
Our short-term and long-term experiments with DSA and various DSdP algorithms indicate, that auditory conditions play paramount role in the speech production. Fact, that some people can speak at any rate and in any fashion with certain forms of DSdP, shows, that stuttering disorder should be correctable similarly to hearing impairment (with hearing aids) at least in certain cases.
In authors opinion, any realistic experiments should involve wearable and portable device, such as DSA in order to allow testing in various real-life environments and situations. Also long-term performance can be investigated only with the portable device such as DSA.
Authors believe, that we are very close to find new more efficient methods to eliminate the stuttering disorder through the electronic speech aids. The progress in this domain may be brought by the implementation of the modern DSdP technology.
Our hope is that scientists from different fields will join forces together in order to advance our knowledge of this disorder and its treatment. Without this approach this progress will be as slow as in the last several decades.
Back to the Content

References

[1] Hermes Electronics, internal report " The Digital Speech Aid", Halifax, Nova Scotia, Canada, December 1993.
[2] Roland-Mieszkowski M., book "Career Assassination", Halifax, Nova Scotia, Canada, to be published.
[3] Fairbanks , G. (1955), Selective vocal effects of delayed auditory feedback, Journal of Speech and Hearing Disorders, 20.
[4] Stromsta C. (1990). Delays associated with certain sidetone pathways, Journal of the Acoustical Society of America, 34.
[5] Howell P. (1990). Changes in voice level caused by several forms of altered feedback in fluent speakers and stutterers, Language and Speech, Vol. 33, No.4, 325-338.
[6] Smolka E., Adamczyk B. (1983). Pomiar czestosci podstawowej przy oddzialywaniu echa i poglosu na proces mowienia jakajacych sie, Logopedia 14/15.
[7] Webster, R.L. (1991). Manipulation of Vocal Tone: implications for Stuttering. In Peters, Hulstijn and Starkweather (Ed.). Speech motor control and Stuttering, pp.535-545, Elsevier Science Publishers B.V.
[8] Bloodstein O. (1987). A Handbook on Stuttering, Brooklyn College of The City University of New York, National Easter Seal Society.
[9] Turowski, G. (1993). Elektroniczne metody korekcji wad wymowy, M.Sc. thesis, Technical University of Gdansk, Faculty of Electronics, Department of Sound Engineering, Gdansk, Poland.
[10] Piorecki P. (1994). Korektor wad wymowy oparty na cyfrowym procesorze sygnalowym, M.Sc. thesis, Technical University of Gdansk, Faculty of Electronics, Department of Sound Engineering, Gdansk, Poland.

Back to the Content

Terms of Use | Return Policy | Privacy Policy

Copyright (©) 1989-2014 by Digital Recordings. All Rights Reserved.
No part of the information provided on this www page may be reproduced for any purpose, in any form, without prior written approval.

This site uses frames. To enjoy them your screen's resolution should be at least 800 x 600, preferably 1024 x 768. To invoke frames click here.