Personal tools
You are here: Home / Repository / ICRA noise

ICRA noise

ICRA noise

SPECIFICATION OF ICRA-NOISE

The ICRA-Noise has been developed for the International Collegium of Rehabilitative Audiology by the HACTES work group (Hearing Aid Clinical Test Environment standardisation). The purpose was to establish collection of noise signals to be used as background noise in clinical tests of hearing aids and possibly for measuring characteristics of non-linear instruments. By composing signals with well defined spectral and temporal characteristics similar to those typically found in real life speech signals and babble noise, it has been the hope of the HACTES group that these signals eventually could become an international de facto standard for these two purposes.

 

Specifications:

The signals are based on live English speech from the EUROM database (Chan, 1995) in which a female speaker is explaining about the system of arithmetical notation. Two signals were generated in principle using the same process, resulting in speech spectrum shaped noise corresponding to female and male speech.

The speech signal was sampled with a sampling rate of 44.1 kHz. The process consists of first splitting the signal into three bands with cross over frequencies of 850 Hz and 2500 Hz using IIR filters with a slope exceeding 100 dB/octave and more than 50 dB damping outside pass band. The cross over frequencies was chosen so the 1st formant of vowels were within the low band, the 2nd formant in the mid band and the unvoiced fricatives in the higher band. Next, each of the three bands were scrambled according to according to a process described by Schroeder (1968), which means that with a probability of 0.5 the sign of each sample of the speech is at random either reversed or kept unaltered. Since the numerical value of all samples are preserved by this process, each of the modified signals have the same modulation properties as the original speech, but will be completely unintelligible and have a flat, white spectrum. Next, the same filters by which they were originally separated again filter the Schroeder processed signals. The three signals are then scaled to have the same spectral density level. Now, the three bands are added together forming one signal with a white spectrum and with the original modulation preserved in each of the three frequency-ranges. In order to obtain the desired spectrum the signal is now filtered resulting in signals with spectra corresponding to male and female speech in close accordance with LTASS (Byrne et al., 1996) and the ANSI S3.5 (1997) standard (for the calculation of the SII). However, since the resulting signals had an unpleasant scratchy sound, their phase was smoothed in a 512 point FFT procedure by randomising the phase and then (after an inverse FFT) overlap-adding the segments with 7/8 overlap. The resulting signals have long-term spectrums according to LTASS and modulation characteristics like natural speech.

These signals are more representative of normal speech than filtered stationary Gaussian noise since both the spectrum and the modulation are preserved. Furthermore signals representative of raised and loud voices were generated by including the difference between normal, raised and loud speech according to ANSI S3.5 (1997) in the filter characteristic.

 

For a detailed description of the ICRA noises, see Dreschler et al. (2001).

 

Contents of the ICRA CD 1:

In each channel: Unmodulated random gaussian noise.

Male weighted1 idealized speech spectrum2. Normal effort.

Level: Lref

2 min.

 

In each channel: Unmodulated random gaussian noise.

Male weighted idealized speech spectrum. Raised effort3.

Level: Lref + 5.7 dB

2 min.

 

In each channel: Unmodulated random gaussian noise.

Male weighted idealized speech spectrum. Loud effort3.

Level: Lref + 12.1 dB

2 min.

 

In each channel: 3 band speech modulated noise (3bSMN).

Female weighted1 idealized speech spectrum. Normal effort.

Level: Lref

5 min.

 

In each channel: 3 band speech modulated noise (3bSMN).

Male weighted idealized speech spectrum. Normal effort.

Level: Lref

5 min.

 

In each channel: 2 persons babble, 1 female 3bSMN + 1 male 3bSMN.

Idealized speech spectrum. Normal effort.

Level: Lref + 3 dB

10 min.

 

In each channel: 6 persons babble, 1f +1m + 2f-6dB + 2m-6dB , all 3bSMN.

Idealized speech spectrum. Normal effort.

Level: Lref + 4.7 dB

20 min.

 

In each channel: 6 persons babble, 1f +1m + 2f-6dB + 2m-6dB , all 3bSMN.

Idealized speech spectrum. Raised effort.

Level: Lref + 10.7 dB

10 min.

 

In each channel: 6 persons babble, 1f +1m + 2f-6dB + 2m-6dB , all 3bSMN.

Idealized speech spectrum. Loud effort.

Level: Lref + 17.2.

10 min.

 

In each channel: Calibration Tone : 1 kHz

Level: Lref

2 min.

 

For ICRA noise, distributed through wetransfer, please contact Wouter Dreschler, email: w.a.dreschler@amc.uva.nl

 

Notes:

  1. Male weighted spectrum: HP 100 Hz 12dB/oct., female weighted spectrum: HP 200 Hz 12dB/oct .
  2. Idealized speech spectrum according to ANSI S3.5 (1997)
  3. Raised and loud effort according to ANSI S3.5 (1997)
  4. All signals in both channels are uncorrelated.
  5. All levels are measured as long term RMS.
  6. All voices on each track are uncorrelated.

 

References

  • ANSI S3.5 (1997) “American National Standard Methods for the Calculation of the Speech Intelligibility Index,” American National Standards Institute, New York.
  • Byrne, D., Dillon, H., Tran, K., Arlinger, S., Wilbraham, K., Cox, R., Hagerman, B., Heto, R., Kei, J., Lui, C., Kiessling, J., Kotby, M.N., Nasser, N.H.A., El Kholy, W.A.H., Nakanishi, Y., Oyer, H., Powell, R., Stephens, D., Meredith, R., Sirimanna, T., Tavartkiladze, G., Frolenkov, G.I., Westermann, S., & Ludvigsen, C. (1994). An international comparison of long-term average speech spectra. J. Acoust. Soc. Am., 96, 2108-2120.
  • Chan, D. (1995). EUROM – a spoken language resource for the EU. in Eurospeech, Volume I, pp. 867-870.
  • Dreschler, WA., Verschuure, H., Ludvigsen, C., & Westermann, S. (2001). ICRA Noises: Artificial noise signals with speech-like spectral and temporal properties for hearing aid assessment. Audiology, 40, 148-157.
  • Schroeder, M. (1968). Reference signal for signal quality studies. J. Acoust. Soc. Am., 44, 1735-1736.
  • Last modified 07-11-2015