Perception et cognition auditives PECA

Atelier du Réseau de sciences cognitives d'Ile-de-France

Responsables : Carolyn Drake et Daniel Pressnitzer

Site PECA :

Lieu : IRCAM, 1 place Igor-Stravinsky, 75004 Paris, salle Stravinsky
Métros : Hôtel de Ville, Châtelet, Les Halles, Rambuteau

Programme 2001-2002

mardi 11 septembre R.P. Carlyon (MRC-CBU, Cambridge, UK) The Continuity Illusion: Vowel Perception, Frequency Modulation, And Hearing Backwards In Time (en anglais).


When a 'target' is turned off and then resumed a short time later, it can be heard as continuous, provided that the silent interval is filled by another sound that would have masked the target if it had actually remained uninterrupted. Hence both the level and frequency content of the 'inducing sound' are crucial. We performed a series of experiments investigating this 'continuity illusion' and its relationship to other aspects of auditory processing. In one study, we generated four different vowels, each consisting of two formants (F1 and F2). When the two formants were presented simultaneously, identification performance was very good. In a second condition, they were alternated for one second, so that F1 and F2 were never present at the same time; the duration of each formant presentation was 100 or 200 ms. Performance in this condition was close to chance. In a third condition, the F1s and F2s still alternated, but the silent intervals in each formant region were filled by noise bursts. The same noise burst was used to fill the gaps for all the F2s used, and its level was set in a preliminary experiment to induce the illusion of continuity for all F2s presented in isolation, and to fail to do so for all F1s. Similarly, the noise used to fill all F1 gaps induced continuity for all F1s in isolation, but for no F2s. Performance in this condition was substantially better compared to the condition with no noise, and to other conditions in which noise was added only to the F1 or F2 gaps. This demonstrates that the neural mechanisms responsible for vowel perception receive input from those underlying the continuity illusion. A second study investigated the finding that, when a frequency modulated (FM) tone is interrupted, and that interruption filled by noise, listeners not only hear the tone as continuous, but also hear the modulation continue through the noise. We wondered whether the phase of FM would be preserved during the illusion. To test this, we asked subjects to discriminate between two stimuli, both of which consisted of two portions of a 1-kHz tone modulated at a rate of 5 Hz, and separated by a 200-ms interval filled by noise. The level and frequency content of the noise were sufficient to induce the continuity illusion. In one of the two sounds the FM phase was the same after the noise as it would have been if the tone had been uninterrupted. Subjects could not discriminate between this sound and one in which the FM phase after the noise was shifted by 180°. This shows that FM phase is not preserved in the illusion, and demonstrates a paradoxical percept: subjects hear a modulation as continuous, but do not notice what would be an obvious phase reversal in that modulation. Finally, we presented listeners with a 300-ms wideband noise, which was immediately followed (without interruption) by a 300-ms narrowband noise. When asked to adjust the duration of a second narrowband noise presented 500 ms later, they adjusted it to a duration of about 370 ms. This is consistent with the onset of the first narrowband noise being perceived as occurring before the end of the wideband noise. We will present additional data investigating this explanation. If correct, it is an example of 'hearing backwards in time': a subsequent sound (narrowband noise) affects what is heard before the end of a preceding sound (wideband noise).

lundi 22 octobre

D. Wesley Grantham, David Chandler (Department of Hearing and Speech Sciences, Vanderbilt Bill Wilkerson Center for Otolaryngology and Communication Sciences) "Effects of uncertainty on auditory spatial resolution in the horizontal plane"

mercredi 19 juin Nicolas Grimault Rôle comparé des indices spectraux et temporels pour la perception de la hauteur et l'analyse séquentielle des scènes auditives

Résumé :

Aussi performante que puisse être l'analyse spectrale réalisée par le système auditif, cette analyse ne peut suffire à expliquer dans son ensemble la perception auditive humaine. L'analyse temporelle des signaux par le système auditif complète et pallie aux insuffisances de cette analyse spectrale et peut expliquer à elle seule un grand nombre de phénomènes perceptifs observés en psychoacoustique. Les pertes auditives neurosensorielles s'accompagnent, le plus souvent, d'une dégradation de la sélectivité fréquentielle. L'analyse temporelle du signal réalisée par le système auditif revêt alors une importance particulière pour la plupart des malentendants.
Au travers de résultats expérimentaux, je montrerai que ces deux types de mécanismes (spectraux et temporels) peuvent donner naissance à un continuum perceptif (la hauteur). De la même façon, je comparerai le rôle des indices spectraux et temporels pour l'analyse séquentielle des scènes auditives et je décrirai des résultats récents identifiant des indices temporels pertinents.

  Barbara Tillmann Attentes musicales et écoute attentive : deux aspects de la perception musicale étudiés par imagerie cérébrale fonctionnelle

Résumé :

Ma présentation regroupe des études d'IRMf qui s'intéressent soit aux attentes musicales en contexte, soit aux processus d'attention lors de l'écoute musicale. Les patrons d'activations cérébrales observés pour ces aspects de la perception musicale sont similaires à ceux observés pour le traitement d'autres matériaux (auditifs, visuels, verbaux).
Pour les attentes musicales, le paradigme d'amorçage harmonique a permis de montrer antérieurement que l'auditeur développe des attentes sur des événements musicaux à venir. Ces attentes vont, selon les cas, faciliter ou retarder le traitement de l'événement. Basée sur ce paradigme d'amorçage, notre étude analyse les corrélats neurophysiologiques du traitement d'une cible musicale reliée ou non-reliée au contexte précédent. Les données comportementales acquises lors de la séance d'IRMf répliquent une facilitation de traitement pour une cible harmoniquement reliée. Les activations cérébrales associées au traitement de la cible impliquent entre autres les régions frontales inférieures bilatérales, avec une plus forte activation pour une cible non-reliée. Ces résultats sont en accord avec d'autres données de la
littérature qui montrent que ces régions frontales ne sont pas spécialisées uniquement dans le traitement du langage.
En ce qui concerne l'écoute attentive de la musique, des pièces musicales polyphoniques combinent plusieurs flux auditifs et créent des scènes auditives complexes. Ce type de matériel permet d'étudier des mécanismes neuronaux qui guident l'attention dans des contextes auditifs naturels. Nous avons manipulé les extraits musicaux et les tâches expérimentales (par exemple, écouter sélectivement un instrument) dans deux études IRMf. Les résultats montrent des réseaux d'activations cérébrales qui impliquent des régions frontales, pariétales et temporales, et qui indiquent que l'écoute attentive de la musique recrute des circuits neuronaux également impliqués dans d'autres tâches (de mémoire, d'attention, de détection, etc.).

mardi 2 juillet 2 interventions de
Douglas Eck (IDSIA, Lugano, Switzerland)

Learning Long-Timescale Musical Structure: Music Composition using LSTM Recurrent Neural Networks


Unlike "feed-forward" neural networks, recurrent neural networks (RNNs) can learn datasets having rich temporal dynamics, making them good candidates for music composition. Unfortunately, previous attempts at composing music using RNNs have been disappointing. Though networks do learn note-by-note transition probabilities and even capture some phrasal structure, they have been unable to find the global musical structure that defines a particular genre. In short, RNNs write music that sounds nice on the surface but "goes nowhere." In this talk I will demonstrate that a recent hybrid RNN called LSTM overcomes this fundamental limitation and learns to compose proper pieces in a given musical form. In the talk I will provide a brief overview of LSTM and show how it solves some problems plaguing traditional RNNs. I will then present details of a music composition model built using LSTM as the learning device. I will present simulation results showing that LSTM successfully learns a form of blues music and is able to improvise novel melodies in that style. (Though I demonstrate that the music does indeed "go somewhere", I shall refrain from claiming that it sounds nice). Finally I will discuss how this work might be used as part of an intelligent interactive musical device that teaches itself to play through exposure to other musicians.


Beat Induction with Spiking Neural Networks


Beat induction is best described by analogy to the activities of hand clapping or foot tapping, and involves finding important metrical components in an auditory signal, usually music. Though beat induction is intuitively easy to understand it is difficult to define and still more difficult to perform automatically. I will present a model of
beat induction that uses a spiking neural network (SNN) as the underlying synchronization mechanism. This approach has some advantages over existing methods; it runs online, responds at many levels in the metrical hierarchy, and produces good results on performed music (Beatles piano performances encoded as MIDI). Furthermore, the synchronization properties of SNNs have been described analytically, providing a theoretical framework for understanding model performance. In the talk I will describe the model in some detail and discuss simulation results. I will also relate the work to the more ambitious goal of building flexible intelligent music devices that interact with musicians in real time. Time permitting I will comment on an important limitation in the model, namely that it has little prior knowledge about meter and rhythm and has no way to learn from example. Because the model performs quite well, the importance of this limitation is unclear. I will discuss ways to implement a learning algorithm for the model that would allow further exploration.


Contacts :

Carolyn Drake, tel 01 55 20 59 30 ou 06 83 82 68 24, Laboratoire de psychologie expérimentale, Institut de psychologie, Centre universitaire de Boulogne, 71, avenue Edouard Vaillant, 92774 Boulogne-Billancourt Cedex
Daniel Pressnitzer


Haut de la page haut