Melodic similarity - a computational model

Computational modeling of melodic similarity judgments

two experimetns on isochronous melodic fragments

L. Hofmann-Engl & R. Parncutt

Keele University (UK, 1998)

chameleongroup online publication 1998

An in-depth discussion on melodic similarity including an evaluation of all major similarity models can be found here:

An evaluation of melodic similarity models (Hofmann-Engl, 2005)

Composers and music analysts alike have often referred to the concept of melodic similarity, either directly or in the guise of melodic variation, transformation and thematic development. Schönberg (1967) stressed the importance of motivic coherence within a composition just as Toch did (1948), implying that different motives should be somewhat related or similar. Reti (1951) and Later Nattiez (1975) based their main analytical tool on motivic analysis, employing an intuitive concept of melodic closeness and relatedness. However, what establishes closeness or relatedness between motives remains subjective judgment. In general, there has been little investigation into what, exactly, makes two motives similar. As a matter of fact, we are not aware of a single study which has endeavored to investigate melodic similarity extensively in an experimental setting. There have, however, been a number of attempts to approach the issue if not extensively but so at least partically.

The possibly most thoroughly investigated factor is the influence of transposition (e.g. Francès, 1988, van Egmond & Povel & Maris 1996) showing that measured similarity decreases with increasing transposition interval. There have been also some investigations into a possible link between melodic similarity and key relatedness (e.g. Cuddy, Cohen & Miller 1979: Trainor & Trehub, 1993). But as van Egmond & Povel & Maris (1996) report the results obtained in these experiments are often ambiguous.

It is obvious that similarity and recognition are linked: Two motives which are recognized as being the same must be highly similar and also will belong to the same category. Thus experiments investigating melodic recognition are crucially linked to similarity. Here, the most commonly investigated factor is contour (e.g. White, 1960; Dowling & Harwood, 1986) demonstrating that a melody can generally be recognized if the size of the intervals are changed while contour is maintained. However, Edworthy (1982, 1985) has been able to show thatthis applies to short motives (5 sounds) significantly better than it does to longer motives (15 sounds). Another factor which has been investigated is tempo (Gabriellson, 1973). His findings suggest that the more tempo changes the more decreases the measured similarity. However, there has been not one study to the knowledge of the authors which endeavored to investigate melodic similarity systematically. There have, however, been a number of attempts to approach the issue of similarity from a more theoretical prespective.

Palmer (1983) related similarity to the complexity of the transformation process involved in mapping one object onto the other. There has been no application of this model to the perception of melodic similarity known to us. Shepard (1987) established a concept of similarity by measuring the distance between the attributes of two objects. Kluge (1997) uses this model to evaluate the similarity of two musical objects. Here, similarity is represented by the closeness of two musical objects with respect to all their attributes. However, Kluge does not specify which musical attributes he is referring to. Cambouropoulos (1998) develops a concept of melodic similarity based on a model by Tversky, relating similarity to the number of coinciding attributes (e.g. pitch classes and inversion) of two melodies.

There also has been skepticism expressed regarding one-dimensional concepts of melodic similarity. Clarke & Dibben (1997) have suggested that the idea of similarity cannot be considered in isolation from musical context. On this basis they proposed replacing similarity by the idea of relatedness and functional equivalence. However, they do not develop their concept in detail. Their criticism is in part based on the observation that researchers like Shepard have not been able to produce a working model of similarity perception.

This paper aims to investigate the perception of melodic similarity in specific cases and, on the basis of experimental data obtained, to lay the foundations for a computational model of melodic and motivic similarity suitable for application in music theory, analysis and composition.

Two experiments tested the effect of pitch and tempo variations on isochronous melodic fragments (1 to 5 tones). In each trial of each experiment, a pair of fragments was presented, and participants asked to estimate the similarity of the pair. The sound stimuli were synthesized flute sounds. All intervals were multiples of 250 cents.

In the first experiment similarity was measured as a function of tempo, transposition, inversion, and contour. Each trial was presented in two different orders (fragment a followed by fragment b, and fragment b followed by fragment a).

16 participants (8 piano students, 2 professional composers, two professional dancers and two participants with negligible musical training) took part in the experiment. They were asked to assess similarity on a 9-point scale. No suggestions were made regarding the specific interpretation the word “similarity” in this context.

Tempo change: Tempo was varied by a factor 1.2 and 2.4 (affecting both the duration of tones and the silent intervals between them. No significant tempo effect was found.

Transposition: Fragments were transposed by 250 cents and 500 cents. Similarity decreased significantly with increasing transposition interval.

Inversion: No significant difference was observed between pairs of trials that were equivalent under inversion. No preference for ascending or descending fragments was observed.

Order: There was no consistently significant effect of order of fragments in a trial.

Contour: Melodic contour plays a central role in the recognition of familiar melodies (Dowling, 1971), and has often been used as a criterion for the classification of melodic material in ethnomusicology (Adams 1976). This suggests that contour will be a major factor in the assessment of melodic similarity. To test this idea we correlated the mean similarity judgments of all pairs of trials (except those designed to test the effect of transposition and tempo change) against their normalized contour difference (e.g. comparing fragment a: up - up - down with fragment b: up - down - up, the first interval is the same direction - up - and the second and third are in the opposite direction, so contour difference is 2, and normalized contour difference is 2/3). The correlation between contour change and mean similarity was low but significant with r = 0.48, p < 0.001.

In an attempt to develop a more powerful, but still mathematically simple model of melodic similarity, we compared similarity judgments with the normalized interval difference between two fragments, defined as the sum of the absolute differences between the intervals of two fragments: normalized interval difference = [(interval 1, fragment 1) - (interval 1, fragment 2) + (interval 2, fragment 1) - (interval 2, fragment 2) + ... + (interval n, fragment 1) - (Interval n, fragment 2)]/n where n is the number of intervals. This model accounted for 76% of the variance of the data for 18 trials in which the mean pitch of the fragments was equal and tempo was held constant. When both interval difference and contour difference were entered into a multiple regression, the contribution of contour was insignificant (p > 0.3). This suggested that normalized contour difference was embedded in normalized interval difference as defined.

The second experiment addressed issues raised by the first. Three trials addressed the question of how larger tempo changes affect similarity assessments. Six trials were designed to address the effect of transposition: Does it depend only on the transposition interval, or does it also depend on the length of the fragment? An additional 10 trials were designed to pit contour difference against interval difference.

Twenty people (7 piano students, 1 professional composer, 2 professional dancers and 10 participants with negligible musical knowledge) participated in the experiment.

Transposition: The transposition interval (250 and 500 cents) and the length of the fragments (two to four notes) were systematically varied. For the 10 trials testing the effect of transposition, a multiple regression showed that both transposition interval and length were significant predictors (p(interval) < 0.001, p(length) < 0.01). The correlation obtained was r = 0.89. Similarity ratings decreased with increasing transposition interval and decreasing length of fragment.

Surprisingly, tempo changes of factors of 2, 4 and even 6 did not have a significant effect on similarity judgments, suggesting that the listeners understood similarity as tempo invariant in context of isochronous fragments and were able to clearly separate tempo from other parameters. This is in accordance with the informal comments given by the participants after completing the experiment.

In trials to compare the effect of contour change versus interval difference, pairs of trials were designed so that a trial a contained two fragments with zero contour difference and interval difference of a given arbitrary non-zero value x. A trial b then contained two fragments with non-zero contour difference and interval difference again of the same arbitrary value x. We hypothesized that, if contour difference contributes to similarity independently of interval difference, the similarity assessments of trial a and b will differ. However the data did not support this hypothesis. Hence we concluded that interval difference is the most prominent predictor, and that contour difference is embedded within it.

In summary, we investigated the perception of similarity of pairs of isochronous melodic fragments in two experiments. The significance of two effects was established. In pairs of fragments where the fragments were transpositions of each other, it was found that the size of the transposition interval and the length of the fragment are sufficient predictors (the larger the size of the transposition interval and the shorter the fragment, the smaller the mean similarity). It seems that the listener adapts to a transposition after a certain number of tones, so that equality of interval size becomes the dominant factor. In pairs of fragments where the fragments had the same average pitch, “normalized interval difference” was found to be a sufficient predictor of similarity. A model based on contour changes did not contribute significantly beyond the contribution of interval difference, and led to several incorrect predictions. Tempo changes did not contribute significantly to similarity judgments, presumably because participants understood melodic similarity as independent of tempo.

Further experiments are needed to investigate the melodic similarity of fragments where the rhythm and dynamics are changed. It would also be of interest to check whether predictions according to the interval difference model are correct when applied to real musical stimuli (i.e.. excerpts from existing compositions). Finally, the role of recency in similarity judgments of longer melodic fragments requires investigation. An approximately refined and tested model could have a range of applications in music theory, composition and ethnomusicology.

References -

Cambouropoulos, E. & Smaill A. (1995). A computational model for the Discovery of Parallel Melodic Passages. Proceedings of the 11 Colloquio di Informatica Musicale, Bologna

Clarke, E. & Dibben N. (1997). An Ecological Approach to Similarity and Categorization in Music, Proceedings of Simcat workshop Edinburgh 1997, 37-41

Cuddy, L., & Cohen A., & Miller, J. (1979). Melodic recognition: The exerimental application of musical rules. Canadian Journal of Psychology, vol. 33, 148-156

Dowling, W. J. & Fujitani, D. S. (1971). Contour, interval, and pitch recognition in memory for melodies. Journal of the Acoustical Society of America, 49, 524-531

Dowling, W. J. & Harwood, D., (1986). Music Cognition, New York Academic Press

Edworthy, J., (1982). Pitch and contour in music processing. Psychomusicology, vol. 2, 44-46

Edworthy, J., (1985). Interval and contour in melody processing. Music Perception, vol. 2, 375-388

Egmond van, R. & Povel, D-J.. & Maris, E. (1996). The influence of height and key on the perceptual similarity of transposed melodies. Perception & Psychophysics, vol. 58, 1252-1259

Francès, R., (1988). The perception of Music (Dowling, Transl.). Hillsdale, NJ: Erlbaum (original work published 1958)

Gabrielsson, A., (1973). Similarity ratings and dimension analysis of auditory rhythm patterns: I & II, Scandinavian Journal of Psychology, vol. 14, 138-176

Kluge, R., (1996).Ähnlichneitskriterien für Melodieanalyse. Systematische Musikwissenschaft, 4.1-2, 91-99

Mazzola, G., (1990). Geometrie der Töne. Basel: Birkhäuser

Nattiez, J.-J., (1975). Fondements d’une sémiologie de la musique. Paris

Reti, R., (1951). The Thematic Process in Music. New York: Macmillan Company

Shepard, R. N., (1987). Toward a universal law of generalization of psychological science. Science, 237, 1317-1323

Schönberg, A., (1967). Fundamentals of Musical Composition. New York: St. Martin’s Press

Palmer, S., (1983). The psychology of perceptual organization. A transformational approach. In (ed.), Human and Machine Vision , (PP. 269-339). New York: Academic Press.

Toch, E., (1948). The Shaing Forces in Music. New York: Criterion Music Corp, pp. 78-101

Trainor, L., & Trehub, S. (1993). Musical context effects in infants and adults: Key distance. Journal of Experimental Psychology: Human erception & Performance, 19, 615-626

Tversky, A., (1977). Features of Similarity. Psychological Review, 84, 327-352

White, B., (1960). Recognition of distorted melodies. American Journal of Psychology, vol. 73, 100-107