- require "base64" ~ "\xEF\xBB\xBF" - def quellen opts - etc = opts.key? :etc - if etc - opts.delete :etc - etc = "\n" - else - etc = '' - "#{opts.map {|k, v| "#{k}" }.join "\n"}#{etc}" - def link link - "#{link}" - def import_data file - mime_type = IO.popen(["file", "--brief", "--mime-type", file], in: :close, err: :close) { |io| io.read.chomp } - content = Base64.urlsafe_encode64 File.read( file) - "data:#{mime_type};base64,#{content}" !!! 5 %html(lang='en') %head -#%meta(charset="utf-8") %title Decoding the sound of 'hardness' and 'darkness' as perceptual dimensions of music -#%link(rel="stylesheet" href="fonts/Roboto.css") -#%link(rel="stylesheet" href="fonts/RobotoSlab.css") -#%link(rel="stylesheet" href="fonts/PT_Mono.css") -#%link(rel="stylesheet" href="fonts/PT_Sans.css") -#%link(rel="stylesheet" href="fonts/Vollkorn.css") -#%link(rel="stylesheet" href="fonts/Asset.css") -#%link(rel="stylesheet" href="fonts/WithinDestruction.css") -#%link(rel="stylesheet" href="fonts/BlackDahlia.css") -#%link(rel="stylesheet" href="fonts/ThroughStruggleDEMO.css") -#%link(rel="stylesheet" href="fonts/TheDefiler.css") %link(rel="stylesheet" href="fonts/Cardo.css") %link(rel="stylesheet" href="fonts/Italianno.css") -#%link(rel="stylesheet" href="fonts/CinzelDecorative.css") %link(rel="stylesheet" href="style.css") %meta(name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no") %body %header(style="") %figure.logos(style="margin-top:0.3cm")<> %img#tagungs-logo(style="float:right" src="files/icmpc15_logo.jpg") %img#uni-logo(src="files/univie_logo.png") -#%div(style="font-size:0.8em;margin-top:1.31cm") 44. Jahrestagung für Akustik %br<> Technische Universität München %br<> 19. März 2018 .. 22. März 2018 -#.grabstein .grabstein-was DAGA .grabstein-wo Technische Universität München .grabstein-von ✦ 19. März 2018 .grabstein-bis ✝ 22. März 2018 -#%img(style="height:7cm;top:3cm;right:24cm;position:absolute" alt="Dunkle Nacht" src="files/Candle.png") %h1 Decoding the sound of hardness and darkness as perceptual dimensions of music %p#authors<> %span.author(data-mark="1,2")<> Isabella Czedik-Eysenberg %span.author(data-mark="1")<> Christoph Reuter %span.author(data-mark="2")<> Denis Knauf %p#institutions<> %span.institution(data-mark="1")<> University of Vienna, Austria %span.institution(data-mark="2")<> Student at Technical University of Vienna, Austria %main #column1_1 %section#heavy_features :markdown Sound Features ============== Considering Bonferroni correction, 65 significant feature correlations were found for the concept of hardness. The characterizing attributes of hardness include high tempo and sound density, less focus on clear melodic lines than noise-like sounds and especially the occurrence of strong percussive components. %ol %li percussive energy / rhythmic density %figure %img(style="width:50%" src="files/sonagramm_blunt_log.png") %img(style="width:50%" src="files/sonagramm_decap_log.png") %li dynamic distribution %figure %img(style="width:50%" src="files/blunt_envelope.png") %img(style="width:50%" src="files/decap_envelope.png") %figure %img(style="width:50%" src="files/blunt_dyndist.png") %img(style="width:50%" src="files/decap_dyndist.png") %li melodic content / harmonic entropy %figure %img(style="width:50%" src="files/blunt_chromagram.png") %img(style="width:50%" src="files/decap_chromagram.png") %section#heavy_model %h1 Model :markdown Sequential feature selection * set of 5 features * predictive linear regression model RMSE | 0.64 R-Squared | 0.80 MSE | 0.40 MAE | 0.49 r | 0.900 %figure %img(src="scatter_hardness_model5.png") %section#heavy_rater_agreement :markdown Rater Agreement =============== Intraclass Correlation Coefficient (Two-Way Model, Consistency): 0.653 #column1_2 -#%section#aims %h1 Aims %p Based on computationally obtainable signal features, the creation of models for the perceptual concepts of hardness and darkness in music is aimed for. Furthermore it shall be explored if there are interactions between the two factors and to which extent it is possible to classify musical genres based on these dimensions. %section#method %h1 Method %figure.right(style="width:50%") %img(src="files/LastFM.png") :markdown Based on last.fm listener statistics, 150 pieces of music were selected from 10 different subgenres of metal, techno, gothic and pop music. In an online listening test, 40 participants were asked to rate the refrain of each example in terms of hardness and darkness. These ratings served as a ground truth for examining the two concepts using a machine learning approach: Taking into account 230 features describing spectral distribution, temporal and dynamic properties, relevant dimensions were investigated and combined into models. Predictors were trained using five-fold cross-validation. %figure.right(style="width:50%") %img(src="files/einhorn/diagramm_vorgang_english.png") %section#data %h1 Data %figure.right(style="width:50%") %img(src="files/scatter_hard_dark_dashedline_2017-09-05.png") %section#hardness %h1 Hardness %p Hardness is often considered a distinctive feature of (heavy) metal music, as well as in genres like hardcore techno or Neue Deutsche Härte. In a previous investigation the concept of hardness in music was examined in terms of its acoustic correlates and suitability as a descriptor for music #{quellen 'Czedik-Eysenberg et al.' => 2017}. #column1_3 %section#darkness %h1 Darkness %p Certain kinds of music are sometimes described as dark in a metaphorical sense, especially in genres like gothic or doom metal. According to musical adjective classifications dark is part of the same cluster as gloomy, sad or depressing #{quellen Hevner: 1936}, which was later adopted in computational musical affect detection #{quellen 'Li & Oghihara' => 2003}. This would suggest the relevance of sound attributes that correspond with the expression of sadness, e.g. lower pitch, small pitch movement and dark timbre #{quellen Huron: 2008}. In timbre research brightness is often considered one of the central perceptual axes #{quellen Grey: 1975, 'Siddiq et al.' => 2014}, which raises the question if darkness in music is also reflected as the inverse of this timbral brightness concept. %section#darkness_features :markdown Sound Features ============== Considering Bonferroni correction, 35 significant feature correlations were found for the darkness ratings. While a suspected negative correlation with **timbral brightness** cannot be confirmed, darkness appears to be associated with a high **spectral complexity** and harmonic traits like **major or minor mode**. %figure %img(src="files/scatter_spectral_centroid_essentia_darkness.png") :markdown Correlations between darkness rating and measures for brightness: Feature | r | p -----------------------|--------|---------- Spectral centroid | 0.3340 | <0.01 High frequency content | 0.1526 | 0.0631 %figure %img(src="files/violin_keyEdma_darkMean_blaugelb.png") %p Musical excerpts in minor mode were significantly rated as harder than those in major mode. (p < 0.01 according to t-test) %section#darkness_model %h1 Model %figure %img(src="files/scatter_darkness_model8.png") :markdown Sequential feature selection: * combination of 8 features * predictive linear regression model RMSE| 0.81 R-Squared| 0.60 MSE| 0.65 MAE| 0.64 r| 0.7978 %section#darkness_rater_agreement :markdown Rater Agreement =============== Intraclass Correlation Coefficient (Two-Way Model, Consistency): **0.498** %footer %section#further_resultes_conclusion :markdown Further Results & Conclusions ================================= Comparison ---------- When comparing darkness and hardness, the results indicate that the latter concept can be more efficiently described and modeled by specific sound attributes: * The consistency between ratings given by different raters is higher for hardness (see Intraclass Correlation Coefficients) * For the hardness dimension, a model can be based on a more compact set of features and at the same time leads to a better prediction rate Further application ------------------- Although a considerable linear relation (r = 0.65, p < 0.01) is present between the two dimensions within the studied dataset, the concepts prove to be useful criteria for distinguishing music examples from different genres. E.g. a simple tree can be constructed for classification into broad genre categories (Pop, Techno, Metal, Gothic) with an accuracy of 74%. %img(src="files/predictionTree_genreAgg2.png") %img(src="files/confusionMatrix_simpleTree_genreAgg2.png") %section#conclusion :markdown Conclusion ========== Hardness and darkness constitute perceptually relevant dimensions for a high-level description of music. By decoding the sound characteristics associated with these concepts, they can be used for analyzing and indexing music collections and e.g. in a decision tree for automatic genre prediction. -#%section#ergebnisse1(style="height:96.35cm") %h1 4. Ergebnisse %figure.right(style="width:70%") %img(alt='Verwelkter Mohn' src='files/violin_genre_darkMean.svg') %p Es zeigt sich ein Bezug zwischen dem Genre und der durchschnittlichen Düsterkeitsbewertung der jeweiligen Stimuli. %figure.right(style="width:35%") %img(alt='Ernstes Indigo' src='files/scatter_spectral_centroid_essentia_darkness.svg') %p Eine Antiproportionalität zu klangfarblicher Helligkeit lässt sich (mit der vorliegenden Messmethode) nicht nachweisen. Es liegt im Gegenteil sogar eine leicht positive Korrelation vor – womöglich u.a. bedingt durch erhöhte dissonante Klanganteile im Hochfrequenzbereich (z.B. Schlagzeugvorkommen). Werden die perkussiven Signalanteile zuvor ausgefiltert, verringert sich dieser Effekt bereits deutlich. %figure.nobrtd(style="width:24em") :markdown Merkmal|r|p ---|---|--- Spectral Centroid|0,3340|< 0,0001 Hochfrequenzanteil (> 1500 Hz)|0,1526|0,0631 Spectral Centroid (harmonischer Teil)|0,2094|0,0101 Hochfrequenzanteil (harmonischer Teil)|0,1270|0,1215 {:.merkmale} %figcaption Korrelation der durchschnittlichen Düsterkeitsbewertung mit Maßen für klangfarbliche Helligkeit. .clear %figure.left(style="width:41.1%") %img(alt='Trauriges Purpur' src='files/violin_keyEdma_darkMean_blaugelb.svg') %figure %figure.right(style="width:12em") %img(alt="lilien grau" src="files/meanspectra_10khz_600dpi.png") %figure.right :markdown Merkmal|r|p ---|---|--- RMS Gammatone 1|- 0,3989|< 0,0001 RMS Gammatone 4|- 0,3427|< 0,0001 RMS Gammatone 5|- 0,3126|0,0001 {:.merkmale} %p(style="clear:right") Zwischen den 30 am düstersten bzw. am wenigsten düster bewerteten Klangbeispielen zeigen sich charakteristische Unterschiede in der spektralen Verteilung (insbesondere im Bereich der Gammatone-Filterbank-Bänder 1, 4 und 5). %p(style="clear:right") Ein deutlicher Zusammenhang zeigt sich mit der Tonart der jeweiligen Ausschnitte: Moll-Beispiele wurden im Durchschnitt als düsterer bewertet als Stücke in Dur-Tonarten (p < 0.0001 laut t-Test). %p(style="clear:right") Teilweise eher statische Tonchroma-Veränderungen im Fall der als düster bewerteten Beispiele könnten die Theorie geringere Tonhöhenbewegungen in Zusammenhang mit einem Ausdruck von Trauer bestätigen (siehe z.B. Chromagramm Sunn 0)))). %figure.right(style="width:58.2%") %img(style="width:49%" alt='Schrumpeliges Gelb' src='files/chromagramm_sunn.svg') %img(style="width:49%" alt='Vergängliches Weiß' src='files/chromagramm_abba.svg') %p(style="clear:left;max-width: 50%") Der stärkste Zusammenhang lässt sich zur Spectral Complexity feststellen, welche die Komplexität des Signals in Bezug auf seine Frequenzkomponenten anhand der Anzahl spektraler Peaks im Bereich zwischen 100 Hz und 5 kHz beschreibt. Dies ist interessant mit den Ergebnissen von #{quellen 'Laurier et al.' => 2010} in Bezug zu setzen, welche beobachteten, dass entspannte (relaxed) Stücke eine niedrigere spektrale Komplexität aufweisen, fröhliche (happy) Stücke jedoch eine leicht höhere spektrale Komplexität als nicht fröhliche. %figure.left(style="width:59.83%;position:relative") %img(alt='Totes Grün' src='files/scatter_model8_mit_beschriftung_gross.svg') %img(alt="Farbiges Beispiel" style="width:5cm;opacity:0.7;position:absolute;top:0;left:3cm" src="files/bat.png") %p(style="clear:right") Nach sequentieller Merkmalsauswahl wurden 8 Signaldeskriptoren zur Bildung eines Modells zu Rate gezogen: :markdown Merkmal|r|p ----|----|---- Spectral Complexity (mean)| 0,6224| < 0,0001 HPCP Entropy (mean)| 0,5355| < 0,0001 Dynamic Complexity| - 0,4855| < 0,0001 Onset Rate| - 0,4837| < 0,0001 Pitch Salience| 0,4835| < 0,0001 MFCC 3 (mean)| 0,4657| < 0,0001 Spectral Centroid (mean)| 0,3340| < 0,0001 RMS Energy Gammatone 4| - 0,3427| < 0,0001 {:.merkmale} %p Anhand dieser wurde unter 5-facher Kreuzvalidierung ein lineares Regressionsmodell zur Abschätzung der Düsterkeitsbewertung erstellt. :markdown Merkmal|Wert ----|---- Root-mean-squared error (RMSE)|0,81 Bestimmtheitsmaß (R2)|0,60 Mean Squared Error (MSE)|0,65 Mean Average Error (MAE)|0,64 Korrelation (insgesamt)|0,7978 {:.merkmale} %div(style="clear:left") .clear %section#references -#(style="width:44.5%;display:inline-block;float:right") %h1 References %ul.literatur %li %span.author Czedik-Eysenberg, I., Knauf, D., & Reuter, C. %span.year 2017 %span.title Hardness as a semantic audio descriptor for music using automatic feature extraction %span.herausgeber Gesellschaft für Informatik, Bonn %span.link %a(href="https://doi.org/10.18420/in2017_06") https://doi.org/10.18420/in2017_06 %li %span.author Grey, J.M. %span.year 1975 %span.title An Exploration of Musical Timbre %span.herausgeber Stanford University, CCRMA Report No.STAN-M-2 %li %span.author Li,T., Ogihara,M. %span.year 2003 %span.title Detecting emotion in music %nobr %span.herausgeber 4th ISMIR Washington & Baltimore %span.pages 239-240 %li %span.author Huron, D. %span.year 2008 %span.title A comparison of average pitch height and interval size in major-and minor-key themes %nobr %span.herausgeber Empirical Musicology Review, 3 %span.pages 59-63 %li %span.author Siddiq,S. et al. %span.year 2014 %span.title Kein Raum für Klangfarben - Timbre Spaces im Vergleich %nobr %span.herausgeber 40. DAGA %span.pages 56-57 .clear