317 lines
13 KiB
Plaintext
317 lines
13 KiB
Plaintext
- require "base64"
|
|
- def quellen opts
|
|
- etc = opts.key? :etc
|
|
- if etc
|
|
- opts.delete :etc
|
|
- etc = "\n<etc/>"
|
|
- else
|
|
- etc = ''
|
|
- "<quellen>#{opts.map {|k, v| "<quelle jahr=#{v}>#{k}</quelle>" }.join "\n"}#{etc}</quellen>"
|
|
- def link link
|
|
- "<a href=\"#{link}\">#{link}</a>"
|
|
- def import_data file
|
|
- mime_type = IO.popen(["file", "--brief", "--mime-type", file], in: :close, err: :close) { |io| io.read.chomp }
|
|
- content = Base64.urlsafe_encode64 File.read( file)
|
|
- "data:#{mime_type};base64,#{content}"
|
|
~ "\xEF\xBB\xBF"
|
|
!!! 5
|
|
%html(lang='en')
|
|
%head
|
|
-#%meta(charset="utf-8")
|
|
%title Decoding the sound of 'hardness' and 'darkness' as perceptual dimensions of music
|
|
-#%link(rel="stylesheet" href="fonts/Roboto.css")
|
|
-#%link(rel="stylesheet" href="fonts/RobotoSlab.css")
|
|
-#%link(rel="stylesheet" href="fonts/PT_Mono.css")
|
|
-#%link(rel="stylesheet" href="fonts/PT_Sans.css")
|
|
-#%link(rel="stylesheet" href="fonts/Vollkorn.css")
|
|
-#%link(rel="stylesheet" href="fonts/Asset.css")
|
|
-#%link(rel="stylesheet" href="fonts/WithinDestruction.css")
|
|
-#%link(rel="stylesheet" href="fonts/BlackDahlia.css")
|
|
-#%link(rel="stylesheet" href="fonts/ThroughStruggleDEMO.css")
|
|
-#%link(rel="stylesheet" href="fonts/TheDefiler.css")
|
|
%link(rel="stylesheet" href="fonts/Cardo.css")
|
|
%link(rel="stylesheet" href="fonts/Italianno.css")
|
|
-#%link(rel="stylesheet" href="fonts/CinzelDecorative.css")
|
|
%link(rel="stylesheet" href="style.css")
|
|
%meta(name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no")
|
|
|
|
|
|
%body
|
|
%header(style="")
|
|
%figure.logos(style="margin-top:0.3cm")<>
|
|
%img#tagungs-logo(style="float:right" src="files/icmpc15_logo.png")
|
|
%img#uni-logo(src="files/Uni_Logo_2016_ausschnitt.gif")
|
|
-#%div(style="font-size:0.8em;margin-top:1.31cm")
|
|
44. Jahrestagung für Akustik
|
|
%br<>
|
|
Technische Universität München
|
|
%br<>
|
|
19. März 2018 .. 22. März 2018
|
|
-#.grabstein
|
|
.grabstein-was DAGA
|
|
.grabstein-wo Technische Universität München
|
|
.grabstein-von ✦ 19. März 2018
|
|
.grabstein-bis ✝ 22. März 2018
|
|
-#%img(style="height:7cm;top:3cm;right:24cm;position:absolute" alt="Dunkle Nacht" src="files/Candle.png")
|
|
%h1
|
|
Decoding the sound of <q>hardness</q> and <q>darkness</q> as perceptual dimensions of music
|
|
%p#authors<>
|
|
%span.author(data-mark="1,2")<> Isabella Czedik-Eysenberg
|
|
%span.author(data-mark="1")<> Christoph Reuter
|
|
%span.author(data-mark="2")<> Denis Knauf
|
|
%p#institutions<>
|
|
%span.institution(data-mark="1")<> University of Vienna, Austria
|
|
%span.institution(data-mark="2")<> Student at Technical University of Vienna, Austria
|
|
|
|
%main
|
|
#column1_1
|
|
%section#hardness
|
|
%h1 Hardness
|
|
%p
|
|
<q>Hardness</q> is often considered a distinctive feature of (heavy)
|
|
metal music, as well as in genres like hardcore techno or <q>Neue
|
|
Deutsche Härte</q>.
|
|
In a previous investigation the concept of <q>hardness</q> in music
|
|
was examined in terms of its acoustic correlates and suitability as
|
|
a descriptor for music #{quellen 'Czedik-Eysenberg et al.' => 2017}.
|
|
|
|
:markdown
|
|
Sound Features
|
|
--------------
|
|
|
|
Considering Bonferroni correction, 65 significant feature
|
|
correlations were found for the concept of <q>hardness</q>.
|
|
|
|
The characterizing attributes of <q>hardness</q> include high
|
|
tempo and sound density, less focus on clear melodic lines than
|
|
noise-like sounds and especially the occurrence of strong percussive
|
|
components.
|
|
%ol
|
|
%li
|
|
percussive energy / rhythmic density
|
|
%figure
|
|
%img.fifty(src="files/sonagramm_blunt_log.png")
|
|
%img.fifty(src="files/sonagramm_decap_log.png")
|
|
%li
|
|
dynamic distribution
|
|
%figure
|
|
%img.fifty(src="files/blunt_envelope.png")
|
|
%img.fifty(src="files/decap_envelope.png")
|
|
%figure
|
|
%img.fifty(src="files/blunt_dyndist.png")
|
|
%img.fifty(src="files/decap_dyndist.png")
|
|
%li
|
|
melodic content / harmonic entropy
|
|
%figure
|
|
%img.fifty(src="files/blunt_chromagram.png")
|
|
%img.fifty(src="files/decap_chromagram.png")
|
|
:markdown
|
|
Model
|
|
-----
|
|
|
|
Sequential feature selection
|
|
|
|
* set of 5 features
|
|
* predictive linear regression model
|
|
|
|
RMSE | 0.64
|
|
R-Squared | 0.80
|
|
MSE | 0.40
|
|
MAE | 0.49
|
|
r | 0.900
|
|
%figure
|
|
%img(src="scatter_hardness_model5.png")
|
|
:markdown
|
|
Rater Agreement
|
|
---------------
|
|
|
|
Intraclass Correlation Coefficient (Two-Way Model, Consistency): <b>0.653</b>
|
|
.clear
|
|
|
|
#column1_2
|
|
-#%section#aims
|
|
%h1 Aims
|
|
%p
|
|
Based on computationally obtainable signal features, the creation
|
|
of models for the perceptual concepts of <q>hardness</q> and
|
|
<q>darkness</q> in music is aimed for. Furthermore it shall be
|
|
explored if there are interactions between the two factors and to
|
|
which extent it is possible to classify musical genres based on
|
|
these dimensions.
|
|
%section#method
|
|
%h1 Method
|
|
%figure.right(style="width:50%")
|
|
%img(src="files/LastFM.png")
|
|
%p
|
|
Based on last.fm listener statistics, 150 pieces of music were selected
|
|
from 10 different subgenres of metal, techno, gothic and pop music.
|
|
%p
|
|
In an online listening test, 40 participants were asked to rate the
|
|
refrain of each example in terms of <q>hardness</q> and <q>darkness</q>.
|
|
These ratings served as a ground truth for examining the two
|
|
concepts using a machine learning approach:
|
|
|
|
%figure.right(style="width:50%")
|
|
%img(src="files/diagramm_vorgang_english.png")
|
|
%p
|
|
Taking into account 230 features describing spectral distribution,
|
|
temporal and dynamic properties, relevant dimensions were
|
|
investigated and combined into models.
|
|
Predictors were trained using five-fold cross-validation.
|
|
.clear
|
|
%h2 Data
|
|
%figure
|
|
%img(src="files/scatter_hard_dark_dashedline_2017-09-05.png")
|
|
.clear
|
|
%section#further_resultes_conclusion
|
|
%h1 Further Results & Conclusions
|
|
%figure.fifty
|
|
%img.right(src="files/predictionTree_genreAgg2.png")
|
|
%img.right(src="files/confusionMatrix_simpleTree_genreAgg2.png")
|
|
:markdown
|
|
Comparison
|
|
----------
|
|
|
|
When comparing <q>darkness</q> and <q>hardness</q>, the results
|
|
indicate that the latter concept can be more efficiently described
|
|
and modeled by specific sound attributes:
|
|
|
|
* The consistency between ratings given by different raters is
|
|
higher for <q>hardness</q> (see Intraclass Correlation
|
|
Coefficients)
|
|
* For the <q>hardness</q> dimension, a model can be based on a more
|
|
compact set of features and at the same time leads to a better
|
|
prediction rate
|
|
|
|
Further application
|
|
-------------------
|
|
|
|
Although a considerable linear relation
|
|
(<nobr>r = 0.65</nobr>, <nobr>p < 0.01</nobr>) is present between
|
|
the two dimensions within the studied dataset, the concepts prove to
|
|
be useful criteria for distinguishing music examples from different
|
|
genres.
|
|
|
|
E.g. a simple tree can be constructed for classification into broad
|
|
genre categories (Pop, Techno, Metal, Gothic) with an accuracy of
|
|
74%.
|
|
.clear
|
|
|
|
|
|
#column1_3
|
|
%section#darkness
|
|
%h1 Darkness
|
|
%p
|
|
Certain kinds of music are sometimes described as <q>dark</q> in a
|
|
metaphorical sense, especially in genres like gothic or doom metal.
|
|
According to musical adjective classifications <q>dark</q> is part
|
|
of the same cluster as <q>gloomy</q>, <q>sad</q> or
|
|
<q>depressing</q> #{quellen Hevner: 1936}, which was later adopted in
|
|
computational musical affect detection
|
|
#{quellen 'Li & Oghihara' => 2003}.
|
|
This would suggest the
|
|
relevance of sound attributes that correspond with the expression
|
|
of sadness, e.g. lower pitch, small pitch movement and <q>dark</q>
|
|
timbre #{quellen Huron: 2008}. In timbre research <q>brightness</q>
|
|
is often considered one of the central perceptual axes
|
|
#{quellen Grey: 1975, 'Siddiq et al.' => 2014}, which raises the
|
|
question if <q>darkness</q> in music is also reflected as the
|
|
inverse of this timbral <q>brightness</q> concept.
|
|
:markdown
|
|
Sound Features
|
|
--------------
|
|
|
|
Considering Bonferroni correction, 35 significant feature
|
|
correlations were found for the <q>darkness</q> ratings.
|
|
|
|
While a suspected negative correlation with **timbral
|
|
<q>brightness</q>** cannot be confirmed, <q>darkness</q> appears to
|
|
be associated with a high **spectral complexity** and harmonic
|
|
traits like **major or minor mode**.
|
|
%figure.fifty
|
|
%img(src="files/scatter_spectral_centroid_essentia_darkness.png")
|
|
:markdown
|
|
Correlations between darkness rating and measures for brightness:
|
|
|
|
Feature | r | p
|
|
-----------------------|--------|----------
|
|
Spectral centroid | 0.3340 | <0.01
|
|
High frequency content | 0.1526 | 0.0631
|
|
%figure.fifty
|
|
%img(src="files/violin_keyEdma_darkMean_blaugelb.png")
|
|
%p
|
|
Musical excerpts in minor mode were significantly rated as
|
|
<q>harder</q> than those in major mode. (<nobr>p < 0.01</nobr>
|
|
according to t-test)
|
|
%h2 Model
|
|
%figure.fifty
|
|
%img(src="files/scatter_darkness_model8.png")
|
|
:markdown
|
|
Sequential feature selection:
|
|
|
|
* combination of 8 features
|
|
* predictive linear regression model
|
|
|
|
RMSE| 0.81
|
|
R-Squared| 0.60
|
|
MSE| 0.65
|
|
MAE| 0.64
|
|
r| 0.7978
|
|
:markdown
|
|
Rater Agreement
|
|
---------------
|
|
|
|
Intraclass Correlation Coefficient (Two-Way Model, Consistency):
|
|
**0.498**
|
|
.clear
|
|
|
|
%footer
|
|
%section#conclusion
|
|
:markdown
|
|
Conclusion
|
|
==========
|
|
|
|
<q>Hardness</q> and <q>darkness</q> constitute perceptually relevant
|
|
dimensions for a high-level description of music. By decoding the
|
|
sound characteristics associated with these concepts, they can be
|
|
used for analyzing and indexing music collections and e.g. in a
|
|
decision tree for automatic genre prediction.
|
|
|
|
%section#references
|
|
-#(style="width:44.5%;display:inline-block;float:right")
|
|
%h1 References
|
|
%ul.literatur
|
|
%li
|
|
%span.author Czedik-Eysenberg, I., Knauf, D., & Reuter, C.
|
|
%span.year 2017
|
|
%span.title <q>Hardness</q> as a semantic audio descriptor for music using automatic feature extraction
|
|
%span.herausgeber Gesellschaft für Informatik, Bonn
|
|
%span.link= link 'https://doi.org/10.18420/in2017_06'
|
|
%li
|
|
%span.author Grey, J.M.
|
|
%span.year 1975
|
|
%span.title An Exploration of Musical Timbre
|
|
%span.herausgeber Stanford University, CCRMA Report No.STAN-M-2
|
|
%li
|
|
%span.author Li,T., Ogihara,M.
|
|
%span.year 2003
|
|
%span.title Detecting emotion in music
|
|
%nobr
|
|
%span.herausgeber 4th ISMIR Washington & Baltimore
|
|
%span.pages 239-240
|
|
%li
|
|
%span.author Huron, D.
|
|
%span.year 2008
|
|
%span.title A comparison of average pitch height and interval size in major-and minor-key themes
|
|
%nobr
|
|
%span.herausgeber Empirical Musicology Review, 3
|
|
%span.pages 59-63
|
|
%li
|
|
%span.author Siddiq,S. et al.
|
|
%span.year 2014
|
|
%span.title Kein Raum für Klangfarben - Timbre Spaces im Vergleich
|
|
%nobr
|
|
%span.herausgeber 40. DAGA
|
|
%span.pages 56-57
|
|
.clear
|