Ja, den er grei.
Men, poenget her er jo att man hører forskjell.
Så sier da målemafiaen, men ikke under en abx test!!!!
Så du er inne på noe, dette er oppfattet som snodig av da målemafiaen. De ser på det som ett bevis, ett vitenskapelig ett.
Da... har du jo og postulert att uansett hva høremafiaen sier, så vil det ALLTID bli avvfeid.
Det du sier er: Vi vet vi har rett, når dere underkaster dere abx vil dere også endelig høre....
Da er debatten over, for min del. Uansett hva jeg sier vil det aldri bli hørt på...
Eller, har jeg feil i min antakelse?
Og er dette da standpunktet også til I_L ... så er det ikke rart vi krangler så busta fyker.
Og aldri blir enige!!!
Da sier jo målemafiaen bare en ting: Kom til slaktebenken, vi har slipt knivene...
Men vi kan jo snu på flisa:
Det man "hører" under en abx test kan man ikke stole på...
mvh
En ABX er designet for å minimere bias. Det er mye å passe på for å gjøre en slik test valid også, det er nærmere diskutert i artikkelen nevnt under. Gjort riktig er den imidlertid ansett å ha god validitet.
Det at vi ikke blir enig tror jeg dreier seg mye om «confirmation bias», som for øvrig også ble demonstrert i programmet «Folkeopplysningen», NRK. «Confirmation bias” kan du lese mer her
Confirmation bias - Wikipedia, the free encyclopedia: “
Confirmation bias (also called
confirmatory bias or
myside bias) is a tendency of people to favor information that confirms their beliefs or
hypotheses.
[Note 1][1] People display this bias when they gather or remember information selectively, or when they interpret it in a
biased way. The effect is stronger for emotionally charged issues and for deeply entrenched beliefs. For example, in reading about
gun control, people usually prefer sources that affirm their existing attitudes. They also tend to interpret ambiguous evidence as supporting their existing position. Biased search, interpretation and memory have been invoked to explain
attitude polarization (when a disagreement becomes more extreme even though the different parties are exposed to the same evidence), belief perseverance (when beliefs persist after the evidence for them is shown to be false), the irrational primacy effect (a greater reliance on information encountered early in a series) and
illusory correlation (when people falsely perceive an association between two events or situations).”
Jeg synes dette stemmer meget bra med hva vi opplever i denne debatten.
Du mener at man kan høre forskjeller til tross for at det ikke er målbare forskjeller? OK - la oss se litt på hvilke muligheter for systematiske feil som kan gjøres i f. eks en lyttetest. Asbjørn har allerede forklart om illusoriske korrelasjoner i innlegg #899. Bias er et begrep som er mye brukt i forskningssammenheng, og i denne sammenheng kan defineres som systematiske feil som påvirker resultater av lyttetester. Det helt sikkert flere mulige bias, men har plukket ut noen hentet fra:
http://www.acourate.com/Download/BiasesInModernAudioQualityListeningTests.pdf
Jeg klipper og limer litt fra denne artikkelen:
Recency effect
…There is another problem related to using long, timevarying stimuli, which potentially can give rise to a systematic error. As mentioned, listeners face problems when evaluating the audio quality of long program material. It was observed that listeners are not reliable at “averaging” quality as it changes over the duration of the whole excerpt, and their judgments are biased toward the quality of that part of the recording that is auditioned last (the end of the recording if the recording is not looped). This psychological effect is related to the dominance of short-term memory over long-term memory and is often referred to as a recency effect, as the assessors tend to be biased toward recent events. For example, Gros et al. [22] conducted a study evaluating telephone speech quality and observed a systematic shift in scores due to the recency effect of a magnitude of up to 23% of the total range of the scale. Moreover, the recency effect was studied extensively by Aldridge et al. in the context of picture quality evaluation [23]. This phenomenon is sometimes referred to as a forgiveness effect as the assessors tend to “forgive” occasional imperfections in the quality, provided that the final part of the evaluated excerpt is unimpaired. For example, in the study conducted by Seferidis et al. [24] it was observed that for some stimuli the recency effect biased the results of the subjective evaluation by almost 50%.
Biases Due to Appearance, Branding, Expectation, and Personal Preference
In general listeners’ affective judgments can be biased by the appearance of the equipment, price, and branding. For example, Toole and Olive demonstrated that in preference tests (affective judgments) both experienced and inexperienced listeners were biased by the appearance and the brand names of the loudspeakers evaluated [38]. When the scores from the listening tests were averaged across the listeners it was found that the results were different depending on whether the participants could see the loudspeakers or not. The maximum observed difference equaled 1.2 points (loudspeaker D, location 1) on a scale ranging from 0 to 10, which constitutes 12% of the range of the scale. This study is often quoted as a classical example of how important it is to ndertake blind listening tests in order to reduce nonacoustic bias.
Listeners can also be biased by different labeling of the equipment. An interesting example is provided by Bentler et al. [39]. In their experiment a group of listeners were asked to assess the audio quality of two identical types of hearing aids, labeled as either digital or conventional. They found that out of 40 participants 33 listeners preferred the hearing aids labeled digital, 3 preferred the conventional ones, and only 4 participants did not hear the difference between the two. Like Toole and Olive, Benter et al. also emphasized the importance of undertaking blind listening tests in order to minimize this type of bias.
For a given object under evaluation assessors may give different scores depending on whether the object meets their expectations or not. It is likely that the participants will like the objects that meet their expectations and dislike any object that departs from their internal standard of expectation. For example, Fastl reported that the brand name of a car can trigger expectations about the sound character produced by a closing door [33]. Another interesting example is provided by Beidl and Stu¨cklschwaiger [40]. They asked the listeners to do paired comparisons of different car noises. One group of listeners preferred quieter noises, whereas another group preferred louder noises, which was an unexpected outcome of the investigation. When asked for justification, the second group argued that “the higher the speed, the more powerful, the more sport
, the more dynamic and, therefore, better.”
A more recent example of how the expectation of listeners may affect the results of an audio quality evaluation is given by Va¨stfja¨ll [42]. In his experiment the expectation of the participants was controlled directly by asking them to read different consumer reports (either positive or negative). In the listening test participants were asked to evaluate the annoyance (affective judgment) of two different aircraft sounds. It was found that participants who had low expectations on average rated the unpleasant sound as less annoying than people who had high expectations. The difference was equal to about 13% of the total range of the scale.
An interesting example, demonstrating how nonacoustical factors, such as the meaning of the sound, can influence affective judgments is presented by Zimmer et al. [43]. They undertook a listening test investigating the unpleasantness of environmental sounds and then applied a probabilistic choice model based on physical characteristics of the stimuli in order to predict the data. Their model predicted the scores well for all sounds investigated except the “wasp” sound. They concluded that in this case the listeners’ judgments of unpleasantness could have been governed by nonacoustical factors: “Based on the comments by participants some of whom reported to instinctively have ducked, or tried to wave off ‘the bee from their left ear,’ the excess annoyance of this sound may be tentatively characterized as being due to its intrusiveness.” Another example demonstrating how the meaning of the sound may influence the affective judgments is provided by Fastl [33]. He reported that the bell sound may be interpreted by German subjects as “pleasant” and “safe,” due to its association with the sound of a church bell. On the contrary, for Japanese listeners the sound of a bell may lead to feelings denoted by the terms “dangerous” or “unpleasant,” since it could be associated with the sound of a fire engine or a railroad crossing.
Another problem related to the listening tests involving affective judgments is their poor long-term stability of results. Although in the experiments conducted by Choisel it was observed that the results of preference judgments were stable over a period of approximately six months [44], in general the listeners’ preferences may drift over time due to, for instance, changes in fashion, changes in listening habits, or changes in the technical quality of available products. Consequently this may prevent researchers from drawing conclusions that would hold true for a long time. For example, in 1957 Kirk undertook a study investigating people’s preferences in audio quality. According to his results, 210 college students preferred a narrow-band reproduction mode (limited to 90–9000 Hz) compared to the unrestricted frequency range [45]. Had this experiment been undertaken nowadays, it is likely that the results would have been different. Kirk also found that listeners’ preferences can be changed by continued exposure to a particular audio system. The issue of the longterm stability of affective judgments in listening tests requires further studies.
Bias Due to Emotions and Mood
It is important to distinguish between emotion and mood. The former is a specific reaction to a stimulus, whereas the latter is a general “background” feeling. Both emotions and mood can have some effect on affective judgments, and there is some evidence that “happy” people make more positive judgments. For example, Va¨stfja¨ll and Kleiner [37] investigated the effect of emotion on the perception of sound quality using the annoyance scale and found out that for some listeners the mood biased the results by as much as 40% with respect to the total range of the scale. Moreover, in a more recent experiment Va¨stfja¨ll observed that listeners who had a positive frame of mind judged the pleasantness of sound significantly higher than people who had a negative attitude. In addition it was found that those listeners who were annoyed evaluated sounds higher on the annoyance scale compared to the listeners in a neutral mood [42]. The magnitude of this effect varied across listeners and ranged from 10% to almost 40% of the total range of the scale. These examples illustrate to what extent affective judgments of audio quality are prone to nonacoustic factors such as mood or emotional state. This is one of the reasons why it is advantageous to use many listeners. Using a large population of listeners, preferably at different times of the day, may help to average this bias out. If a listening panel contains a similar number of listeners with a positive frame of mind compared to those with a negative one, this bias may cancel out.
Situational Context Bias
Food scientists have observed that affective judgments may change depending on the situational context. For example, some food or beverage products are more liked in a restaurant than in a home setting. In other words, the same product may fit one situation and not another. This may imply that for a given sound stimulus, its quality may be evaluated differently depending on the situational context, and hence it might be advisable to conduct situation oriented studies. For example, some levels of audio quality may be unacceptable in a carefully designed listening room but may be tolerable in a kitchen. Although some authors seem to support this hypothesis (see the definition of audio quality proposed by Blauert and Jekosch [36]), there are no direct data available in its support. On the contrary, there is some evidence contradicting this hypothesis. For example, Gros et al. [22] showed that the context of environmental or visual cues has a very weak influence on the audio quality of speech recordings. This is also in accordance with the findings of Beresford et al., who conducted a study investigating contextual effects of listening room and automotive environments on sound quality evaluation [46]. The result of their investigation showed that the listening context had no effect on scores when using the single judgment method. There was some evidence, however, that the null result was due to contraction bias, which will be discussed in Section 4.3. When the experiment was repeated with a multiple-stimulus method, some differences between the environments were found, but they proved to be very small [47].
Dette er nok bare deler av mulige feilkilder. Kan man stole på det man opplever i en ukontrollert lyttetest? Nei - det er i hvert fall god grunn til å være skeptisk til slike lyttetester når de ikke stemmer overens med målinger.
Analogien mellom deler av hifi-miljøet og alternativbransjen står seg stadig...