Which type of validity is subjective based on judgment?

We are currently in the process of updating this chapter and we appreciate  your patience whilst this is being completed.

 

Validity

Validity is the extent to which an instrument, such as a survey or test, measures what it is intended to measure (also known as internal validity). This is important if the results of a study are to be meaningful and relevant to the wider population. There are four main types of validity:

  • Construct validity
    Construct validity is the extent to which the instrument specifically measures what it is intended to measure, and avoids measuring other things. For example, a measure of intelligence should only assess factors relevant to intelligence and not, for instance, whether someone is a hard worker. Construct validity subsumes the other types of validity.
     
  • Content validity
    Content validity describes whether an instrument is systematically and comprehensively representative of the trait it is measuring. For example, a questionnaire aiming to score anxiety should include questions aimed at a broad range of features of anxiety.
     
  • Face validity
    Face validity is the degree to which a test is subjectively thought to measure what it intends to measure. In other words, does it “look like” it will measure what it should do. The subjective opinion for face validity can come from experts, from those administering the instrument, or from those using the instrument.
     
  • Criterion validity
    Criterion validity involves comparing the instrument in question with another criterion which is taken to be representative of the measure. This can take the form of concurrent validity (where the instrument results are correlated with those of an established, or gold standard, instrument), or predictive validity (where the instrument results are correlated with future outcomes, whether they be measured by the same instrument or a different one).

 

Reliability 

Reliability is the overall consistency of a measure. A highly reliable measure produces similar results under similar conditions so, all things being equal, repeated testing should produce similar results. Reliability is also known as reproducibility or repeatability. There are different means for testing the reliability of an instrument:

  • Inter-rater (or inter-observer) reliability
    The degree of agreement between the results when two or more observers administer the instrument on the same subject under the same conditions.
     
  • Intra-rater (or intra-observer) reliability
    Also known as test-retest reliability, this describes the agreement between results when the instrument is used by the same observer on two or more occasions (under the same conditions and in the same test population).
     
  • Inter-method reliability
    This is the degree to which two or more instruments, that are used to measure the same thing, agree on the result. This is also known as equivalence.
     
  • Internal consistency reliability
    This is the degree of agreement, or consistency, between different parts of a single instrument.
     

Internal consistency can be measured using Cronbach’s alpha (α) – a statistic derived from pairwise correlations between items that should produce similar results. The usual range for the alpha will be zero to one, with values above 0.7 generally deemed acceptable, and a figure of one indicating perfect internal consistency. A negative value will occur if the choice of items is poor and there is inconsistency between them, or the sampling method is faulty. In these cases the items chosen need to be reviewed, along with possibly the sampling methods used for the items. 

Inter-rater reliability can be measured using the Cohen’s kappa (k) statistic. Kappa indicates how well two sets of (categorical) measurements compare. It is more robust than simple percentage agreement as it accounts for the possibility that a repeated measure agrees by chance. Kappa values range from -1 to 1, where values ≤0 indicate no agreement other than that which would be expected by chance, and 1 is perfect agreement. Values above 0.6 are generally deemed to represent moderate agreement. Limitations of Cohen’s kappa are that it can underestimate agreement for rare outcomes, and that it requires the two raters to be independent.

 

Generalisability

Generalisability is the extent to which the findings of a study can be applicable to other settings. It is also known as external validity. Generalisability requires internal validity as well as a judgement on whether the findings of a study are applicable to a particular group. In making such a judgement, you can consider factors such as the characteristics of the participants (including the demographic and clinical characteristics, as affected by the source population, response rate, inclusion criteria,  etc.), the setting of the study, and the interventions or exposures studied. Threats to external validity, that may result in an incorrect generalisation, include restrictions within the original study (eligibility criteria), and pre-test/post-test effects (where cause-effect relationships within a study are only found when pre-tests or post-tests are also carried out).

What type of validity is subjective?

Types of Content Validity The assessment of content validity is subjective. That is, content validity relies on people's judgments of the extent to which the item and/or measure is content valid. Two methods for assessing content validity exist: face validity and logical validity.

Which of the following types of validity are measured based on subjective judgment?

2. Content validity. Content validity is whether or not the measure used in the research covers all of the content in the underlying construct (the thing you are trying to measure). This is also a subjective measure, but unlike face validity we ask whether the content of a measure covers the full domain of the content.

Which type's of validity rely on judgments?

The type of validity that relies on subjective judgments and empirical data (i.e., data based on observations) is construct validity.

Is the criterion validity subjective?

It is a subjective validity criterion that usually requires a human researcher to examine the content of the data to assess whether on its “face” it appears to be related to what the researcher intends to measure.