In psychometrics, content validity (also called logical validity) refers to the extent to which a measurement instrument represents all aspects of a particular psychological trait. For example, if a depression scale only assesses the emotional aspects of depression without taking into account the behavioral aspects, then the scale may lack content validity. Because there is a certain amount of subjectivity in the definition of a particular personality trait (e.g., extraversion), reaching consensus is key to ensuring content validity.
Content validity differs from face validity, which refers to whether a test appears to be valid on the surface rather than what it actually measures.
Content validity assessment usually requires subject matter experts to evaluate whether the test items cover the defined content and to conduct more rigorous statistical tests. This process is particularly important in academic and vocational testing because the test items must reflect the required knowledge or skills.
In a clinical setting, content validity refers to the degree of correspondence between test items and specific symptom content. To ensure the validity of the test, it is necessary to conduct an in-depth analysis of whether the test items can fully reflect the diversity of symptoms.
A widely used measure of content validity was proposed by C.H. Lawshe, which is a measure of the agreement among reviewers on the importance of a particular item.
In a 1975 article, Lawshe proposed that each subject matter expert answer for each item whether the skill or knowledge is "essential" or "useful but not essential" for job performance. , or 'unnecessary'?" According to Lawshe, if more than half of the reviewers consider an item to be necessary, then the item has at least some content validity.
Based on these assumptions, Lawshe developed a formula called the content validity ratio (CVR) to quantify content validity.
The formula is as follows: CVR = (ne - N/2) / (N/2)
, where CVR is the content validity ratio and ne is the number of experts who indicated "necessary" , N is the total number of experts. The formula takes values between +1 and -1; a positive value means that at least half of the experts believe that the item is necessary.
However, the calculation process and results are not static, especially when the number of evaluation experts changes, some unexpected mathematical phenomena may occur. For example, in a key value table provided by Lawshe, the situation becomes complicated when the number of experts is 8, which has attracted the attention of subsequent scholars.
Some researchers have tried to improve this model and found that Lawshe and Schipper's table was mislabeled as a one-tailed test, when it actually corresponds to a normal approximation to a two-tailed test.
As Wilson, Pan, and Schumsky et al. pointed out in their study, the recalculated critical value table of content validity ratio can better reflect the validity of the measurement and provide critical values at different significance levels. . This revision not only makes the assessment of content validity more precise, but also enables future researchers to design tests based on more solid data.
As we try to continuously improve content validity in practice, it is crucial to understand the meaning behind each test item. Especially in psychology and other related fields, effective testing will not be limited to seemingly superficial validity, but will ensure that all psychological traits are fully represented.
Content validity is not only a test standard, but also about how to correctly understand and measure the psychological phenomena we study. Whether the principles of content validity can be appropriately applied in different situations will become an issue worthy of in-depth exploration in future psychometrics.