Authors: Rubio-Aparicio, M.;Núñez-Núñez, R. M.;Sánchez-Meca, J.;López-Pina, J.;Marín-Martínez, F.;Lopez-Lopez, J. A. · Research
How Reliable is the Padua Inventory-Washington State University Revision for Assessing Obsessive-Compulsive Symptoms?
A meta-analysis examining the reliability of a commonly used obsessive-compulsive disorder assessment scale across different populations and contexts.
Source: Rubio-Aparicio, M., Núñez-Núñez, R. M., Sánchez-Meca, J., López-Pina, J., Marín-Martínez, F., & Lopez-Lopez, J. A. (2018). The Padua Inventory-Washington State University Revision of Obsessions and Compulsions: A Reliability Generalization Meta-analysis. Journal of Personality Assessment. Advance online publication. https://doi.org/10.1080/00223891.2018.1483378
What you need to know
- The Padua Inventory-Washington State University Revision (PI-WSUR) is a commonly used questionnaire to assess obsessive-compulsive symptoms.
- This study combined data from many previous studies to evaluate how reliable the PI-WSUR is across different populations and contexts.
- The PI-WSUR total score showed excellent reliability overall, but reliability varied depending on factors like sample characteristics.
- Researchers should report reliability estimates when using the PI-WSUR rather than assuming reliability from previous studies.
Background on the PI-WSUR
The Padua Inventory-Washington State University Revision (PI-WSUR) is a questionnaire used to measure symptoms of obsessive-compulsive disorder (OCD). It consists of 39 items that ask about different obsessive and compulsive symptoms. The PI-WSUR produces a total score as well as scores on five subscales:
- Obsessive thoughts about harm to self/others
- Obsessive impulses to harm self/others
- Contamination obsessions and washing compulsions
- Checking compulsions
- Dressing/grooming compulsions
The PI-WSUR is frequently used both for research purposes and in clinical settings to screen for OCD symptoms or assess symptom severity. Given how widely it is used, it’s important to understand how reliable and consistent the scores are across different contexts.
What is reliability and why does it matter?
Reliability refers to how consistent and stable the scores from a questionnaire or test are. A highly reliable measure will produce similar scores if given to the same person multiple times, assuming their true level of symptoms hasn’t changed. Reliability is important because unreliable scores can lead to inaccurate conclusions in research or misdiagnosis in clinical practice.
There are different types of reliability. This study looked at:
Internal consistency reliability - how well the different items on the questionnaire relate to each other and measure the same underlying construct. This is typically measured by a statistic called Cronbach’s alpha.
Test-retest reliability - how consistent scores are when the same person takes the questionnaire at two different time points. This is measured by the correlation between scores at Time 1 and Time 2.
Higher reliability values indicate more consistent, dependable scores. Generally, reliability values above 0.70 are considered acceptable, above 0.80 is good, and above 0.90 is excellent.
How the researchers assessed PI-WSUR reliability
The researchers conducted what’s called a reliability generalization meta-analysis. This involved:
Searching for all published and unpublished studies that had used the PI-WSUR and reported reliability information.
Extracting the reliability values (Cronbach’s alpha and test-retest correlations) from each study, along with information about the study characteristics.
Statistically combining and analyzing all the reliability estimates to determine:
- The average reliability of the PI-WSUR
- How much reliability varied across studies
- What factors were associated with higher or lower reliability
This method allows researchers to get a comprehensive picture of a questionnaire’s reliability across many different samples and contexts.
Key findings on PI-WSUR reliability
Internal consistency reliability
- The average Cronbach’s alpha for the PI-WSUR total score was 0.929, indicating excellent internal consistency overall.
- For the subscales, average alphas ranged from 0.792 to 0.900, generally indicating good internal consistency.
- However, there was significant variability in alpha values across studies.
Test-retest reliability
- The average test-retest reliability for the total score was 0.767, indicating good stability over time.
- For subscales, average test-retest correlations ranged from 0.540 to 0.790.
- There were very few studies reporting test-retest reliability, so these estimates should be interpreted cautiously.
Factors affecting reliability
Several factors were associated with higher or lower reliability:
Studies with more variable PI-WSUR scores (larger standard deviations) tended to have higher reliability. This makes sense statistically, as having a wider range of scores allows for more consistent measurement.
The original English version of the PI-WSUR showed slightly higher reliability than translated versions.
Studies conducted in North America tended to have higher reliability than those in Europe.
For the contamination/washing subscale, studies with more clinical participants (vs. general population) showed higher reliability.
Implications for using the PI-WSUR
Overall, this study supports the PI-WSUR as a reliable measure of obsessive-compulsive symptoms, particularly for the total score. However, the variability in reliability across studies has some important implications:
Researchers and clinicians should not assume the PI-WSUR will be equally reliable in every context. It’s important to calculate and report reliability statistics when using the questionnaire, rather than relying on previously published values.
Extra caution may be warranted when using translated versions of the PI-WSUR or when using it in cultural contexts very different from where it was developed.
The subscales, particularly the shorter ones, may be less reliable than the total score. They should be interpreted more cautiously, especially in non-clinical samples.
More research on test-retest reliability of the PI-WSUR is needed, as very few studies have examined this important aspect of reliability.
Conclusions
- The PI-WSUR total score demonstrates excellent internal consistency reliability overall.
- Reliability varies depending on factors like score variability, language version, and study location.
- Researchers should report reliability estimates with their own data rather than assuming reliability based on previous studies.
- More research is needed on test-retest reliability of the PI-WSUR.