Inhalt lesen- psychologie-aktuell

Contents, Volume 51, 2009, Issue 2

HERBERT POINSTINGL
The Linear Logistic Test Model (LLTM) as the methodological foundation of item generating rules for a new verbal reasoning test
Abstract | PDF of the full article

TRACEY PLATT, RENÉ PROYER & WILLIBALD RUCH
Gelotophobia and bullying: The assessment of the fear of being laughed at and its application among bullying victims
Abstract | PDF of the full article

JEANNE A. TERESI, KATJA OCEPEK-WELIKSON, MARJORIE KLEINMAN, JOSEPH P. EIMICKE, PAUL K. CRANE, RICHARD N. JONES, JIN-SHEI LAI, SEUNG W. CHOI, RON D. HAYS, BRYCE B. REEVE, STEVEN P. REISE, PAUL A.PILKONIS & DAVID CELLA
Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS): An item response theory approach
Abstract | PDF of the full article

HARTMANN H. SCHEIBLECHNER
Rasch and pseudo-Rasch models: suitableness for practical test applications
Abstract | PDF of the full article

PETER H. SCHONEMANN & MORITZ HEENE
Predictive validities: figures of merit or veils of deception?
Abstract | PDF of the full article

REGINA DITTRICH & REINHOLD HATZINGER
Fitting loglinear Bradley-Terry models (LLBT) for paired comparisons using the R package prefmod
Abstract | PDF of the full article

The Linear Logistic Test Model (LLTM) as the methodological foundation of item generating rules for a new verbal reasoning test
HERBERT POINSTINGL

Abstract
Based on the demand for new verbal reasoning tests to enrich psychological test inventory, a pilot version of a new test was analysed: the 'Family Relation Reasoning Test' (FRRT; Poinstingl, Kubinger, Skoda & Schechtner, forthcoming), in which several basic cognitive operations (logical rules) have been embedded/implemented. Given family relationships of varying complexity embedded in short stories, testees had to logically conclude the correct relationship between two individuals within a family. Using empirical data, the linear logistic test model (LLTM; Fischer, 1972), a special case of the Rasch model, was used to test the construct validity of the test: The hypothetically assumed basic cognitive operations had to explain the Rasch model's item difficulty parameters. After being shaped in LLTM's matrices of weights ((q_ij)), none of these operations were corroborated by means of the Andersen's Likelihood Ratio Test.

Key words: Rasch model; LLTM; item generating rules; reasoning test

Herbert Poinstingl, PhD
University of Vienna
Faculty of Psychology
Department of Developmental Psychology and Psychological Assessment
Liebiggasse 5
A-1010 Vienna, Austria
E-Mail: herbert.poinstingl@univie.ac.at

top

Gelotophobia and bullying: The assessment of the fear of being laughed at and its application among bullying victims
TRACEY PLATT, RENÉ T. PROYER & WILLIBALD RUCH

Abstract
Within the framework of social interaction this paper relates experiences of being bullied to the fear of being laughed at (gelotophobia) in two empirical studies. Study 1 (N = 252) describes the adaptation of a German-language instrument for the assessment of gelotophobia into English (the GELOPH<15>). The translation yielded good psychometric properties (high reliability; α = .90). The one-factor solution of the original version could be replicated. Gelotophobia existed independently of age and gender but was more prevalent among those who were single. 13% exceeded a cut-off score, indicating a slight expression of gelotophobic symptoms. Study 2 (N = 102) used the English GELOPH<15> together with an instrument for assessing emotional reactions in mean-spirited ridicule and good-natured teasing situations (the Ridicule Teasing Scenario questionnaire; Platt, 2008). Results indicated that being a victim of bullying yielded higher shame responses to teasing scenarios, and lower happiness and higher fear in response to both types of laughter situations. Stepwise multiple regression showed that self-reported experiences of having been a victim of bullying were best predicted by low happiness during teasing and high fear in response to ridicule, but gelotophobia accounted for most of these effects. Results are discussed within the context of future studies on gelotophobia-bullying social relationships.

Key words: bullying; gelotophobia; humour; test adaptation

Tracey Platt, MSc
University of Hull, UK
who is now at the University of Zurich
Department of Psychology
Personality and Assessment
Binzmühlestrasse 14/7
8050 Zürich, Switzerland
E-Mail: tracey.platt@psychologie.uzh.ch

top

Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS): An item response theory approach
JEANNE A. TERESI, KATJA OCEPEK-WELIKSON, MARJORIE KLEINMAN, JOSEPH P. EIMICKE, PAUL K. CRANE, RICHARD N. JONES, JIN-SHEI LAI, SEUNG W. CHOI, RON D. HAYS, BRYCE B. REEVE, STEVEN P. REISE, PAUL A.PILKONIS, DAVID CELLA

Abstract
The aims of this paper are to present findings related to differential item functioning (DIF) in the Patient Reported Outcome Measurement Information System (PROMIS) depression item bank, and to discuss potential threats to the validity of results from studies of DIF. The 32 depression items studied were modified from several widely used instruments. DIF analyses of gender, age and education were performed using a sample of 735 individuals recruited by a survey polling firm. DIF hypotheses were generated by asking content experts to indicate whether or not they expected DIF to be present, and the direction of the DIF with respect to the studied comparison groups. Primary analyses were conducted using the graded item response model (for polytomous, ordered response category data) with likelihood ratio tests of DIF, accompanied by magnitude measures. Sensitivity analyses were performed using other item response models and approaches to DIF detection. Despite some caveats, the items that are recommended for exclusion or for separate calibration were "I felt like crying" and "I had trouble enjoying things that I used to enjoy." The item, "I felt I had no energy," was also flagged as evidencing DIF, and recommended for additional review. On the one hand, false DIF detection (Type 1 error) was controlled to the extent possible by ensuring model fit and purification. On the other hand, power for DIF detection might have been compromised by several factors, including sparse data and small sample sizes. Nonetheless, practical and not just statistical significance should be considered. In this case the overall magnitude and impact of DIF was small for the groups studied, although impact was relatively large for some individuals.

Key words: patient reported outcomes measurement information system; item response theory; differential item functioning; depression

Jeanne A. Teresi, EdD, PhD
Research Division, HHAR
5901 Palisade Avenue
Riverdale, New York 10471, USA
E-Mail: teresimeas@aol.com

top

Rasch and pseudo-Rasch models: suitableness for practical test applications
HARTMANN H. SCHEIBLECHNER

Abstract
The Rasch model has been suggested for psychological test data (subjects items) for various scales of measurement. It is defined to be specifically objective. If the data are dichotomous, the use of the dichotomous model of Rasch for psychological test construction is almost inevitable. The two- and three-parameter logistic models of Birnbaum and further models with additional parameters are not always identifiable. The linear logistic model is useful for the construction of item pools. For polytomous graded response data, there are useful models (Samejima, 1969; Tutz, 1990; and again by Rasch, cf. Fischer, 1974, or Kubinger, 1989) which, however, are not specifically objective. The partial credit model (Masters, 1982) is not meaningful in a measurement theory sense. For polytomous nominal data, the multicategorical Rasch model is much too rarely applied. There are limited possibilities for locally dependent data. The mixed Rasch model is not a true Rasch model, but useful for model controls and heuristic purposes. The models for frequency data and continuous data are not discussed here. The nonparametric ISOP-models are "sample independent" (ordinally specifically objective) models for (up to 3 dependent) graded responses providing ordinal scales or interval scales for subject-, item- and response-scale-parameters. The true achievement of sample-independent Rasch models is an extraordinary generalizeability of psychological assessment procedures.

Key words: specific objectivity; measurement structures; graded responses; local dependence; generalized assessment procedures

Hartmann Scheiblechner, PhD
Philipps-Universität Marburg
Biegenstraße 10/12
35037 Marburg, Germany
E-Mail: scheible@mailer.uni-marburg.de

top

Predictive validities: figures of merit or veils of deception?
PETER H. SCHONEMANN & MORITZ HEENE

Abstract
The ETS has recently released new estimates of validities of the GRE for predicting cumulative graduate GPA. They average in the middle thirties - twice as high as those previously reported by a number of independent investigators.
It is shown in the first part of this paper that this unexpected finding can be traced to a flawed methodology that tends to inflate multiple correlation estimates, especially those of populations values near zero.
Secondly, the issue of upward corrections of validity estimates for restriction of range is taken up. It is shown that they depend on assumptions that are rarely met by the data.
Finally, it is argued more generally that conventional test theory, which is couched in terms of correlations and variances, is not only unnecessarily abstract but, more importantly, incomplete, since the practical utility of a test does not only depend on its validity, but also on base-rates and admission quotas. A more direct and conclusive method for gauging the utility of a test involves misclassification rates, and entirely dispenses with questionable assumptions and post-hoc "corrections".
On applying this approach to the GRE, it emerges (1) that the GRE discriminates against ethnic and economic minorities, and (2) that it often produces more erroneous decisions than a purely random admissions policy would.

Key words: validity; test theory; test construction; IQ tests; test bias; classification errors

Moritz Heene, PhD
Senior Lecturer
Ludwig Maximilian University of Munich
Department of Psychology
Unit Methodology and Psychological Assessment
Leopoldstr. 13
80802 Munich, Germany
E-Mail: heene@psy.lmu.de

top

Fitting loglinear Bradley-Terry models (LLBT) for paired comparisons using the R package prefmod
REGINA DITTRICH & REINHOLD HATZINGER

Abstract
This paper aims at introducing the R package prefmod (Hatzinger, 2009) which allows the user to fit various models to paired comparison data. These models give estimated overall rankings of objects or items where each subject (respondent/judge) makes one or more comparisons between pairs of objects (items). The focus is on the loglinear Bradley-Terry (LLBT) model, the loglinear formulation of the Bradley-Terry(-Luce) model, both assuming independence between comparisons. Five types of data structures are covered: (i) simple paired comparisons, (ii) paired comparisons including an undecided category, (iii) categorical subject covariates (for estimating different overall rankings for different subject groups), (iv) object covariates for reparameterizing objects, and (v) order (position) effects. Additionally, the discussion briefly addresses other response formats such as ratings and (partial) rankings.

Key words: Bradley-Terry(-Luce) model; loglinear Bradley-Terry model; prefmod; paired comparison; preference scale

Reinhold Hatzinger, PhD
Department of Statistics and Mathematics
WU (Vienna University of Economics and Business)
Augasse 2-6
1090 Vienna, Austria
E-Mail: reinhold.hatzinger@wu-wien.ac.at

top

Zurück

Psychological Test and Assessment Modeling

2009-2

Aktuell

Socials

Fachzeitschriften