Inhalt lesen- psychologie-aktuell

CONTENTS

Disentangling fluid and crystallized intelligence by means of Bayesian structural equation modeling and correlation-preserving mean plausible values
André Beauducel, Richard Bruntsch, Martin Kersting
DOI: https://doi.org/10.2440/001-0010

Examining the viability of the Continuous Matching Task in mobile assessment compared to laboratory testing
Johann-Christoph Münscher
DOI: https://doi.org/10.2440/001-0011

The impact of filtering out rapid-guessing examinees on PISA 2015 country rankings
Michalis P. Michaelides, Militsa G. Ivanova, Demetris Avraam
DOI: https://doi.org/10.2440/001-0012

Unraveling Performance on Multiple-Choice and Free-Response University Exams: A Multilevel Analysis of Study Time, Lecture Attendance and Personality Traits
Tuulia M. Ortner, Verena Keneder, Sonja Breuer, Freya M. Gruber, Thomas Scherndl
DOI: https://doi.org/10.2440/001-0013

++ to be continued +++

Disentangling fluid and crystallized intelligence by means of Bayesian structural equation modeling and correlation-preserving mean plausible values

André Beauducel, Richard Bruntsch, Martin Kersting

Abstract:
The present study presents Bayesian confirmatory factor analyses of data from an extensive computer intelligence test battery used in the applied field of assessment in Switzerland. Bayesian confirmatory factor analysis allows to constrain the variability and distribution of model parameters according to theoretical expectations using priors. Posterior distributions of the model parameters are then obtained by means of a Bayesian estimation procedure. A large sample of 4,677 participants completed the test battery comprising 21 different tasks. Factors for crystallized intelligence, fluid intelligence, memory, and basic skills/clerical speed were obtained. The latter factor is different from speed-factors in several other tests as it encompasses speeded performance on moderately complex tasks. Three types of models were compared: for one type, only the expected salient loadings were freely estimated, and all crossloadings were fixed to zero (i.e., independent clusters) whereas for the other two types of models normally distributed priors with a zero mean were defined. The latter two types were again altered regarding the amount of defined prior variance. Results show that defining substantial prior var-iances for the cross-loadings in Bayesian confirmatory factor analysis allow to overcome limi-tations of the independent clusters model. In order to estimate individual scores for the factors, mean plausible values were computed. However, the intercorrelations of the mean plausible-values substantially overestimated the true correlations of the factors. To improve discriminant validity of individual score estimates, it was therefore proposed to compute correlation-preserving mean plausible values. The findings can be applied to derive estimates for factorial scoring of a test battery, especially if cross loadings of subtests must be expected.

Keywords: Fluid intelligence; crystallized intelligence; Bayesian confirmatory factor analysis, mean plausible values

Correspondence:
André Beauducel, University of Bonn, Department of Psychology, Kaiser-Karl-Ring 9, 53111 Bonn, Germany, email: beauducel@uni-bonn.de

Examining the viability of the Continuous Matching Task in mobile assessment compared to laboratory testing.

Johann-Christoph Münscher

Abstract:
Two measures of attention, the Continuous Matching Task (CMT, measuring alertness), and the Stroop task (measuring selective attention) were applied under two conditions: In the laboratory using a standardized apparatus and in mobile measurements using participant’s smart devices. Both are cognitive performance tasks reliant on processing speed. In past research, implementing this type of measurement on mobile devices was called into question and the psychometric quality was assumed to be low. The present study aims to evaluate if the CMT can yield equivalent results from guided laboratory testing and self-administered mobile measurements. The Stroop task results are evaluated in the same way and results of the two tasks are compared. They were implemented identically in both conditions, with only slight modifications to the methods of input. Comparing and analyzing the results revealed that the CMT is not consistent across conditions and prone to age effects on mobile devices. Consequently, it is largely not suited for mobile assessment. The Stroop task showed more consistent measurements, although characteristic shortcomings were also observed. Generally, mobile assessment using response-time-based measurements appear to be problematic when tasks are more technically demanding.

Keywords: Mobile Assessment, Continuous Performance Tasks, Computer Applications, At-tention, Reaction Times

Correspondence:
Johann-Christoph Münscher
orcid.org/0000-0002-8434-7970
Department of Aviation and Space Psychology, German Aerospace Center (DLR) Institute of Aerospace Medicine.
E-mail: Johann-Christoph.Muenscher@dlr.de

The impact of filtering out rapid-guessing examinees on PISA 2015 country rankings

Michalis P. Michaelides, Militsa G. Ivanova, Demetris Avraam

Abstract:
International large-scale assessments are low-stakes tests for examinees and their motivation to perform at their best may not be high. Thus, these programs are criticized as invalid for accurately depicting individual and aggregate achievement levels. In this paper, we examine whether filtering out examinees who rapid-guess impacts country score averages and rankings. Building on an earlier analysis that identified rapid guessers using two different methods, we reestimated country average scores and rankings in three subject tests of PISA 2015 (Science, Mathematics, Reading) after filtering out rapid-guessing examinees. Results suggest that country mean scores increase for all countries after filtering, but in most conditions the change in rankings is minimal, if any. A few exceptions with considerable changes in rankings were observed in the Science and Reading tests with methods that were more liberal in identifying rapid guessing. Lack of engagement and effort is a validity concern for individual scores, but has a minor impact on aggregate scores and country rankings.

Keywords: Rapid guessing, response time effort, PISA, filtering

Correspondence:
Michalis P. Michaelides
Institutional address: Dept. of Psychology, 1 Panepistimiou Avenue, 2109 Aglantzia, P.O. Box 20537, 1678 Nicosia, Cyprus
Email: Michaelides.michalis@ucy.ac.cy

Unraveling Performance on Multiple-Choice and Free-Response University Exams: A Multilevel Analysis of Study Time, Lecture Attendance and Personality Traits

Tuulia M. Ortner, Verena Keneder, Sonja Breuer, Freya M. Gruber, Thomas Scherndl

Abstract:
Assessment methods impact student learning and performance. Various recommendations address challenges of assessment in education, emphasizing test validity and reliability, aligning with ongoing efforts in psychological assessment to prevent test bias, a concern also relevant in evaluating student learning outcomes. Examinations in education commonly use either free-response (FR) or multiple-choice (MC) response formats, each with its advantages and disadvantages. Despite frequent reports of high construct equivalence between them, certain group differences based on differing person characteristics still need to be explained. In this study, we aimed to investigate how test takers’ characteristics and behavior—particularly test anxiety, risk propensity, conscientiousness, lecture attendance, and study time—impact test scores in exams with FR and MC format. Data was collected from 376 students enrolled in one of two Psychology lectures at a large Austrian University at the beginning of the semester and post-exam in a real-life setting. Multilevel analyses revealed that, overall, students achieved higher scores on FR items compared to MC items. Less test anxiety, higher conscientiousness, and more study time significantly increased student examination performance. Lecture attendance impacted performance differently according to the exam items’ response format: Students who attended more lectures scored higher on the MC items compared to the FR items. Risk propensity exhibited no significant effect on exam scores. The results offer deeper insights into the nuanced interplay between academic performance, personality, and other influencing factors with the aim of establishing more reliable and valid performance tests in the future. Limitations and implications of the results are discussed.

Keywords: evaluation methods, student evaluation, test performance, response format, person-ality

Correspondence:
Univ. Prof. Dr. Tuulia M. Ortner, Department of Psychology, University of Salzburg. tuulia.ortner@sbg.ac.at

Psychological Test and Assessment Modeling
Volume 66 · 2024 · Issue 1
Pabst, 2023
ISSN 2190-0493 (Print)
ISSN 2190-0507 (Internet)

Zurück