13th European Conference on Psychological Assessment - ConfTool Pro Printout

Original Photo adapted from Hansueli Kramer / CC BY

Session Overview

Date: Wednesday, 22/Jul/2015
9:00am - 12:30pm	W1: What the Face Reveals: The Facial Action Coding System as an Assessment Tool and its Applications Organizers: Tracey Platt and Jennifer Hofmann (University of Zurich, Switzerland)
KOL-G-204 (Ⅱ)
1:00pm - 4:30pm	W3: Diagnostics of Personality Pathology – Past, Present, and Future Organizer: Daniel Leising (Technische Universität Dresden, Germany)
KOL-G-204 (Ⅱ)

Date: Thursday, 23/Jul/2015
9:45am - 11:15am	PA1: Measurement 1 Session Chair: Klaus D. Kubinger
KOL-G-204 (Ⅱ)	PA1: Measurement 1 Session Chair: Klaus D. Kubinger
	On designing data-sampling for Rasch Model calibrating an achievement test Klaus D. Kubinger¹, Dieter Rasch², Takuya Yanagida³ ¹University of Vienna, Austria; ²University of Natural Resources and Applied Life Sciences, Vienna; ³University of Applied Sciences, Austria; klaus.kubinger@univie.ac.at klaus.kubinger@univie.ac.at Though calibration of an achievement test within psychological and educational context is very often carried out by the Rasch model, data sampling is hardly designed according to its statistical foundations. Kubinger, Rasch, and Yanagida (2009) suggested an approach for the determination of sample size according to a given Type I and Type II risk, and a certain effect of model misfit when testing the Rasch model is supported by some new results. The approach uses a three-way analysis of variance design (A > B) x C with mixed classification. There is a (fixed) group factor A, a (random) factor B of testees within A, and a (fixed) factor C of items cross-classified with (A > B). In accordnce with to Andersen’s Likelihood-Ratio test, the testees must be divided into at least two groups according to some criterion suspected of causing differential item functioning (DIF). The Rasch model’s quality of specific objective measurement is in accordance with no interaction effect A x C. The results of simulations studies are: the approach works given several restrictions, and its main aim, the determination of the sample size, is attained. Additionally, our approach's power is consistently higher than Andersen's test. Examining fit in covariance modeling with ordinal data Christine DiStefano¹, Grant Morgan², Phillip Sherlock¹ ¹University of South Carolina, USA; ²Baylor University, USA; distefan@mailbox.sc.edu distefan@mailbox.sc.edu Fit indices are routinely used with covariance modeling to provide information about the goodness of fit between the hypothesized model and the data. These indices include relative fit indices (e.g., Goodness of fit Index, Root Mean Square Error of Approximation, Standardized Root Mean Square Residual) and incremental fit indices (e.g., Tucker Lewis Fit Index (Nonnormed fit index, Comparative Fit Index, Incremental Fit Index). Recommendations and rules of thumb for interpreting various fit indices have been presented in the literature; however these guidelines are largely built from investigations using continuous, multivariate, normal data, and normal theory estimators (maximum likelihood or generalized least squares). As most of the data used in empirical studies is not continuous and may not be normally distributed, these recommendations may not hold when ordered category data are analyzed and/or robust estimators are used. Little is known as to how ad-hoc fit indices behave under non-normal and/or ordinal data. The purpose of this study is to examine the performance of fit indices under situations of categorical data and non-normality. Conditions such as sample size, number of ordered categories, non-normality, and estimation technique will be manipulated to examine the performance of fit indices. Statistical and theoretical reductionism in research on scientific thinking: How much can the Rasch model tell us? Peter Adriaan Edelsbrunner¹, Fabian Dablander² ¹ETH Zurich, Switzerland; ²University of Tübingen, Germany; dostodabsi@gmail.com dostodabsi@gmail.com In recent research on scientific thinking, Rasch modeling was employed to investigate the dimensionality of items that were meant to cover a wide variety of skills. Based on generic fit statistics and model comparisons, it was concluded that scientific thinking represents a unidimensional psychological construct. Using simulations, we argue that generic fit statistics and model comparisons based on the Rasch model merely warrant crude conclusions about the use of composite scores for practical assessments. Without strong prior theory, results from the Rasch model do not warrant theoretical conclusions about the dimensionality of the underlying psychological construct. In the simulations we compare the adequacy of various alternative measurement models for examining structural assumptions about scientific thinking. Based on the simulations, crucial assumptions of the Rasch model and their implications for theory development are discussed, expanding the discussion by drawing parallels to reductionism in intelligence and psychiatry research. We conclude that an undue reliance on Rasch models might not benefit and even hinder theory development in research on scientific thinking. Alternative measurement models and experimental studies might provide more thorough insight into the structure of scientific thinking. Finally, we discuss our study’s implications for other fields with frequent application of Rasch models. Teaching statistical inference and the null hypothesis significance controversy Ernest Kwan, Irene R. R. Lu Carleton University, Canada; ernest.kwan@carleton.ca ernest.kwan@carleton.ca Null hypothesis significance testing (NHST) is the predominant procedure for statistical inference in the social sciences. Quantitative methodologists, however, have debated the legitimacy of NHST, and the American Psychological Association convened a task force to evaluate the role of NHST in quantitative research. We describe an approach to teaching statistical inference that illustrates the problems of NHST and reviews the recommendations of reform made by the task force and other renowned methodologists. This pedagogical approach is designed for a statistics course enrolled by graduate students in a research-oriented doctoral program. Accordingly, our approach also illustrates how NHST should and should not be used to evaluate substantive theories or hypotheses of interest.
11:45am - 1:15pm	PA5: Clinical Assessment 2 Session Chair: Anna Barbara Słysz
KOL-G-204 (Ⅱ)
	The structure of thinking of novices and experienced diagnosticians - The report of the research Anna Barbara Słysz Adam Mickiewicz University, Poland; aslysz@amu.edu.pl aslysz@amu.edu.pl The aim of the presentation is to introduce the detailed analysis of the structure of thinking of novices and experienced diagnosticians. It will be performed on the example of case conceptualisation. The creation of such a conceptualization for psychological diagnostics is a complex thought process, which requires processing a wide range of data, formulating hypotheses about psychological onset and maintenance mechanisms of the client/patient's problem. A group of 30 psychotherapists served as subjects of the research. The presented research was planned in such a way so as to obtain a graphical representation of complex, cause-and-effect diagnostic inference. In order to study the structure of professional thinking, a complex diagnostic task (case conceptualisation) was employed. Footage was prepared showing a 40-minute conversation between a psychotherapist and a client. The diagnostic task of the psychotherapists consisted of categorising the client's statements and presenting relations between the categories. In order to analyse similarities between concept maps visualising the structure of diagnostic thinking, a dedicated software application was developed. The characteristics of the structure of diagnostician knowledge (e.g. coherence, complexity, relations between the individual elements of the structure) vary depending on the factors defining the professional profile of psychotherapists. Psychometric properties of Everyday-life Fatigue Questionnaire (EFQ) Joanna Urbańska Adam Mickiewicz University, Poland; joanna.urbanska@amu.edu.pl joanna.urbanska@amu.edu.pl The aim of this study is to investigate the psychometric properties of the Everyday-life Fatigue Questionnaire (EFQ; Urbańska, 2010). EFQ is a self-report inventory assessing everyday-life fatigue, constructed on the basis of theoretical principles of classical test theory (ERA/APA/NCME, 1999/2007). EFQ is a paper-and -pencil instrument with 24 items and three scales: subjective physical fatigue, subjective mental fatigue, and subjective social fatigue. The total sample consisted of 454 participants (adults), with ages from 24 to 85 (M = 60); 295 females and 159 males. Results of the study indicate that the reliability (the Cronbach's alpha) for the total scale was .89 and the 3 subscales demonstrated high reliability as well. EFQ has been also used in other studies by different researchers, yielding similar results. Good psychometric properties of the EFQ allow for the conclusion that it is a suitable instrument for the assessment of the everyday-life fatigue at adults. Moreover, the EFQ showed interesting statistically significant relationships with WHOQOL-bref and Fatigue Assessment Scale, especially in longitudinal studies. Do psychiatric symptoms diminish response quality to self-rated personality tests? Evidences from the PsyCoLaus study Marc Dupuis¹, Emanuele Meier¹, Caroline Vandeleur², Roland Capel¹ ¹University of Lausanne, Switzerland; ²Lausanne University Hospital, Switzerland; marc.dupuis@unil.ch marc.dupuis@unil.ch Our purpose of this study was to examine the relationships between psychiatric symptoms and the response quality to personality questionnaires. The study sample consisted of 1,981 participants from the Swiss cohort study “CoLaus PsyCoLaus” in Lausanne who completed both the NEO Five-Factor Inventory (NEO-FFI) and the Symptom Checklist 90 revised (SCL-90-R). Based on Gendre’s functional method, different indices measuring the quality of the entire set of responses to the NEO-FFI were calculated: response coherence, reliability, response level, variability, modality, normativity, positivity and negativity. Multiple linear regression analyses were performed to measure how much of the variance of such indices of response quality could be explained by the SCL-90-R factors. Determination coefficients ranging from 2.4% to 37.2% were measured for the response indices, indicating that some aspects of response quality are explained by psychiatric symptoms. Response normativity, positivity, and negativity were the indices most strongly associated with the SCL-90-R factors, while reliability was only related to paranoid and oppositional symptoms. Our findings suggest that an important part of the variance in response quality to self-rated questionnaires can be explained by the presence or absence of psychiatric symptoms. These findings call for further research in identifying populations unable to provide sufficiently valid responses to self-rated questionnaires. The PsicAP Project: A randomized controlled trial to improve psychological assessment and treatment with based-evidence psychological techniques of emotional disorders in Spanish primary care centers Antonio Cano Vindel², Roger Muñoz Navarro¹, Paloma Ruíz Rodriguez², Cristina Mae Wood², Benigna Díaz-Ovejero², Esperanza Dongil¹, Itziar Iruarrizaga², Mar García Moreno¹, Fernando Chacón³, Francisco Santolaya³, María Dolores Gómez Castillo³, Patricia Tomás Tomás¹, PsicAP Research Group³ ¹University of Valencia, Spain; ²University of Madrid, Spain; ³Spanish Council of Psychologists, Spain; roger.munoz@uv.es roger.munoz@uv.es Emotional disorders (ED), such as anxiety, mood, and somatoform disorders overwhelm existing resources in Spanish Primary Care (PC) centers. They are poorly detected and sparsely attended with adequate treatment, generating a higher use of health care services than physical illnesses. Other countries have provided Cognitive-Behavioral Therapy (CBT) programs to treat ED in PC demonstrating a high cost-effectiveness when compared to treatment as usual (TAU). The PsicAP Project is a pilot study that seeks to implement an evidence-based psychological group treatment protocol for ED in PC. A randomized controlled trial with two parallel groups will be conducted with a sample of 1126 participants: an experimental group (CBT) compared to a control group (TAU). Clinical symptoms, level of disability, quality of life, cognitive-emotional factors, treatment satisfaction, as well as data on attendance, drug use and other variables that reflect cost-effectiveness will be measured. Follow-up assessments will be completed at 3, 6, and 12 months. Also, the psychometric properties of the PHQ will be studied to improve the assessment of ED in Spanish PC. As in other countries, this treatment may help improve the mental health of these patients and reduce costs. Changes in personality functioning as a result of group psychotherapy with elements of individual psychotherapy in persons with neurotic and personality disorders – MMPI-2 Katarzyna Cyranka, Krzysztof Rutkowski, Michał Mielimąka, Jerzy A. Sobański, Łukasz Müldner-Nieckowski, Edyta Dembińska, Katarzyna Klasa, Bogna Smiatek-Mazgaj, Paweł Rodziński Jagiellonian University Medical College, Poland; katarzyna.cyranka@interia.pl katarzyna.cyranka@interia.pl This study is an analysis of group psychotherapy influence on the personality functioning of patients on treatment for neurotic disorders and selected personality disorders (F4-F6 under ICD-10). The study concerned 82 patients (61 women and 21 men) who underwent intensive short-term group psychotherapy in a day hospital. A comprehensive assessment of the patients’ personality functioning was carried out at the outset and the end of the psychotherapy utilising the MMPI-2 questionnaire. At the treatment outset the majority of the study patients demonstrated a considerable level of symptoms of disorders in five MMPI-2 clinical scales (Depression, Hysteria, Psychopathic Deviate, Psychastenia, Schizophrenia), and moderate pathology in Hypochondria. In the Mania scale most patients obtained results comparable to the healthy population when the treatment commenced. Having undergone the psychotherapy treatment, the majority of the examined were observed to demonstrate positive changes in those areas of personality functioning which were classified as severe or moderate pathology. Short-term intensive comprehensive group psychotherapy with elements of individual psychotherapy leads to desirable changes in personality functioning.
4:30pm - 6:00pm	PA8: Educational Assessment 1 Session Chair: Anne-Kathrin Mayer
KOL-G-204 (Ⅱ)
	Equivalence of computerized versus paper-and-pencil testing of information literacy under controlled versus uncontrolled conditions: An experimental study Anne-Kathrin Mayer¹, Günter Krampen^1,2 ¹ZPID - Leibniz Institute for Psychology Information, Germany; ²University of Trier, Germany; mayer@zpid.de mayer@zpid.de Achievement tests, as well as self-report questionnaires, may provide reliable and valid results regardless of medium (e.g.paper-and-pencil vs. computerized testing) or mode (e.g. supervised vs. unsupervised testing) of test administration. However, because evidence is inconsistent and test-specific, it is recommended to review the equivalence of each assessment tool before applying it in various formats. Thus, the present study examines the equivalence of two information literacy measures by comparing their a) psychometric properties (internal consistencies, item-total correlations), b) means and standard deviations, and c) intercorrelations under different conditions. In an experimental study, educational students (n=141) completed a knowledge test which aims to assess individuals’ ability to find and evaluate scholarly information, and a questionnaire assessing information literacy related self-efficacy beliefs. Medium and mode of test administration were varied in a 2 x 2 between subjects design. Testing was conducted in a paper based or a computer based format either individually under supervision, or under uncontrolled conditions. While the self-efficacy scale yielded comparable results under the different experimental conditions, the knowledge test appeared to be more susceptible to variations of test administration. Results are discussed with respect to general differences in measurement equivalence of test versus questionnaire data. The predictive validity and stability of standardized assessment in early childhood education Niek Frans¹, Wendy J. Post¹, Mark Huisman¹, C.E. {Ineke} Oenema-Mostert^1,2, Anne L. Keegstra³, Alexander E.M.G. Minnaert¹ ¹University of Groningen, The Netherlands; ²Stenden University of Applied Sciences, The Netherlands; ³University Medical Center Groningen, The Netherlands; N.Frans@rug.nl N.Frans@rug.nl There has been a lot of controversy surrounding the use of standardized achievement tests in preschool. Several researchers claim that the performance of young children is too fickle to be reliably and validly tested. The goal of this study was to examine the predictive validity for future performance and the score-stability of two widely administered Dutch preschool tests. Language and arithmetic scores of 431 children were collected retrospectively over a four-year period. First, percentile scores of low scoring children were plotted to assess the stability of scores over time. Second, predictive validity of arithmetic and language scores was assessed by means of a multilevel model. Both the language and arithmetic tests were poor identifiers for low scoring children in first and second grade. The majority of low scoring first and second graders achieved above average in preschool, or fluctuated between top and bottom range scores. A small group did not show large fluctuations in scores. Low correlations (r= .09 to .30) between the preschool tests and subsequent tests indicated that both tests are weakly to moderately associated with first and second grade performance. The results are discussed in light of practical applications of these tests. A cross-cultural study of Curriculum-Based Measurement in Brazil Maria Cristina Joly¹, Suzanne Bamonto² ¹University of Brasilia, Brazil; ²Rochester Institute of Technology, USA; mcrisjoly@gmail.com mcrisjoly@gmail.com Curriculum-Based Measurement (CBM) is a set of assessment procedures designed to provide accurate and reliable, yet efficient, indicators of student performance in the basic skill areas of reading, writing, and mathematics. Educators use CBM for universal screening and progress monitoring supported by a number of studies establishing good reliability and validity and linking performance on the measures, particularly the oral reading measure, to performance on state-administered high stakes tests. Such a measurement system currently does not exist in Brazil, therefore teachers often rely on their informal measures of student performance to guide instruction, and school administrators and policymakers to guide programmatic decisions. The purpose of this session is to describe a cross-cultural project aimed at investigating the suitability of CBM for schools in Brazil. The paper presentation will include an overview of the education systems in the U.S. and Brazil and how CBM fits in. Results of some initial reliability and validity studies using two mathematics probes administered to a group of third-grade students will be presented, including test-retest and alternate-form reliability coefficients and exploratory factor analysis. Preliminary implications for implementation and plans for follow-up studies will be discussed. Longitudinal factorial invariance of a Childhood Career Exploration measure Iris Martins Oliveira¹, Maria do Ceu Taveira¹, Erik J. Porfeli² ¹University of Minho, Portugal; ²Northeast Ohio Medical University, USA; ioliveira@psi.uminho.pt ioliveira@psi.uminho.pt Career exploration is a central process of childhood career development, sustaining an emerging sense of self and learning about life-roles and work. Self-report measures have been used with middle-school children, but often present theoretical and psychometric limitations, lacking evidence of temporal validity. This study examines the longitudinal factorial invariance of the Childhood Career Exploration Inventory (CCEI) over a 14-month period spinning fifth- and sixth-grades. The CCEI is a self-report measure of middle-school children’s career exploration, yielding scores for three subscales and total career exploration. Attrition did not rely on gender, region, or previous CCEI scores. Analyses were derived from a final data set of 437 Portuguese children of both genders (M(age) first wave = 10.23). A hierarchical factor model constituted the baseline model. Results suggested configural and metric invariance of the first- and second-order factors over time for genders. The CCEI presented acceptable reliability at each time and relative construct stability, except from the third to the fourth occurrences of measurement. These results support the use of the CCEI with girls and boys to investigate change in career exploration over fifth- and sixth-grades. Possible reasons for the relative construct instability from the third to the fourth occurrences of measurement are discussed.

Date: Friday, 24/Jul/2015
9:45am - 11:15am	PA11: Educational Assessment 2 Session Chair: Carl-Walter Kohlmann
KOL-G-204 (Ⅱ)
	Anxiety in children and adolescents: Need for school-specific contexts for the assessment of worry and emotionality Carl-Walter Kohlmann¹, Heike Eschenbeck¹, Uwe Heim-Dreger¹, Michael Hock² ¹University of Education Schwäbisch Gmünd, Germany; ²University of Bamberg, Germany; carl-walter.kohlmann@ph-gmuend.de carl-walter.kohlmann@ph-gmuend.de The Multidimensional Anxiety Inventory (MAI) was developed to assess emotionality and worry in three school-specific situations which vary according to academic threat (AT) and social threat (ST): performing a test (AT high, ST low), presenting in front of class (AT high, ST high), and meeting in the schoolyard (AT low, ST low). Aim of the study was to analyze factor structure, psychometric properties, and validity. Analyses are based on a sample of German students (N > 7000, age 9 to 16 years). Reliability coefficients were good for all subscales. The postulated factor structure was supported by both exploratory and confirmatory factor analyses. Emotionality was similar relevant across the three contexts, whereas worry appeared to be more specific: a) performance-related worry occurred while writing a test and while presenting in front of class, b) social-related worry was related to presenting in front of class and meeting in the schoolyard. Differential associations with criterion variables (e.g., grades, well-being, gelotophobia) support the validity of the MAI. Incorporating academic as well as social situations in an anxiety questionnaire allows for a more comprehensive assessment of anxiety in children and adolescents. Applications for educational and clinical psychology will be discussed. The Scale "Openness for Information" (SOFI) – A new assessment tool for research on information behavior Anne-Kathrin Mayer¹, Günter Krampen^1,2 ¹ZPID - Leibniz Institute for Psychology Information, Germany; ²University of Trier, Germany; mayer@zpid.de mayer@zpid.de The readiness to approach processes of informational search and evaluation in a specific manner is essential for solving complex and ill-defined information problems in scholarly contexts as well as in everyday life. The paper introduces the construct “Openness for Information” (OI) as a corresponding cognitive-motivational disposition. OI is thought to be rooted in personality and to be affected by epistemic beliefs, i.e. assumptions about the nature of knowledge. Additionally, associations with other aspects of information behavior are expected. To assess OI, the Scale Openness for Information (SOFI) was applied in four studies together with other self-report measures. The 12-item scale proved to be internally consistent (Cronbachs alpha = .82 to .87) in three samples of university students (n = 112 law, n = 116 psychology, n = 101 educational sciences) and an opportunity sample of adults aged 18 to 72 years (n = 86). Scores were significantly correlated with Need for Cognitive Closure, Openness for Experiences, and Conscientiousness. Students high in OI reported more sophisticated epistemological beliefs, more active planning of information seeking, and more reliance on scientific criteria when judging the quality of information. It is concluded that the SOFI is a useful tool for studying information behavior. Confirmatory study of the Multidimensional Scales of Perceived Self-Efficacy with middle-school children Iris Martins Oliveira¹, Maria do Ceu Taveira¹, Erik J. Porfeli² ¹University of Minho, Portugal; ²Northeast Ohio Medical University, USA; ioliveira@psi.uminho.pt ioliveira@psi.uminho.pt The Social Cognitive Career Theory (SCCT) presents childhood as a foundational period for the development of self-efficacy expectations. Self-efficacy expectations constitute one’s judgments of prospective capabilities to successfully perform a task, which impacts children’s career preferences. The Multidimensional Scales of Perceived Self-Efficacy (MSPSE) is validated to Portugal and has served the assessment of career self-efficacy from seventh-grade to college. Adding to the lack of research using the MSPSE before seventh-grade, there is no evidence of factor equivalence for genders and school levels. This work examines the applicability of the MSPSE to Portuguese fifth- and sixth-grade girls and boys. The self-efficacy expectations for academic success, self-regulated learning, and leisure and extracurricular activities scales were used due to their alignment with career development in these grades. Participants were 313 Portuguese children (137 female and 176 male; 47.9% fifth and 51.8% sixth graders, M(age) = 10.80). Confirmatory factor analyses suggested a good fit of a hierarchical measurement model, including three first-order factors and a second-order factor (composite score). Multi-group results suggested factor equivalence for genders and school levels. These results support the use of the MSPSE scales with Portuguese fifth- and sixth-grade girls and boys, which might sustain further SCCT-based research in middle-school years. Primary school students’ social and emotional school experiences across grade levels: Adaptation and validation of the Social and Emotional School Experiences Survey—Short form (SESES-S) Tanja Gabriele Baudson¹, Rachel Wollschläger², Isabelle Schmidt², Vsevolod Scherrer², Samuel Greiff³, Sascha Wüstenberg³, Franzis Preckel² ¹University of Duisburg-Essen, Germany; ²University of Trier, Germany; ³Université du Luxembourg, Luxembourg; tanja.baudson@uni-due.de tanja.baudson@uni-due.de Students’ school-related attitudes, relationships with classmates and teachers, academic self-concept, and other social-emotional school experiences influence both students’ wellbeing and their academic development. The younger students are, the more important these “soft factors” prove in the long run. However, brief assessments that can be administered across grades are still lacking. We propose an abbreviated and enhanced 36-item adaptation of the Social and Emotional School Experiences Survey (Fragebogen zur Erfassung emotionaler und sozialer Schulerfahrungen, FEESS; Rauer & Schuck, 2003, 2004), assessing the original seven factors (academic self-concept, social integration, class climate, attitude towards school, attitude towards learning, joy of learning, and feeling accepted by the teacher) plus facets of academic self-concept (mathematical, reading, and writing) not considered in the original. Based on the German norming sample of the THINK (Baudson, Wollschläger & Preckel, 2015; N > 2,000 students from grades 1–4 from five German federal states), evidence on criterial validity (IQ, grades, other-ratings) and model fit (first-order factor models with correlated factors) of the SESES-K will be presented along with invariance test results across grades and findings from longitudinal data. Cognitively diagnostic feedback: Mediating factors and remedial effects Eunice Eunhee Jang University of Toronto, Canada; eun.jang@utoronto.ca eun.jang@utoronto.ca Latent trait classification methods, including diagnostic classification methods (Rupp, Templin, & Henson, 2012) and latent class modeling (Magidson & Vermunt, 2002), offer opportunities to observe, classify, and profile individual learners’ strengths and areas for improvement in detail. Resulting learner profiles can be used in the form of cognitively diagnostic feedback (CDF) for immediate intervention. However, there is a paucity of empirical research that explains the mechanism of how feedback from assessment interacts with learners’ mind. Their beliefs about intelligence and orientations to learning can powerfully influence their attention to information and further learning success (Dweck & Sorich, 1999). The paper discusses the mechanism of how CDF is perceived and used by young learners with different psycho-social profiles based on research with 105 children in Grades 5 and 6, their parents, and their teachers. The study results indicated that the use of CDF is influenced not only by actual ability, but also by the beliefs about intelligence and goal orientations that students bring to their assessment situation. Learners are sensitive to the mismatch between their expected and actual outcome. CDF indicating such conflicts can stimulate learners’ cognitive engagement and prompt them to use feedback for planning learning.
11:45am - 1:15pm	PA15: Positive Traits and Positive Emotions Session Chair: Ingrid Koller
KOL-G-204 (Ⅱ)
	What do you think you are measuring? A new mixed-methods procedure for assessing content validity and theory-based scaling with an example on wisdom Ingrid Koller¹, Michael R. Levenson², Judith Glück¹ ¹Alpen-Adria-Universität Klagenfurt, Austria; ²Oregon State University, USA; ingrid.koller@aau.at ingrid.koller@aau.at The valid measurement of latent constructs is a crucial issue for psychological research. Precise definitions of constructs are a very important foundation for content-valid item generation, for examining other aspects of validity (e.g., convergent validity), and for theory-based scaling. Although this sounds trivial, many researchers pay too little attention to the precise definition of latent constructs. In the first part of this presentation, we present a new mixed-methodology approach for improving construct definitions, supporting item generation, determining the content validity of existing items, and theory-based scaling. We illustrate our approach using an analysis of the items of the Adult Self-Transcendence Inventory, a self-report measure of wisdom (ASTI, Levenson, Jennings, & Shiraishi, 2005). The results of this analysis were used as the basis of a psychometric evaluation of the ASTI in a sample of 1215 participants using multidimensional item response theory models. We found that the new procedure produced important suggestions concerning five sub-dimensions of the ASTI that were not identifiable using exploratory methods. Further research questions, possible adaptations, and some critical issues are discussed. Validation of a self-report scale measuring wisdom resources Michaela Pötscher-Gareiss, Judith Glück Alpen-Adria-Universität Klagenfurt, Austria; michaela.gareiss@aau.at michaela.gareiss@aau.at Throughout our life, we all confront difficult life events. Internal and external resources play an important role in helping individuals to overcome, reflect on, and integrate such events. The MORE-Life-Experience-Model (Glück & Bluck, 2014) postulates that five internal resources are crucial for the successful processing of difficult life events and in the long run for the development of wisdom: Mastery, Openness, Reflectivity, Empathy, and Emotion Regulation. The aim of this study was to develop a self-report-scale measuring the MORE-resources, with a focus on content-validity. Each of the resources is a relatively complex construct, and self-report-measures of positive constructs tend to be heavily biased by self-presentation and self-perception issues. Therefore, a large set of items was generated on the basis of construct definitions, and each item was analyzed by an expert panel with respect to construct-validity as well as agreement probability. After a large number of items were removed on this basis, the revised instrument (86 items) was evaluated empirically using factor-analytic-methods (n = 522). The resulting scale (25 items) has convincing subscale reliability; first validity-analyses and support the theoretical assumptions concerning the MORE-resources. The results also emphasize the advantages of our theory-based mixed-methods procedure for item generation and evaluation. A three-dimensional screening tool for strengths Samuel M.Y. Ho¹, Bowie P.Y. Siu² ¹City University of Hong Kong, Hong Kong; ²University of Hong Kong, Hong Kong; munyinho@cityu.edu.hk munyinho@cityu.edu.hk Twenty-four self-developed items assessing strengths were administered to 149 service recipients of a psychiatric rehabilitation organization along with the Hospital Anxiety and Depression Scale. Minimum Average Partial (MAP) test showed that the minimum Velicer's Average Squared Correlation of .020 was obtained for a three-factor solution. Accordingly, twelve items were selected from principle component factor analysis with oblimin rotation to form the Brief Strengths Scale (BSS-12) to measure the three strengths, namely, Temperance Strength, Intellectual Strength, and Interpersonal Strength, with internal consistency coefficients ranging from .76 - .84. The Intellectual Strength and Temperance Strength had significant negative correlations with both depression and anxiety, whereas the Interpersonal Strength was significantly and negatively related to depression only. The BSS-12 was also administered to 203 university undergraduates to examine the factorial invariance of the scale in a different population. Confirmatory Factor Analysis revealed satisfactory goodness-of-fit indices (X2/df = 1.846; CFI = .905; RMSEA = .065; SRMR = .059). We concluded that the BSS-12 was a useful screening tool for strength among people with and without mental health issues. General issues in adopting psychological assessment inventories in different cultures will be discussed towards the end of the presentation. The assessment of emotional states induced by clowns and nurses Sarah Auerbach, Jennifer Hofmann, Tracey Platt, Willibald Ruch, Annette Fehling University of Zurich, Switzerland; s.auerbach@psychologie.uzh.ch s.auerbach@psychologie.uzh.ch Clowns have visited hospitals and nursing homes for quite some time. However, up until now, there has been no instrument available for the assessment of the various and unique emotional states induced in individuals by hospital clowns. The present research identified the dimensionality of emotional states induced in observers of clown interventions, and investigated the difference between clowns and nurses. In Study 1, 183 adults watched 15 videos of hospital clowns, circus clowns, and nurses, and filled in the 29 Clown Emotion List (CLEM-29; Auerbach et al., 2014). Four factors emerged from a factor analysis: amusement, transcendence, arousal, and uneasiness. Both circus and hospital clowns elicited amusement, but only hospital clowns additionally elicited transcendence (i.e., feeling privileged, appreciated). Nurses also elicited transcendent experiences without being amusing. In Study 2 with 42 patients involved in a hospital clown intervention, the incremental validity of the dimensions of the CLEM-29 over and above a general funniness judgment of clowns was investigated. Global positive feelings toward the clowns were best predicted by funniness of clown performances in general and a higher level of felt transcendence. The CLEM-29 has proven to be useful in identifying the core components of hospital clown interventions: humor and transcendence.
4:30pm - 6:00pm	PA20: Intelligence Session Chair: Klaus D. Kubinger
KOL-G-204 (Ⅱ)	PA20: Intelligence Session Chair: Klaus D. Kubinger
	Practical assertion of paper-and-pencil adaptive testing: 30 years of experience with the intelligence test-battery Adaptive Intelligence Diagnosticum (AID) Klaus D. Kubinger University of Vienna, Austria; klaus.kubinger@univie.ac.at klaus.kubinger@univie.ac.at Adaptive testing has stood the test in practice over fourty years – when computerized tailored testing applies (see for a current review Kubinger, 2015, in print). However, in the case of especially young children, they may not be tested with a computer but individually under a psychologist’s supervision, and in these cases the use of some branched testing design may be the method of choice. Thereby the items are clustered in advance according to some intended cluster averages of item difficulty parameters. After each administered cluster of items, the optimally informative next item cluster is presented to the testee. As a consequence, there is no online ability parameter estimation needed after each administered item (cluster), but all ability parameter estimations can be done in advance. However, this approach is not too well-known. Here the respective conceptualization of the Adaptive Intelligence Diagnosticum (AID) is given. Apart from practitioners’ 30 years lasting approval, this branched adaptive testing test-battery has proven to have a shorter administration time, accompanied by a much higher accuracy of measurement (reliability). Lastly, there hardly ever occur achievement motivational problems in testees, either due to a lot of too easy items or due to a lot of too difficult items. External validation and reliability of The Indonesian Wechsler Adult Intelligence Scale – Fourth Edition (WAIS-IV-IDN) Magdalena S. Halim¹, Christiany Suwartono¹, Lidia L. Hidajat¹, Marc P.H. Hendriks^2,3, Roy P.C. Kessels^2,4 ¹ATMA JAYA Catholic University of Indonesia, Indonesia; ²Radboud University Nijmegen, The Netherlands; ³Academic Centre of Epileptology Kempenhaeghe, The Netherlands; ⁴Radboud University Medical Center, The Netherlands; magdalena.halim@atmajaya.ac.id magdalena.halim@atmajaya.ac.id The internal structure of the standardized Indonesian version of the Wechsler Adult Intelligence Scale – Fourth Edition (WAIS-IV-IDN) supports the expected internal structure of four first order factors and the full scale score as second order. In addition we evaluated the external validation and the scale’s reproducibility over time. For validation, we correlated WAIS-IV-IDN with the Raven’s Standard Progressive Matrices (SPM), Cattell’s Culture Fair Intelligence Test (CFIT), and The Modified Mini Mental State Examination (3MS). Furthermore, to investigate the scale’s reproducibility over time, we measured the test-retest reliability. There were 125 participants for the validation with SPM and CFIT, 90 participants for the validation with 3MS, and 77 participants for test-retest reliability. The Pearson product-moment correlation was used for analyzing data. We found positive and significant correlation coefficients between the WAIS-IV-IDN with all other tests (SPM, CFIT, and 3MS). The correlation coefficients ranged from .26 - .66. The result of test-retest reliability on all sub tests, indexes and Full IQ of WAIS-IV-IDN ranged from .47 - .92. In summary, the WAIS-IV-IDN is considered to be acceptable for using in Indonesia although there are still few limitations on test retest coefficients. Discussion of these findings would be explained further. Speedy assessment of speeded reasoning with the intelligence screening “mini-q” Tanja Gabriele Baudson¹, Martina Kaufmann², Franzis Preckel², Carolin Räihälä² ¹University of Duisburg-Essen, Germany; ²University of Trier, Germany; tanja.baudson@uni-due.de tanja.baudson@uni-due.de Economic and nevertheless valid assessment of cognitive ability is useful under time constraints, especially in large research projects where this resource is limited. The intelligence screening "mini-q" allows to assess speeded reasoning in three (adults) to five minutes (children). Based on Baddeley's Test of Verbal Reasoning (1968), the mini-q does not only include verbal, but also visuospatial aspects. We will present evidence for the reliability and validity of the paper-and-pencil version 478 adults, (e.g., correlations with other IQ and speed tests, and comparisons between gifted and average-ability participants.) Findings on the suitability of the mini-q for children and adolescents will be presented as well. One question that arose with respect to the material (where geometrical figures show human characteristics such as "preferring" or "refusing" other geometrical figures) was whether social abilities influence mini-q results substantially. Using an online version of the test, relationships with diverse measures of inter- and intrapersonal abilities will therefore be examined to further clarify what the mini-q measures. The interplay of working memory, processing speed, and attention with intelligence in children George Spanoudis, Anna Tourva University of Cyprus, Cyprus; spanoud@ucy.ac.cy spanoud@ucy.ac.cy The distinction between fluid (gf) and crystallized (gc) intelligence is important because it helps us to explain how intellectual ability develops and interacts with fundamental cognitive processes like memory and attention. The present study examined the relations of fluid and crystallized intelligence with three cognitive processes, namely speed of processing, attention, and working memory (WM) in 158 7- to 18-year-old children and adolescents (mean age in years=12.68, SD=3.16). Multiple measures of each of these cognitive processes were obtained. Structural equation modeling was performed to investigate: i) the relations between intelligence and its main correlates, and ii) whether developmental changes in each of the above three cognitive processes lead directly to developmental increases in intelligence. The results suggested that only WM predicted fluid and crystallized intelligence when controlling for the other two cognitive processes. The data indicate that WM is the main cognitive function underlying fluid and crystallized intelligence in children and adolescents. Also, our findings suggested that age-related changes in WM pave the way for developmental changes in intelligence. The discussion focuses on the construct validity of tests for the measurement of gc, gf and working memory and the interpretation of WM as a predictive variable of intelligence in children and adolescents.

Date: Saturday, 25/Jul/2015
10:15am - 11:45am	PA24: Educational Assessment 3 Session Chair: Fariha Asif
KOL-G-204 (Ⅱ)
	Identifying the profiles of social and emotional development among first grade pupils in Russia Ekaterina Orel, Alena Ponomareva, Christina Bekmukhametova National Research University Higher School of Economics, Russia; aponomareva@hse.ru aponomareva@hse.ru Personal, social, and emotional development (PSED) of young children is particularly important at the beginning of primary education. This paper is aimed at constructing PSED profiles (in the framework of iPIPS study of schooling progress) of Russian school children during their first year at school and determining relationship between PSED and cognitive abilities. Childrens' PSED was assessed by teachers using 11 scales. The scales themselves were arranged in three sections: adjustment to the school environment, personal development, and social and emotional development. Data were collected in one big region of Russian Federation, the Republic of Tatarstan. The sample representative of Tatarstan consisted of 1,218 (409 boys and 447 girls, mean age - 7.3 years old; gender data for 362 students is missing because of the low parents response rate) primary school students assessed by 68 teachers. 4 PSED profiles were identified using cluster analysis (k-means method). In group one children scored high on all scales and in other three groups children had certain characteristic strengths and weaknesses. A relationship between profiles and cognitive abilities was found. This information can help teachers find the most effective ways to work with children during their first year at school. An ecological assessment framework for tracking learner growths in self-regulation, emotional engagement, and feedback responses in a virtual diagnosis learning program Eunice Eunhee Jang University of Toronto, Canada; eun.jang@utoronto.ca eun.jang@utoronto.ca The paper discusses assessment principles for learning in technology-rich learning environments, in this case BioWorld (Lajoie, 2009). BioWorld is a patient simulation program designed to support medical students’ clinical reasoning. Students are invited to diagnose virtual patients by gathering and evaluating medical evidence. The framework highlights a shift away from discrete content domain knowledge to cognitively, metacognitively, and emotionally competent applications of target knowledge. In this framework, learners’ current state of knowledge and skill mastery are constantly changing as a result of interactions with elements of learning contexts, which contradicts a static view of learning. The present study gathered multiple types of behavioral, affective, cognitive, and contextual data from students’ self-reports, computer logs, and think-aloud verbal protocols. Latent class analyses (Vermunt & Magidson, 2005) and cluster analyses were used to identify distinct latent classes among students in terms of their self-regulation, emotional engagement (Pekrun, Goetz, & Perry, 2005), and responses to expert feedback. Logistic multiple regression analyses were used to examine the predictive probabilities of successful diagnosis of virtual patients using different learner trait profiles. I discuss the viability of the ecological assessment framework for tracking learning progressions and providing intervention support tailored to individual learners’ profiles. The anxiety factors in the Saudi EFL learners: A study from English language teachers’ perspective Fariha Asif King Abdul Aziz University, Saudi Arabia; farihaa83@yahoo.com farihaa83@yahoo.com The purpose of the study is to explore the factors that cause language anxiety in the Saudi EFL learners and the influence it casts on communication as observed and perceived by EFL teachers. The study seeks to answer questions such as what are the psycholinguistic and socio-cultural factors, as per teachers’ perspective that cause language anxiety among ESL/EFL learners while learning and speaking English Language, especially in the context of the Saudi students. It also finds what strategies can be used to successfully cope with language anxiety. The scope of the study is limited to college and university English Teachers in Saudi Arabia. The sample size is 115(Mean age = 35 years). One hundred university English teachers (both males and females) were selected from various cultural backgrounds. Five points Likert scale questionnaire comprising twenty items was served to these 100 English teachers. In additon, 15 structured interviews were also conducted. Some English teachers believe that anxiety serves a positive outcome for the learners by giving them an extra bit of motivation to do their best in English language learning.