13th European Conference on Psychological Assessment

Original Photo adapted from Hansueli Kramer / CC BY

Date: Thursday, 23/Jul/2015
9:45am - 11:15am	S1: Opportunities and Challenges of Longitudinal Perspectives Session Chair: Grégoire Bollmann Discussant: Martin Tomasik
KOL-G-217 (Ⅳ)
	Opportunities and challenges of longitudinal perspectives Chair(s): Grégoire Bollmann (University of Lausanne, Switzerland) Discussant(s): Martin Tomasik (University of Zurich, Switzerland) Time poses several challenges to longitudinal perspectives, an example of this would be when it comes to ensure measurement invariance of constructs or to assess people evaluations of past events. This symposium brings together researchers from the Swiss National Center of Competence in Research LIVES and those interested in longitudinal perspectives to explore these challenges and discuss the opportunities they also entail. First, introducing the issue of measurement invariance, Brodbeck and colleagues examine standardized inventories of marital satisfaction and psychopathological symptoms. In two 2-waves studies on married individuals and on patients before and after psychotherapy, respectively, this team presents the evolution of these constructs over time. Sarrasin then showcases the invariance of a 5-item self-esteem scale with multigroup confirmatory factor analyses in the tumultuous context of late adolescence and young adulthood. Her results highlight that changes in self-esteem of this vulnerable population are mainly related to changes in their satisfaction with their body image. Finally, Morselli and colleagues present life-history calendars as a means to approach past events. Their work compares respondents’ subjective evaluations of their personal trajectory obtained with graphical representations or a differential scale and pinpoints advantages of life-history calendars. Presentations of the Symposium Longitudinal measurement invariance issues illustrated by examples of marital satisfaction in later life and the structure of psychopathology before and after psychotherapy Jeannette Brodbeck, Hansjörg Znoj, Pasqualina Perrig-Chiello; jeannette.brodbeck@psy.unibe.ch jeannette.brodbeck@psy.unibe.ch University of Bern, Switzerland When questionnaires are administered repeatedly over time, measurement invariance needs to be established in order to determine whether the same construct is measured with similar precision. After a short introduction to cross-sectional and longitudinal measurement invariance, these concepts are illustrated by two examples employing a confirmatory factor analysis framework for categorical data. The first example is a 10-item version of the Marital Satisfaction Inventory (Whisman, Snyder, & Beach, 2009) administered at baseline and two years later to a population sample of 1275 married individuals aged 40+ (NCCR LIVES, IP-12). Measurement invariance held only for a modified one-factor model but not the three factor model which fitted the data best at baseline. The second example is the latent structure of psychopathology assessed with the Brief Symptom Inventory (Franke, 2000) before and after psychotherapy in 526 patients. Configural factorial invariance was not confirmed and the latent structure of psychopathology was simpler after psychotherapy. Implications of these measurement invariance issues for theory, statistical analyses, and adaptation of the measures will be discussed. Measuring self-esteem among young adults in different educational tracks: A longitudinal perspective Oriane Sarrasin; oriane.sarrasin@gmail.com oriane.sarrasin@gmail.com University of Lausanne, Switzerland Self-esteem is subject to strong variations during late adolescence and young adulthood: Not only does it drop and then rise again gradually, but also marked differences across educational groups are often found. To ensure that such within- and between-individuals comparisons are unbiased, it is necessary to verify in preliminary analyses that measurement of self-esteem is invariant. To illustrate this, data of young adults (M = 18.7, SD = 2.61) in academic (N(2013) = 147; N2014 = 115) and vocational (N(2013) = 160; N(2014) = 65) tracks from the two last waves of the Longitudinal Lausanne Youth Study (NCCR LIVES) were selected. Cross-sectional multigroup confirmatory factor analyses (MGCFA) showed no difference in the measurement of the five-item self-esteem scale across the two groups, indicating that in both years unbiased mean comparisons can be conducted. Contrary to previous research, participants in academic and vocational tracks did not differ significantly in their self-esteem. In contrast, longitudinal MGCFAs revealed one within-individuals difference: While all other items remain stable, participants reported being less satisfied with their body image as they grew older. The inclusion of this item in a composite score can lead to the erroneous conclusion that young adults’ general self-esteem decreases over time. The use of Life-History Calendar Methods (LHC) to assess subjective evaluation of the personal life trajectory Davide Morselli, Dario Spini, Nora Dasoki, Elenya Page; Davide.Morselli@unil.ch Davide.Morselli@unil.ch University of Lausanne, Switzerland Life-history calendar (LHC) methods have been increasingly used life-course research as well as other domains that are interested in the timing of events and trajectories. There is indeed a consensus on the fact that the highly structured but flexible approach of the LHC facilitates the memory of past events. Respondent's experience provide a context for retrieval of autobiographical memories and they are used as anchoring points and time landmarks for recollecting events. The literature has shown that this method provides more reliable answers that conventional question lists on biographical retrospective data. If the LHC method has been mainly used to collect data on factual (i.e., objective) events, a few experiences have adapted it for tapping subjective dimensions and assessing the psychological impact of events. In this study we investigate whether the LHC method can be used to asses respondents' evaluation of their own personal trajectory, by comparing two different methods. The first one relies on the use of a differential scale on which respondents indicate their evaluation. The second one maximizes the visual potential of the LHC and respondents are asked to graphically express their evaluation.
11:45am - 1:15pm	PA4: Organizational Assessment 2 Session Chair: Olaf Ringelband
KOL-G-217 (Ⅳ)
	A different sort of pedigree: Top-managers’ personality structures, career success, and derailment risks Olaf Ringelband md gesellschaft für management-diagnostik, Germany; ringelband@management-diagnostik.de ringelband@management-diagnostik.de Top-managers’ personalities differ significantly from those of other people. Two sample groups of Top-managers (n=1,052 and n=495) filled out two different personality inventories (BIP and CPI, respectively). The results of the survey proved that generally, managers are more assertive, sociable, self-confident, and show stronger performance motivation than other professionals. The ramifications of those personality traits on professional success and career development are discussed. Special attention is paid to the derailment risks associated with the aforementioned traits— and their connection to top-managers’ psychopathic behavior (“The Dark Triad”, Paulhus, 2010). Concluding possible measures for reducing top-managers’ derailment-risks are discussed. The development of an instrument to assess organizational learning in small and medium enterprise in Asia Yu-Lin Wang National Cheng Kung University, Republic of China (Taiwan); ywang@mail.ncku.edu.tw ywang@mail.ncku.edu.tw Organizational learning has been examined since the 1950’s and the base of literature on the topic has expanded conceptually, theoretically, and somewhat empirically during the past decades. However limited instrument in measuring organizational learning processes exists. Scholars have indicated that part of the reason is because it is difficult to develop a quantitative measurement of organizational learning. Such limited instrument in measuring organizational learning processes has hampered the empirical research on organizational learning. In addition, current empirical studies on exploring organizational learning have tended to focus on large firms. Unlike large firms, small and medium enterprises, with limited resource on money and people, usually adopt different approaches in learning and obtaining knowledge. Moreover, existing organizational learning instruments based on Western countries may not fit for Asian context. It is necessary to develop an indigenous organizational learning instrument that captures organizational learning processes may be unique to an Asian country. As a result, the purpose of this study is to develop a valid and reliable instrument to measure organizational learning in small and medium enterprises in Asia in order to understand and explain the organizational learning phenomena. Unmasking ethical leadership: Quantitative research on the characteristics that describe ethical leaders at work Eirini Marina Mitropoulou¹, Ioannis Tsaousis¹, Despoina Xanthopoulou², Konstantinos Petrides³ ¹University of Crete, Greece; ²Aristoteleio University of Thessaloniki, Greece; ³University College London, UK; psyp165@psy.soc.uoc.gr psyp165@psy.soc.uoc.gr The appropriate definition and assessment of ethical leadership has been a source of conceptual confusion in the leadership literature. During the last decade different theories have been evolved, all including different type and number of leadership characteristics. Consequently, none of the existing theories provides a full understanding of the concept of ethical leadership. In this study, all ethical leadership characteristics are evaluated that are present in international literature. In total, twenty seven characteristics derived and their relevancy was tested in a quantitative study, using a multi source sample (both employers and employees in public and private sector in Greece). A new factor structure was investigated for all twenty seven characteristics with CFA testing one-, two-, three-, four-, five-, and by- factor models. Fit indices showed that a four factor model had the most acceptable fit to the data. The four factors that emerged were named Ethical Virtue, Solidarity, Ethical Practices and Fulfillment of Ethical Goals. This new four-factor model will be the groundwork for creating a new psychometric scale that will assess ethical leadership at work. Personality-based Person-Organization (PO) fit: A new direction for personality assessments Punya V. Iyer^1,2, Alec W. Serlie², Janneke K. Oostrom³, Marise Ph. Born¹ ¹Erasmus University Rotterdam, The Netherlands; ²GITP, The Netherlands; ³VU University, The Netherlands; iyer@fsw.eur.nl iyer@fsw.eur.nl This study aims to demonstrate the value of personality assessments from a Person-Organization (PO) fit perspective. Organizations could improve the utility of HR-assessments by using personality questionnaires to assess individual personality as well as personality-based PO fit. We initially hypothesized personality-based PO fit to predict satisfaction and intention-to-stay. Furthermore, we hypothesized the fit relationships to be stable over time (two years apart). In phase-I (T0), 636 employees in Netherlands completed questionnaires on their individual personality, perceived organizational personality, and the criteria (satisfaction and intention-to-stay). In phase-II (T1), 202 of the original respondents completed the same questionnaires. The personality dimensions measured were agreeableness, enterprise, competence, chic, ruthlessness, innovativeness, and stability. Polynomial regression analysis revealed that, at T0 PO fit led to satisfaction for all dimensions except agreeableness. Similarly, PO fit at T0 predicted an intention to stay for all dimensions expect agreeableness and competence. PO fit at T1 led to satisfaction for competence, chic, ruthlessness, and stability whereas PO fit predicted an intention to stay for enterprise and chic. We conclude that personality questionnaires can be used in the form of PO fit and specifically personality-based PO fit is a valuable and stable predictor of an individual’s future attitudes and behaviors. Assessment of consumer heterogeneity: A comparison of two multidimensional latent modeling approaches Irene R. R. Lu, Ernest Kwan, D. Roland Thomas, Louise A. Heslop Carleton University, Canada; irene.lu@carleton.ca irene.lu@carleton.ca The assessment of consumer heterogeneity is essential for marketing segmentation in both profit and nonprofit organizations. We explore two methods that capture consumer heterogeneity within and between groups: latent class modeling and diagnostic classification modeling. The paper also discusses the advantages and limitations of the application of each method in marketing.
4:30pm - 6:00pm	S2: Online Assessment and Internet-Based Research Session Chair: Ulf-Dietrich Reips Session Chair: Stefan Stieger
KOL-G-217 (Ⅳ)
	Online assessment and internet-based research Chair(s): Ulf-Dietrich Reips (University of Konstanz, Germany), Stefan Stieger (University of Konstanz, Germany; University of Vienna, Austria) During the last decades, online assesment and testing became an indispensable data source for research, and not only in the fields of personality psychology, intelligence, and achievement. The Internet provides a powerful infrastructure for data collection and many researchers have been taking advantage of it to conduct basic and applied research. This symposium will cover new developments and present tools and examples of how to use the Internet for online assessment and research. This session intends to give an overview of the online assessment/testing expansion in recent years, including topics such as the rise of mobile computing (smartphones) in research, relationships between self-reported executive problems, personality, and cognitive performance, self-ratings versus observers' ratings of personality on Facebook profiles, and how to handle and analyze dropout in Internet-based research. The session will also explore the technical and ethical issues in the use of data generated by online assessment/testing as well as the added value and benefits of such data. Special attention will be given to the development of the International Personality Item Pool (IPIP) as a possible blueprint for online cross-cultural personality assessment in the public domain. Presentations of the Symposium Smartphone apps in psychological science: Results from an experience sampling method study Stefan Stieger¹, Ulf-Dietrich Reips²; stefan.stieger@uni-konstanz.de stefan.stieger@uni-konstanz.de ¹University of Konstanz, University of Vienna, Germany, Austria, ²University of Konstanz, Germany Data collection methods in the social and behavioral sciences have always been inspired by new technologies. The introduction of the Internet had a major impact in advancing the methodological repertoire of researchers, with Internet-based experiments, online questionnaires, and non-reactive online data collection methods, to name just a few. Meanwhile, the next major impact from technology is hitting research – smartphones. The penetration rate of these small mobile devices is increasing rapidly, and they offer a multitude of new sensors that can be used for scientific research (e.g., GPS, gyroscope, accelerometer, temperature sensors). We report a smartphone app field study about well-being conducted in German-speaking countries (n =219). It took place for 14 days with three measurements per day (8000+ well-being judgments). Based on this study, we discuss important aspects of the planning of a smartphone study (e.g., programming, implementation, pitfalls, and recruitment strategies). The presentation aims not only to present empirical data about an exemplary smartphone study, but also to present the unique aspects of smartphone studies compared to traditional research methods of data collection. What do self-report measures of problems with executive function actually measure? Data from internet and laboratory studies Tom Buchanan; T.Buchanan@westminster.ac.uk T.Buchanan@westminster.ac.uk University of Westminster, United Kingdom Measuring executive function interests researchers and practitioners in a number of psychological fields. Self-report measures of executive problems may have considerable value, especially for research conducted via the internet. They are easier to implement online than traditional cognitive tests, and arguably have greater ecological validity as indices of everyday problems. However, there are questions about whether they actually measure executive function, or other constructs such as personality. Relationships between self-reported executive problems, personality, and cognitive performance were assessed in three correlational studies using non-clinical samples. In Study 1, 49398 participants completed online measures of personality and self-reported executive problems. In Study 2, 345 participants additionally completed an online Digit Span task. In Study 3, 103 participants in a traditional laboratory setting completed multiple measures of personality, self-reported executive problems, and objective cognitive tests. Across all three studies, self-reported problems correlated with neuroticism and with low conscientiousness, with medium to large effect sizes. However self-reported problems did not correlate with performance on Trail Making, Phonemic Fluency, Semantic Fluency or Digit Span tests tapping aspects of executive function. These findings raise questions about self-report measures of executive problems, both on the Internet and offline. Self-ratings of personality and observers' ratings based on Facebook profiles Boris Mlačić, Goran Milas, Ivna Sladić; Boris.Mlacic@pilar.hr Boris.Mlacic@pilar.hr Institute of Social Sciences Ivo Pilar, Croatia The aim of the study was to investigate the relationship between self-ratings of personality and expert observers’ ratings of personality based on Facebook profiles of target persons. The self-rating sample consisted of 177 participants with active Facebook profiles between March and June 2014. Expert observers were students in the final year of masters’ course in psychology with training in personality psychology. Personality traits from the Big-Five model were assessed by the IPIP50 (Goldberg, 1999; Mlačić & Goldberg, 2007) while Facebook usage was assessed by the Questionnaire of Facebook use (Ross et al., 2009). Observers’ ratings of personality were based on data from Facebook profiles where each of the five personality dimensions (Extraversion, Agreeableness, Conscientiousness, Emotional Stability, and Intellect) was briefly defined. The results showed significant relations between Facebook usage and Agreeableness and Extraversion, respectively. Observers’ personality ratings correlated significantly with self-ratings of Conscientiousness and Intellect while the ratings between observers were the highest for the dimensions of Extraversion, Conscientiousness, and Emotional Stability. Dropout analysis with DropR: An R-based web app to analyze and visualize dropout Ulf-Dietrich Reips¹, Matthias Bannert²; reips@uni-konstanz.de reips@uni-konstanz.de ¹University of Konstanz, Germany, ²ETH Zurich, Switzerland With Internet-based research non-response such as lack of responses to particular items and dropout have become interesting dependent variables, due to highly voluntary participation and large numbers of participants (Reips, 2000, 2002). In this paper we develop and discuss the methodology of using and analyzing dropout in Internet-based research, and we present DropR, a Web App to analyze and visualize dropout. The Web App was written in R, a free software environment for statistical computing and graphics. Among other features, DropR turns input from datasets in various formats into visual displays of dropout curves. It calculates parameters relevant to dropout analysis, such as Chi Square values and odds ratios for points of difference, initial drop, and percent remaining in stable states. With automated inferential components, it identifies critical points in dropout and critical differences between dropout curves for different experimental conditions and produces related statistical copy. The visual displays are interactive, users can use mouse over and mouse drag and click to identify regions within a display for further analysis. DropR is provided as a free R package (http://cran.r-project.org/web/licenses/GPL-2) and Web service (http://dropr.eu) from researchers for researchers. Measuring narcissism online: Development and validation of a brief web‐based instrument Tim Kuhlmann, Michael Dantlgraber, Ulf-Dietrich Reips; tim.kuhlmann@uni-konstanz.de tim.kuhlmann@uni-konstanz.de University of Konstanz, Germany Narcissism continues to be a widely researched topic in psychology, and the scientific community is in need of validated online instruments. The present paper describes the development and validation of a questionnaire for the web-based assessment of sub‐clinical narcissism. Several versions were developed, including items from the original NPI-40 (Raskin & Terry, 1988) and from the open item database IPIP. Using the multiple-site-entry technique (Reips, 2000), a sample of 1972 participants was recruited. They answered the original 40 items of the NPI‐40 in either choice or Likert-type format as well as 80 items from the IPIP with a Likert-type answer format. The NPI‐40 in original choice format showed unsatisfactory fit-characteristics in a CFA. After factor analysis of all Likert-type items, an 18-item questionnaire for narcissism with three intercorrelated subscales emerged. These were labeled importance, manipulation, and vanity. The overall narcissism score had good internal consistency (α = .91), with the subscales showing acceptable reliabilities (α = .78 - .83). The final scale was validated in a separate sample with 549 participants. The three-factor-structure was replicated and similar psychometric properties were shown. The questionnaire provides researchers with a brief and validated instrument for the web-based assessment of narcissism and its sub‐facets.

Date: Friday, 24/Jul/2015
9:45am - 11:15am	S3: Vulnerabilities and Resources at Work and in Career Development Session Chair: Grégoire Bollmann
KOL-G-217 (Ⅳ)
	Vulnerabilities and resources at work and in career development Chair(s): Grégoire Bollmann (University of Lausanne, Switzerland) Within the Swiss National Center of Competence in Research LIVES, vulnerabilities and resources can be conceived at multiple levels. Here we bring together researchers interested to track down various forms and sources of these two concepts in the domains of career development and work. This symposium will showcase a collection of newly developed instruments investigating the multiple levels at which vulnerabilities and resources can be experienced and respectively garnered, namely within individuals, in their interpersonal relationships or the broader normative context. Starting within individuals in career development, Rochat and Rossier explore the validity of the career decision-making difficulties scale and its relationship with various forms of self-esteem. Sgaramella and colleagues then identify future orientation and resilience as relevant resources for individuals’ career and life paths. The next two talks then proceed with vulnerabilities and resources in individuals’ interpersonal and normative context. Introducing humor at work, Hofmann and Ruch validate a short instrument of dispositions toward ridicule and laughter and present their relations with work related outcomes. Finally, Bollmann, and Mena examine people endorsement of the free market system as an institution permeating society and its implications for the self and decision-making at work. Presentations of the Symposium Validation of the Career Decision-Making Difficulties Scale (CDDQ) in a Francophone context Shékina Rochat, Jérôme Rossier; shekina.rochat@unil.ch shekina.rochat@unil.ch University of Lausanne, Switzerland Indecision may be understood as a normative part of the developmental process if not —to a certain extent—an adaptive attitude toward the career choice. However, encountering severe career decision-making difficulties can also threaten career paths. This study presents the validation of the French-language version of the Career Decision-Making Difficulties Questionnaire (CDDQ) among 1,750 French-speaking adolescents and young adults. The structure of the CDDQ-French form was verified through confirmatory factor analysis (CFA), and multigroup CFA were used to test the measurement equivalence across a general sample and a clinical sample. Relationship with the short form of the Career Decision-Making Self-Efficacy Scale (CDMES-SF) and the Self-Esteem Scale (SES) were also explored. Implications of these findings for the assessment and support of vulnerabilities associated with career choice are discussed. More complex times require more attention to future orientation, resilience, and methodological choices in Life Design approach Teresa M. Sgaramella, Laura Nota, Lea Ferrari, Maria Cristina Ginevra, I. DiMaggio; teresamaria.sgaramella@unipd.it teresamaria.sgaramella@unipd.it Università degli Studi di Padova, Italy The complex times that people are currently living in, and challenges they frequently face, raise new questions and draw the attention to dimensions such as future orientation, resilience (Soresi et al., 2015), and to their possible role in Life Design (Nota et al., 2014; Savickas, et al., 2009). A further, more compelling, challenge comes from the increasingly larger number of marginalized and vulnerable individuals (from unemployed to people with disabilities, addiction or psychopathological problems) who experience difficulties and add relevant questions about determinants and resources available to them for a successful Life Designing (Sgaramella et al., 2015). In order to face these challenges (besides career adaptability) additional quantitative and qualitative measures have been recently introduced in research conducted in the LARIOS laboratory, such as Design My Future, Vision about the future (Soresi et al., 2012 ab) and My Future Interview (Sgaramella et al., 2014). After examining their psychometric properties, patterns of association with other relevant resources in life designing have been analyzed. Results from large groups of young and adults, and particularly those coming from individuals experiencing vulnerabilities, support the relevance of these dimensions in Life Design studies. Their usefulness in counseling, and more specifically in career counseling, is also underscored. Validation of the PhoPhiKat-9 (Short Form) in a workplace context Jennifer Hofmann, Willibald Ruch; j.hofmann@psychologie.uzh.ch j.hofmann@psychologie.uzh.ch University of Zurich, Switzerland Three dispositions towards ridicule and laughter have been put forward and investigated: Gelotophobia (the fear of being laughed at), gelotophilia (the joy of being laughed at), and katagelasticism (the joy of laughing at others). Within the NCCR LIVES project, gelotophobia has been postulated to be a potential vulnerability in the work place context, where the misperception of feeling laughed at and being bullied can have detrimental effects on work stress and work satisfaction. For an economic, large-scale assessment of the three dispositions, we first developed and validated a short form (PhoPhiKat-9) of the standard self-report instrument (PhoPhiKat-45) in two independent samples. Second, the PhoPhiKat-9 was validated in a representative sample of Swiss employees in a third sample, relating gelotophobia to relevant behaviors and perceptions at the work place. Results and implications are discussed. Believing in a free market system: Implications for the self and the society Grégoire Bollmann¹, Sébastien Mena²; gregoire.bollmann@unil.ch gregoire.bollmann@unil.ch ¹University of Lausanne, Switzerland, ²City University London, United Kingdom We conceptualize the belief in a free market system (BFM) as people endorsement of basic assumptions about the economy. The free market system is an institution permeating western societies in which individuals freely pursue their interests, organizations maximize their profits, State doesn’t intervene, and competition rules market exchanges. In our sense-making-intuitionist framework, the belief is an amoral cognition people endorse to satisfy fundamental motives and which make them go about their life, unaware of the moral stakes of their choices. In 5 studies involving samples of executives, students and the general population (N(total) = 1374), we develop and validate a measure of BFM and then longitudinally and cross-sectionally test its predictive power on relevant outcomes for individuals and society. BFM is a one dimensional, reliable concept, and is positively associated to social dominance and meritocracy, negatively related to need-based allocations and, crucially, unrelated to moral identity. It might serve people satisfaction with their life but increases the likelihood of amoral decisions-making at work. As such, it constitutes simultaneously a resource and a vulnerability depending on the context in which it is applied.
11:45am - 1:15pm	S4: Response Styles in Personality Assessment Session Chair: Daniel Danner
KOL-G-217 (Ⅳ)
	Response styles in personality assessment Chair(s): Daniel Danner (GESIS - Leibniz Institute for the Social Sciences, Germany) Response styles such as acquiescence or extreme responding can bias correlations, factor structures, and prevent measurement invariance of personality inventories. Based on empirical data, we will discuss to what extent response styles are relevant, how response styles can be measured and controlled, and what the determinants of response styles are. Beatrice Rammstedt will illustrate that acquiescence biases the comparability of big five measures across countries. Julian Aichholzer and Meike Morren will introduce statistical models that allow controlling for acquiescence and extreme responding. Daniel Danner will demonstrate that acquiescence is not a general, uni-dimensional response style but that in it fact also depends on the item domain (such as personality or attitude items). Finally, Clemens Lechner will present data suggesting that acquiescence is not only associated with education but also with age related decline in cognitive functioning. Presentations of the Symposium Measurement equivalence of personality measures across educational groups – The moderating role of acquiescence Beatrice Rammstedt; beatrice.rammstedt@gesis.org beatrice.rammstedt@gesis.org GESIS – Leibniz-Institute for the Social Sciences, Germany Effects of response set are often neglected in research investigating differences among groups. In contrast to individual diagnostics-which often controls for effects of social desirability for example personality assessments-investigating among group differences does not take into account effects of response styles. In this talk I will show that response style, and in particular acquiesce, have indeed strong biasing effects on personality assessments. In several different large scale population representative samples (n=888 to n=25,509) we proved that item responding is in particular in low educated groups strongly affected by acquiescence with blurring effects on the resulting factor structure. This effect could be shown to be generalizable across questionnaires, item formats, assessment modes, and numerous in particular Western countries. Implications of the findings for personality assessment are discussed. Controlling acquiescence bias in measurement invariance tests Julian Aichholzer; julian.aichholzer@univie.ac.at julian.aichholzer@univie.ac.at University of Vienna, Austria Assessing measurement invariance (MI) is an important cornerstone in establishing equivalence of instruments and comparability of measured constructs. This study investigates how acquiescence response style (ARS) impacts the level of MI achieved (configural, metric, scalar). Data from a German representative sample (n = 3,118) were analyzed. The random intercept method is combined with multiple-group factor analysis to assess MI in a Big Five personality scale. Initial results suggest that if groups differ in ARS, neglecting that bias leads to different conclusions regarding the level of MI of the instrument. Implications and further applications are discussed. Extreme response style and personality traits Meike Morren; meike.morren@vu.nl meike.morren@vu.nl VU University Amsterdam, The Netherlands Since the 1950s, extreme response style (ERS) has been associated with personality traits, such as anxiety, neuroticism, extraversion, conscientiousness, defensiveness, self-esteem, and depression. Inconsistent results have been obtained, for example, some find that extraversion relates positively to ERS, others find a negative relationship, and some find no relationship. This inconsistency might result from serious methodological challenges in exploring the relationship between response styles and personality. First, ERS can be measured by a sum score, a standard deviation score or a latent variable. Second, the modeling approaches to detect and correct for ERS diverges across studies. Third, most research uses the personality assessments both for measuring personality traits and response styles which inevitably leads to confounding style with content. Fourth, the personality assessments are affected by response styles themselves and need to be corrected. We propose a latent class factor approach that detects ERS using a validated scale and corrects for the influence of ERS on the personality assessments by simultaneously estimating the Big Five and ERS. Additionally, we assess the influence of methodological issues outlined above by comparing other modeling approaches to our model. We illustrate our approach using student data (n=200) from the Netherlands. Facets of acquiescence Daniel Danner; daniel.danner@gesis.org daniel.danner@gesis.org GESIS – Leibniz-Institute for the Social Sciences, Germany The present research investigates two facets of acquiescence: agreement and acceptance. Agreement has been defined as agreeing to all items (e.g., I am reserved, I am outgoing, I am not reserved; I am not outgoing) whereas acceptance has been defined as accepting opposite but non-negated items (e.g. I am reserved, I am outgoing) but not negated items (e.g. I am not reserved, I am not outgoing). Participants (n=398, 20-82 years old) completed a survey containing 96 items of different domains (personality, attitude, and knowledge items). The data were analyzed using hierarchical structural equation models. The results indicate that, (1) there is a general agreement factor that can explain about 2% of total item variance, (2) there are also domain-specific agreement factors that can explain up to 29% of total item variance, and (3) there is no general acceptance factor but domain specific acceptance factors that can explain up to 4% of total item variance. This suggests that acquiescence is not a general, uni-dimensional response style but has different facets which have different impact on items. Implications for research and assessments are discussed. Cognitive ability, acquiescence, and the structure of personality in a sample of older adults Clemens Lechner¹, Beatrice Rammstedt²; clemens.lechner@uni-jena.de clemens.lechner@uni-jena.de ¹University of Jena, Germany, ²GESIS – Leibniz-Institute for the Social Sciences, Germany Acquiescence, or the tendency to respond to descriptions of conceptually distinct personality attributes with agreement/affirmation, constitutes a major challenge in the assessment of personality. The aim of this study was to shed light on cognitive ability as a potential source of individual differences in acquiescent responding. We hypothesized that respondents with lower cognitive ability exhibit stronger acquiescent response tendencies; this leads to problems in establishing the Big Five factor structure among these respondents, as opposed to respondents with higher cognitive ability. Further, we hypothesized that after controlling for acquiescence by using mean-corrected instead of raw item scores, the Big Five structure holds even at lower levels of cognitive ability. Analyses in a sample of 1,071 German adults aged 56 to 75 years using the Digit Symbol Substitution Test (DSST) as a measure of cognitive ability and the BFI-10, an abbreviated version of the Big Five Inventory, as a measure of personality, corroborated these hypotheses. This suggests that lower cognitive ability, and age-related declines in cognitive functioning more specifically, are associated with higher acquiescent responding in personality inventories; but that the problems this poses for establishing the five-factor structure can be resolved by statistically controlling for acquiescence.
4:30pm - 6:00pm	PA19: Measurement 2 Session Chair: Frank M. Goldhammer
KOL-G-217 (Ⅳ)
	Assessing test-taking engagement using response times Frank Goldhammer^1,3, Thomas Martens¹, Oliver Lüdtke^2,3 ¹DIPF - German Institute for International Educational Research, Germany; ²IPN - Leibniz-Institut für die Pädagogik der Naturwissenschaften und Mathematik, Germany; ³ZIB - Centre for International Student Assessment, Germany; goldhammer@dipf.de goldhammer@dipf.de A problem of low-stake assessments is low test-taking engagement threatening the validity of test score interpretations. Therefore, we addressed the question of how indicators of test-taking engagement can be defined and validated in the context of the OECD Programme for the International Assessment of Adult Competencies (PIAAC). The approach was to identify disengaged response behavior by means of response time thresholds (cf. Lee & Jia, 2014). Constant thresholds were considered as well as item-specific thresholds based on the visual inspection of (bimodal) response time distributions (VI method) and the proportion correct conditioning on response time (P+>0% method). Results based on 152514 participants from 22 countries showed that the VI method could only be applied to a portion of items. Overall, the validity checks comparing the proportion correct of engaged and disengaged response behavior revealed that the P+>0% method performed slightly better than the other methods. Finally, we computed the proportion of disengaged responses across items and countries by domain. Overall this proportion was quite low. The results also revealed that there was an increase from part 1 to part 2 of the assessment in disengaged response behavior suggesting a drop in test-taking motivation during the course of test-taking. Examining test items for Differential Distractor Functioning (DDF) across different groups Ioannis Tsaousis University of Crete, Greece; tsaousis@uoc.gr tsaousis@uoc.gr The aim of this study was to examine the effectiveness of the alternative false responses on multiple-choice items in cognitive based test. Particularly, using Item Response Theory (e.g. Differential Distractor Analysis) as a methodological framework, we were interested in examining whether the distractors, or incorrect option choices, used in each item increase the probability for DIF effects across different groups. Data were sampled from approximately 600 students from the Greek Military Academies (i.e., Air Force, Army and Navy Academy), and who completed the Army Numerical Reasoning Test. To examine for possible DDF effects we used the odds ratio approach, whereby the DDF effect of each distractor is obtained using a generalization of the Mantel-Haenszel common odds ratio estimator adapted to each distractor. The results from the analysis revealed that there some items that exhibit DDF across different groups. Results also suggested that items showing DDF were more likely to be located in the second half of the test rather than the first half. The findings from this study allow us to determine the items needed further observation, and designate DDF analysis as a useful tool that could be used to understand better why a particular item exhibits DIF across groups. Controlling time-related individual differences in test-taking behavior by presenting time information Miriam Hacker, Frank Goldhammer, Ulf Kröhne Educational Research and Educational Information (DIPF), Germany; hacker@dipf.de hacker@dipf.de Generally speaking, in ability or competence assessments, test takers answer the questions in a self-paced way. This can make test takers differ considerably in the amount of time spent to complete a task. Such (construct-irrelevant) individual differences in test-taking behavior can produce differences in test performance although test takers may be equally able or skilled. Thus, time-related test-taking behavior can influence the measurement and affect comparability of ability scores. Previous findings on this measurement problem relating to the so-called ‘speed-accuracy-tradeoff’ originate from speed test studies. The present study aims to address the research questions with regard to power tests and to develop appropriate measurement approaches. For this study, reading competence tests were administered in a control condition with no influence on the timing behavior and several experimental conditions differing in how the timing behavior was influenced. The impact of the conditions on individual differences in timing behavior, performance, as well as the tests reliability and validity were assessed. Additional covariates were assessed to further explore performance differences within experimental conditions. The random sample consists of 1065 german students (521 female, 544 male; M = 20.51 years). First results show, i.e., that presenting time information can reduce rapid guessing behavior and decrease the number of missing responses. Gender differences in general knowledge tests: Caused by unbalanced interest domains? Philipp Meinolf Engelberg, Ralf Schulze Bergische Universität Wuppertal, Germany; engelberg@uni-wuppertal.de engelberg@uni-wuppertal.de Robust gender differences in standardized psychological tests of general knowledge, favoring men, have been repeatedly reported in test manuals and the pertinent literature. For example, the norm sample of the frequently used German general knowledge test I-S-T 2000 R evidenced an effect size of d = 0.30. In the present study, gender differences in interests as well as an inbalanced representation of interest domains between men and women in knowledge tests were both investigated as potential causes for these findings. Based on the results from an assessment of both male and female interests (n =507), a knowledge test consisting of 121 items that tap exclusively on female interest domains was created. A total of 202 participants completed both this new test and the I-S-T knowledge test. Subsequent factor analyses yielded a 2-factor solution with opposing gender differences. The I-S-T indicators showed substantial loadings on the factor with male advantage only. The results support the hypothesis that gender differences in knowledge tests are not based on gender differences in true general knowledge but may – at least partially – be attributed to an unbalanced item selection from predominantly male interest domains. Are student evaluations of teaching really reliable? A Bayesian meta-analysis Sherin Natalia Bopp, Sven Hug, Rüdiger Mutz ETH Zurich, Switzerland; sherin.bopp@gess.ethz.ch sherin.bopp@gess.ethz.ch Student evaluation of teaching (SET) has become a fixed part of most university quality assurance systems in order to assess teaching performance. Numerous primary studies to different topics of SET reflect the strong development in research on SET especially in the last 30 years. In face of this huge literature it is still not possible to integrate results of primary studies to conclusive overall statements, even in comprehensive reviews. Therefore, for the first time in research on SET, more sophisticated Bayesian meta-analysis techniques have been used here to establish general quantitative statements about SET in order to address the complex problems of data analysis (e.g., multilevel data, different teaching dimensions). Of major concern in research on SET are the key concepts of test theory (e.g., reliability, validity). In a first step the reliability of SET has been investigated with 218 primary studies. We address the following questions: Which kind of reliability concepts were used in the studies? Are SET scores on the average actually sufficient reliable as Marsh (1984) has claimed? How much do SET results vary across and within studies? What are the determinants of reliability of SET? Inital results and conclusions will be presented.

Date: Saturday, 25/Jul/2015
10:15am - 11:45am	PA23: Creativity and Emotional Intelligence Session Chair: Johnny Fontaine
KOL-G-217 (Ⅳ)
	Assessment of emotional intelligence: A plea for unscored ratings Elke Veirman, Johnny Fontaine Ghent University, Belgium; Johnny.Fontaine@UGent.be Johnny.Fontaine@UGent.be How emotional intelligence ability items should be scored has been vigorously debated. In the present study we investigate the possibility to directly derive emotional intelligence from the raw item ratings as given by the participant without any post hoc scoring of the items. We do this by investigating the internal structure of emotional intelligence subscales at item level. We hypothesized that rating-based emotional intelligence scales would be structured by two factors: a bipolar ability factor with right items loading positively and wrong items loading negatively, and a unipolar acquiescent response style factor with all items loading positively on it. This hypothesis was investigated on the Mayer-Salovey-Caruso Emotional Intelligence Test - Youth Version in a first sample of 630 Flemish pupils and a second sample of 664 Flemish pupils. In the first sample the original instrument, three rating subscales were applied; In the second sample an adapted version was applied with all four scales presented in rating format. In both samples the hypothesized structure was confirmed for all subtests using rating scales. Across the factors of all subtests a general emotional intelligence and a general acquiescence factor emerged. The nomological network further confirmed the interpretation of the factors. Measuring emotional intelligence in early adolescents: An application of the latent change variable models Vesna Buško¹, Ana Babić Čikeš² ¹University of Zagreb, Croatia; ²J. J. Strossmayer University of Osijek, Croatia; vbusko@ffzg.hr vbusko@ffzg.hr This study is focused on the analysis of intraindividual changes in emotional intelligence (EI) conceptualized within the ability-based model. Particular classes of structural equation models were applied to the study of correlates of inter- and intraindividual variations in the proposed performance based measures of the three EI dimensions. The data to be presented is derived from the longitudinal study of EI development conducted in three time points on the sample of 517 primary school students aged 10 to 15 years. Following the assumptions of the latent state-trait theory (e.g. Steyer, et al., 1992), the degree to which variations in EI measures are due to individual dispositions and/or to occasion-related factors will be presented. Several single- and multi-construct latent state-trait models were tested against the EI data. According to the parameter estimates obtained, the portions of variance attributable to situational and/or interactional effects varied with the point of measurement and the EI operationalization used. Further, the true change modeling procedures (e.g., Steyer, Eid, and Schwenkmezer, 1997) employed confirmed the significant role of gender and cognitive ability measures as moderator and antecedents of interindividual differences in changes on EI measures, respectively. Validation through inhouse-meta-analysis exemplified on an inventory of creative activities and achievements Jennifer Diedrich^1,2, Mathias Benedek¹, Emanuel Jauk¹, Aljoscha Neubauer¹ ¹University of Graz, Austria; ²Federal Academy of Lower Austria (Niederösterreichische Landesakademie), Austria; jennifer.diedrich@noe-lak.at jennifer.diedrich@noe-lak.at Creative activities and achievements can be reliably assessed using self-report inventories (Silvia, Wigert, Reiter-Palmon & Kaufman, 2011). These measures differ in their focus on personal as opposed to public achievements. The inventory to be presented – the inventory of creative activities and achievements (ICAA; Jauk et al, 2013a & Jauk et al, 2013b) – is constructed to assess both levels of achievement in eight different domains. The ICAA has been employed in eight studies along with tests of creative potential, personality, and intelligence. This inventory’s reliability and validity was estimated in two different ways: First internal consistency and measurement models are performed in a compound dataset comprising the ICAA variables of all eight studies. Second, convergent and divergent validity with measures of personality, intelligence, and creative potential were performed via meta-analyses of these eight datasets. This two-tier approach was chosen due to the reasonable homogeneity of ICAA variables but not of the validity variables. The advantages of this two-tier approach shall be presented at the conference. Assessing creativity by meaning Shulamith Kreitler Tel-Aviv University, Israel; krit@netvision.net.il krit@netvision.net.il The purpose was to develop a procedure of assessing creativity in terms of the meaning system (Kreitler) which is a psychosemantically-grounded methodology for assessing meaning. Three studies will be presented, describing the meaning variables found to differentiate significantly between more and less creative participants, in different samples of children, and different measures of creativity. In study 1, 158 children (ages7.2-9.4) were administered the meaning test, the Wechsler IQ test, and the Torrance test of creativity. In study 2, 71 children (mean age 10.9) were administered the meaning test and their drawings and paintings were evaluated for creativity. In study 3, 238 Beduin children (mean age 13.7) were administered the meaning test and the questionnaire “The Things Done on your Own” (Torrance). In each study the meaning variables differentiating between the more and less creative were identified. The set as a whole indicates the following tendencies characterizing the more creative children: focusing on dynamic, objective and experiential aspects, using nonverbal and verbal forms of expression, considering present inputs and distant ones, and emphasizing both the personal-subjective and the interpersonally-shared meanings. The meaning variables characterizing the more creative children could be used for developing an assessment instrument for creativity.