• Binary Modeling of Action Sequences in Problem-solving Tasks: One- and Two-parameter Action Sequence Model

    Subjects: Psychology >> Psychological Measurement submitted time 2023-01-05

    Abstract: Process data refers to the human-computer or human-human interaction data recorded in computerized learning and assessment systems that reflect respondents’ problem-solving processes. Among the process data,  action sequences are the most typical data because they reflect how respondents solve the problem step by step.  However, the non-standardized format of action sequences (i.e., different data lengths for different participants) also poses difficulties for the direct application of traditional psychometric models. Han et al. (2021) proposed the SRM by combining dynamic Bayesian networks with the nominal response model (NRM) to address the shortcomings of existing methods. Similar to the NRM, the SRM uses multinomial logistic modeling, which in turn assigns different parameters to each possible action sequence in the task, leading to high model complexity. Given that action sequences in problem-solving tasks have correct and incorrect outcomes rather than equivalence relations without quantitative order, this paper proposes two action sequence models based on binary logistic modeling with relatively low model complexity: the one- and two-parameter action sequence models (1P and 2P-ASM). Unlike the SRM, which applies the NRM migration to action sequence analysis, the 1P-ASM and 2P-ASM migrate the simpler one- and two-parameter IRT models to action sequence analysis, respectively. An illustrated example was provided to compare the performance of SRM and two ASMs with a real-world interactive assessment item, “Tickets,” in the PISA 2012. The results mainly showed that: (1) the latent ability estimates of two ASMs and the SRM had high correlation; (2) ASMs took less computing time than that of SRM; (3) participants who are solving the problem correctly tend to continue to present the correct action sequences, and vice versa; and (4) compared with the fixed discrimination parameter of the SRM, the free estimated  discrimination parameter of the 2P-ASM helped us to better understand the task. A simulation study was further designed to explore the psychometric performance of the proposed model in different test scenarios. Two factors were manipulated: sample size (including 100, 200, and 500) and average problem state transition sequence length (including short and long). The SRM was used to generate the state transition sequences in the simulation study. The problem-solving task structure from the empirical study was used. The results showed that: (1) two ASMs could provide accurate parameter estimates even if they were not the data-generation model; (2) the computation time of both ASMs was lower than that of SRM, especially under the condition of a small sample size; (3) the problem-solving ability estimates of both ASMs were in high agreement with the problem-solving ability estimate of the SRM, and the agreement between 2P-ASM and SRM is relatively higher; and (4) the longer the problem state transition sequence, the better the recovery of problem solving ability parameter for both ASMs and SRM. Overall, the two ASMs proposed in this paper based on binary logistic modeling can achieve effective 6 analysis of action sequences and provide almost identical estimates of participants' problem-solving ability to SRM while significantly reducing the computational time. Meanwhile, combining the results of simulation and empirical studies, we believe that the 2P-ASM has better overall performance than the 1P-ASM; however, the more parsimonious 1P-ASM is recommended when the sample size is small (e.g., 100 participants) or the task is simple (fewer operations are required to solve the problem).

  • Longitudinal Hamming Distance Discrimination: Developmental Tracking of Latent Attributes

    Subjects: Psychology >> Psychological Measurement submitted time 2022-10-06

    Abstract: Longitudinal cognitive diagnostics can assess students' strengths and weaknesses over time, profile students' developmental trajectories, and can be used to evaluate the effectiveness of teaching methods and optimize the teaching process.Existing researchers have proposed different longitudinal diagnostic classification models, which provide methodological support for the analysis of longitudinal cognitive diagnostic data. Although these parametric longitudinal cognitive diagnostic models can effectively assess students' growth trajectories, their requirements for coding ability and sample size hinder their application among frontline educators, and they are time-consuming and not conducive to providing timely feedback. On the one hand, the nonparametric approach is easy to calculate, efficient to apply, and provides timely feedback; on the other hand, it is free from the dependence on sample size and is particularly suitable for analyzing assessment data at the classroom or school level. Therefore, this paper proposed a longitudinal nonparametric approach to track changes in student attribute mastery. This study extended the longitudinal Hamming distance discriminant (Long-HDD) based on the Hamming distance discriminant (HDD), which uses the Hamming distance to represent the dependence between attribute mastery patterns of the same student at adjacent time points. To explore the performance of Long-HDD in longitudinal cognitive diagnostic data, we conducted a simulation study and an empirical study and compared the classification accuracy of the HDD, Long-HDD, and Long-DINA models. In the simulation study, five independent variables were manipulated, including (1) sample sizes N = 25, 50, 100, and 300; (2) number of items I = 25 and 50; (3) number of time points T = 2 and 3; (4) number of attributes measured at each time point K = 3 and 5, and (5) data analysis methods M = HDD, Long-HDD, and Long-DINA. The student’s real attribute mastery patterns were randomly selected with equal probability from all possible attribute patterns, and the transfer probabilities among attributes between adjacent time points were set to be equal (e.g., p(0→0) = 0.8, p(0→1) = 0.2, p(1→0) = 0.05, p(1→1) = 0.95), while the first K items constituting the unit matrix in the Q-matrix at each time point were set to be anchor items, and the item parameters were set to be moderately negative correlation, generated by a ?bivariate normal distribution. For the empirical study, the results of three parallel tests with 18 questions each, measuring six attributes, were used for 90 7th graders. The Q-matrix for each test was equal. The results of the simulation study showed that (1) Long-HDD had higher classification accuracy in longitudinal diagnostic data analysis; (2) Long-HDD performed almost independently of sample size and performed better with a smaller sample size compared to Long-DINA; and (3) Long-HDD consumed much less computational time than Long-DINA. In addition, the results of the empirical data also showed that there was good consistency between the results of the Long-HDD and the Long-DINA model?in tracking changes in attribute development. The percentage of mastery of each attribute increased with the increase of time points. In summary, the long-HDD proposed in this study extends the application of nonparametric methods to longitudinal cognitive diagnostic data and can provide high classification accuracy. Compared with parameterized longitudinal DCM (e.g., Long-DINA), it can provide timely diagnostic feedback due to the fact that it is not affected by sample size, simple calculation, and less time-consuming. It is more suitable for small-scale longitudinal assessments such as class and school level. " "

  • Joint-Cross-Loading Multimodal Cognitive Diagnostic Modeling Incorporating Visual Fixation Counts

    Subjects: Psychology >> Psychological Measurement submitted time 2021-11-30

    Abstract: Students' observed behavior (e.g., learning behavior and problem-solving behavior) comprises of activities that represent complicated cognitive processes and latent conceptions that are frequently systematically related to one another. Cognitive characteristics such as cognitive styles and fluency may differ between students with the same cognitive/knowledge structure. However, practically all cognitive diagnosis models (CDMs) that merely assess item response accuracy (RA) data are currently incapable of estimating or inferring individual differences in cognitive traits. With advances in technology-enhanced assessments, it is now possible to capture multimodal data, such as outcome data (e.g., response accuracy), process data (e.g., response times (RTs), and biometric data (e.g., visual fixation counts (FCs)), automatically and simultaneously during the problem-solving activity. Multimodal data allows for precise cognitive structure diagnosis as well as comprehensive feedback on various cognitive characteristics. First, using joint analysis of RA, RT, and FC data as an example, this study elaborated three multimodal data analysis methods and models, including separate modeling (whose model is denoted as S-MCDM), joint-hierarchical modeling (whose model is denoted as H-MCDM) (Zhan et al., 2021), and joint-cross-loading modeling (whose model is denoted as C-MCDM). Following that, three C-MCDMs with distinct hypotheses were presented based on joint-cross-loading modeling, namely, the C-MCDM-θ, C-MCDM-D, and C-MCDM-C, respectively. Three C-MCDMs, in comparison to the H-MCDM, introduce two item-level weight parameters (i.e., φi and λi) into the RT and FC measurement models, respectively, to quantify the impact of latent ability or latent attributes on RT and FC. The Markov Chain Monte Carlo method was used to estimate model parameters using a full Bayesian approach. To illustrate the three proposed models' application and compare them to the S-MCDM and H-MCDM, multimodal data for a real-world mathematics test was used. Data was gathered at a prominent university on the East Coast of the United States in an eye-tracking lab. An I = 10 mathematics items test was given to N = 93 university students with normal or corrected vision. The test included K = 4 attributes, and the related Q-matrix is shown in Figure 3. The data is divided into three modalities: RA, RT, and FC, which were all collected at the same time. The data was fitted to all five multimodal models. In addition, two simulation studies were conducted further to explore the psychometric performance of the proposed models. The purpose of simulation study 1 was to explore whether the parameter estimates of the proposed models can converge effectively and explore the recovery of parameter estimation under different simulated test situations. The purpose of simulation study 2 was to explore the relative merits of C-MCDMs and H-MCDM, that is, to explore the necessity of considering cross-loading in multimodal data analysis. The results of the empirical study showed that (1) the C-MCDM-θ has the best model-data fitting, followed by the H-MCDM and the S-MCDM. Although the DIC showed that the C-MCDM-D and C-MCDM-C also fitted the data well, the results were only for reference because some parameter estimates in these two models did not converge; that (2) the correlation coefficients between latent ability and latent processing speed and that between latent ability and latent concentration were weak, making it difficult to fully exploit the theoretical advantages of H-MCDM over S-MCDM (Ranger, 2013). By contrast, since the C-MCDM-θ can directly utilize the information from RT and FC data, the standard error of the estimates of its latent ability was significantly lower than that of the previous two competing models; and that (3) the median of the estimates of φi was less than 0, which indicated that for most items, the higher the participant’s latent ability is, the longer the time it will take to solve the items; and the median of the estimates of λi was higher than 0, which indicated that for most items, the higher the participant’s latent ability is, the more number of fixation counts he/she shown in problem-solving. Furthermore, it should be noted that the estimates of φi and λi do not always have the same sign for different items, indicating that the influence of latent abilities on RT and FC has different directions (i.e., facilitation or inhibition) for different items. Furthermore, simulation study 1 indicated that the parameter estimation of the proposed three models could converge effectively and the recovery of model parameters was good under different simulated test situations. The results of simulation study 2 indicated that the adverse effects of ignoring the possible cross-loadings are more severe than redundantly considering the cross-loadings. Overall, the results of this study indicate that (1) fusion analysis is more suitable for multimodal data that provides parallel information than separate analysis; that (2) through cross-loading, the proposed models can directly use information from RT and FC data to improve the parameter estimation accuracy of latent ability or latent attributes; that (3) the results of the proposed models can be used to diagnose cognitive structure and infer other cognitive characteristics such as cognitive styles and fluency; and that (4) the proposed models have better compatibility with different test situations than H-MCDM.

  • The Measurement of Problem-Solving Competence Using Process Data

    Subjects: Psychology >> Psychological Measurement submitted time 2021-10-04

    Abstract: Problem-solving competence is an individual’s capacity to engage in cognitive processing to understand and resolve problem situations where a method of solution is not immediately obvious. The measurement of problem-solving competence requires the use of relatively more complex and real problem situations to induce the presentation of problem-solving behaviors. This brings challenges to both the measurement methods of problem-solving competence and the corresponding data analysis methods. Using virtual assessments to capture the process data in problem-solving and mining the potential information contained therein is a new trend in measuring problem-solving competence in psychometrics. To begin with, we reviewed the development of the measurement methods of problem-solving competence: from paper-and-pencil tests to virtual assessments. In addition, we summarized two types of process data analysis methods: data mining and statistical modeling. Finally, we look forward to possible future research directions from five perspectives: the influence of non-cognitive factors on problem-solving competence, the use of multimodal data to measure problem-solving competence, the measurement of the development of problem-solving competence, the measurement of other higher-order thinking competencies, and the definition of concept and structure of problem-solving competence.