Your conditions: 刘彦楼
  • On the reliability of point estimation of model parameters: Taking cognitive diagnostic models as an example

    Subjects: Other Disciplines >> Synthetic discipline submitted time 2023-10-09 Cooperative journals: 《心理学报》

    Abstract: Cognitive diagnostic models (CDMs) are psychometric models that have received increasing attention within fields such as psychology, education, sociology, and biology. It has been argued that an inappropriate convergence criterion for a maximum likelihood estimation using the expectation maximization (MLE-EM) algorithm could result in unpredictable and inaccurate model parameter estimates. Thus, inappropriate convergence criteria may yield unstable and misleading conclusions from the fitted CDMs. Although several convergence criteria have been developed, it remains an unexplored question, how to specify the appropriate convergence criterion for fitted CDMs. A comprehensive method for assessing convergence is proposed in this study. To minimize the influence of the model parameter estimation framework, a new framework adopting the multiple starting values strategy (mCDM) is introduced. To examine the performance of the convergence criterion for MLE-EM in CDMs, a simulation study under various conditions was conducted. Five convergence assessment methods were examined: the maximum absolute change in model parameters, the maximum absolute change in item endorsement probabilities and structural parameters, the absolute change in log-likelihood, the relative log-likelihood, and the comprehensive method. The data generating models were the saturated CDM and the hierarchical CDM. The number of items was set to J = 16 and 32. Three levels of sample sizes were considered: 500, 1000, and 4000. The three convergence tolerance value conditions were 10-4, 10-6, and 10-8. The simulated response data were fitted by the saturated CDM using the mCDM and the R package GDINA. The maximum number of iterations was set to 50000.The simulation results suggest the following. (1) The saturated CDM converged under all conditions. However, the actual number of iterations exceeded 30000 under some conditions, implying that when the predefined maximum iteration number is less than 30000, the MLE-EM algorithm might inadvertently stop.(2) The model parameter estimation framework affected the performance of the convergence criteria. The performance of the convergence criteria under the mCDM framework was comparable or superior to that of the GDINA framework. (3) Regarding the convergence tolerance values considered in this study, 10-8 consistently had the best performance in providing the maximum value of the log-likelihood and 10-4 had the worst performance. Compared to all other convergence assessment methods, the comprehensive method in general had the best performance, especially under the mCDM framework. The performance of the maximum absolute change in model parameters was similar to the comprehensive method, but this good performance was not consistent. On the contrary, the relative log-likelihood had the worst performance under the mCDM and GDINA frameworks.The simulation results showed that the most appropriate convergence criterion for MLE-EM in CDMs was the comprehensive method with tolerance 10-8 under the mCDM framework. The results from the real data analysis also demonstrated that the proposed comprehensive method and mCDM framework had good performance.

  • On the reliability of point estimation of model parameter: taking the CDMs as an example

    Subjects: Psychology >> Psychological Measurement Subjects: Psychology >> Statistics in Psychology submitted time 2023-05-11

    Abstract: Cognitive diagnostic models (CDMs) are psychometric models which have received increasing attention within the field of psychological, educational, social, biological, and many other disciplines. It has been argued that an inappropriate convergence criterion for MLE-EM (maximum likelihood estimation using the expectation maximization) algorithm could result in unpredictably distorted model parameter estimates, and thus may yield unstable and misleading conclusions drawn from the fitted CDMs. Although several convergence criteria have been developed, it remains an unexplored question, how to specify the appropriate convergence criterion for the fitted CDMs.
    A comprehensive method for assessing convergence is proposed in this study. To minimize the impact by the model parameter estimation framework, a new framework adopting the multiple starting values strategy mCDM is introduced. To examine the performance of the convergence criterion for MLE-EM in CDMs, a simulation study under various conditions was conducted. Five convergence assessment methods were examined: the maximum absolute change in model parameters, the maximum absolute change in item endorsement probabilities and structural parameters, the absolute change in log-likelihood, the relative log-likelihood, and the comprehensive method. The data generating models were the saturated CDM and the hierarchical CDM. The number of items was set to J = 16 and 32. Three levels of sample sizes were considered: 500, 1000, and 4000. Three convergence tolerance value conditions were: 10-4 , 10-6 , and 10-8 . The simulated response data were fitted by the saturated CDM using the mCDM and the R package GDINA. And the maximum number of iterations was set to 50000.
    Simulation results suggest that:
    (1) The saturated CDM converged under all conditions. However, the actual number of iterations exceeded 30000 under some conditions, which implies that when predefined maximum iteration number is less than 30000, the MLE-EM algorithm might mistakenly stop.
    (2) The model parameter estimation framework affected the performance of the convergence criteria. The performance of the convergence criteria under the mCDM framework was comparable or superior to that of the GDINA framework.
    (3) Regarding the convergence tolerance values considered in this study, 10-8  consistently had the best performance in providing the maximum value of the log-likelihood and 10-4  had the worst as suggested by the higher log-likelihood value. Compared to all other convergence assessment methods, the comprehensive method in general had the best performance, especially under the mCDM framework. The performance of the maximum absolute change in model parameters was similar to the comprehensive method, however, its good performance was not guaranteed. On the contrary, the relative log-likelihood had the worst performance under the mCDM or GDINA framework.
    The simulation results showed that, the most appropriate convergence criterion for MLE-EM in CDMs was the comprehensive method with tolerance 10-8  under the mCDM framework. Results from the real data analysis also demonstrated the good performance of the proposed comprehensive method and mCDM framework.
     

  • 认知诊断模型的标准误与置信区间估计:并行自助法

    Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

    Abstract: The model parameter standard error (SE; or variance-covariance matrix), which provides an estimate of the uncertainty associated with the model parameter estimate, has both theoretical and practical implications in cognitive diagnostic models (CDMs). The drawbacks of the analytic methods, such as the empirical cross-product information matrix, observed information matrix, and “robust” sandwich-type information matrix, are that they require the positive definiteness of the information matrix and may suffer from boundary problems. Another method for estimating model parameter SEs is to use the computer-intensive bootstrap method, and consequently, no study has systematically explored the performance of the bootstrap in calculating model parameter SEs and confidence intervals (CIs) in CDMs. The purpose of this research is to present two new highly efficient bootstrap methods to calculate model parameter SEs and CIs in CDMs, namely the parallel parametric bootstrap (pPB) and parallel non-parametric bootstrap (pNPB) methods. A simulation study was conducted to evaluate the performance of the pPB and pNPB methods. Five factors that may influence the performance of the model parameter SEs and CIs were manipulated. The two model specification scenarios considered in this simulation were the correctly specified and over-specified models. The sample size was set to two levels: 1, 000 and 3, 000. Three bootstrap sample sizes were manipulated: 200, 500, and 3, 000. Three levels of item quality were considered: high [P(0)=0.1P(0)=0.1P(\mathbf{0})=0.1, P(1)=0.9P(1)=0.9P(\mathbf{1})=0.9], moderate [P(0)=0.2P(0)=0.2P(\mathbf{0})=0.2, P(1)=0.8P(1)=0.8P(\mathbf{1})=0.8], and low quality [P(0)=0.3P(0)=0.3P(\mathbf{0})=0.3, P(1)=0.7P(1)=0.7P(\mathbf{1})=0.7]. The pPB and pNPB methods were used to estimate model parameter SEs and CIs. The simulation results indicated the following. (1) For the correctly specified CDMs, under the high- or moderate-item-quality conditions, the coverage rates of the 95% CIs of the model parameter SEs based on the pNPB or pPB method were reasonably close to the expected coverage rate, and the bias for each model parameter SE converged to zero, meaning that the estimated SE was almost identical to the empirical SE. The increase in the bootstrap sample size had only a slight effect on the performance of the pNPB or pPB method. Under the low-item-quality condition, the pNPB method tended to over-estimate SE, whereas a contrary trend was observed for the pPB method. (2) For the over-specified CDMs, most of the permissible item parameter SEs and almost all of the permissible structural parameter SEs exhibited good performance in terms of the 95% CI coverage rates and bias. Under most of the simulation conditions, the impermissible model parameter SEs did not exhibit good performance in approximating the empirical SEs. To the best of our knowledge, this is the first study in which the performance of the bootstrap method in estimating model parameter SEs and CIs in CDMs is systematically investigated. The pNPB or pPB appears to be a useful tool for researchers interested in evaluating the uncertainty of the model parameter point estimates. As a time-saving computational strategy, the pNPB or pPB method is substantially faster than the usual bootstrap method. The simulation and real data studies showed that 3, 000 re-samples might be adequate for the bootstrap method in calculating model parameter SEs and CIs in CDMs.

  • 认知诊断模型Q矩阵修正:完整信息矩阵的作用

    Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

    Abstract: A Q-matrix, which defines the relations between latent attributes and items, is a central building block of the cognitive diagnostic models (CDMs). In practice, a Q-matrix is usually specified subjectively by domain experts, which might contain some misspecifications. The misspecified Q-matrix could cause several serious problems, such as inaccurate model parameters and erroneous attribute profile classifications. Several Q-matrix validation methods have been developed in the literature, such as the G-DINA discrimination index (GDI), Wald test based on an incomplete information matrix (Wald-IC), and Hull methods. Although these methods have shown promising results on Q-matrix recovery rate (QRR) and true positive rate (TPR), a common drawback of these methods is that they obtain poor results on true negative rate (TNR). It is important to note that the worse performance of the Wald-IC method on TNR might be caused by the incorrect computation of the information matrix. A new Q-matrix validation method is proposed in this paper that constructs a Wald test with a complete empirical cross-product information matrix (XPD). A simulation study was conducted to evaluate the performance of the Wald-XPD method and compare it with GDI, Wald-IC, and Hull methods. Five factors that may influence the performance of Q-matrix validation were manipulated. Attribute patterns were generated following either a uniform distribution or a higher-order distribution. The misspecification rate was set to two levels: QM = 0.15 and QM= 0.3. Two sample sizes were manipulated: 500 and 1000. The three levels of IQ were defined as high IQ, Pj(0) ~ U(0, 0.2) and Pj(1) ~ U(0.8, 1); medium IQ, Pj(0) ~ U(0.1, 0.3) and Pj(1) ~ U(0.7, 0.9); and low IQ, Pj(0) ~ U(0.2, 0.4) and Pj(1) ~ U(0.6, 0.8). The number of attributes was fixed at K = 4. Two ratios of the number of items to attribute were considered in the study: J = 16[(K= 4)×(JK = 4)] and J = 32[(K= 4)×(JK = 8)]. The simulation results showed the following. (1) The Wald-XPD method always provided the best results or was close to the best-performing method across the different factor levels, especially in the terms of the TNR. The HullP and Wald-IC methods produced larger values of QRR and TPR but smaller values of TNR. A similar pattern was observed between HullP and HullR, with HullP being better than HullR. Among the Q-matrix validation methods considered in this study, the GDI method was the worst performer. (2) The results from the comparison of the HullP, Wald-IC, and Wald-XPD methods suggested that the Wald-XPD method is more preferred for Q-matrix validation. Even though the HullP and Wald-IC methods could provide higher TPR values when the conditions were particularly unfavorable (e.g., low item quality, short test length, and low sample size), they obtain very low TNR values. The practical application of the Wald-XPD method was illustrated using real data. In conclusion, the Wald-XPD method has excellent power to detect and correct misspecified q-entry. In addition, it is a generic method that can serve as an important complement to domain experts’ judgement, which could reduce their workload.

  • 认知诊断模型Q矩阵修正:完整信息矩阵的作用

    Subjects: Psychology >> Statistics in Psychology Subjects: Psychology >> Psychological Measurement submitted time 2022-07-15

    Abstract:

    A Q-matrix, which defines the relations between latent attributes and items, is a central building block of the cognitive diagnostic models (CDMs). In practice, a Q-matrix is usually specified subjectively by domain experts, which might contain some misspecifications. The misspecified Q-matrix could cause several serious problems, such as inaccurate model parameters and erroneous attribute profile classifications. Several Q-matrix validation methods have been developed in the literature, such as the G-DINA discrimination index (GDI), Wald test based on an incomplete information matrix (Wald-IC), and Hull methods. Although these methods have shown promising results on Q-matrix recovery rate (QRR) and true positive rate (TPR), a common drawback of these methods is that they obtain poor results on true negative rate (TNR). It is important to note that the worse performance of the Wald-IC method on TNR might be caused by the incorrect computation of the information matrix.

    A new Q-matrix validation method is proposed in this paper that constructs a Wald test with a complete empirical cross-product information matrix (XPD). A simulation study was conducted to evaluate the performance of the Wald-XPD method and compare it with GDI, Wald-IC, and Hull methods. Five factors that may influence the performance of Q-matrix validation were manipulated. Attribute patterns were generated following either a uniform distribution or a higher-order distribution. The misspecification rate was set to two levels: $QM\text{=}0.15$and$QM\text{=}0.3$. Two sample sizes were manipulated: 500 and 1000. The three levels of IQ were defined as high IQ, ${{P}_{j}}\left( 0 \right)\sim U(0,0.2)$and${{P}_{j}}\left( 1 \right)\sim U(0.8,1)$; medium IQ, ${{P}_{j}}\left( 0 \right)\sim U(0.1,0.3)$ and ${{P}_{j}}\left( 1 \right)\sim U(0.7,0.9)$; and low IQ, ${{P}_{j}}\left( 0 \right)\sim U(0.2,0.4)$ and ${{P}_{j}}\left( 1 \right)\sim U(0.6,0.8)$. The number of attributes was fixed at $K\text{=}4$. Two ratios of the number of items to attribute were considered in the study: $J=16$$\left[ (K\text{=}4)\times (JK\text{=}4) \right]$ and $J=32$$\left[ (K\text{=}4)\times (JK\text{=}8) \right]$.

    The simulation results showed the following.

    (1) The Wald-XPD method always provided the best results or was close to the best-performing method across the different factor levels, especially in the terms of the TNR. The HullP and Wald-IC methods produced larger values of QRR and TPR but smaller values of TNR. A similar pattern was observed between HullP and HullR, with HullP being better than HullR. Among the Q-matrix validation methods considered in this study, the GDI method was the worst performer.

    (2) The results from the comparison of the HullP, Wald-IC, and Wald-XPD methods suggested that the Wald-XPD method is more preferred for Q-matrix validation. Even though the HullP and Wald-IC methods could provide higher TPR values when the conditions were particularly unfavorable (e.g., low item quality, short test length, and low sample size), they obtain very low TNR values. The practical application of the Wald-XPD method was illustrated using real data.

    In conclusion, the Wald-XPD method has excellent power to detect and correct misspecified q-entry. In addition, it is a generic method that can serve as an important complement to domain experts’ judgement, which could reduce their workload.

  • 认知诊断模型的标准误与置信区间估计:并行自助法

    Subjects: Psychology >> Statistics in Psychology submitted time 2022-01-26

    Abstract:

    The model parameter standard error (SE; or variance-covariance matrix), which provides an estimate of the uncertainty associated with the model parameter estimate, has both theoretical and practical implications in cognitive diagnostic models (CDMs). The drawbacks of the analytic methods, such as the empirical cross-product information matrix, observed information matrix, and “robust” sandwich-type information matrix, are that they require the positive definiteness of the information matrix and may suffer from boundary problems. Another method for estimating model parameter SEs is to use the computer-intensive bootstrap method, and consequently, no study has systematically explored the performance of the bootstrap in calculating model parameter SEs and confidence intervals (CIs) in CDMs.

    The purpose of this research is to present two new highly efficient bootstrap methods to calculate model parameter SEs and CIs in CDMs, namely the parallel parametric bootstrap (pPB) and parallel non-parametric bootstrap (pNPB) methods. A simulation study was conducted to evaluate the performance of the pPB and pNPB methods. Five factors that may influence the performance of the model parameter SEs and CIs were manipulated. The two model specification scenarios considered in this simulation were the correctly specified and over-specified models. The sample size was set to two levels: 1,000 and 3,000. Three bootstrap sample sizes were manipulated: 200, 500, and 3,000. Three levels of item quality were considered: high [ , ], moderate [ , ], and low quality [ , ]. The pPB and pNPB methods were used to estimate model parameter SEs and CIs.

    The simulation results indicated the following.

    (1) For the correctly specified CDMs, under the high- or moderate-item-quality conditions, the coverage rates of the 95% CIs of the model parameter SEs based on the pNPB or pPB method were reasonably close to the expected coverage rate, and the bias for each model parameter SE converged to zero, meaning that the estimated SE was almost identical to the empirical SE. The increase in the bootstrap sample size had only a slight effect on the performance of the pNPB or pPB method. Under the low-item-quality condition, the pNPB method tended to over-estimate SE, whereas a contrary trend was observed for the pPB method.

    (2) For the over-specified CDMs, most of the permissible item parameter SEs and almost all of the permissible structural parameter SEs exhibited good performance in terms of the 95% CI coverage rates and bias. Under most of the simulation conditions, the impermissible model parameter SEs did not exhibit good performance in approximating the empirical SEs.

    To the best of our knowledge, this is the first study in which the performance of the bootstrap method in estimating model parameter SEs and CIs in CDMs is systematically investigated. The pNPB or pPB appears to be a useful tool for researchers interested in evaluating the uncertainty of the model parameter point estimates. As a time-saving computational strategy, the pNPB or pPB method is substantially faster than the usual bootstrap method. The simulation and real data studies showed that 3,000 re-samples might be adequate for the bootstrap method in calculating model parameter SEs and CIs in CDMs.