Your conditions: 汤勇韬
  • Span Classification Based Model For Clinical Concept Extraction

    Subjects: Computer Science >> Natural Language Understanding and Machine Translation submitted time 2020-10-27

    Abstract: Recently, how to structuralize electronic medical records (EMRs) has attracted considerable attention from researchers. Extracting clinical concepts from EMRs is a critical part of EMR structuralization. The performance of clinical concept extraction will directly affect the performance of the downstream tasks related to EMR structuralization. However, the mainstream method, sequence labeling model has some shortcomings. The clinical concept extraction method based on sequence labeling does not conform to the human cognitive model of language. At the same time, the extraction results produced by this method are dif- ficult to couple with downstream tasks, which will cause error propagation and affect the performance of downstream tasks. To deal with these problems, we propose a span classification based method to improves the performance of clinical concept extraction tasks by considering the overall semantics of the token sequence instead of the semantics of each token. We call this model as span classification model. Experiments show that the span classification model achieves the best micro-average F1 score(81.22%) on the corpora of the 2012 i2b2 NLP challenges, and obtained an F1 score(89.25%) comparable to SOTA in the 2010 i2b2 NLP challenges. Furthermore, the performance of our approach is always better than the sequence labeling model such as BiLSTM-CRF model and softmax classifier " " "