• 基于核密度估计的基本概率指派生成方法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2019-05-10 Cooperative journals: 《计算机应用研究》

    Abstract: D-S evidence theory is a method that processes uncertain information effectively and is widely used in information fusion. However, the determination of BPA (Basic Probability Assign) or the action object of D-S fusion method is still an open problem in process of D-S theory application. This paper proposed a BPA determination method based on kernel density estimation(kde) . The method uses training data to construct a data attribute model with optimized bandwidth based on the optimized kernel density estimation; then calculate the density-distance-distribution (Tri-D) value of test data by using the kernel density model of training data. The next step is obtaining BPA of test data by using the nested method to assign Tri-D. Finally, fusing BPA by D-S method to get the final result, and judging the validity of the BPA generation method by the classification accuracy rate. An illustrative case regarding the classification accuracy compared with other methods on UCI data sets shows the effectiveness of the method.

  • NLOF:基于网格过滤的两阶段离群点检测算法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2019-01-28 Cooperative journals: 《计算机应用研究》

    Abstract: The purpose of outlier detection is to effectively identify anomalous data in the dataset and to mine meaningful potential information in the data set. The existing outlier detection algorithm does not process the original data, resulting in too high computational time complexity and unsatisfactory detection results. This paper proposes a two-stage outlier detection algorithm NLOF based on grid filtering: First use grid filtering to initially screen the original data, put data with a density less than a certain threshold into a candidate exception subset; then in order to further optimize the density-based algorithm, based on the k-neighborhood, according to the ratio of the number of data points in the neighborhood to the area of the circle formed by the neighborhood, as the basis for calculating the data point density, outlier detection to obtain a more accurate outlier set. Experiments have been carried out on a variety of public datasets. Experiments show that this method can achieve good performance in anomaly detection and reduce the time complexity of the algorithm.

  • 基于CRT机制混合神经网络的特定目标情感分析

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-12-13 Cooperative journals: 《计算机应用研究》

    Abstract: The purpose of target-specific affective analysis is to predict the sentiment of a text from the perspective of different target words. The key is to assign appropriate affective words to a given target. When there are more than one affective word describing multiple target sentiments in a sentence, it may lead to the mismatch between the affective word and the target. In this paper, a hybrid neural network based on CRT mechanism is proposed for target-specific sentiment analysis. The model uses CNN layer to extract features from the word representation after BiLSTM transformation. The specific target representation of the word is generated by CRT component and the original context information from BiLSTM layer is saved. Experiments on three open datasets show that the proposed model can significantly improve the accuracy and stability of target-specific affective analysis tasks compared with previous models. It is proved that the CRT mechanism in this paper can integrate the advantages of CNN and LSTM well, which is of great significance to the task of sentiment analysis for specific targets.

  • 基于卷积神经网络和贝叶斯分类器的句子分类模型

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-12-13 Cooperative journals: 《计算机应用研究》

    Abstract: The traditional sentence classification model has many disadvantages such as complex feature extraction process and low classification accuracy. This paper used the advantages of the popular deep learning model based convolutional neural network in feature extraction, combined with the traditional sentence classification method, proposed a sentence classification model based on convolutional neural network and Bayesian classifier. The model first used convolutional neural network to extract text features, and secondly used principal component analysis method to reduce the dimensionality of text features. Finally, Bayesian classifier were used to classify sentences. The experimental results show that on Cornell University's public film review dataset and Stanford Sentiment Treebank dataset, the method proposed in this paper is superior to the model using only deep learning or the traditional sentence classification model.

  • 基于TextRank的自动摘要优化算法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-19 Cooperative journals: 《计算机应用研究》

    Abstract: When Abstract: ng Chinese texts, the traditional TextRank algorithm only considers the similarity between nodes and neglects other important information of the text. Firstly, aiming at Chinese single document, on the basis of existing research, this paper uses TextRank algorithm, on the one hand, it considers the similarities between sentences, on the other hand, TextRank is combined with the overall structural information of texts and the contextual information of sentences, such as the physical position of the document sentences or paragraph, feature sentences, core sentences and other sentences that may increase the weight of the sentence, all are used to generate the digest candidate sentence group of the text. And then, removing high-similarity sentences by redundancy processing technology on the digest candidate sentence group. Finally, the experimental verification shows that the algorithm can improve the accuracy of the generated digest, indicating the effectiveness of the algorithm.

  • 基于互信息和邻接熵的新词发现算法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-19 Cooperative journals: 《计算机应用研究》

    Abstract: How to identify new words quickly and efficiently is a very important task in natural language processing. Aiming at the problems existing in the discovery of new words, there is an algorithm for word-finding new words verbatim from left to right in the uncut word Weibo corpus. One way to get a candidate new word is by computing the candidate word and its right adjacent word mutual information to expand word by word; There are some ways to filter candidate new words to get new word sets. The included methods include calculating the branch entropy, deleting stop words contained in the first or last word of each candidate new word and deleting old words included in the candidate new word set. It solves the problem that some new words can not be recognized due to the mistakes in the word segmentation and It also solves the problem that the large number of repetitive word strings and rubbish words strings generated by the n-gram method are identified as new words. Finally, experiments verified the effectiveness of the algorithm.