ChinaXiv.org 中国科学院科技论文预发布平台

Submitted Date

2019
2
2018
4

Subjects

Integration Theory of Computer Science
6

Authors

Institution

result total 6.

Hide Summary

Hits

Date

Downloads

Your conditions: 山东省分布式计算机软件新技术重点实验室

1. ChinaXiv:201905.00025
Download

基于核密度估计的基本概率指派生成方法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2019-05-10 Cooperative journals: 《计算机应用研究》

黄杰尉永清伊静刘孟迪

Abstract： D-S evidence theory is a method that processes uncertain information effectively and is widely used in information fusion. However, the determination of BPA (Basic Probability Assign) or the action object of D-S fusion method is still an open problem in process of D-S theory application. This paper proposed a BPA determination method based on kernel density estimation(kde) . The method uses training data to construct a data attribute model with optimized bandwidth based on the optimized kernel density estimation; then calculate the density-distance-distribution (Tri-D) value of test data by using the kernel density model of training data. The next step is obtaining BPA of test data by using the nested method to assign Tri-D. Finally, fusing BPA by D-S method to get the final result, and judging the validity of the BPA generation method by the classification accuracy rate. An illustrative case regarding the classification accuracy compared with other methods on UCI data sets shows the effectiveness of the method.

Hits 2109 Downloads 1047 Comment
2. ChinaXiv:201901.00183
Download

NLOF：基于网格过滤的两阶段离群点检测算法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2019-01-28 Cooperative journals: 《计算机应用研究》

王立英石磊伊静宋天霞

Abstract： The purpose of outlier detection is to effectively identify anomalous data in the dataset and to mine meaningful potential information in the data set. The existing outlier detection algorithm does not process the original data, resulting in too high computational time complexity and unsatisfactory detection results. This paper proposes a two-stage outlier detection algorithm NLOF based on grid filtering: First use grid filtering to initially screen the original data, put data with a density less than a certain threshold into a candidate exception subset; then in order to further optimize the density-based algorithm, based on the k-neighborhood, according to the ratio of the number of data points in the neighborhood to the area of the circle formed by the neighborhood, as the basis for calculating the data point density, outlier detection to obtain a more accurate outlier set. Experiments have been carried out on a variety of public datasets. Experiments show that this method can achieve good performance in anomaly detection and reduce the time complexity of the algorithm.

Hits 1765 Downloads 1049 Comment
3. ChinaXiv:201812.00103
Download

基于CRT机制混合神经网络的特定目标情感分析

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-12-13 Cooperative journals: 《计算机应用研究》

孟威尉永清刘文锋

Abstract： The purpose of target-specific affective analysis is to predict the sentiment of a text from the perspective of different target words. The key is to assign appropriate affective words to a given target. When there are more than one affective word describing multiple target sentiments in a sentence, it may lead to the mismatch between the affective word and the target. In this paper, a hybrid neural network based on CRT mechanism is proposed for target-specific sentiment analysis. The model uses CNN layer to extract features from the word representation after BiLSTM transformation. The specific target representation of the word is generated by CRT component and the original context information from BiLSTM layer is saved. Experiments on three open datasets show that the proposed model can significantly improve the accuracy and stability of target-specific affective analysis tasks compared with previous models. It is proved that the CRT mechanism in this paper can integrate the advantages of CNN and LSTM well, which is of great significance to the task of sentiment analysis for specific targets.

Hits 2039 Downloads 1251 Comment
4. ChinaXiv:201812.00116
Download

基于卷积神经网络和贝叶斯分类器的句子分类模型

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-12-13 Cooperative journals: 《计算机应用研究》

李文宽刘培玉朱振方刘文锋

Abstract： The traditional sentence classification model has many disadvantages such as complex feature extraction process and low classification accuracy. This paper used the advantages of the popular deep learning model based convolutional neural network in feature extraction, combined with the traditional sentence classification method, proposed a sentence classification model based on convolutional neural network and Bayesian classifier. The model first used convolutional neural network to extract text features, and secondly used principal component analysis method to reduce the dimensionality of text features. Finally, Bayesian classifier were used to classify sentences. The experimental results show that on Cornell University's public film review dataset and Stanford Sentiment Treebank dataset, the method proposed in this paper is superior to the model using only deep learning or the traditional sentence classification model.

Hits 2146 Downloads 1192 Comment
5. ChinaXiv:201804.02053
Download

基于TextRank的自动摘要优化算法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-19 Cooperative journals: 《计算机应用研究》

李娜娜刘培玉刘文锋刘伟童

Abstract： When Abstract: ng Chinese texts, the traditional TextRank algorithm only considers the similarity between nodes and neglects other important information of the text. Firstly, aiming at Chinese single document, on the basis of existing research, this paper uses TextRank algorithm, on the one hand, it considers the similarities between sentences, on the other hand, TextRank is combined with the overall structural information of texts and the contextual information of sentences, such as the physical position of the document sentences or paragraph, feature sentences, core sentences and other sentences that may increase the weight of the sentence, all are used to generate the digest candidate sentence group of the text. And then, removing high-similarity sentences by redundancy processing technology on the digest candidate sentence group. Finally, the experimental verification shows that the algorithm can improve the accuracy of the generated digest, indicating the effectiveness of the algorithm.

Hits 3020 Downloads 1926 Comment
6. ChinaXiv:201804.02058
Download

基于互信息和邻接熵的新词发现算法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-19 Cooperative journals: 《计算机应用研究》

刘伟童刘培玉刘文锋李娜娜

Abstract： How to identify new words quickly and efficiently is a very important task in natural language processing. Aiming at the problems existing in the discovery of new words, there is an algorithm for word-finding new words verbatim from left to right in the uncut word Weibo corpus. One way to get a candidate new word is by computing the candidate word and its right adjacent word mutual information to expand word by word; There are some ways to filter candidate new words to get new word sets. The included methods include calculating the branch entropy, deleting stop words contained in the first or last word of each candidate new word and deleting old words included in the candidate new word set. It solves the problem that some new words can not be recognized due to the mistakes in the word segmentation and It also solves the problem that the large number of repetitive word strings and rubbish words strings generated by the n-gram method are identified as new words. Finally, experiments verified the effectiveness of the algorithm.

Hits 2727 Downloads 1588 Comment

基于核密度估计的基本概率指派生成方法

NLOF：基于网格过滤的两阶段离群点检测算法

基于CRT机制混合神经网络的特定目标情感分析

基于卷积神经网络和贝叶斯分类器的句子分类模型

基于TextRank的自动摘要优化算法

基于互信息和邻接熵的新词发现算法