ChinaXiv.org 中国科学院科技论文预发布平台

Submitted Date

2019
1
2018
4

Subjects

Integration Theory of Computer Science
5

Authors

Institution

result total 5.

Hide Summary

Hits

Date

Downloads

Your conditions: 新疆警察学院信息安全工程系

1. ChinaXiv:201904.00018
Download

利用稀疏语义结合双层深度卷积神经网络的敏感图像检测方法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2019-04-01 Cooperative journals: 《计算机应用研究》

如先姑力·阿布都热西提亚森·艾则孜孙国梓

Abstract： With the rapid development of Internet technology, sensitive content images have changed from basic concealed content exchange to mass data sharing. The traditional method of sensitive content detection based on image feature extraction is no longer applicable. To overcome these difficulties, this paper proposes a sensitive content detection method based on sparse semantics and double-layer deep convolution neural network. In this method, the upper network preprocesses the training samples and constructs sparse semantic representation of the image as the input of the neural network, while the lower network further considers the third-party control mechanism (such as government agents) and proposes a sensitive content image detection method for specific groups. Compared with the existing image detection methods for sensitive content, the proposed method can effectively reduce the number of training samples, and the detection accuracy is more than 7% higher than that of traditional image detection methods (such as visual word bag method) .

Hits 2209 Downloads 1152 Comment 0
2. ChinaXiv:201810.00040
Download

维语网页中n-gram模型结合类不平衡SVM的不良文本过滤方法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-10-11 Cooperative journals: 《计算机应用研究》

如先姑力·阿布都热西提亚森·艾则孜郭文强

Abstract： Along with the construction and development of the network in Xinjiang, a large number of Uyghur webpages have been produced. In order to construct a healthy network environment, this paper proposed a Uyghur text filtering method combining n-gram statistical model and class-unbalanced support vector machine (SVM) classifier. Firstly, it preprocessed the webpage text, and extracted the stem initially by the N-gram statistical model. Then, it carried out the semantic analysis of the stems, and aggregated the stems with similar meanings into one class, thereby reducing the stem dimension. Finally, it introduced a parameter that controls the distance between hyperplanes in the traditional SVM, and constructed a class-unbalanced SVM to classify Uyghur texts with nonlinear indivisibility and imbalance. The experimental results show that the method can accurately classify bad texts and has a shorter classification time.

Hits 1810 Downloads 1044 Comment 0
3. ChinaXiv:201805.00467
Download

利用N-gram和语义分析的维吾尔语文本相似性检测方法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-05-24 Cooperative journals: 《计算机应用研究》

张莹亚森·艾则孜吴顺祥

Abstract： At present, most of the researches on the similarity of natural language texts are aimed at some major languages such as English. In order to detect similarities between Uighur texts, this paper proposed a similarity detection method based on N-gram and semantic analysis. Firstly, it used N-gram statistical model to obtain the words based on Uyghur word features, and constructed the word-text relation matrix according to the appearance frequency of the words in the text. Then, it adopted a latent semantic analysis (LSA) to obtain the hidden association between the words and their texts, so as to solve the problem of vague semantic meaning in Uyghur language and obtain exact similarity. Experiments on plagiarized text sets containing reorganization and synonym replacement show that this method can detect the similarity accurately and effectively.

Hits 2193 Downloads 1337 Comment
4. ChinaXiv:201805.00368
Download

维吾尔文论坛中基于术语选择和Rocchio分类器的文本过滤方法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-05-18 Cooperative journals: 《计算机应用研究》

如先姑力·阿布都热西提亚森·艾则孜艾山·吾买尔阿力木江·艾沙

Abstract： For the issues that the text filtering in Uyghur web forum, this paper proposed a text filtering method based on term selection and Rocchio classifier. Firstly, it preprocessed the forum text to remove useless words and extract stemming (term) based on the N-gram statistical model. Then, it proposed a balanced mutual information term selection method (BMITS) , which considered the correlation and redundancy of equilibrium, used to reduce the dimension of initial term set and obtain the reduced term set. Finally, it made the text feature terms as input, and used Rocchio classifier to filter out the bad text. The experimental results show that the proposed method can accurately identify the bad type text, which is effective.

Hits 2038 Downloads 1176 Comment 0
5. ChinaXiv:201804.02180
Download

基于分级匹配的维吾尔语文档相似性计算及剽窃检测方法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-17 Cooperative journals: 《计算机应用研究》

亚森·艾则孜艾山·吾买尔阿力木江·艾沙

Abstract： For the issues of the similarity calculation and plagiarism detection from documents written in Uyghur, a content-based Uyghur plagiarism detection (U-PD) method is proposed. Firstly, the Uyghur texts are segmented, the stop words are deleted, the stems are extracted and synonyms are replaced through the preprocessing stage, of which extraction stems are based on N-gram statistical models. Then, calculate the hash value of each text block through the BKDRhash algorithm and construct the hash fingerprint information of the entire document. Finally, according to the hash fingerprint information, the document and document library are matched at the document level, the paragraph level and the sentence level based on the RKR-GST matching algorithm, and the similarity of the document is obtained, so as to realize plagiarism detection. The experimental evaluation in Uyghur documents shows that the proposed method can detect plagiarism documents accurately and is feasible and effective.

Hits 1936 Downloads 1112 Comment 0

利用稀疏语义结合双层深度卷积神经网络的敏感图像检测方法

维语网页中n-gram模型结合类不平衡SVM的不良文本过滤方法

利用N-gram和语义分析的维吾尔语文本相似性检测方法

维吾尔文论坛中基于术语选择和Rocchio分类器的文本过滤方法

基于分级匹配的维吾尔语文档相似性计算及剽窃检测方法