ChinaXiv.org 中国科学院科技论文预发布平台

Submitted Date

2018
3

Subjects

Integration Theory of Computer Science
3

Authors

Institution

result total 3.

Hide Summary

Hits

Date

Downloads

1. ChinaXiv:201810.00023
Download

Spark框架下利用分布式NBC的大数据文本分类方法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-10-11 Cooperative journals: 《计算机应用研究》

臧艳辉赵雪章席运江

Abstract： Aiming at the challenges faced by the existing big data-oriented computing framework in the study of extensible machine learning, a distributed naive Bayesian text classification method based on MapReduce and Apache Spark framework is proposed. The proposed method explores the Bayesian network classifier by studying the adaptability of MapReduce and Apache Spark frameworks, and studies the existing computing framework for big data. First, the training sample data set is divided into m classes based on the naive Bayes text classification model. In the training phase, the output of the previous MapReduce is used as the input of the next MapReduce, and four MapReduce jobs are used to derive the model. This design process makes full use of the parallel advantages of MapReduce. Finally, when the classifier is tested, the value of the class label to which the maximum value belongs is fetched. Experiments in the new group’s dataset have resulted in more than 99% of the results on all five types of news data sets, and are all higher than the comparison algorithm. Proved the accuracy of the method of this article.

Hits 2001 Downloads 1149 Comment
2. ChinaXiv:201808.00117
Download

基于MF-R和AWS密钥管理机制的物联网健康监测大数据分析系统

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-07-23 Cooperative journals: 《计算机应用研究》

臧艳辉赵雪章席运江

Abstract： Wearable medical devices with sensor continuously generate enormous data, due to the complexity of the data, it is difficult to process and analyze the big data for finding valuable information that can be useful in decision-making. In order to overcome this issue, this paper proposed a new architecture for the implementation of IoT to store and process scalable sensor data (big data) for health care applications. The proposed architecture consisted of two main sub architectures, namely, Meta Fog-Redirection (MF-R) and AWS key management mechanism. MF-R architecture used big data technologies such as Apache Pig and Apache HBase for collection and storage of the sensor data generated from different sensor devices and it also used kalman filter for removal of noise. AWS key management mechanism used a key management scheme to protect data in the cloud and prevent unauthorized access. When data was stored in the cloud, the proposed system could use stochastic gradient descent algorithms and logistic regression to develop a predictive model of heart disease. Simulation experiments show that compared with other algorithms, the proposed algorithm has smaller error and it has certain advantages in terms of throughput and accuracy.

Hits 1868 Downloads 1051 Comment
3. ChinaXiv:201804.02364
Download

动作识别中基于深度神经网络和GA合并算法的分类决策方法

Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-24 Cooperative journals: 《计算机应用研究》

赵雪章席运江黄雄波

Abstract： Aiming at the problems and shortcomings of traditional methods in human motion recognition in classification decision, a novel nonlinear classification decision method based on deep neural network (DNN) and genetic algorithm (GA) merge algorithm is proposed. First, the proposed merging algorithm combines the feature extractors over the entire training set and combines them into two different independent networks. Then use DNN to initialize two independent networks and further use GA to merge the two networks. Then the deviation and weight of the network are expressed as a matrix between each layer of the network. Finally, use DNN to train the bias and weight of the network, and each row in the matrix is treated as a chromosome during the merge process. The experiment uses the standard MNIST data set to evaluate the performance of the proposed algorithm. The evaluation results showed that the crossover and mutation operations during the experiment increased the neuron nodes, improved the recognition performance, and weakened the irrelevant and related neuronal nodes. Therefore, the proposed algorithm has a lower error rate and better network performance.

Hits 2192 Downloads 1117 Comment

Spark框架下利用分布式NBC的大数据文本分类方法

基于MF-R和AWS密钥管理机制的物联网健康监测大数据分析系统

动作识别中基于深度神经网络和GA合并算法的分类决策方法