• Spark框架下利用分布式NBC的大数据文本分类方法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-10-11 Cooperative journals: 《计算机应用研究》

    Abstract: Aiming at the challenges faced by the existing big data-oriented computing framework in the study of extensible machine learning, a distributed naive Bayesian text classification method based on MapReduce and Apache Spark framework is proposed. The proposed method explores the Bayesian network classifier by studying the adaptability of MapReduce and Apache Spark frameworks, and studies the existing computing framework for big data. First, the training sample data set is divided into m classes based on the naive Bayes text classification model. In the training phase, the output of the previous MapReduce is used as the input of the next MapReduce, and four MapReduce jobs are used to derive the model. This design process makes full use of the parallel advantages of MapReduce. Finally, when the classifier is tested, the value of the class label to which the maximum value belongs is fetched. Experiments in the new group’s dataset have resulted in more than 99% of the results on all five types of news data sets, and are all higher than the comparison algorithm. Proved the accuracy of the method of this article.

  • 基于MF-R和AWS密钥管理机制的物联网健康监测大数据分析系统

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-07-23 Cooperative journals: 《计算机应用研究》

    Abstract: Wearable medical devices with sensor continuously generate enormous data, due to the complexity of the data, it is difficult to process and analyze the big data for finding valuable information that can be useful in decision-making. In order to overcome this issue, this paper proposed a new architecture for the implementation of IoT to store and process scalable sensor data (big data) for health care applications. The proposed architecture consisted of two main sub architectures, namely, Meta Fog-Redirection (MF-R) and AWS key management mechanism. MF-R architecture used big data technologies such as Apache Pig and Apache HBase for collection and storage of the sensor data generated from different sensor devices and it also used kalman filter for removal of noise. AWS key management mechanism used a key management scheme to protect data in the cloud and prevent unauthorized access. When data was stored in the cloud, the proposed system could use stochastic gradient descent algorithms and logistic regression to develop a predictive model of heart disease. Simulation experiments show that compared with other algorithms, the proposed algorithm has smaller error and it has certain advantages in terms of throughput and accuracy.

  • 动作识别中基于深度神经网络和GA合并算法的分类决策方法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-24 Cooperative journals: 《计算机应用研究》

    Abstract: Aiming at the problems and shortcomings of traditional methods in human motion recognition in classification decision, a novel nonlinear classification decision method based on deep neural network (DNN) and genetic algorithm (GA) merge algorithm is proposed. First, the proposed merging algorithm combines the feature extractors over the entire training set and combines them into two different independent networks. Then use DNN to initialize two independent networks and further use GA to merge the two networks. Then the deviation and weight of the network are expressed as a matrix between each layer of the network. Finally, use DNN to train the bias and weight of the network, and each row in the matrix is treated as a chromosome during the merge process. The experiment uses the standard MNIST data set to evaluate the performance of the proposed algorithm. The evaluation results showed that the crossover and mutation operations during the experiment increased the neuron nodes, improved the recognition performance, and weakened the irrelevant and related neuronal nodes. Therefore, the proposed algorithm has a lower error rate and better network performance.