Your conditions: 毛伊敏
  • 基于Im2col的并行深度卷积神经网络优化算法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2022-06-06 Cooperative journals: 《计算机应用研究》

    Abstract: In the large data environment, there are many problems in the parallel deep convolution neural network (DCNN) algorithm, such as excessive data redundancy, slow convolution layer operation and poor convergence of loss function. This paper proposed a parallel deep convolution neural network optimization algorithm based on the Im2col method. First, the algorithm proposed a parallel feature extraction strategy based on Marr-Hildreth operator to extract target features from data as input of convolution neural network, which can effectively avoid the problem of excessive data redundancy. Secondly, the algorithm designed a parallel model training strategy based on the Im2col method. The redundant convolution kernel is removed by designing the Mahalanobis distance center value, and the convolution layer operation speed is improved by combining the MapReduce and Im2col methods. Finally, the algorithm proposed an improved small-batch gradient descent strategy, which eliminates the effect of abnormal data on the batch gradient and solves the problem of poor convergence of the loss function. The experimental results show that IA-PDCNNOA algorithm performs well in deep convolution neural network calculation under large data environment and is suitable for parallel DCNN model training of large datasets.

  • 基于MapReduce的并行频繁项集挖掘算法研究

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2020-09-28 Cooperative journals: 《计算机应用研究》

    Abstract: Aiming at the problem of excessive time, space complexity and unbalanced load for each node based on the parallel frequent itemset mining algorithm MRPrePost, this paper proposed an optimization parallel frequent itemset mining algorithm based on MapReduce, named PFIMD. Firstly, this algorithm adopted a data structure called DiffNodeset, which effectively avoid the defect that the N-list cardinality got very large in the MRPrePost algorithm. Secondly, in order to reduce the time complexity of this algorithm, it designed the T-wcs (2-way Comparison Strategy) to avoid the invalid calculation in the procession of two DiffNodesets connection. Finally, considering the impact of cluster load on the efficiency of parallel algorithm, it proposed the LBSBDG (Load Balancing Strategy Based on Dynamic Grouping) , which decreased the size of PPC-Tree on each computing node and reduced the amount of time required to traverse the PPC-Tree by evenly grouping each item in the F-list. The experimental results show that the modified algorithm has better performance on mining frequent itemset in a big data environment.

  • 基于模糊蚁群的加权蛋白质复合物识别算法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2019-04-01 Cooperative journals: 《计算机应用研究》

    Abstract: Aiming at the problem that the accuracy and recall of the protein complexes identification algorithm based on ant colony and fuzzy C-means (FCM) clustering are not high and the running efficiency is low, this paper proposed a novel protein complex recognition algorithm named FAC-PC (algorithm for identifying weighted protein complexes based on fuzzy ant colony clustering) . Firstly, combing with the Pearson correlation coefficient and edge aggregation coefficient, it constructed the weighted protein network. Secondly, in order to overcome the defects of massive merger, filter, repeated pick-up and drop-down operations in ant colony clustering algorithm, it designed the EPS (essential protein selection) metric to select essential protein, and designed the PFC (protein fitness calculation) metric to traverse neighbors of essential proteins to obtain essential group proteins, then the essential group protein replaced the seed node in the process of ant colony clustering, which improved results that the accuracy and time performance. Furthermore, it proposed the SI (similarity improvement) metric to optimize the probability of picking and dropping operations of ant colony to obtain the number of clustering. Finally, according to the improved ant colony algorithm, it obtained the essential protein and the number of clustering to initialize the FCM algorithm, and designed the membership update strategy to optimize the membership update, at the same time, a new FCM objective function which took a balance between intra-clustering and proposed inter-clustering variation, finally identified the protein complex by improved FCM algorithm. It used FAC-PC algorithm to identify protein complexes on DIP data. The experimental results show that FAC-PC algorithm has better performance on accuracy and recall, which is more reasonable to identify protein complexes.

  • 基于蚁群聚类的动态加权PPI网络复合物挖掘

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-12-13 Cooperative journals: 《计算机应用研究》

    Abstract: Since static PPI networks are difficult to truly reflect the dynamic character of cells, the convergence speed is slow, cluster precision and recall is low in mining protein complex based on ant colony clustering, this paper proposes an ant colony clustering algorithm based on fuzzy granular and closeness degree to mine protein complexes in dynamic weighted PPI network, named FGCDACC-DPC. First, based on the topological and biological characteristics of the PPI network, a comprehensive weight metric (CWM) is designed to accurately describe the interaction between proteins. Second, this method constructs a series of dense and highly co-expressed complex core based on the basic characteristic of the complexes, then it employs the picking and dropping operations, which based on fuzzy granular and closeness degree, to cluster the nodes in PPI networks, in order to reduce effectively the computational complexity and randomness, speed up the clustering speed. Finally, this algorithm designs a local and global strategy founded on function transmission and timing functional relevance theory for weight’s update, which achieve the function transmission between different generations of ant colonies and networks at different times to effectively improve clustering accuracy. FGCDACC-DPC algorithm is used to mine protein complexes on DIP data. Experimental results demonstrate that this algorithm has better performance on precision and recall, which is more reasonable to identify protein complexes.

  • 不确定NNSB-OPTICS聚类算法在滑坡危险性预测中的研究与应用

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-05-20 Cooperative journals: 《计算机应用研究》

    Abstract: Since the rainfall and other uncertainties are difficult to obtain and effectively deal with in landslide hazard prediction, and the existence of setting density threshold and high time complexity in the OPTICS-PLUS algorithms, in order to improve the prediction accuracy, this paper proposed an uncertainty NNSB-OPTICS clustering algorithm and applied to landslide prediction. Firstly, the expansion strategy of OPTICS-PLUS algorithm is optimized, which avoids the manual setting of density threshold and improves the efficiency of the algorithm. Then, according to the distribution characteristics of rainfall data, combined with EW distance formula and cloud model theory, this paper puts forward EC distance formula, can deal with the uncertain rainfall data effectively. Finally, the uncertain NNSB-OPTICS clustering algorithm is applied to predict landslide hazard in Baota district of Yan’an city and the landslide prediction accuracy reaches into 87.9%. The experimental results show that this method can effectively improve the accuracy of landslide prediction and has high feasibility.

  • 不确定PAHT聚类算法在滑坡危险性预测上的应用

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-19 Cooperative journals: 《计算机应用研究》

    Abstract: In the clustering study of landslide prediction, the difficulties of determining the number of clusters which traditional clustering algorithm needs to set in advance and accurately measuring the important factor of Landslide induced-rainfall leads to bad prediction effect. Therefore, this paper proposes a new clustering algorithm-Uncertain PAHT algorithm , the algorithm introduces a kind of uncertain data model called M-D distance, which effectively measure the uncertain rainfall; and based on the hierarchical clustering thinking, through finding the best threshold p* to determine the k value. Contrast experiment in Yenan Baota district as an example, the experimental results verified the effectiveness of uncertain M-D distance and PAHT algorithm and the feasibility of uncertain PAHT algorithm on the landslide hazard prediction.