Subjects: Library Science,Information Science >> Library Science submitted time 2023-07-09 Cooperative journals: 《中国科学院院刊》
Abstract: To promote economic development, social progress, and scientific and technological innovation, it is necessary to strengthen scientific cooperation and information sharing. Open data has emerged in response and become a seemingly inevitable development in the evolution of digital technology. Open data, however, must be supported by infrastructure composed of physical entities and virtual systems that meet the needs of data applications in many fields. Constructing and strengthening open data infrastructure should therefore be considered important objectives of information technology development. This study analyzes the elements of open data infrastructure and expounds its significant positive role in implementing open science. Based on an analysis of the current state and substantial development of China’s open data infrastructure, this study puts forward relevant measures and suggestions in view of the shortcomings and challenges China has faced with its open data infrastructure.
Subjects: Other Disciplines >> Synthetic discipline submitted time 2023-03-28 Cooperative journals: 《中国科学院院刊》
Abstract: The research infrastructure is the basic and strategic platform of scientific and technological innovation. In the last decade, China’s research infrastructure has achieved leapfrog development in the level of observation, manufacturing, management, data acquisition, data sharing and utilization, which supports China’s scientific and technological innovation activities at a higher level. Looking into the future, the scientific research paradigm is transforming. The network, data, and computing platform will not only support the development of major science and technology infrastructure and field stations in larger, more accurate, and more advanced approach, but also contribute to the transformation of scientific research paradigm. It will become an “accelerator” and “multiplier” for major scientific and technological breakthroughs. It is also the key support for China to become the forefront of innovative countries and to enter the world’s science and technology strength.
Subjects: Other Disciplines >> Synthetic discipline submitted time 2023-03-19 Cooperative journals: 《中国科学院院刊》
Abstract: As modern scientific discoveries heavily depend on the big data management, it is an urgent task to research how to manage scientific big data efficiently. In this paper, we first introduce the application scenes and requirement of scientific big data. Then we summarize four challenges in the management of scientific big data (SPUS): Scale dynamic, Pipeline management, Unified access, and Sharing management. After that, we present the proposed scientific big data management system which consists of four components: computing & storage management, data processing management, data fusion management, and data sharing management. Moreover, we specify the key techniques in the proposed system. At last, we introduce the ongoing Big Scientific Data Management System (BigSDMS) program, which is a national key research and development program.
Subjects: Other Disciplines >> Synthetic discipline submitted time 2023-03-19 Cooperative journals: 《中国科学院院刊》
Abstract: As the soul of research activities, scientific data is not only the inputs for stimulating scientific research innovation, but also the indispensable outcomes during research. Tracing the policies and practices on scientific data management and sharing, main development trends have been identified as “positive and gentle research data policy” as well as “full and delicate scientific data management”. Upon comparison of domestic and international policies and practices, we figure out that the national scientific data policy development is still at its early stage while multidisciplinary practices on research stewardship are still on demand. The open data trend could positively nourish better scientific data management and establishment of research data sharing culture. More involvement of different stakeholders is still in urgent need. Further, gentle and positive sharing trend will still last; public data rights shall continue tradeoff against private data rights fiercely. IT innovation and re-definition of scientific data management could benefit data sharing as well.
Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-05-20 Cooperative journals: 《计算机应用研究》
Abstract: The exact approximation of write latency for NWR databases under various consistency levels can serve the building and operating of database clusters, by finding the optimal combination of cluster size and replication factor that minimizes the building and operating cost. Existing benchmarking or queue simulating based approaches can only give incomplete results as they are limited to specific configurations and testbeds. This paper depicted the first close-form analysis of (n, r, k) fork-join queueing process of Cassandra (a typical NWR database) write operations, based on which this paper proposed the first theoretical write latency model for NWR databases. The model is capable of giving more comprehensive latency results. Experiments validated the close-form analysis of (n, r, k) fork-join queues and the write latency model respectively on simulated queues and a Cassandra cluster.
Subjects: Library Science,Information Science >> Information Science submitted time 2017-10-11 Cooperative journals: 《数据分析与知识发现》
Abstract: [Objective] This study aims to effectively storage, manage and reuse scientific data with the help of a specialized data repository management system, the TeamDR, for the research teams. [Context] TeamDR is a Web tool helping scientific research team members organize, storage, manage and share data. It was developed by Java and offered cloud-based and standalone services. [Methods] We first designed a dynamic metadata template to organize and manage scientific research data. MongoDB was then adopted to improve data storage capacity and query performance. [Results] TeamDR stores and manages the scientific research data effectively with the support of, dynamic metadata template, categorized sharing control, and full-text search of metadata. Users’ feedbacks show that TeamDR meets the demands of scientific data storage and management. [Conclusions] TeamDR effectively addresses the issues of scientific data storage and management, data sharing and collaboration, data discovery and linking. However, this system’s usability, completeness and extensibility could be further improved.