Your conditions: 安涛
  • Science Applications and Challenges of SKA Big Data

    Subjects: Other Disciplines >> Synthetic discipline submitted time 2023-03-19 Cooperative journals: 《中国科学院院刊》

    Abstract: The Square Kilometre Array (SKA) radio telescope to be built soon is the largest astronomical observing facility, and it is expected to make revolutionary breakthroughs in the major frontiers of natural sciences to answer fundamental questions of origins, such as the origin of the Universe, the origin of life, the origin of the cosmic magnetic field, the nature of gravity, and search for extraterrestrial civilization. The unprecedented power of the SKA, characterized by the extremely high sensitivity, wide field of view, ultra-fast survey speed, super high time, space, and frequency resolutions ensures the leading position of the SKA in radio astronomy in next decades, which also produces a vast amount of observational data at ExaByte (EB) level. The transportation, storage, reading, writing, computing, management, archiving of the SKA-level data and the release of SKA science products have posed serious challenges on the technologies in the field of information and computers. China SKA science team will work together with the information, communication, and computer industry to tackle the challenges of the SKA big data, as not only promotes major original scientific discoveries, but also applies the derived technological achievements to stimulate the national economy.

  • Software Platform on China SKA Regional Center Prototype System

    Subjects: Astronomy >> Astrophysics submitted time 2023-01-06

    Abstract:

    "

    "

    "

    "

  • Optimization of parallel processing of Square Kilometre Array low frequency imaging pipeline

    Subjects: Computer Science >> Other Disciplines of Computer Science submitted time 2022-06-28

    Abstract:

    Data processing of the Square Kilometre Array (SKA) is carried out in pipeline mode, and the execution efficiency of pipeline is an important factor to be considered in SKA data processing. Continuum imaging is one of the main observation modes of SKA and is a prerequisite for many other scientific works. In this paper, we take the imaging pipeline of SKA low-frequency precursor Murchison Widefield Array (MWA), as an example and optimize the parallel processing pipeline on the China SKA Regional Centre prototype (CSRC-P). Previous optimization schemes have focused on a few performance hotspots and lacked systematic optimization of the overall pipeline, resulting in a relatively poor overall speedup ratio. In this paper, we propose a global optimization scheme that combines C++ multi-threading, Python multi-processing, and Shell multi-tasking parallelism for pipelines using multiple programming languages and image datasets that can be processed independently, and verify the accuracy of the optimization results. Experiments show taht the optimized pipeline achieves an overall speedup of 2.7 and 2.4 times on the x86 and ARM nodes of CSRC-P, respectively, and the ARM compute nodes shows good adaptability to SKA applications. The optimization strategies and methods in this paper are also applicable to other SKA applications and will be useful for the scientific operation and future operation of the SKA precursor telescope.

  • Parallel optimization of the pulsar search pipeline on China SKA Regional Centre Prototype

    Subjects: Computer Science >> Other Disciplines of Computer Science submitted time 2022-06-28

    Abstract:

    The connection between astronomy and high performance computing is becoming stronger with the devel- opment of cutting-edge observing facilities such as the Square Kilometre Array (SKA) and the proposed innovative platform for big data and high performance computing. Astronomical computation is character- ized by huge data volume and massive parallelism, especially for pulsar search which is one of the leading scientific directions of the SKA. In this paper, we present an approach to accelerate the pulsar search pipeline based on OpenMP and multiprocessing techniques, propose a method to solve the load imbalance problem,and successfully has the pipeline installed on both x86 and ARM compute nodes on China SKA regional center prototype (CSRC-P). The performance evaluation from the tests on the Murchison Widefield Array (MWA) VCS observations shows that our optimization method works well on both x86 and ARM nodes, improving the relative speedup by a factor of 10.4–12.2 and 24.5–27.6, respectively, compared to the original single- thread approach. The ARM platform was found to be 1.1–1.3 times faster than the x86 platform in the tested cases, showing its great potential for SKA data processing. Recently, this optimized pulsar search pipeline deployed on CSRC-P will be especially used for low-frequency pulsar survey of the Southern-sky MWA Rapid Two-metre (SMART) program,for various scientific goals including pulsar timing arrays for gravitational wave detections.

  • Scientific data flow and array simulation analysis for the SKA1 era

    Subjects: Astronomy >> Astrophysical processes submitted time 2022-06-28

    Abstract:

    After years of planning for the next generation of radio telescopes, the Square Kilometer Array (SKA), the construction of the SKA phase one (SKA1) had started in July 2021.After the formal operation of SKA1, it is expected that 750 petabytes of scientifically processed data will be generated every year. The data will be stored at SKA regional centers around the world for further analysis by researchers.In this paper, the models of SKA observation station, central signal processor, scientific data processing and regional center are quantitatively analyzed. Based on the high-priority scientific observation of SKA1, the data flow evaluation at each stage and the demand for computing power of scientific data processing are obtained. Taking the current SKA1-Low and SKA1-Mid arrays as examples, the key factors affecting the layout of interference arrays including resolution, sensitivity and UV coverage are summarized. Finally, OSKAR is used for data simulation of interference array. Through the simulation of SKA1-Mid, the scalability and stability of the system are obtained. Through the simulation of SKA1-Low on CSRC-P, it can be seen that the design of prototype SKA regional center in China has been fully optimized. And the detailed requirements of computing power and the detailed information of data volume are obtained. The SKA's demand for data processing, computing and storage also requires a combination of technologies and interdisciplinary efforts from areas such as electronics, communication, information technology and computer.

  • Progress and Prospect of transcontinental high-speed data transmission at SKA Regional Center in China

    Subjects: Astronomy >> Astrophysical processes submitted time 2022-06-28

    Abstract:

    The Square Kilometer Array (SKA) is the largest radio telescope, and the data generated by its observations will be transmitted from Australia and South Africa to the scientific data processing center about one hundred kilometers away at first, and then distributed to various SKA Regional Centres(SRC) with a distance of tens of thousands of kilometers through high-speed network.In the SKA Phase One (SKA1) stage with a scale of 10\% of SKA, it is estimated that about 750PB of data needs to be distributed to each SRC through a network of at least 100Gbps each year. Such high network bandwidth and data scale bring great challenges to data transmission and distribution. This paper analyzes different network protocols such as TCP/UDP/HTTP and uses different software in the field of radio astronomy for testing and research, and then the optimal transmission scheme parameters under the current infrastructure of 10Gbps network are obtained. In this paper, the factors affecting high-speed transmission are discussed, and the corresponding performance optimization strategies are given.Before the real observation data of SKA1 is generated, it will provide the technical foundation for the network construction and layout of China's SKA regional center. The technical details and methods described are available for reference and use in relevant scientific applications. Finally, the challenges of future SKA network requirements are discussed and prospected.

  • A machine learning dataset for FRB detection in raw data

    Subjects: Astronomy >> Astronomical Instruments and Techniques submitted time 2022-06-28

    Abstract:

    We introduce a machine learning FRB dataset that can train the ML algorithms to reach the FRBs in raw data. It has 8020 FRB simulation images, 4010 non-FRB and 4010 RFI simulation images built from the public FRB observations, and can be expanded in any number as needed. This work provides an open-source dataset for state of art AI to the comparison of FRB event recognition algorithms. The dataset provides image and NumPy format files for both convolutional neural networks and classic machine learning algorithms. The dataset can implement FRB/non-FRB classification, or FRB/RFI/Blank classification. In the example, we used 31 pre-trained classic CNNs. In FRB/non-FRB classification, it achieves the accuracy of 90-92% in the first training epoch and max accuracy of 99.8% in real FRB dataset testing.