-
Intensity-sensitive quality assessment of extended sources in astronomical images
Authors:
X. Li,
K. Adamek,
W. Armour
Abstract:
Radio astronomy studies the Universe by observing the radio emissions of celestial bodies. Different methods can be used to recover the sky brightness distribution (SBD), which describes the distribution of celestial sources from recorded data, with the output dependent on the method used. Image quality assessment (IQA) indexes can be used to compare the differences between restored SBDs produced…
▽ More
Radio astronomy studies the Universe by observing the radio emissions of celestial bodies. Different methods can be used to recover the sky brightness distribution (SBD), which describes the distribution of celestial sources from recorded data, with the output dependent on the method used. Image quality assessment (IQA) indexes can be used to compare the differences between restored SBDs produced by different image reconstruction techniques to evaluate the effectiveness of different techniques. However, reconstructed images (for the same SBD) can appear to be very similar, especially when observed by the human visual system (HVS). Hence current structural similarity methods, inspired by the HVS, are not effective. In the past, we have proposed two methods to assess point source images, where low amounts of concentrated information are present in larger regions of noise-like data. But for images that include extended source(s), the increase in complexity of the structure makes the IQA methods for point sources over-sensitive since the important objects cannot be described by isolated point sources. Therefore, in this article we propose augmented Low-Information Similarity Index (augLISI), an improved version of LISI, to assess images including extended source(s). Experiments have been carried out to illustrate how this new IQA method can help with the development and study of astronomical imaging techniques. Note that although we focus on radio astronomical images herein, these IQA methods are also applicable to other astronomical images, and imaging techniques.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Pulscan: Binary pulsar detection using unmatched filters on NVIDIA GPUs
Authors:
Jack White,
Karel Adámek,
Jayanta Roy,
Scott Ransom,
Wesley Armour
Abstract:
The Fourier Domain Acceleration Search (FDAS) and Fourier Domain Jerk Search (FDJS) are proven matched filtering techniques for detecting binary pulsar signatures in time-domain radio astronomy datasets. Next generation radio telescopes such as the SPOTLIGHT project at the GMRT produce data at rates that mandate real-time processing, as storage of the entire captured dataset for subsequent offline…
▽ More
The Fourier Domain Acceleration Search (FDAS) and Fourier Domain Jerk Search (FDJS) are proven matched filtering techniques for detecting binary pulsar signatures in time-domain radio astronomy datasets. Next generation radio telescopes such as the SPOTLIGHT project at the GMRT produce data at rates that mandate real-time processing, as storage of the entire captured dataset for subsequent offline processing is infeasible. The computational demands of FDAS and FDJS make them challenging to implement in real-time detection pipelines, requiring costly high performance computing facilities. To address this we propose Pulscan, an unmatched filtering approach which achieves order-of-magnitude improvements in runtime performance compared to FDAS whilst being able to detect both accelerated and some jerked binary pulsars. We profile the sensitivity of Pulscan using a distribution (N = 10,955) of synthetic binary pulsars and compare its performance with FDAS and FDJS. Our implementation of Pulscan includes an OpenMP version for multicore CPU acceleration, a version for heterogeneous CPU/GPU environments such as NVIDIA Grace Hopper, and a fully optimized NVIDIA GPU implementation for integration into an AstroAccelerate pipeline, which will be deployed in the SPOTLIGHT project at the GMRT. Our results demonstrate that unmatched filtering in Pulscan can serve as an efficient data reduction step, prioritizing datasets for further analysis and focusing human and subsequent computational resources on likely binary pulsar signatures.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
CLEAN algorithm implementation comparisons between popular software packages
Authors:
Daniel Wright,
Karel Adámek,
Wesley Armour
Abstract:
The CLEAN algorithm, first published by Högbom and its later variants such as Multiscale CLEAN (msCLEAN) by Cornwell, has been the most popular tool for deconvolution in radio astronomy. Interferometric imaging used in aperture synthesis radio telescopes requires deconvolution for removal of the telescopes point spread function from the observed images. We have compared source fluxes produced by d…
▽ More
The CLEAN algorithm, first published by Högbom and its later variants such as Multiscale CLEAN (msCLEAN) by Cornwell, has been the most popular tool for deconvolution in radio astronomy. Interferometric imaging used in aperture synthesis radio telescopes requires deconvolution for removal of the telescopes point spread function from the observed images. We have compared source fluxes produced by different implementations of Högbom and msCLEAN (WSCLEAN, CASA) with a prototype implementation of Högbom and msCLEAN for the Square Kilometer Array (SKA) on two datasets. First is a simulation of multiple point sources of known intensity using Högbom, where none of the software packages detected all the simulated point sources to within 1.0% of the simulated values. The second is of supernova remnant G055.7+3.4 taken by the Karl G. Jansky Very Large Array (VLA) using msCLEAN, where each of the software packages produced different images for the same settings.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Toward using GANs in astrophysical Monte-Carlo simulations
Authors:
Ahab Isaac,
Wesley Armour,
Karel Adámek
Abstract:
Accurate modelling of spectra produced by X-ray sources requires the use of Monte-Carlo simulations. These simulations need to evaluate physical processes, such as those occurring in accretion processes around compact objects by sampling a number of different probability distributions. This is computationally time-consuming and could be sped up if replaced by neural networks. We demonstrate, on an…
▽ More
Accurate modelling of spectra produced by X-ray sources requires the use of Monte-Carlo simulations. These simulations need to evaluate physical processes, such as those occurring in accretion processes around compact objects by sampling a number of different probability distributions. This is computationally time-consuming and could be sped up if replaced by neural networks. We demonstrate, on an example of the Maxwell-Jüttner distribution that describes the speed of relativistic electrons, that the generative adversarial network (GAN) is capable of statistically replicating the distribution. The average value of the Kolmogorov-Smirnov test is 0.5 for samples generated by the neural network, showing that the generated distribution cannot be distinguished from the true distribution.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Accelerating Dedispersion using Many-Core Architectures
Authors:
Jan Novotný,
Karel Adámek,
M. A. Clark,
Mike Giles,
Wesley Armour
Abstract:
Astrophysical radio signals are excellent probes of extreme physical processes that emit them. However, to reach Earth, electromagnetic radiation passes through the ionised interstellar medium (ISM), introducing a frequency-dependent time delay (dispersion) to the emitted signal. Removing dispersion enables searches for transient signals like Fast Radio Bursts (FRB) or repeating signals from isola…
▽ More
Astrophysical radio signals are excellent probes of extreme physical processes that emit them. However, to reach Earth, electromagnetic radiation passes through the ionised interstellar medium (ISM), introducing a frequency-dependent time delay (dispersion) to the emitted signal. Removing dispersion enables searches for transient signals like Fast Radio Bursts (FRB) or repeating signals from isolated pulsars or those in orbit around other compact objects. The sheer volume and high resolution of data that next generation radio telescopes will produce require High-Performance Computing (HPC) solutions and algorithms to be used in time-domain data processing pipelines to extract scientifically valuable results in real-time. This paper presents a state-of-the-art implementation of brute force incoherent dedispersion on NVIDIA GPUs, and on Intel and AMD CPUs. We show that our implementation is 4x faster (8-bit 8192 channels input) than other available solutions and demonstrate, using 11 existing telescopes, that our implementation is at least 20 faster than real-time. This work is part of the AstroAccelerate package.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Cutting the cost of pulsar astronomy: Saving time and energy when searching for binary pulsars using NVIDIA GPUs
Authors:
Jack White,
Karel Adamek,
Wes Armour
Abstract:
Using the Fourier Domain Acceleration Search (FDAS) method to search for binary pulsars is a computationally costly process. Next generation radio telescopes will have to perform FDAS in real time, as data volumes are too large to store. FDAS is a matched filtering approach for searching time-domain radio astronomy datasets for the signatures of binary pulsars with approximately linear acceleratio…
▽ More
Using the Fourier Domain Acceleration Search (FDAS) method to search for binary pulsars is a computationally costly process. Next generation radio telescopes will have to perform FDAS in real time, as data volumes are too large to store. FDAS is a matched filtering approach for searching time-domain radio astronomy datasets for the signatures of binary pulsars with approximately linear acceleration. In this paper we will explore how we have reduced the energy cost of an SKA-like implementation of FDAS in AstroAccelerate, utilising a combination of mixed-precision computing and dynamic frequency scaling on NVIDIA GPUs. Combining the two approaches, we have managed to save 58% of the overall energy cost of FDAS with a (<3%) sacrifice in numerical sensitivity.
△ Less
Submitted 24 November, 2022;
originally announced November 2022.
-
Bits missing: Finding exotic pulsars using bfloat16 on NVIDIA GPUs
Authors:
Jack White,
Karel Adamek,
Jayanta Roy,
Sofia Dimoudi,
Scott M. Ransom,
Wesley Armour
Abstract:
The Fourier Domain Acceleration Search (FDAS) is an effective technique for detecting faint binary pulsars in large radio astronomy datasets. This paper quantifies the sensitivity impact of reducing numerical precision in the GPU accelerated FDAS pipeline of the AstroAccelerate software package. The prior implementation used IEEE-754 single-precision in the entire binary pulsar detection pipeline,…
▽ More
The Fourier Domain Acceleration Search (FDAS) is an effective technique for detecting faint binary pulsars in large radio astronomy datasets. This paper quantifies the sensitivity impact of reducing numerical precision in the GPU accelerated FDAS pipeline of the AstroAccelerate software package. The prior implementation used IEEE-754 single-precision in the entire binary pulsar detection pipeline, spending a large fraction of the runtime computing GPU accelerated FFTs. AstroAccelerate has been modified to use bfloat16 (and IEEE754 double-precision to provide a "gold standard" comparison) within the Fourier domain convolution section of the FDAS routine. Approximately 20,000 synthetic pulsar filterbank files representing binary pulsars were generated using SIGPROC with a range of physical parameters. They have been processed using bfloat16, single and double-precision convolutions. All bfloat16 peaks are within 3% of the predicted signal-to-noise ratio of their corresponding single-precision peaks. Of 14,971 "bright" single-precision fundamental peaks above a power of 44.982 (our experimentally measured highest noise value), 14,602 (97.53%) have a peak in the same acceleration and frequency bin in the bfloat16 output plane, whilst in the remaining 369 the nearest peak is located in the adjacent acceleration bin. There is no bin drift measured between the single and double-precision results. The bfloat16 version of FDAS achieves a speedup of approximately 1.6x compared to single-precision. A comparison between AstroAccelerate and the PRESTO software package is presented using observations collected with the GMRT of PSR J1544+4937, a 2.16ms black widow pulsar in a 2.8 hour compact orbit.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
A Novel Greedy Approach To Harmonic Summing Using GPUs
Authors:
Karel Adamek,
Jayanta Roy,
Wesley Armour
Abstract:
Incoherent harmonic summing is a technique which is used to improve the sensitivity of Fourier domain search methods. A one dimensional harmonic sum is used in time-domain radio astronomy as part of the Fourier domain periodicity search, a type of search used to detect isolated single pulsars. The main problem faced when implementing the harmonic sum on many-core architectures, like GPUs, is the v…
▽ More
Incoherent harmonic summing is a technique which is used to improve the sensitivity of Fourier domain search methods. A one dimensional harmonic sum is used in time-domain radio astronomy as part of the Fourier domain periodicity search, a type of search used to detect isolated single pulsars. The main problem faced when implementing the harmonic sum on many-core architectures, like GPUs, is the very unfavourable memory access pattern of the harmonic sum algorithm. The memory access pattern gets worse as the dimensionality of the harmonic sum increases. Here we present a set of algorithms for calculating the harmonic sum that are suited to many-core architectures such as GPUs. We present an evaluation of the sensitivity of these different approaches, and their performance. This work forms part of the AstroAccelerate project which is a GPU accelerated software package for processing time-domain radio astronomy data.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Implementation of 3D degridding algorithm on the NVIDIA GPUs using CUDA
Authors:
Karel Adámek,
Peter Wortmann,
Bojan Nikolic,
Ben Mort,
Wesley Armour
Abstract:
Practical aperture synthesis imaging algorithms work by iterating between estimating the sky brightness distribution and a comparison of a prediction based on this estimate with the measured data ("visibilities"). Accuracy in the latter step is crucial but is made difficult by irregular and non-planar sampling of data by the telescope. In this work we present a GPU implementation of 3d de-gridding…
▽ More
Practical aperture synthesis imaging algorithms work by iterating between estimating the sky brightness distribution and a comparison of a prediction based on this estimate with the measured data ("visibilities"). Accuracy in the latter step is crucial but is made difficult by irregular and non-planar sampling of data by the telescope. In this work we present a GPU implementation of 3d de-gridding which accurately deals with these two difficulties and is designed for distributed operation. We address the load balancing issues caused by large variation in visibilities that need to be computed. Using CUDA and NVidia GPUs we measure performance up to 1.2 billion visibilities per second.
△ Less
Submitted 25 February, 2021;
originally announced February 2021.
-
Implementing CUDA Streams into AstroAccelerate -- A Case Study
Authors:
Jan Novotný,
Karel Adámek,
Wes Armour
Abstract:
To be able to run tasks asynchronously on NVIDIA GPUs a programmer must explicitly implement asynchronous execution in their code using the syntax of CUDA streams. Streams allow a programmer to launch independent concurrent execution tasks, providing the ability to utilise different functional units on the GPU asynchronously. For example, it is possible to transfer the results from a previous comp…
▽ More
To be able to run tasks asynchronously on NVIDIA GPUs a programmer must explicitly implement asynchronous execution in their code using the syntax of CUDA streams. Streams allow a programmer to launch independent concurrent execution tasks, providing the ability to utilise different functional units on the GPU asynchronously. For example, it is possible to transfer the results from a previous computation performed on input data n-1, over the PCIe bus whilst computing the result for input data n, by placing different tasks in different CUDA streams. The benefit of such an approach is that the time taken for the data transfer between the host and device can be hidden with computation. This case study deals with the implementation of CUDA streams into AstroAccelerate. AstroAccelerate is a GPU accelerated real-time signal processing pipeline for time-domain radio astronomy.
△ Less
Submitted 6 May, 2021; v1 submitted 4 January, 2021;
originally announced January 2021.
-
Development of production-ready GPU data processing pipeline software for AstroAccelerate
Authors:
Cees Carels,
Karel Adámek,
Jan Novotný,
Wesley Armour
Abstract:
Upcoming large scale telescope projects such as the Square Kilometre Array (SKA) will see high data rates and large data volumes; requiring tools that can analyse telescope event data quickly and accurately. In modern radio telescopes, analysis software forms a core part of the data read out, and long-term software stability and maintainability are essential. AstroAccelerate is a many core acceler…
▽ More
Upcoming large scale telescope projects such as the Square Kilometre Array (SKA) will see high data rates and large data volumes; requiring tools that can analyse telescope event data quickly and accurately. In modern radio telescopes, analysis software forms a core part of the data read out, and long-term software stability and maintainability are essential. AstroAccelerate is a many core accelerated software package that uses NVIDIA(R) GPUs to perform realtime analysis of radio telescope data, and it has been shown to be substantially faster than realtime at processing simulated SKA-like data. AstroAccelerate contains optimised GPU implementations of signal processing tools used in radio astronomy including dedispersion, Fourier domain acceleration search, single pulse detection, and others. This article describes the transformation of AstroAccelerate from a C-like prototype code to a production-ready software library with a C++ API and a Python interface; while preserving compatibility with legacy software that is implemented in C. The design of the software library interfaces, refactoring aspects, and coding techniques are discussed.
△ Less
Submitted 16 January, 2020; v1 submitted 16 December, 2019;
originally announced December 2019.
-
Searching for pulsars in extreme orbits -- GPU acceleration of the Fourier domain 'jerk' search
Authors:
Karel Adámek,
Jan Novotný,
Sofia Dimoudi,
Wesley Armour
Abstract:
Binary pulsars are an important target for radio surveys because they present a natural laboratory for a wide range of astrophysics for example testing general relativity, including detection of gravitational waves. The orbital motion of a pulsar which is locked in a binary system causes a frequency shift (a Doppler shift) in their normally very periodic pulse emissions. These shifts cause a reduc…
▽ More
Binary pulsars are an important target for radio surveys because they present a natural laboratory for a wide range of astrophysics for example testing general relativity, including detection of gravitational waves. The orbital motion of a pulsar which is locked in a binary system causes a frequency shift (a Doppler shift) in their normally very periodic pulse emissions. These shifts cause a reduction in the sensitivity of traditional periodicity searches. To correct this smearing Ransom [2001], Ransom et al. [2002] developed the Fourier domain acceleration search (FDAS) which uses a matched filtering technique. This method is however limited to a constant pulsar acceleration. Therefore, Andersen and Ransom [2018] broadened the Fourier domain acceleration search to account also for a linear change in the acceleration by implementing the Fourier domain "jerk" search into the PRESTO software package. This extension increases the number of matched filters used significantly. We have implemented the Fourier domain "jerk" search (JERK) on GPUs using CUDA. We have achieved 90x performance increase when compared to the parallel implementation of JERK in PRESTO. This work is part of the AstroAccelerate project Armour et al. [2019], a many-core accelerated time-domain signal processing library for radio astronomy.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Single Pulse Detection Algorithms for Real-time Fast Radio Burst Searches using GPUs
Authors:
Karel Adamek,
Wesley Armour
Abstract:
The detection of non-repeating or irregular events in time-domain radio astronomy has gained importance over the last decade due to the discovery of fast radio bursts. Existing or upcoming radio telescopes are gathering more and more data and consequently the software, which is an important part of these telescopes, must process large data volumes at high data rates. Data has to be searched throug…
▽ More
The detection of non-repeating or irregular events in time-domain radio astronomy has gained importance over the last decade due to the discovery of fast radio bursts. Existing or upcoming radio telescopes are gathering more and more data and consequently the software, which is an important part of these telescopes, must process large data volumes at high data rates. Data has to be searched through to detect new and interesting events, often in real-time. These requirements necessitate new and fast algorithms which must process data quickly and accurately. In this work we present new algorithms for single pulse detection using boxcar filters. We have quantified the signal loss introduced by single pulse detection algorithms which use boxcar filters and based on these results, we have designed two distinct "lossy" algorithms. Our lossy algorithms use an incomplete set of boxcar filters to accelerate detection at the expense of a small reduction in detected signal power. We present formulae for signal loss, descriptions of our algorithms and their parallel implementation on NVIDIA GPUs using CUDA. We also present tests of correctness, tests on artificial data and the performance achieved. Our implementation can process SKA-MID-like data 266$\times$ faster than real-time on a NVIDIA P100 GPU and 500x faster than real-time on a NVIDIA Titan V GPU with a mean signal power loss of 7%. We conclude with prospects for single pulse detection for beyond SKA era, nanosecond time resolution radio astronomy.
△ Less
Submitted 27 April, 2020; v1 submitted 18 October, 2019;
originally announced October 2019.
-
A GPU implementation of the harmonic sum algorithm
Authors:
Karel Adámek,
Wesley Armour
Abstract:
Time-domain radio astronomy utilizes a harmonic sum algorithm as part of the Fourier domain periodicity search, this type of search is used to discover single pulsars. The harmonic sum algorithm is also used as part of the Fourier domain acceleration search which aims to discover pulsars that are locked in orbit around another pulsar or compact object. However porting the harmonic sum to many-core…
▽ More
Time-domain radio astronomy utilizes a harmonic sum algorithm as part of the Fourier domain periodicity search, this type of search is used to discover single pulsars. The harmonic sum algorithm is also used as part of the Fourier domain acceleration search which aims to discover pulsars that are locked in orbit around another pulsar or compact object. However porting the harmonic sum to many-core architectures like GPUs is not a straightforward task. The main problem that must be overcome is the very unfavourable memory access pattern, which gets worse as the dimensionality of the harmonic sum increases. We present a set of algorithms for calculating the harmonic sum that are more suited to many-core architectures such as GPUs. We present an evaluation of the sensitivity of these different approaches, and their performance. This work forms part of the AstroAccelerate project which is a GPU accelerated software package for processing time-domain radio astronomy data.
△ Less
Submitted 6 December, 2018;
originally announced December 2018.
-
A GPU implementation of the Correlation Technique for Real-time Fourier Domain Pulsar Acceleration Searches
Authors:
Sofia Dimoudi,
Karel Adamek,
Prabu Thiagaraj,
Scott M. Ransom,
Aris Karastergiou,
Wesley Armour
Abstract:
The study of binary pulsars enables tests of general relativity. Orbital motion in binary systems causes the apparent pulsar spin frequency to drift, reducing the sensitivity of periodicity searches. Acceleration searches are methods that account for the effect of orbital acceleration. Existing methods are currently computationally expensive, and the vast amount of data that will be produced by ne…
▽ More
The study of binary pulsars enables tests of general relativity. Orbital motion in binary systems causes the apparent pulsar spin frequency to drift, reducing the sensitivity of periodicity searches. Acceleration searches are methods that account for the effect of orbital acceleration. Existing methods are currently computationally expensive, and the vast amount of data that will be produced by next generation instruments such as the Square Kilometre Array (SKA) necessitates real-time acceleration searches, which in turn requires the use of High Performance Computing (HPC) platforms. We present our implementation of the Correlation Technique for the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs). The correlation technique is applied as a convolution with multiple Finite Impulse Response filters in the Fourier domain. Two approaches are compared: the first uses the NVIDIA cuFFT library for applying Fast Fourier Transforms (FFTs) on the GPU, and the second contains a custom FFT implementation in GPU shared memory. We find that the FFT shared memory implementation performs between 1.5 and 3.2 times faster than our cuFFT-based application for smaller but sufficient filter sizes. It is also 4 to 6 times faster than the existing GPU and OpenMP implementations of FDAS. This work is part of the AstroAccelerate project, a many-core accelerated time-domain signal processing library for radio astronomy.
△ Less
Submitted 15 April, 2018;
originally announced April 2018.
-
Improved Acceleration of the GPU Fourier Domain Acceleration Search Algorithm
Authors:
Karel Adámek,
Sofia Dimoudi,
Mike Giles,
Wesley Armour
Abstract:
We present an improvement of our implementation of the Correlation Technique for the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs) (Dimoudi & Armour 2015; Dimoudi et al. 2017). Our new improved convolution code which uses our custom GPU FFT code is between 2.5 and 3.9 times faster the than our cuFFT-based implementation (on an NVIDIA P100) and allows for a…
▽ More
We present an improvement of our implementation of the Correlation Technique for the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs) (Dimoudi & Armour 2015; Dimoudi et al. 2017). Our new improved convolution code which uses our custom GPU FFT code is between 2.5 and 3.9 times faster the than our cuFFT-based implementation (on an NVIDIA P100) and allows for a wider range of filter sizes then our previous version. By using this new version of our convolution code in FDAS we have achieved 44% performance increase over our previous best implementation. It is also approximately 8 times faster than the existing PRESTO GPU implementation of FDAS (Luo 2013). This work is part of the AstroAccelerate project (Armour et al. 2002), a many-core accelerated time-domain signal processing library for radio astronomy.
△ Less
Submitted 29 November, 2017;
originally announced November 2017.
-
A Real-time Single Pulse Detection Algorithm for GPUs
Authors:
Karel Adámek,
Wesley Armour
Abstract:
The detection of non-repeating events in the radio spectrum has become an important area of study in radio astronomy over the last decade due to the discovery of fast radio bursts (FRBs). We have implemented a single pulse detection algorithm, for NVIDIA GPUs, which use boxcar filters of varying widths. Our code performs the calculation of standard deviation, matched filtering by using boxcar filt…
▽ More
The detection of non-repeating events in the radio spectrum has become an important area of study in radio astronomy over the last decade due to the discovery of fast radio bursts (FRBs). We have implemented a single pulse detection algorithm, for NVIDIA GPUs, which use boxcar filters of varying widths. Our code performs the calculation of standard deviation, matched filtering by using boxcar filters and thresholding based on the signal-to-noise ratio. We present our parallel implementation of our single pulse detection algorithm. Our GPU algorithm is approximately 17x faster than our current CPU OpenMP code (NVIDIA Titan XP vs Intel E5-2650v3). This code is part of the AstroAccelerate project which is a many-core accelerated time-domain signal processing code for radio astronomy. This work allows our AstroAccelerate code to perform a single pulse search on SKA-like data 4.3x faster than real-time.
△ Less
Submitted 29 November, 2016;
originally announced November 2016.
-
Constraining models of twin peak quasi-periodic oscillations with realistic neutron star equations of state
Authors:
Gabriel Török,
Kateřina Goluchová,
Martin Urbanec,
Eva Šrámková,
Karel Adámek,
Gabriela Urbancová,
Tomáš Pecháček,
Pavel Bakala,
Zdeněk Stuchlík,
Jiří Horák,
Jakub Juryšek
Abstract:
Twin-peak quasi-periodic oscillations (QPOs) are observed in the X-ray power-density spectra of several accreting low-mass neutron star (NS) binaries. In our previous work we have considered several QPO models. We have identified and explored mass-angular-momentum relations implied by individual QPO models for the atoll source 4U 1636-53. In this paper we extend our study and confront QPO models w…
▽ More
Twin-peak quasi-periodic oscillations (QPOs) are observed in the X-ray power-density spectra of several accreting low-mass neutron star (NS) binaries. In our previous work we have considered several QPO models. We have identified and explored mass-angular-momentum relations implied by individual QPO models for the atoll source 4U 1636-53. In this paper we extend our study and confront QPO models with various NS equations of state (EoS). We start with simplified calculations assuming Kerr background geometry and then present results of detailed calculations considering the influence of NS quadrupole moment (related to rotationally induced NS oblateness) assuming Hartle-Thorne spacetimes. We show that the application of concrete EoS together with a particular QPO model yields a specific mass-angular-momentum relation. However, we demonstrate that the degeneracy in mass and angular momentum can be removed when the NS spin frequency inferred from the X-ray burst observations is considered. We inspect a large set of EoS and discuss their compatibility with the considered QPO models. We conclude that when the NS spin frequency in 4U 1636-53 is close to 580Hz we can exclude 51 from 90 of the considered combinations of EoS and QPO models. We also discuss additional restrictions that may exclude even more combinations. Namely, there are 13 EOS compatible with the observed twin peak QPOs and the relativistic precession model. However, when considering the low frequency QPOs and Lense-Thirring precession, only 5 EOS are compatible with the model.
△ Less
Submitted 18 November, 2016;
originally announced November 2016.
-
A polyphase filter for many-core architectures
Authors:
Karel Adámek,
Jan Novotný,
Wes Armour
Abstract:
In this article we discuss our implementation of a polyphase filter for real-time data processing in radio astronomy. We describe in detail our implementation of the polyphase filter algorithm and its behaviour on three generations of NVIDIA GPU cards, on dual Intel Xeon CPUs and the Intel Xeon Phi (Knights Corner) platforms. All of our implementations aim to exploit the potential for data reuse t…
▽ More
In this article we discuss our implementation of a polyphase filter for real-time data processing in radio astronomy. We describe in detail our implementation of the polyphase filter algorithm and its behaviour on three generations of NVIDIA GPU cards, on dual Intel Xeon CPUs and the Intel Xeon Phi (Knights Corner) platforms. All of our implementations aim to exploit the potential for data reuse that the algorithm offers. Our GPU implementations explore two different methods for achieving this, the first makes use of L1/Texture cache, the second uses shared memory. We discuss the usability of each of our implementations along with their behaviours. We measure performance in execution time, which is a critical factor for real-time systems, we also present results in terms of bandwidth (GB/s), compute (GFlop/s) and type conversions (GTc/s). We include a presentation of our results in terms of the sample rate which can be processed in real-time by a chosen platform, which more intuitively describes the expected performance in a signal processing setting. Our findings show that, for the GPUs considered, the performance of our polyphase filter when using lower precision input data is limited by type conversions rather than device bandwidth. We compare these results to an implementation on the Xeon Phi. We show that our Xeon Phi implementation has a performance that is 1.47x to 1.95x greater than our CPU implementation, however is not insufficient to compete with the performance of GPUs. We conclude with a comparison of our best performing code to two other implementations of the polyphase filter, showing that our implementation is faster in nearly all cases. This work forms part of the Astro-Accelerate project, a many-core accelerated real-time data processing library for digital signal processing of time-domain radio astronomy data.
△ Less
Submitted 21 April, 2016; v1 submitted 11 November, 2015;
originally announced November 2015.
-
Appearance of innermost stable circular orbits of accretion discs around rotating neutron stars
Authors:
G. Torok,
M. Urbanec,
K. Adamek,
G. Urbancova
Abstract:
The innermost stable cicular orbit (ISCO) of an accretion disc orbiting a neutron star (NS) is often assumed a unique prediction of general relativity. However, it has been argued that ISCO also appears around highly elliptic bodies described by Newtonian theory. In this sense, the behaviour of an ISCO around a rotating oblate neutron star is formed by the interplay between relativistic and Newton…
▽ More
The innermost stable cicular orbit (ISCO) of an accretion disc orbiting a neutron star (NS) is often assumed a unique prediction of general relativity. However, it has been argued that ISCO also appears around highly elliptic bodies described by Newtonian theory. In this sense, the behaviour of an ISCO around a rotating oblate neutron star is formed by the interplay between relativistic and Newtonian effects. Here we briefly explore the consequences of this interplay using a straightforward analytic approach as well as numerical models that involve modern NS equations of state. We examine the ratio K between the ISCO radius and the radius of the neutron star. We find that, with growing NS spin, the ratio K first decreases, but then starts to increase. This non-monotonic behaviour of K can give rise to a neutron star spin interval in which ISCO appears for two very different ranges of NS mass. This may strongly affect the distribution of neutron stars that have an ISCO (ISCO-NS). When (all) neutron stars are distributed around a high mass M0, the ISCO-NS spin distribution is roughly the same as the spin distribution corresponding to all neutron stars. In contrast, if M0 is low, the ISCO-NS distribution can only have a peak around a high value of spin. Finally, an intermediate value of M0 can imply an ISCO-NS distribution divided into two distinct groups of slow and fast rotators. Our findings have immediate astrophysical applications. They can be used for example to distinguish between different models of high-frequency quasiperiodic oscillations observed in low-mass NS X-ray binaries.
△ Less
Submitted 14 March, 2014;
originally announced March 2014.