Search | arXiv e-print repository

arXiv:2308.12026 [pdf, other]

Prospects of detecting gamma-ray signal of dark matter interaction with the MACE telescope

Authors: M. Khurana, A. Pathania, K. K. Singh, C. Borwankar, P. K. Netrakanti, K. K. Yadav

Abstract: The MACE (Major Atmospheric Cherenkov Experiment) telescope has started its regular gamma-ray observations at Hanle in India. Located at an altitude of $\sim$ 4.3 km above sea level and equipped with a 21 m diameter large quasi-parabolic reflector, it has the capability to explore the gamma-ray sky in the energy range above 20 GeV with very high sensitivity. In this work, we present the results fr… ▽ More The MACE (Major Atmospheric Cherenkov Experiment) telescope has started its regular gamma-ray observations at Hanle in India. Located at an altitude of $\sim$ 4.3 km above sea level and equipped with a 21 m diameter large quasi-parabolic reflector, it has the capability to explore the gamma-ray sky in the energy range above 20 GeV with very high sensitivity. In this work, we present the results from the feasibility studies for searching high-energy gamma-ray signals from dark matter interaction in potential astrophysical environments. We study the impact of MACE response function and other instrumental characteristics to probe the velocity average interaction cross-section ($<σv>$) of the weakly interacting massive particles (WIMPs), expected from the thermal dark matter freeze-out during the decoupling era. We consider the presence of dark matter in the form of pure WIMPs in the mass range 200 GeV - 10 TeV to produce distinctive gamma-ray spectra through its self-annihilation into standard model particles using the Pythia simulation package. The convolution of gamma-ray spectra corresponding to different standard model channels with the MACE response function is used to estimate the upper limit on $<σv>$ for 100 hours of expected MACE observation of Segue1 (a dwarf spheroidal galaxy) which is a potential site of dark matter. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: 8 Pages, 4 Figures, To appear in Proceedings of Science (ICRC 2023)

arXiv:2109.12405 [pdf, other]

CoMeT: An Integrated Interval Thermal Simulation Toolchain for 2D, 2.5D, and 3D Processor-Memory Systems

Authors: Lokesh Siddhu, Rajesh Kedia, Shailja Pandey, Martin Rapp, Anuj Pathania, Jörg Henkel, Preeti Ranjan Panda

Abstract: Processing cores and the accompanying main memory working in tandem enable the modern processors. Dissipating heat produced from computation, memory access remains a significant problem for processors. Therefore, processor thermal management continues to be an active research topic. Most thermal management research takes place using simulations, given the challenges of measuring temperature in rea… ▽ More Processing cores and the accompanying main memory working in tandem enable the modern processors. Dissipating heat produced from computation, memory access remains a significant problem for processors. Therefore, processor thermal management continues to be an active research topic. Most thermal management research takes place using simulations, given the challenges of measuring temperature in real processors. Since core and memory are fabricated on separate packages in most existing processors, with the memory having lower power densities, thermal management research in processors has primarily focused on the cores. Memory bandwidth limitations associated with 2D processors lead to high-density 2.5D and 3D packaging technology. 2.5D packaging places cores and memory on the same package. 3D packaging technology takes it further by stacking layers of memory on the top of cores themselves. Such packagings significantly increase the power density, making processors prone to heating. Therefore, mitigating thermal issues in high-density processors (packaged with stacked memory) becomes an even more pressing problem. However, given the lack of thermal modeling for memories in existing interval thermal simulation toolchains, they are unsuitable for studying thermal management for high-density processors. To address this issue, we present CoMeT, the first integrated Core and Memory interval Thermal simulation toolchain. CoMeT comprehensively supports thermal simulation of high- and low-density processors corresponding to four different core-memory configurations - off-chip DDR memory, off-chip 3D memory, 2.5D, and 3D. CoMeT supports several novel features that facilitate overlying system research. Compared to an equivalent state-of-the-art core-only toolchain, CoMeT adds only a ~5% simulation-time overhead. The source code of CoMeT has been made open for public use under the MIT license. △ Less

Submitted 16 March, 2022; v1 submitted 25 September, 2021; originally announced September 2021.

Comments: https://github.com/marg-tools/CoMeT

arXiv:1908.11450 [pdf, other]

doi 10.1109/MDAT.2020.2968258

Neural Network Inference on Mobile SoCs

Authors: Siqi Wang, Anuj Pathania, Tulika Mitra

Abstract: The ever-increasing demand from mobile Machine Learning (ML) applications calls for evermore powerful on-chip computing resources. Mobile devices are empowered with heterogeneous multi-processor Systems-on-Chips (SoCs) to process ML workloads such as Convolutional Neural Network (CNN) inference. Mobile SoCs house several different types of ML capable components on-die, such as CPU, GPU, and accele… ▽ More The ever-increasing demand from mobile Machine Learning (ML) applications calls for evermore powerful on-chip computing resources. Mobile devices are empowered with heterogeneous multi-processor Systems-on-Chips (SoCs) to process ML workloads such as Convolutional Neural Network (CNN) inference. Mobile SoCs house several different types of ML capable components on-die, such as CPU, GPU, and accelerators. These different components are capable of independently performing inference but with very different power-performance characteristics. In this article, we provide a quantitative evaluation of the inference capabilities of the different components on mobile SoCs. We also present insights behind their respective power-performance behavior. Finally, we explore the performance limit of the mobile SoCs by synergistically engaging all the components concurrently. We observe that a mobile SoC provides up to 2x improvement with parallel inference when all its components are engaged, as opposed to engaging only one component. △ Less

Submitted 22 January, 2020; v1 submitted 24 August, 2019; originally announced August 2019.

Comments: Accepted to IEEE Design & Test

Journal ref: in IEEE Design & Test, vol. 37, no. 5, pp. 50-57, Oct. 2020

arXiv:1903.05898 [pdf, other]

doi 10.1109/TCAD.2019.2944584

High-Throughput CNN Inference on Embedded ARM big.LITTLE Multi-Core Processors

Authors: Siqi Wang, Gayathri Ananthanarayanan, Yifan Zeng, Neeraj Goel, Anuj Pathania, Tulika Mitra

Abstract: IoT Edge intelligence requires Convolutional Neural Network (CNN) inference to take place in the edge devices itself. ARM big.LITTLE architecture is at the heart of prevalent commercial edge devices. It comprises of single-ISA heterogeneous cores grouped into multiple homogeneous clusters that enable power and performance trade-offs. All cores are expected to be simultaneously employed in inferenc… ▽ More IoT Edge intelligence requires Convolutional Neural Network (CNN) inference to take place in the edge devices itself. ARM big.LITTLE architecture is at the heart of prevalent commercial edge devices. It comprises of single-ISA heterogeneous cores grouped into multiple homogeneous clusters that enable power and performance trade-offs. All cores are expected to be simultaneously employed in inference to attain maximal throughput. However, high communication overhead involved in parallelization of computations from convolution kernels across clusters is detrimental to throughput. We present an alternative framework called Pipe-it that employs pipelined design to split convolutional layers across clusters while limiting parallelization of their respective kernels to the assigned cluster. We develop a performance-prediction model that utilizes only the convolutional layer descriptors to predict the execution time of each layer individually on all permitted core configurations (type and count). Pipe-it then exploits the predictions to create a balanced pipeline using an efficient design space exploration algorithm. Pipe-it on average results in a 39% higher throughput than the highest antecedent throughput. △ Less

Submitted 22 January, 2020; v1 submitted 14 March, 2019; originally announced March 2019.

Comments: Accepted to IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Journal ref: in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 10, pp. 2254-2267, Oct. 2020

Showing 1–4 of 4 results for author: Pathania, A