-
Neuro-Symbolic Fusion of Wi-Fi Sensing Data for Passive Radar with Inter-Modal Knowledge Transfer
Authors:
Marco Cominelli,
Francesco Gringoli,
Lance M. Kaplan,
Mani B. Srivastava,
Trevor Bihl,
Erik P. Blasch,
Nandini Iyer,
Federico Cerutti
Abstract:
Wi-Fi devices, akin to passive radars, can discern human activities within indoor settings due to the human body's interaction with electromagnetic signals. Current Wi-Fi sensing applications predominantly employ data-driven learning techniques to associate the fluctuations in the physical properties of the communication channel with the human activity causing them. However, these techniques often…
▽ More
Wi-Fi devices, akin to passive radars, can discern human activities within indoor settings due to the human body's interaction with electromagnetic signals. Current Wi-Fi sensing applications predominantly employ data-driven learning techniques to associate the fluctuations in the physical properties of the communication channel with the human activity causing them. However, these techniques often lack the desired flexibility and transparency. This paper introduces DeepProbHAR, a neuro-symbolic architecture for Wi-Fi sensing, providing initial evidence that Wi-Fi signals can differentiate between simple movements, such as leg or arm movements, which are integral to human activities like running or walking. The neuro-symbolic approach affords gathering such evidence without needing additional specialised data collection or labelling. The training of DeepProbHAR is facilitated by declarative domain knowledge obtained from a camera feed and by fusing signals from various antennas of the Wi-Fi receivers. DeepProbHAR achieves results comparable to the state-of-the-art in human activity recognition. Moreover, as a by-product of the learning process, DeepProbHAR generates specialised classifiers for simple movements that match the accuracy of models trained on finely labelled datasets, which would be particularly costly.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Systematic effects on a Compton polarimeter at the focus of an X-ray mirror
Authors:
M. Aoyagi,
R. G. Bose,
S. Chun,
E. Gau,
K. Hu,
K. Ishiwata,
N. K. Iyer,
F. Kislat,
M. Kiss,
K. Klepper,
H. Krawczynski,
L. Lisalda,
Y. Maeda,
F. af Malmborg,
H. Matsumoto,
A. Miyamoto,
T. Miyazawa,
M. Pearce,
B. F. Rauch,
N. Rodriguez Cavero,
S. Spooner,
H. Takahashi,
Y. Uchida,
A. T. West,
K. Wimalasena
, et al. (1 additional authors not shown)
Abstract:
XL-Calibur is a balloon-borne Compton polarimeter for X-rays in the $\sim$15-80 keV range. Using an X-ray mirror with a 12 m focal length for collecting photons onto a beryllium scattering rod surrounded by CZT detectors, a minimum-detectable polarization as low as $\sim$3% is expected during a 24-hour on-target observation of a 1 Crab source at 45$^{\circ}$ elevation. Systematic effects alter the…
▽ More
XL-Calibur is a balloon-borne Compton polarimeter for X-rays in the $\sim$15-80 keV range. Using an X-ray mirror with a 12 m focal length for collecting photons onto a beryllium scattering rod surrounded by CZT detectors, a minimum-detectable polarization as low as $\sim$3% is expected during a 24-hour on-target observation of a 1 Crab source at 45$^{\circ}$ elevation. Systematic effects alter the reconstructed polarization as the mirror focal spot moves across the beryllium scatterer, due to pointing offsets, mechanical misalignment or deformation of the carbon-fiber truss supporting the mirror and the polarimeter. Unaddressed, this can give rise to a spurious polarization signal for an unpolarized flux, or a change in reconstructed polarization fraction and angle for a polarized flux. Using bench-marked Monte-Carlo simulations and an accurate mirror point-spread function characterized at synchrotron beam-lines, systematic effects are quantified, and mitigation strategies discussed. By recalculating the scattering site for a shifted beam, systematic errors can be reduced from several tens of percent to the few-percent level for any shift within the scattering element. The treatment of these systematic effects will be important for any polarimetric instrument where a focused X-ray beam is im**ing on a scattering element surrounded by counting detectors.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Emergence and dynamics of delusions and hallucinations across stages in early psychosis
Authors:
Catalina Mourgues-Codern,
David Benrimoh,
Jay Gandhi,
Emily A. Farina,
Raina Vin,
Tihare Zamorano,
Deven Parekh,
Ashok Malla,
Ridha Joober,
Martin Lepage,
Srividya N. Iyer,
Jean Addington,
Carrie E. Bearden,
Kristin S. Cadenhead,
Barbara Cornblatt,
Matcheri Keshavan,
William S. Stone,
Daniel H. Mathalon,
Diana O. Perkins,
Elaine F. Walker,
Tyrone D. Cannon,
Scott W. Woods,
Jai L. Shah,
Albert R. Powers
Abstract:
Hallucinations and delusions are often grouped together within the positive symptoms of psychosis. However, recent evidence suggests they may be driven by distinct computational and neural mechanisms. Examining the time course of their emergence may provide insights into the relationship between these underlying mechanisms. Participants from the second (N = 719) and third (N = 699) iterations of t…
▽ More
Hallucinations and delusions are often grouped together within the positive symptoms of psychosis. However, recent evidence suggests they may be driven by distinct computational and neural mechanisms. Examining the time course of their emergence may provide insights into the relationship between these underlying mechanisms. Participants from the second (N = 719) and third (N = 699) iterations of the North American Prodrome Longitudinal Study (NAPLS 2 and 3) were assessed for timing of CHR-P-level delusion and hallucination onset. Pre-onset symptom patterns in first-episode psychosis patients (FEP) from the Prevention and Early Intervention Program for Psychosis (PEPP-Montreal; N = 694) were also assessed. Symptom onset was determined at baseline assessment and the evolution of symptom patterns examined over 24 months. In all three samples, participants were more likely to report the onset of delusion-spectrum symptoms prior to hallucination-spectrum symptoms (odds ratios (OR): NAPLS 2 = 4.09; NAPLS 3 = 4.14; PEPP, Z = 7.01, P < 0.001) and to present with only delusions compared to only hallucinations (OR: NAPLS 2 = 5.6; NAPLS 3 = 11.11; PEPP = 42.75). Re-emergence of delusions after remission was also more common than re-emergence of hallucinations (Ps < 0.05), and hallucinations more often resolved first (Ps < 0.001). In both CHR-P samples, ratings of delusional ideation fell with the onset of hallucinations (P = 0.007). Delusions tend to emerge before hallucinations and may play a role in their development. Further work should examine the relationship between the mechanisms driving these symptoms and its utility for diagnosis and treatment.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
An Empirical Study on Usage and Perceptions of LLMs in a Software Engineering Project
Authors:
Sanka Rasnayaka,
Guanlin Wang,
Ridwan Shariffdeen,
Ganesh Neelakanta Iyer
Abstract:
Large Language Models (LLMs) represent a leap in artificial intelligence, excelling in tasks using human language(s). Although the main focus of general-purpose LLMs is not code generation, they have shown promising results in the domain. However, the usefulness of LLMs in an academic software engineering project has not been fully explored yet. In this study, we explore the usefulness of LLMs for…
▽ More
Large Language Models (LLMs) represent a leap in artificial intelligence, excelling in tasks using human language(s). Although the main focus of general-purpose LLMs is not code generation, they have shown promising results in the domain. However, the usefulness of LLMs in an academic software engineering project has not been fully explored yet. In this study, we explore the usefulness of LLMs for 214 students working in teams consisting of up to six members. Notably, in the academic course through which this study is conducted, students were encouraged to integrate LLMs into their development tool-chain, in contrast to most other academic courses that explicitly prohibit the use of LLMs.
In this paper, we analyze the AI-generated code, prompts used for code generation, and the human intervention levels to integrate the code into the code base. We also conduct a perception study to gain insights into the perceived usefulness, influencing factors, and future outlook of LLM from a computer science student's perspective. Our findings suggest that LLMs can play a crucial role in the early stages of software development, especially in generating foundational code structures, and hel** with syntax and error debugging. These insights provide us with a framework on how to effectively utilize LLMs as a tool to enhance the productivity of software engineering students, and highlight the necessity of shifting the educational focus toward preparing students for successful human-AI collaboration.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
A review on different techniques used to combat the non-IID and heterogeneous nature of data in FL
Authors:
Venkataraman Natarajan Iyer
Abstract:
Federated Learning (FL) is a machine-learning approach enabling collaborative model training across multiple decentralized edge devices that hold local data samples, all without exchanging these samples. This collaborative process occurs under the supervision of a central server orchestrating the training or via a peer-to-peer network. The significance of FL is particularly pronounced in industrie…
▽ More
Federated Learning (FL) is a machine-learning approach enabling collaborative model training across multiple decentralized edge devices that hold local data samples, all without exchanging these samples. This collaborative process occurs under the supervision of a central server orchestrating the training or via a peer-to-peer network. The significance of FL is particularly pronounced in industries such as healthcare and finance, where data privacy holds paramount importance. However, training a model under the Federated learning setting brings forth several challenges, with one of the most prominent being the heterogeneity of data distribution among the edge devices. The data is typically non-independently and non-identically distributed (non-IID), thereby presenting challenges to model convergence. This report delves into the issues arising from non-IID and heterogeneous data and explores current algorithms designed to address these challenges.
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
Unsupervised Pre-Training Using Masked Autoencoders for ECG Analysis
Authors:
Guoxin Wang,
Qingyuan Wang,
Ganesh Neelakanta Iyer,
Avishek Nag,
Deepu John
Abstract:
Unsupervised learning methods have become increasingly important in deep learning due to their demonstrated large utilization of datasets and higher accuracy in computer vision and natural language processing tasks. There is a growing trend to extend unsupervised learning methods to other domains, which helps to utilize a large amount of unlabelled data. This paper proposes an unsupervised pre-tra…
▽ More
Unsupervised learning methods have become increasingly important in deep learning due to their demonstrated large utilization of datasets and higher accuracy in computer vision and natural language processing tasks. There is a growing trend to extend unsupervised learning methods to other domains, which helps to utilize a large amount of unlabelled data. This paper proposes an unsupervised pre-training technique based on masked autoencoder (MAE) for electrocardiogram (ECG) signals. In addition, we propose a task-specific fine-tuning to form a complete framework for ECG analysis. The framework is high-level, universal, and not individually adapted to specific model architectures or tasks. Experiments are conducted using various model architectures and large-scale datasets, resulting in an accuracy of 94.39% on the MITDB dataset for ECG arrhythmia classification task. The result shows a better performance for the classification of previously unseen data for the proposed approach compared to fully supervised methods.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Efficient Q-Learning over Visit Frequency Maps for Multi-agent Exploration of Unknown Environments
Authors:
Xuyang Chen,
Ashvin N. Iyer,
Zixing Wang,
Ahmed H. Qureshi
Abstract:
The robot exploration task has been widely studied with applications spanning from novel environment map** to item delivery. For some time-critical tasks, such as rescue catastrophes, the agent is required to explore as efficiently as possible. Recently, Visit Frequency-based map representation achieved great success in such scenarios by discouraging repetitive visits with a frequency-based pena…
▽ More
The robot exploration task has been widely studied with applications spanning from novel environment map** to item delivery. For some time-critical tasks, such as rescue catastrophes, the agent is required to explore as efficiently as possible. Recently, Visit Frequency-based map representation achieved great success in such scenarios by discouraging repetitive visits with a frequency-based penalty. However, its relatively large size and single-agent settings hinder its further development. In this context, we propose Integrated Visit Frequency Map, which encodes identical information as Visit Frequency Map with a more compact size, and a visit frequency-based multi-agent information exchange and control scheme that is able to accommodate both representations. Through tests in diverse settings, the results indicate our proposed methods can achieve a comparable level of performance of VFM with lower bandwidth requirements and generalize well to different multi-agent setups including real-world environments.
△ Less
Submitted 30 July, 2023;
originally announced July 2023.
-
Does Transport Inequality Perpetuate Housing Insecurity?
Authors:
Nandini Iyer,
Ronaldo Menezes,
Hugo Barbosa
Abstract:
With trends of urbanisation on the rise, providing adequate housing to individuals remains a complex issue to be addressed. Often, the slow output of relevant housing policies, coupled with quickly increasing housing costs, leaves individuals with the burden of finding housing that is affordable and safe. In this paper, we unveil how urban planning, not just housing policies, can prevent individua…
▽ More
With trends of urbanisation on the rise, providing adequate housing to individuals remains a complex issue to be addressed. Often, the slow output of relevant housing policies, coupled with quickly increasing housing costs, leaves individuals with the burden of finding housing that is affordable and safe. In this paper, we unveil how urban planning, not just housing policies, can prevent individuals from accessing better housing conditions. We begin by proposing a clustering approach to characterising levels of housing insecurity in a city, by considering multiple dimensions of housing. Then we define levels of transit efficiency in 20 US cities by comparing public transit journeys to car-based journeys. Finally, we use geospatial autocorrelation to highlight how commuting to areas associated with better housing conditions results in transit commute times of over 30 minutes in most cities, and commute times of over an hour in some cases. Ultimately, we show the role that public transportation plays in locking vulnerable demographics into a cycle of poverty, thus motivating a more holistic approach to addressing housing insecurity that extends beyond changing housing policies.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Network Entropy as a Measure of Socioeconomic Segregation in Residential and Employment Landscapes
Authors:
Nandini Iyer,
Ronaldo Menezes,
Hugo Barbosa
Abstract:
Cities create potential for individuals from different backgrounds to interact with one another. It is often the case, however, that urban infrastructure obfuscates this potential, creating dense pockets of affluence and poverty throughout a region. The spatial distribution of job opportunities, and how it intersects with the residential landscape, is one of many such obstacles. In this paper, we…
▽ More
Cities create potential for individuals from different backgrounds to interact with one another. It is often the case, however, that urban infrastructure obfuscates this potential, creating dense pockets of affluence and poverty throughout a region. The spatial distribution of job opportunities, and how it intersects with the residential landscape, is one of many such obstacles. In this paper, we apply global and local measures of entropy to the commuting networks of 25 US cities to capture structural diversity in residential and work patterns. We identify significant relationships between the heterogeneity of commuting origins and destinations with levels of employment and residential segregation, respectively. Finally, by comparing the local entropy values of low and high-income networks, we highlight how disparities in entropy are indicative of both employment segregation and residential inhomogeneities. Ultimately, this work motivates the application of network entropy to understand segregation not just from a residential perspective, but an experiential one as well.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Mobility and Transit Segregation in Urban Spaces
Authors:
Nandini Iyer,
Ronaldo Menezes,
Hugo Barbosa
Abstract:
Segregation is a highly nuanced concept that researchers have worked to define and measure over the past several decades. Conventional approaches tend to estimate segregation based on residential patterns in a static manner. In this work, we analyse socioeconomic inequalities, assessing segregation in various dimensions of the urban experience. Moreover, we consider the pivotal role that transport…
▽ More
Segregation is a highly nuanced concept that researchers have worked to define and measure over the past several decades. Conventional approaches tend to estimate segregation based on residential patterns in a static manner. In this work, we analyse socioeconomic inequalities, assessing segregation in various dimensions of the urban experience. Moreover, we consider the pivotal role that transport plays in democratising access to opportunities. Using transport networks, amenity visitations, and census data, we develop a framework to approximate segregation, within the United States, for various dimensions of urban life. We find that neighbourhoods that are segregated in the residential domain, tend to exhibit similar levels of segregation in amenity visitation patterns and transit usage, albeit to a lesser extent. We identify inequalities embedded into transit service, which impose constraints on residents from segregated areas, limiting the neighbourhoods that they can access within an hour to areas that are similarly disadvantaged. By exploring socioeconomic segregation from a transit perspective, we underscore the importance of conceptualising segregation as a dynamic measure, while also highlighting how transport systems can contribute to a cycle of disadvantage.
△ Less
Submitted 14 April, 2023;
originally announced April 2023.
-
Enhancing Classification with Hierarchical Scalable Query on Fusion Transformer
Authors:
Sudeep Kumar Sahoo,
Sathish Chalasani,
Abhishek Joshi,
Kiran Nanjunda Iyer
Abstract:
Real-world vision based applications require fine-grained classification for various area of interest like e-commerce, mobile applications, warehouse management, etc. where reducing the severity of mistakes and improving the classification accuracy is of utmost importance. This paper proposes a method to boost fine-grained classification through a hierarchical approach via learnable independent qu…
▽ More
Real-world vision based applications require fine-grained classification for various area of interest like e-commerce, mobile applications, warehouse management, etc. where reducing the severity of mistakes and improving the classification accuracy is of utmost importance. This paper proposes a method to boost fine-grained classification through a hierarchical approach via learnable independent query embeddings. This is achieved through a classification network that uses coarse class predictions to improve the fine class accuracy in a stage-wise sequential manner. We exploit the idea of hierarchy to learn query embeddings that are scalable across all levels, thus making this a relevant approach even for extreme classification where we have a large number of classes. The query is initialized with a weighted Eigen image calculated from training samples to best represent and capture the variance of the object. We introduce transformer blocks to fuse intermediate layers at which query attention happens to enhance the spatial representation of feature maps at different scales. This multi-scale fusion helps improve the accuracy of small-size objects. We propose a two-fold approach for the unique representation of learnable queries. First, at each hierarchical level, we leverage cluster based loss that ensures maximum separation between inter-class query embeddings and helps learn a better (query) representation in higher dimensional spaces. Second, we fuse coarse level queries with finer level queries weighted by a learned scale factor. We additionally introduce a novel block called Cross Attention on Multi-level queries with Prior (CAMP) Block that helps reduce error propagation from coarse level to finer level, which is a common problem in all hierarchical classifiers. Our method is able to outperform the existing methods with an improvement of ~11% at the fine-grained classification.
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
-
The design and performance of the XL-Calibur anticoincidence shield
Authors:
N. K. Iyer,
M. Kiss,
M. Pearce,
T. -A. Stana,
H. Awaki,
R. G. Bose,
A. Dasgupta,
G. De Geronimo,
E. Gau,
T. Hakamata,
M. Ishida,
K. Ishiwata,
W. Kamogawa,
F. Kislat,
T. Kitaguchi,
H. Krawczynski,
L. Lisalda,
Y. Maeda,
H. Matsumoto,
A. Miyamoto,
T. Miyazawa,
T. Mizuno,
B. F. Rauch,
N. Rodriguez Cavero,
N. Sakamoto
, et al. (9 additional authors not shown)
Abstract:
The XL-Calibur balloon-borne hard X-ray polarimetry mission comprises a Compton-scattering polarimeter placed at the focal point of an X-ray mirror. The polarimeter is housed within a BGO anticoincidence shield, which is needed to mitigate the considerable background radiation present at the observation altitude of ~40 km. This paper details the design, construction and testing of the anticoincide…
▽ More
The XL-Calibur balloon-borne hard X-ray polarimetry mission comprises a Compton-scattering polarimeter placed at the focal point of an X-ray mirror. The polarimeter is housed within a BGO anticoincidence shield, which is needed to mitigate the considerable background radiation present at the observation altitude of ~40 km. This paper details the design, construction and testing of the anticoincidence shield, as well as the performance measured during the week-long maiden flight from Esrange Space Centre to the Canadian Northwest Territories in July 2022. The in-flight performance of the shield followed design expectations, with a veto threshold <100 keV and a measured background rate of ~0.5 Hz (20-40 keV). This is compatible with the scientific goals of the mission, where %-level minimum detectable polarisation is sought for a Hz-level source rate.
△ Less
Submitted 8 December, 2022;
originally announced December 2022.
-
Peculiar temporal and spectral features in highly obscured HMXB pulsar IGR J16320-4751 using XMM-Newton
Authors:
Varun,
Nirmal Iyer,
Biswajit Paul
Abstract:
IGR J16320-4751 is a highly obscured HMXB source containing a very slow neutron star ($P_{spin}\sim1300$ sec) orbiting its supergiant companion star with a period of $\sim$9 days. It shows high column density ($N_{H}\sim2-5\times10^{23}$ $cm^{-2}$) in the spectrum, and a large variation in flux along the orbit despite not being an eclipsing source. We report on some peculiar timing and spectral fe…
▽ More
IGR J16320-4751 is a highly obscured HMXB source containing a very slow neutron star ($P_{spin}\sim1300$ sec) orbiting its supergiant companion star with a period of $\sim$9 days. It shows high column density ($N_{H}\sim2-5\times10^{23}$ $cm^{-2}$) in the spectrum, and a large variation in flux along the orbit despite not being an eclipsing source. We report on some peculiar timing and spectral features from archival XMM-Newton observation of this source including 8 observations taken during a single orbit. The pulsar shows large timing variability in terms of average count rate from different observations, flaring activity, sudden changes in count rate, cessation of pulsation, and variable pulse profile even from observations taken a few days apart. We note that IGR J16320-4751 is among a small number of sources for which this temporary cessation of pulsation in the light curve has been observed. A time-resolved spectral analysis around the segment of missing pulse shows that variable absorption is deriving such behavior in this source. Energy resolved pulse profiles in the 6.2-6.6 keV band which has a partial contribution from Fe K$_α$ photons, show strong pulsation. However, a more systematic analysis reveals a flat pulse profile from the contribution of Fe K$_α$ photons in this band implying a symmetric distribution for the material responsible for this emission. Soft excess emission below 3 keV is seen in 6 out of 11 spectra of XMM-Newton observations.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
Semantic Driven Energy based Out-of-Distribution Detection
Authors:
Abhishek Joshi,
Sathish Chalasani,
Kiran Nanjunda Iyer
Abstract:
Detecting Out-of-Distribution (OOD) samples in real world visual applications like classification or object detection has become a necessary precondition in today's deployment of Deep Learning systems. Many techniques have been proposed, of which Energy based OOD methods have proved to be promising and achieved impressive performance. We propose semantic driven energy based method, which is an end…
▽ More
Detecting Out-of-Distribution (OOD) samples in real world visual applications like classification or object detection has become a necessary precondition in today's deployment of Deep Learning systems. Many techniques have been proposed, of which Energy based OOD methods have proved to be promising and achieved impressive performance. We propose semantic driven energy based method, which is an end-to-end trainable system and easy to optimize. We distinguish in-distribution samples from out-distribution samples with an energy score coupled with a representation score. We achieve it by minimizing the energy for in-distribution samples and simultaneously learn respective class representations that are closer and maximizing energy for out-distribution samples and pushing their representation further out from known class representation. Moreover, we propose a novel loss function which we call Cluster Focal Loss(CFL) that proved to be simple yet very effective in learning better class wise cluster center representations. We find that, our novel approach enhances outlier detection and achieve state-of-the-art as an energy-based model on common benchmarks. On CIFAR-10 and CIFAR-100 trained WideResNet, our model significantly reduces the relative average False Positive Rate(at True Positive Rate of 95%) by 67.2% and 57.4% respectively, compared to the existing energy based approaches. Further, we extend our framework for object detection and achieve improved performance.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Authors:
Aarohi Srivastava,
Abhinav Rastogi,
Abhishek Rao,
Abu Awal Md Shoeb,
Abubakar Abid,
Adam Fisch,
Adam R. Brown,
Adam Santoro,
Aditya Gupta,
Adrià Garriga-Alonso,
Agnieszka Kluska,
Aitor Lewkowycz,
Akshat Agarwal,
Alethea Power,
Alex Ray,
Alex Warstadt,
Alexander W. Kocurek,
Ali Safaya,
Ali Tazarv,
Alice Xiang,
Alicia Parrish,
Allen Nie,
Aman Hussain,
Amanda Askell,
Amanda Dsouza
, et al. (426 additional authors not shown)
Abstract:
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur…
▽ More
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.
△ Less
Submitted 12 June, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Performance of the X-Calibur Hard X-Ray Polarimetry Mission during its 2018/19 Long-Duration Balloon Flight
Authors:
Quincy Abarr,
Banafsheh Beheshtipour,
Matthias Beilicke,
Richard Bose,
Dana Braun,
Gianluigi de Geronimo,
Paul Dowkontt,
Manel Errando,
Thomas Gadson,
Victor Guarino,
Scott Heatwole,
Md. Arman Hossen,
Nirmal K. Iyer,
Fabian Kislat,
Mózsi Kiss,
Takao Kitaguchi,
Henric Krawczynski,
R. James Lanzi,
Shaorui Li,
Lindsey Lisalda,
Takashi Okajima,
Mark Pearce,
Zachary Peterson,
Logan Press,
Brian Rauch
, et al. (6 additional authors not shown)
Abstract:
X-Calibur is a balloon-borne telescope that measures the polarization of high-energy X-rays in the 15--50keV energy range. The instrument makes use of the fact that X-rays scatter preferentially perpendicular to the polarization direction. A beryllium scattering element surrounded by pixellated CZT detectors is located at the focal point of the InFOCμS hard X-ray mirror. The instrument was launche…
▽ More
X-Calibur is a balloon-borne telescope that measures the polarization of high-energy X-rays in the 15--50keV energy range. The instrument makes use of the fact that X-rays scatter preferentially perpendicular to the polarization direction. A beryllium scattering element surrounded by pixellated CZT detectors is located at the focal point of the InFOCμS hard X-ray mirror. The instrument was launched for a long-duration balloon (LDB) flight from McMurdo (Antarctica) on December 29, 2018, and obtained the first constraints of the hard X-ray polarization of an accretion-powered pulsar. Here, we describe the characterization and calibration of the instrument on the ground and its performance during the flight, as well as simulations of particle backgrounds and a comparison to measured rates. The pointing system and polarimeter achieved the excellent projected performance. The energy detection threshold for the anticoincidence system was found to be higher than expected and it exhibited unanticipated dead time. Both issues will be remedied for future flights. Overall, the mission performance was nominal, and results will inform the design of the follow-up mission XL-Calibur, which is scheduled to be launched in summer 2022.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
Period -Index problem for hyperelliptic curves
Authors:
J. N. Iyer,
R. Parimala
Abstract:
Let $C$ be a smooth projective curve of genus 2 over a number field $k$ with a rational point. We prove that the index and exponent coincide for elements in the 2-torsion of $\Sha(Br(C))$.
In the appendix, an isomorphism of the moduli space of rank 2 stable vector bundles with odd determinant on a smooth projective hyperelliptic curve $C$ of genus $g$ with a rational point over any field of char…
▽ More
Let $C$ be a smooth projective curve of genus 2 over a number field $k$ with a rational point. We prove that the index and exponent coincide for elements in the 2-torsion of $\Sha(Br(C))$.
In the appendix, an isomorphism of the moduli space of rank 2 stable vector bundles with odd determinant on a smooth projective hyperelliptic curve $C$ of genus $g$ with a rational point over any field of characteristic not two with the Grassmannian of $(g-1)$-dimensional linear subspaces in the base locus of a certain pencil of quadrics is established, making a result of (\cite{De-Ra}) rational. We establish a twisted version of this isomorphism and we derive as a consequence a weak Hasse principle for the smooth intersection $X$ of two quadrics in ${\mathbb P}^5$ over a number field: if $X$ contains a line locally, then $X$ has a $k$-rational point.
△ Less
Submitted 3 March, 2022; v1 submitted 30 January, 2022;
originally announced January 2022.
-
PATO: Producibility-Aware Topology Optimization using Deep Learning for Metal Additive Manufacturing
Authors:
Naresh S. Iyer,
Amir M. Mirzendehdel,
Sathyanarayanan Raghavan,
Yang Jiao,
Erva Ulu,
Morad Behandish,
Saigopal Nelaturi,
Dean M. Robinson
Abstract:
In this paper, we propose PATO-a producibility-aware topology optimization (TO) framework to help efficiently explore the design space of components fabricated using metal additive manufacturing (AM), while ensuring manufacturability with respect to cracking. Specifically, parts fabricated through Laser Powder Bed Fusion are prone to defects such as warpage or cracking due to high residual stress…
▽ More
In this paper, we propose PATO-a producibility-aware topology optimization (TO) framework to help efficiently explore the design space of components fabricated using metal additive manufacturing (AM), while ensuring manufacturability with respect to cracking. Specifically, parts fabricated through Laser Powder Bed Fusion are prone to defects such as warpage or cracking due to high residual stress values generated from the steep thermal gradients produced during the build process. Maturing the design for such parts and planning their fabrication can span months to years, often involving multiple handoffs between design and manufacturing engineers. PATO is based on the a priori discovery of crack-free designs, so that the optimized part can be built defect-free at the outset. To ensure that the design is crack free during optimization, producibility is explicitly encoded within the standard formulation of TO, using a crack index. Multiple crack indices are explored and using experimental validation, maximum shear strain index (MSSI) is shown to be an accurate crack index. Simulating the build process is a coupled, multi-physics computation and incorporating it in the TO loop can be computationally prohibitive. We leverage the current advances in deep convolutional neural networks and present a high-fidelity surrogate model based on an Attention-based U-Net architecture to predict the MSSI values as a spatially varying field over the part's domain. Further, we employ automatic differentiation to directly compute the gradient of maximum MSSI with respect to the input design variables and augment it with the performance-based sensitivity field to optimize the design while considering the trade-off between weight, manufacturability, and functionality. We demonstrate the effectiveness of the proposed method through benchmark studies in 3D as well as experimental validation.
△ Less
Submitted 8 December, 2021;
originally announced December 2021.
-
LRTuner: A Learning Rate Tuner for Deep Neural Networks
Authors:
Nikhil Iyer,
V Thejas,
Nipun Kwatra,
Ramachandran Ramjee,
Muthian Sivathanu
Abstract:
One very important hyperparameter for training deep neural networks is the learning rate schedule of the optimizer. The choice of learning rate schedule determines the computational cost of getting close to a minima, how close you actually get to the minima, and most importantly the kind of local minima (wide/narrow) attained. The kind of minima attained has a significant impact on the generalizat…
▽ More
One very important hyperparameter for training deep neural networks is the learning rate schedule of the optimizer. The choice of learning rate schedule determines the computational cost of getting close to a minima, how close you actually get to the minima, and most importantly the kind of local minima (wide/narrow) attained. The kind of minima attained has a significant impact on the generalization accuracy of the network. Current systems employ hand tuned learning rate schedules, which are painstakingly tuned for each network and dataset. Given that the state space of schedules is huge, finding a satisfactory learning rate schedule can be very time consuming. In this paper, we present LRTuner, a method for tuning the learning rate as training proceeds. Our method works with any optimizer, and we demonstrate results on SGD with Momentum, and Adam optimizers.
We extensively evaluate LRTuner on multiple datasets, models, and across optimizers. We compare favorably against standard learning rate schedules for the given dataset and models, including ImageNet on Resnet-50, Cifar-10 on Resnet-18, and SQuAD fine-tuning on BERT. For example on ImageNet with Resnet-50, LRTuner shows up to 0.2% absolute gains in test accuracy compared to the hand-tuned baseline schedule. Moreover, LRTuner can achieve the same accuracy as the baseline schedule in 29% less optimization steps.
△ Less
Submitted 30 May, 2021;
originally announced May 2021.
-
XL-Calibur -- a second-generation balloon-borne hard X-ray polarimetry mission
Authors:
Q. Abarr,
H. Awaki,
M. G. Baring,
R. Bose,
G. De Geronimo,
P. Dowkontt,
M. Errando,
V. Guarino,
K. Hattori,
K. Hayashida,
F. Imazato,
M. Ishida,
N. K. Iyer,
F. Kislat,
M. Kiss,
T. Kitaguchi,
H. Krawczynski,
L. Lisalda,
H. Matake,
Y. Maeda,
H. Matsumoto,
T. Mineta,
T. Miyazawa,
T. Mizuno,
T. Okajima
, et al. (16 additional authors not shown)
Abstract:
XL-Calibur is a hard X-ray (15-80 keV) polarimetry mission operating from a stabilised balloon-borne platform in the stratosphere. It builds on heritage from the X-Calibur mission, which observed the accreting neutron star GX 301-2 from Antarctica, between December 29th 2018 and January 1st 2019. The XL-Calibur design incorporates an X-ray mirror, which focusses X-rays onto a polarimeter comprisin…
▽ More
XL-Calibur is a hard X-ray (15-80 keV) polarimetry mission operating from a stabilised balloon-borne platform in the stratosphere. It builds on heritage from the X-Calibur mission, which observed the accreting neutron star GX 301-2 from Antarctica, between December 29th 2018 and January 1st 2019. The XL-Calibur design incorporates an X-ray mirror, which focusses X-rays onto a polarimeter comprising a beryllium rod surrounded by Cadmium Zinc Telluride (CZT) detectors. The polarimeter is housed in an anticoincidence shield to mitigate background from particles present in the stratosphere. The mirror and polarimeter-shield assembly are mounted at opposite ends of a 12 m long lightweight truss, which is pointed with arcsecond precision by WASP - the Wallops Arc Second Pointer. The XL-Calibur mission will achieve a substantially improved sensitivity over X-Calibur by using a larger effective area X-ray mirror, reducing background through thinner CZT detectors, and improved anticoincidence shielding. When observing a 1 Crab source for $t_{\rm day}$ days, the Minimum Detectable Polarisation (at 99% confidence level) is $\sim$2$\%\cdot t_{\rm day}^{-1/2}$. The energy resolution at 40 keV is $\sim$5.9 keV. The aim of this paper is to describe the design and performance of the XL-Calibur mission, as well as the foreseen science programme.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Evolution of timing and spectral characteristics of 4U 1901+03 during its 2019 outburst using the Swift and NuSTAR observatories
Authors:
Aru Beri,
Tinku,
Nirmal K. Iyer,
Chandreyee Maitra
Abstract:
We report the results from a detailed timing and spectral study of transient X-ray pulsars, 4U 1901+03 during its 2019 outburst. We performed broadband spectroscopy in the 1-70 keV energy band using four observations made with Swift and NuSTAR at different intensity levels. Our timing results reveal the presence of highly variable pulse profiles dependent on both luminosity and energy. Our spectro…
▽ More
We report the results from a detailed timing and spectral study of transient X-ray pulsars, 4U 1901+03 during its 2019 outburst. We performed broadband spectroscopy in the 1-70 keV energy band using four observations made with Swift and NuSTAR at different intensity levels. Our timing results reveal the presence of highly variable pulse profiles dependent on both luminosity and energy. Our spectroscopy results showed the presence of a cyclotron resonance scattering feature (CRSF) at ~30 keV. This feature at 30 keV is highly luminosity and pulse-phase dependent. Phase-averaged spectra during the last two observations, made close to the declining phase of the outburst showed the presence of this feature at around 30 keV. The existence of CRSF at 30 keV during these observations is well supported by an abrupt change in the shape of pulse profiles found close to this energy. We also found that 30 keV feature was significantly detected in the pulse-phase resolved spectra of observations made at relatively high luminosities. Moreover, all spectral fit parameters showed a strong pulse phase dependence. In line with the previous findings, an absorption feature at around 10 keV is significantly observed in the phase-averaged X-ray spectra of all observations and also showed a strong pulse phase dependence.
△ Less
Submitted 26 October, 2020; v1 submitted 27 September, 2020;
originally announced September 2020.
-
Comparison of Source Coding Techniques for the Vehicle to Vehicle Communication
Authors:
Varad Vinod Prabhu,
Subrahmanya Gunaga,
Rahul M. S.,
Akash Kulkarni,
Nalini C. Iyer
Abstract:
Autonomous driving is gaining its importance due to the advancements in technology. With the intention of safety during human driving and with the longer-term aim to act as a communication enabler for autonomous driving, vehicle to vehicle communication is gaining its importance. In this paper, we discuss and compare various source coding techniques that can be used for vehicle to vehicle communic…
▽ More
Autonomous driving is gaining its importance due to the advancements in technology. With the intention of safety during human driving and with the longer-term aim to act as a communication enabler for autonomous driving, vehicle to vehicle communication is gaining its importance. In this paper, we discuss and compare various source coding techniques that can be used for vehicle to vehicle communication. We propose abbreviation-based and probability-based source coding methods for the vehicle to vehicle communication. We compare the proposed application-specific source coding methods with other techniques like Huffman, Arithmetic, and Lempel-Ziv-Welch coding. Experimental results show that the proposed probability-based source coding has better values of the compression ratio to the time required for all the messages considered.
△ Less
Submitted 4 August, 2020;
originally announced August 2020.
-
Selection of Robust Digital Communication Techniques for the Vehicle to Vehicle Communication
Authors:
Subrahmanya Gunaga,
Varad Vinod Prabhu,
Rahul M. S.,
Akash Kulkarni,
Nalini C. Iyer
Abstract:
V2V, Vehicle to Vehicle communication has become one of the key features in achieving complete autonomy for self-driving vehicles. We use digital communication as a backbone to deliver a vehicle to vehicle communication. The primary building blocks of a digital communication system are source coding, error correction and detection, and channel coding. Choosing optimal techniques for each block pla…
▽ More
V2V, Vehicle to Vehicle communication has become one of the key features in achieving complete autonomy for self-driving vehicles. We use digital communication as a backbone to deliver a vehicle to vehicle communication. The primary building blocks of a digital communication system are source coding, error correction and detection, and channel coding. Choosing optimal techniques for each block plays a significant role in the performance of the entire communication system. Five Source coding techniques, three Error control, and Channel coding techniques, respectively, have been considered for the implementation. The methods were evaluated based on different comparison parameters for each of the building blocks. Based on the obtained result, the robust techniques were chosen for our application.
△ Less
Submitted 2 August, 2020;
originally announced August 2020.
-
Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule
Authors:
Nikhil Iyer,
V Thejas,
Nipun Kwatra,
Ramachandran Ramjee,
Muthian Sivathanu
Abstract:
Several papers argue that wide minima generalize better than narrow minima. In this paper, through detailed experiments that not only corroborate the generalization properties of wide minima, we also provide empirical evidence for a new hypothesis that the density of wide minima is likely lower than the density of narrow minima. Further, motivated by this hypothesis, we design a novel explore-expl…
▽ More
Several papers argue that wide minima generalize better than narrow minima. In this paper, through detailed experiments that not only corroborate the generalization properties of wide minima, we also provide empirical evidence for a new hypothesis that the density of wide minima is likely lower than the density of narrow minima. Further, motivated by this hypothesis, we design a novel explore-exploit learning rate schedule. On a variety of image and natural language datasets, compared to their original hand-tuned learning rate baselines, we show that our explore-exploit schedule can result in either up to 0.84% higher absolute accuracy using the original training budget or up to 57% reduced training time while achieving the original reported accuracy. For example, we achieve state-of-the-art (SOTA) accuracy for IWSLT'14 (DE-EN) dataset by just modifying the learning rate schedule of a high performing model.
△ Less
Submitted 1 June, 2021; v1 submitted 9 March, 2020;
originally announced March 2020.
-
Variational Encoder-based Reliable Classification
Authors:
Chitresh Bhushan,
Zhaoyuan Yang,
Nurali Virani,
Naresh Iyer
Abstract:
Machine learning models provide statistically impressive results which might be individually unreliable. To provide reliability, we propose an Epistemic Classifier (EC) that can provide justification of its belief using support from the training dataset as well as quality of reconstruction. Our approach is based on modified variational auto-encoders that can identify a semantically meaningful low-…
▽ More
Machine learning models provide statistically impressive results which might be individually unreliable. To provide reliability, we propose an Epistemic Classifier (EC) that can provide justification of its belief using support from the training dataset as well as quality of reconstruction. Our approach is based on modified variational auto-encoders that can identify a semantically meaningful low-dimensional space where perceptually similar instances are close in $\ell_2$-distance too. Our results demonstrate improved reliability of predictions and robust identification of samples with adversarial attacks as compared to baseline of softmax-based thresholding.
△ Less
Submitted 17 October, 2020; v1 submitted 19 February, 2020;
originally announced February 2020.
-
Tautological algebra of the moduli stack of semi stable bundles of rank two on a general curve
Authors:
Chandranandan Gangopadhyay,
Jaya NN Iyer,
Arijit Mukherjee
Abstract:
Our aim in this paper is to determine the tautological algebra generated by the cohomology classes of the Brill Noether loci in the rational cohomology of the moduli stack $\mathcal{U}_C(n,d)$ of semistable bundles of rank $n$ and degree $d$. When $C$ is a general smooth projective curve of genus $g\geq 2$, $n=2$, $d=2g-2$, the tautological algebra of $ \mathcal{U}_C(2,2g-2)$ (resp.…
▽ More
Our aim in this paper is to determine the tautological algebra generated by the cohomology classes of the Brill Noether loci in the rational cohomology of the moduli stack $\mathcal{U}_C(n,d)$ of semistable bundles of rank $n$ and degree $d$. When $C$ is a general smooth projective curve of genus $g\geq 2$, $n=2$, $d=2g-2$, the tautological algebra of $ \mathcal{U}_C(2,2g-2)$ (resp. $\mathcal{SU}_C(2,L)$, $deg(L)=2g-2)$) is generated by the divisor classes (resp. the Theta divisor $Θ$).
△ Less
Submitted 27 May, 2021; v1 submitted 3 February, 2020;
originally announced February 2020.
-
Observations of a GX 301-2 Apastron Flare with the X-Calibur Hard X-Ray Polarimeter Supported by NICER, the Swift XRT and BAT, and Fermi GBM
Authors:
Q. Abarr,
M. Baring,
B. Beheshtipour,
M. Beilicke,
G. deGeronimo,
P. Dowkontt,
M. Errando,
V. Guarino,
N. Iyer,
F. Kislat,
M. Kiss,
T. Kitaguchi,
H. Krawczynski,
J. Lanzi,
S. Li,
L. Lisalda,
T. Okajima,
M. Pearce,
L. Press,
B. Rauch,
D. Stuchlik,
H. Takahashi,
J. Tang,
N. Uchida,
A. West
, et al. (6 additional authors not shown)
Abstract:
The accretion-powered X-ray pulsar GX 301-2 was observed with the balloon-borne X-Calibur hard X-ray polarimeter during late December 2018, with contiguous observations by the NICER X-ray telescope, the Swift X-ray Telescope and Burst Alert Telescope, and the Fermi Gamma-ray Burst Monitor spanning several months. The observations detected the pulsar in a rare apastron flaring state coinciding with…
▽ More
The accretion-powered X-ray pulsar GX 301-2 was observed with the balloon-borne X-Calibur hard X-ray polarimeter during late December 2018, with contiguous observations by the NICER X-ray telescope, the Swift X-ray Telescope and Burst Alert Telescope, and the Fermi Gamma-ray Burst Monitor spanning several months. The observations detected the pulsar in a rare apastron flaring state coinciding with a significant spin-up of the pulsar discovered with the Fermi GBM. The X-Calibur, NICER, and Swift observations reveal a pulse profile strongly dominated by one main peak, and the NICER and Swift data show strong variation of the profile from pulse to pulse. The X-Calibur observations constrain for the first time the linear polarization of the 15-35 keV emission from a highly magnetized accreting neutron star, indicating a polarization degree of (27+38-27)% (90% confidence limit) averaged over all pulse phases. We discuss the spin-up and the X-ray spectral and polarimetric results in the context of theoretical predictions. We conclude with a discussion of the scientific potential of future observations of highly magnetized neutron stars with the more sensitive follow-up mission XL-Calibur.
△ Less
Submitted 10 January, 2020;
originally announced January 2020.
-
Justification-Based Reliability in Machine Learning
Authors:
Nurali Virani,
Naresh Iyer,
Zhaoyuan Yang
Abstract:
With the advent of Deep Learning, the field of machine learning (ML) has surpassed human-level performance on diverse classification tasks. At the same time, there is a stark need to characterize and quantify reliability of a model's prediction on individual samples. This is especially true in application of such models in safety-critical domains of industrial control and healthcare. To address th…
▽ More
With the advent of Deep Learning, the field of machine learning (ML) has surpassed human-level performance on diverse classification tasks. At the same time, there is a stark need to characterize and quantify reliability of a model's prediction on individual samples. This is especially true in application of such models in safety-critical domains of industrial control and healthcare. To address this need, we link the question of reliability of a model's individual prediction to the epistemic uncertainty of the model's prediction. More specifically, we extend the theory of Justified True Belief (JTB) in epistemology, created to study the validity and limits of human-acquired knowledge, towards characterizing the validity and limits of knowledge in supervised classifiers. We present an analysis of neural network classifiers linking the reliability of its prediction on an input to characteristics of the support gathered from the input and latent spaces of the network. We hypothesize that the JTB analysis exposes the epistemic uncertainty (or ignorance) of a model with respect to its inference, thereby allowing for the inference to be only as strong as the justification permits. We explore various forms of support (for e.g., k-nearest neighbors (k-NN) and l_p-norm based) generated for an input, using the training data to construct a justification for the prediction with that input. Through experiments conducted on simulated and real datasets, we demonstrate that our approach can provide reliability for individual predictions and characterize regions where such reliability cannot be ascertained.
△ Less
Submitted 14 November, 2021; v1 submitted 17 November, 2019;
originally announced November 2019.
-
A Compton polarimeter using scintillators read out with MPPCs through Citiroc ASIC
Authors:
Rakhee Kushwah,
Nirmal K. Iyer,
Mózsi Kiss,
Theodor A. Stana,
Mark Pearce
Abstract:
In recent years, a number of purpose-built scintillator-based polarimeters have studied bright astronomical sources for the first time in the hard X-ray band (tens to hundreds of keV). The addition of polarimetry can help data interpretation by resolving model-dependent degeneracies. The typical instrument approach is that incident X-rays scatter off a plastic scintillator into an adjacent scintil…
▽ More
In recent years, a number of purpose-built scintillator-based polarimeters have studied bright astronomical sources for the first time in the hard X-ray band (tens to hundreds of keV). The addition of polarimetry can help data interpretation by resolving model-dependent degeneracies. The typical instrument approach is that incident X-rays scatter off a plastic scintillator into an adjacent scintillator cell. In all missions to date, the scintillators are read out using traditional vacuum tube photo-multipliers (PMTs). The advent of solid-state PMTs ("silicon PM" or "MPPC") is attractive for space-based instruments since the devices are compact, robust and require a low bias voltage. We have characterised the plastic scintillator, EJ-248M, optically coupled to a multi-pixel photon counter (MPPC) and read out with the Citiroc ASIC. A light-yield of 1.6 photoelectrons/keV has been obtained, with a low energy detection threshold of $\lesssim$5 keV at room temperature. We have also constructed an MPPC-based polarimeter-demonstrator in order to investigate the feasibility of such an approach for future instruments. Incident X-rays scatter from a plastic-scintillator bar to surrounding cerium-doped GAGG (Gadolinium Aluminium Gallium Garnet) scintillators yielding time-coincident signals in the scintillators. We have determined the polarimetric response of this set-up using both unpolarised and polarised $\sim$50 keV X-rays. We observe a clear asymmetry in the GAGG counting rates for the polarised beam. The low-energy detection threshold in the plastic scintillator can be further reduced using a coincidence technique. The demonstrated polarimeter design shows promise as a space-based Compton polarimeter and we discuss ways in which our polarimeter can be adapted for such a mission.
△ Less
Submitted 21 July, 2019;
originally announced July 2019.
-
AstroSat view of MAXI J1535-571: broadband spectro-temporal features
Authors:
Sreehari H.,
Ravishankar B. T.,
Nirmal Iyer,
V. K. Agrawal,
Tilak B. Katoch,
Samir Mandal,
Anuj Nandi
Abstract:
We present the results of Target of Opportunity (ToO) observations made with AstroSat of the newly discovered black hole binary MAXI J1535-571. We detect prominent C-type Quasi-periodic Oscillations (QPOs) of frequencies varying from 1.85 Hz to 2.88 Hz, along with distinct harmonics in all the AstroSat observations. We note that while the fundamental QPO is seen in the 3 - 50 keV energy band, the…
▽ More
We present the results of Target of Opportunity (ToO) observations made with AstroSat of the newly discovered black hole binary MAXI J1535-571. We detect prominent C-type Quasi-periodic Oscillations (QPOs) of frequencies varying from 1.85 Hz to 2.88 Hz, along with distinct harmonics in all the AstroSat observations. We note that while the fundamental QPO is seen in the 3 - 50 keV energy band, the harmonic is not significant above ~ 35 keV. The AstroSat observations were made in the hard intermediate state, as seen from state transitions observed by MAXI and Swift. We attempt spectral modelling of the broadband data (0.7-80 keV) provided by AstroSat using phenomenological and physical models. The spectral modelling using nthComp gives a photon index in the range between 2.18-2.37 and electron temperature ranging from 21 to 63 keV. The seed photon temperature is within 0.19 to 0.29 keV. The high flux in 0.3 - 80 keV band corresponds to a luminosity varying from 0.7 to 1.07 L_Edd assuming the source to be at a distance of 8 kpc and hosting a black hole with a mass of 6 M$_{\odot}$. The physical model based on the two-component accretion flow gives disc accretion rates as high as ~ 1 $\dot{m}_{Edd}$ and halo rate ~ 0.2 $\dot{m}_{Edd}$ respectively. The near Eddington accretion rate seems to be the main reason for the unprecedented high flux observed from this source. The two-component spectral fitting of AstroSat data also provides an estimate of a black hole mass between 5.14 to 7.83 M$_{\odot}$.
△ Less
Submitted 12 May, 2019;
originally announced May 2019.
-
Design of intentional backdoors in sequential models
Authors:
Zhaoyuan Yang,
Naresh Iyer,
Johan Reimann,
Nurali Virani
Abstract:
Recent work has demonstrated robust mechanisms by which attacks can be orchestrated on machine learning models. In contrast to adversarial examples, backdoor or trojan attacks embed surgically modified samples with targeted labels in the model training process to cause the targeted model to learn to misclassify chosen samples in the presence of specific triggers, while kee** the model performanc…
▽ More
Recent work has demonstrated robust mechanisms by which attacks can be orchestrated on machine learning models. In contrast to adversarial examples, backdoor or trojan attacks embed surgically modified samples with targeted labels in the model training process to cause the targeted model to learn to misclassify chosen samples in the presence of specific triggers, while kee** the model performance stable across other nominal samples. However, current published research on trojan attacks mainly focuses on classification problems, which ignores sequential dependency between inputs. In this paper, we propose methods to discreetly introduce and exploit novel backdoor attacks within a sequential decision-making agent, such as a reinforcement learning agent, by training multiple benign and malicious policies within a single long short-term memory (LSTM) network. We demonstrate the effectiveness as well as the damaging impact of such attacks through initial outcomes generated from our approach, employed on grid-world environments. We also provide evidence as well as intuition on how the trojan trigger and malicious policy is activated. Challenges with network size and unintentional triggers are identified and analogies with adversarial examples are also discussed. In the end, we propose potential approaches to defend against or serve as early detection for such attacks. Results of our work can also be extended to many applications of LSTM and recurrent networks.
△ Less
Submitted 26 February, 2019;
originally announced February 2019.
-
LAXPC / AstroSat Study of ~ 1 and ~ 2 mHz Quasi-periodic Oscillations in the Be/X-ray Binary 4U 0115+63 During its 2015 Outburst
Authors:
Jayashree Roy,
P. C. Agrawal,
N. K. Iyer,
D. Bhattacharya,
J. S. Yadav,
H. M. Antia,
J. V. Chauhan,
M. Choudhury,
D. K. Dedhia,
T. Katoch,
P. Madhavani,
R. K. Manchanda,
R. Misra,
M. Pahari,
B. Paul,
P. Shah
Abstract:
The Be X-ray Binary 4U 0115+63 was observed by Large Area X-ray Proportional Counter (LAXPC) instrument on AstroSat on 2015 October 24 during the peak of a giant Type II outburst. Prominent intensity oscillations at ~ 1 and ~ 2 mHz frequency were detected during the outburst. Nuclear Spectroscopic Telescope Array (NuSTAR) observations made during the same outburst also show mHz quasi periodic osci…
▽ More
The Be X-ray Binary 4U 0115+63 was observed by Large Area X-ray Proportional Counter (LAXPC) instrument on AstroSat on 2015 October 24 during the peak of a giant Type II outburst. Prominent intensity oscillations at ~ 1 and ~ 2 mHz frequency were detected during the outburst. Nuclear Spectroscopic Telescope Array (NuSTAR) observations made during the same outburst also show mHz quasi periodic oscillations (QPOs). Details of the oscillations and their characteristics deduced from LAXPC/AstroSat and NuSTAR observations are reported in this paper. Analysis of the archival Rossi X-ray Timing Explorer (RXTE)/Proportional Counter Array (PCA) data during 2001-11 also show presence of mHz QPOs during some of the outbursts and details of these QPOs are also reported. Possible models to explain the origin of the mHz oscillations are examined. Similar QPOs, albeit at higher frequencies, have been reported from other neutron star and black hole sources and both may have a common origin. Current models to explain the instability in the inner accretion disk causing the intense oscillations are discussed.
△ Less
Submitted 27 January, 2019;
originally announced January 2019.
-
Gamma-ray burst localisation strategies for the SPHiNX hard X-ray polarimeter
Authors:
L. Heckmann,
N. K. Iyer,
M. Kiss,
M. Pearce,
F. Xie
Abstract:
SPHiNX is a proposed gamma-ray burst (GRB) polarimeter mission operating in the energy range 50-600 keV with the aim of studying the prompt emission phase. The polarisation sensitivity of SPHiNX reduces as the uncertainty on the GRB sky position increases. The stand-alone ability of the SPHiNX design to localise GRB positions is explored via Geant4 simulations. Localisation at the level of a few d…
▽ More
SPHiNX is a proposed gamma-ray burst (GRB) polarimeter mission operating in the energy range 50-600 keV with the aim of studying the prompt emission phase. The polarisation sensitivity of SPHiNX reduces as the uncertainty on the GRB sky position increases. The stand-alone ability of the SPHiNX design to localise GRB positions is explored via Geant4 simulations. Localisation at the level of a few degrees is possible using three different routines. This results in a large fraction (> 80%) of observed GRBs having a negligible (< 5%) reduction in polarisation sensitivity due to the uncertainty in localisation.
△ Less
Submitted 25 January, 2019;
originally announced January 2019.
-
Constraining the mass of the black hole GX 339-4 using spectro-temporal analysis of multiple outbursts
Authors:
Sreehari H.,
Nirmal Iyer,
Radhika D.,
Anuj Nandi,
Samir Mandal
Abstract:
We carried out spectro-temporal analysis of the archived data from multiple outbursts spanning over the last two decades from the black hole X-ray binary GX 339-4. In this paper, the mass of the compact object in the X-ray binary system GX 339-4 is constrained based on three indirect methods. The first method uses broadband spectral modelling with a two component flow structure of the accretion ar…
▽ More
We carried out spectro-temporal analysis of the archived data from multiple outbursts spanning over the last two decades from the black hole X-ray binary GX 339-4. In this paper, the mass of the compact object in the X-ray binary system GX 339-4 is constrained based on three indirect methods. The first method uses broadband spectral modelling with a two component flow structure of the accretion around the black hole. The broadband data are obtained from {\it RXTE (Rossi X-ray Timing Explorer)} in the range 3.0 to 150.0 keV and from {\it Swift} and {\it NuSTAR (Nuclear Spectroscopic Telescope Array)} simultaneously in the range 0.5 to 79.0 keV. In the second method, we model the time evolution of Quasi-periodic Oscillation (QPO) frequencies, considering it to be the result of an oscillating shock that radially propagates towards or away from the compact object. The third method is based on scaling a mass dependent parameter from an empirical model of the photon index ($Γ$) - QPO ($ν$) correlation. We compare the results at 90 percent confidence from the three methods and summarize the mass estimate of the central object to be in the range $8.28 - 11.89~ M_{\odot}$.
△ Less
Submitted 10 November, 2018;
originally announced November 2018.
-
Science prospects for SPHiNX - a small satellite GRB polarimetry mission
Authors:
M. Pearce,
L. Eliasson,
N. Kumar Iyer,
M. Kiss,
R. Kushwah,
J. Larsson,
C. Lundman,
V. Mikhalev,
F. Ryde,
T. -A. Stana,
H. Takahashi,
F. Xie
Abstract:
Gamma-ray bursts (GRBs) are exceptionally bright electromagnetic events occurring daily on the sky. The prompt emission is dominated by X-/$γ$-rays. Since their discovery over 50 years ago, GRBs are primarily studied through spectral and temporal measurements. The properties of the emission jets and underlying processes are not well understood. A promising way forward is the development of mission…
▽ More
Gamma-ray bursts (GRBs) are exceptionally bright electromagnetic events occurring daily on the sky. The prompt emission is dominated by X-/$γ$-rays. Since their discovery over 50 years ago, GRBs are primarily studied through spectral and temporal measurements. The properties of the emission jets and underlying processes are not well understood. A promising way forward is the development of missions capable of characterising the linear polarisation of the high-energy emission. For this reason, the SPHiNX mission has been developed for a small-satellite platform. The polarisation properties of incident high-energy radiation (50-600 keV) are determined by reconstructing Compton scattering interactions in a segmented array of plastic and Gd$_3$Al$_2$Ga$_3$O$_{12}$(Ce) (GAGG(Ce)) scintillators. During a two-year mission, $\sim$200 GRBs will be observed, with $\sim$50 yielding measurements where the polarisation fraction is determined with a relative error $\leq$10%. This is a significant improvement compared to contemporary missions. This performance, combined with the ability to reconstruct GRB localisation and spectral properties, will allow discrimination between leading classes of emission models.
△ Less
Submitted 16 August, 2018;
originally announced August 2018.
-
Learning Contextual Bandits in a Non-stationary Environment
Authors:
Qingyun Wu,
Naveen Iyer,
Hongning Wang
Abstract:
Multi-armed bandit algorithms have become a reference solution for handling the explore/exploit dilemma in recommender systems, and many other important real-world problems, such as display advertisement. However, such algorithms usually assume a stationary reward distribution, which hardly holds in practice as users' preferences are dynamic. This inevitably costs a recommender system consistent s…
▽ More
Multi-armed bandit algorithms have become a reference solution for handling the explore/exploit dilemma in recommender systems, and many other important real-world problems, such as display advertisement. However, such algorithms usually assume a stationary reward distribution, which hardly holds in practice as users' preferences are dynamic. This inevitably costs a recommender system consistent suboptimal performance. In this paper, we consider the situation where the underlying distribution of reward remains unchanged over (possibly short) epochs and shifts at unknown time instants. In accordance, we propose a contextual bandit algorithm that detects possible changes of environment based on its reward estimation confidence and updates its arm selection strategy respectively. Rigorous upper regret bound analysis of the proposed algorithm demonstrates its learning effectiveness in such a non-trivial environment. Extensive empirical evaluations on both synthetic and real-world datasets for recommendation confirm its practical utility in a changing environment.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
Push-forwards of Chow groups of smooth ample divisors
Authors:
Kalyan Banerjee,
Jaya NN Iyer,
James D. Lewis
Abstract:
We introduce a homological Lefschetz conjecture on (rational) Chow groups, which can be deduced from some well known conjectures, and illustrate it by a series of key examples. We then prove the injectivity of the push-forward morphism on Chow groups, induced by the closed embedding of the Theta divisor in it's Jacobian $J(C)$. Here $C$ is a smooth irreducible complex projective curve.
We introduce a homological Lefschetz conjecture on (rational) Chow groups, which can be deduced from some well known conjectures, and illustrate it by a series of key examples. We then prove the injectivity of the push-forward morphism on Chow groups, induced by the closed embedding of the Theta divisor in it's Jacobian $J(C)$. Here $C$ is a smooth irreducible complex projective curve.
△ Less
Submitted 30 March, 2021; v1 submitted 9 May, 2018;
originally announced May 2018.
-
Accretion Flow Dynamics During 1999 Outburst of XTE J1859+226 - Modeling of Broadband Spectra and Constraining the Source Mass
Authors:
A. Nandi,
S. Mandal,
H. Sreehari,
D. Radhika,
S. Das,
I. Chattopadhyay,
N. Iyer,
V. K. Agrawal,
R. Aktar
Abstract:
We examine the dynamical behavior of accretion flow around XTE J1859+226 during the 1999 outburst by analyzing the entire outburst data ($\sim$ 166 days) from RXTE Satellite. Towards this, we study the hysteresis behavior in the hardness intensity diagram (HID) based on the broadband ($3 - 150$ keV) spectral modeling, spectral signature of jet ejection and the evolution of Quasi-periodic Oscillati…
▽ More
We examine the dynamical behavior of accretion flow around XTE J1859+226 during the 1999 outburst by analyzing the entire outburst data ($\sim$ 166 days) from RXTE Satellite. Towards this, we study the hysteresis behavior in the hardness intensity diagram (HID) based on the broadband ($3 - 150$ keV) spectral modeling, spectral signature of jet ejection and the evolution of Quasi-periodic Oscillation (QPO) frequencies using the two-component advective flow model around a black hole. We compute the flow parameters, namely Keplerian accretion rate (${\dot m}_d$), sub-Keplerian accretion rate (${\dot m}_h$), shock location ($r_s$) and black hole mass ($M_{bh}$) from the spectral modeling and study their evolution along the q-diagram. Subsequently, the kinetic jet power is computed as $L^{\rm obs}_{\rm jet}\sim 3 - 6 \times 10^{37}$ erg~s$^{-1}$ during one of the observed radio flares which indicates that jet power corresponds to $8-16\%$ mass outflow rate from the disc. This estimate of mass outflow rate is in close agreement with the change in total accretion rate ($\sim 14\%$) required for spectral modeling before and during the flare. Finally, we provide a mass estimate of the source XTE J1859+226 based on the spectral modeling that lies in the range of $5.2 - 7.9 M_{\odot}$ with 90\% confidence.
△ Less
Submitted 22 March, 2018;
originally announced March 2018.
-
Observational aspects of Outbursting Black Hole Sources - Evolution of Spectro-Temporal features and X-ray Variability
Authors:
H. Sreehari,
Anuj Nandi,
D. Radhika,
Nirmal Iyer,
Samir Mandal
Abstract:
We report on our attempt to understand the outbursting profile of Galactic Black Hole (GBH) sources, kee** in mind the evolution of temporal and spectral features during the outburst. We present results of evolution of Quasi-periodic Oscillations (QPOs), spectral states and possible connection with Jet ejections during the outburst phase. Further, we attempt to connect the observed X-ray variabi…
▽ More
We report on our attempt to understand the outbursting profile of Galactic Black Hole (GBH) sources, kee** in mind the evolution of temporal and spectral features during the outburst. We present results of evolution of Quasi-periodic Oscillations (QPOs), spectral states and possible connection with Jet ejections during the outburst phase. Further, we attempt to connect the observed X-ray variabilities (i.e., `class' / `structured' variabilities, similar to GRS 1915+105) with spectral states of BH sources. Towards these studies, we consider three Black Hole sources that have undergone single (XTE J1859+226), a few (IGR J17091-3624) and many (GX 339-4) outbursts since the start of RXTE era. Finally, we model the broadband energy spectra (3 - 150 keV) of different spectral states using RXTE and NuSTAR observations. Results are discussed in the context of two component advective flow model, while constraining the mass of the three BH sources.
△ Less
Submitted 14 February, 2018;
originally announced February 2018.
-
Brauer groups of schemes associated to symmetric powers of smooth projective curves in arbitrary characteristics
Authors:
Jaya NN Iyer,
Roy Joshua
Abstract:
In this paper we show that the l^n-torsion part of the cohomological Brauer groups of certain schemes associated to symmetric powers of a projective smooth curve over a separably closed field k are isomorphic, when `l is invertible in k. The schemes considered are the Symmetric powers themselves, then the corresponding Picard schemes and also certain Quot-schemes. We also obtain similar results fo…
▽ More
In this paper we show that the l^n-torsion part of the cohomological Brauer groups of certain schemes associated to symmetric powers of a projective smooth curve over a separably closed field k are isomorphic, when `l is invertible in k. The schemes considered are the Symmetric powers themselves, then the corresponding Picard schemes and also certain Quot-schemes. We also obtain similar results for Prym varieties associated to certain finite covers of such curves: we prove such results only for curves defined over the field of complex numbers.
△ Less
Submitted 16 June, 2019; v1 submitted 29 July, 2017;
originally announced July 2017.
-
Orbital variations in intensity and spectral properties of the highly obscured sgHMXB IGR J16318-4848
Authors:
Nirmal Iyer,
Biswajit Paul
Abstract:
IGR J16318-4848 is an X-ray binary with the highest known line of sight absorption column density among all known X-ray binary systems in our galaxy. In order to investigate the reason behind such a large absorption column, we looked at the variations in the X-ray intensity and spectral parameters as a function of the tentatively discovered $\sim$ 80 day orbit of this source. The orbital period is…
▽ More
IGR J16318-4848 is an X-ray binary with the highest known line of sight absorption column density among all known X-ray binary systems in our galaxy. In order to investigate the reason behind such a large absorption column, we looked at the variations in the X-ray intensity and spectral parameters as a function of the tentatively discovered $\sim$ 80 day orbit of this source. The orbital period is firmly confirmed in the long term ($\sim$ 12 year) Swift BAT lightcurve. Two peaks about half an orbit apart, one narrow and small, and the other broad and large are seen in the orbital intensity profile. We find that while most orbits show enhanced emissions at these two peaks, the larger peak in the folded longterm lightcurve is more a result of randomly occurring large flares spread over $\sim$ 0.2 orbital phase. As opposed to this, the smaller peak is seen in every orbit as a regular increase in intensity. Using archival data spread over different phases of the orbit and the geometry of the system as known from previously published infrared observations, we present a possible scenario which explains the orbital intensity profile, bursting characteristics and large column density of this X-ray binary.
△ Less
Submitted 22 June, 2017;
originally announced June 2017.
-
Neural Network Based Speaker Classification and Verification Systems with Enhanced Features
Authors:
Zhenhao Ge,
Ananth N. Iyer,
Srinath Cheluvaraja,
Ram Sundaram,
Aravind Ganapathiraju
Abstract:
This work presents a novel framework based on feed-forward neural network for text-independent speaker classification and verification, two related systems of speaker recognition. With optimized features and model training, it achieves 100% classification rate in classification and less than 6% Equal Error Rate (ERR), using merely about 1 second and 5 seconds of data respectively. Features with st…
▽ More
This work presents a novel framework based on feed-forward neural network for text-independent speaker classification and verification, two related systems of speaker recognition. With optimized features and model training, it achieves 100% classification rate in classification and less than 6% Equal Error Rate (ERR), using merely about 1 second and 5 seconds of data respectively. Features with stricter Voice Active Detection (VAD) than the regular one for speech recognition ensure extracting stronger voiced portion for speaker recognition, speaker-level mean and variance normalization helps to eliminate the discrepancy between samples from the same speaker. Both are proven to improve the system performance. In building the neural network speaker classifier, the network structure parameters are optimized with grid search and dynamically reduced regularization parameters are used to avoid training terminated in local minimum. It enables the training goes further with lower cost. In speaker verification, performance is improved with prediction score normalization, which rewards the speaker identity indices with distinct peaks and penalizes the weak ones with high scores but more competitors, and speaker-specific thresholding, which significantly reduces ERR in the ROC curve. TIMIT corpus with 8K sampling rate is used here. First 200 male speakers are used to train and test the classification performance. The testing files of them are used as in-domain registered speakers, while data from the remaining 126 male speakers are used as out-of-domain speakers, i.e. imposters in speaker verification.
△ Less
Submitted 7 February, 2017;
originally announced February 2017.
-
Speaker Change Detection Using Features through A Neural Network Speaker Classifier
Authors:
Zhenhao Ge,
Ananth N. Iyer,
Srinath Cheluvaraja,
Aravind Ganapathiraju
Abstract:
The mechanism proposed here is for real-time speaker change detection in conversations, which firstly trains a neural network text-independent speaker classifier using in-domain speaker data. Through the network, features of conversational speech from out-of-domain speakers are then converted into likelihood vectors, i.e. similarity scores comparing to the in-domain speakers. These transformed fea…
▽ More
The mechanism proposed here is for real-time speaker change detection in conversations, which firstly trains a neural network text-independent speaker classifier using in-domain speaker data. Through the network, features of conversational speech from out-of-domain speakers are then converted into likelihood vectors, i.e. similarity scores comparing to the in-domain speakers. These transformed features demonstrate very distinctive patterns, which facilitates differentiating speakers and enable speaker change detection with some straight-forward distance metrics. The speaker classifier and the speaker change detector are trained/tested using speech of the first 200 (in-domain) and the remaining 126 (out-of-domain) male speakers in TIMIT respectively. For the speaker classification, 100% accuracy at a 200 speaker size is achieved on any testing file, given the speech duration is at least 0.97 seconds. For the speaker change detection using speaker classification outputs, performance based on 0.5, 1, and 2 seconds of inspection intervals were evaluated in terms of error rate and F1 score, using synthesized data by concatenating speech from various speakers. It captures close to 97% of the changes by comparing the current second of speech with the previous second, which is very competitive among literature using other methods.
△ Less
Submitted 7 February, 2017;
originally announced February 2017.
-
Noetherian algebras of quantum differential operators
Authors:
Uma N Iyer,
David A Jordan
Abstract:
We consider algebras of quantum differential operators, for appropriate bicharacters on a polynomial algebra in one indeterminate and for the coordinate algebra of quantum $n$-space for $n\geq 3$. In the former case a set of generators for the quantum differential operators was identified in work by the first author and T. C. McCune but it was not known whether the algebra is Noetherian. We answer…
▽ More
We consider algebras of quantum differential operators, for appropriate bicharacters on a polynomial algebra in one indeterminate and for the coordinate algebra of quantum $n$-space for $n\geq 3$. In the former case a set of generators for the quantum differential operators was identified in work by the first author and T. C. McCune but it was not known whether the algebra is Noetherian. We answer this question affirmatively, setting it in a more general context involving the behaviour Noetherian condition under localization at the powers of a single element. In the latter case we determine the algebra of quantum differential operators as a skew group algebra of the group $\mathbb{Z}^n$ over a quantized Weyl algebra. It follows from this description that this algebra is a simple right and left Noetherian domain.
△ Less
Submitted 2 December, 2016;
originally announced December 2016.
-
Generation and Pruning of Pronunciation Variants to Improve ASR Accuracy
Authors:
Zhenhao Ge,
Aravind Ganapathiraju,
Ananth N. Iyer,
Scott A. Randal,
Felix I. Wyss
Abstract:
Speech recognition, especially name recognition, is widely used in phone services such as company directory dialers, stock quote providers or location finders. It is usually challenging due to pronunciation variations. This paper proposes an efficient and robust data-driven technique which automatically learns acceptable word pronunciations and updates the pronunciation dictionary to build a bette…
▽ More
Speech recognition, especially name recognition, is widely used in phone services such as company directory dialers, stock quote providers or location finders. It is usually challenging due to pronunciation variations. This paper proposes an efficient and robust data-driven technique which automatically learns acceptable word pronunciations and updates the pronunciation dictionary to build a better lexicon without affecting recognition of other words similar to the target word. It generalizes well on datasets with various sizes, and reduces the error rate on a database with 13000+ human names by 42%, compared to a baseline with regular dictionaries already covering canonical pronunciations of 97%+ words in names, plus a well-trained spelling-to-pronunciation (STP) engine.
△ Less
Submitted 28 June, 2016;
originally announced June 2016.
-
A Factorized Recurrent Neural Network based architecture for medium to large vocabulary Language Modelling
Authors:
Anantharaman Palacode Narayana Iyer
Abstract:
Statistical language models are central to many applications that use semantics. Recurrent Neural Networks (RNN) are known to produce state of the art results for language modelling, outperforming their traditional n-gram counterparts in many cases. To generate a probability distribution across a vocabulary, these models require a softmax output layer that linearly increases in size with the size…
▽ More
Statistical language models are central to many applications that use semantics. Recurrent Neural Networks (RNN) are known to produce state of the art results for language modelling, outperforming their traditional n-gram counterparts in many cases. To generate a probability distribution across a vocabulary, these models require a softmax output layer that linearly increases in size with the size of the vocabulary. Large vocabularies need a commensurately large softmax layer and training them on typical laptops/PCs requires significant time and machine resources. In this paper we present a new technique for implementing RNN based large vocabulary language models that substantially speeds up computation while optimally using the limited memory resources. Our technique, while building on the notion of factorizing the output layer by having multiple output layers, improves on the earlier work by substantially optimizing on the individual output layer size and also eliminating the need for a multistep prediction process.
△ Less
Submitted 4 February, 2016;
originally announced February 2016.
-
Variations in the Cyclotron Resonant Scattering Features during 2011 outburst of 4U 0115+63
Authors:
N. Iyer,
D. Mukherjee,
G. C. Dewangan,
D. Bhattacharya,
S. Seetha
Abstract:
We study the variations in the Cyclotron Resonant Scattering Feature (CRSF) during 2011 outburst of the high mass X-ray binary 4U 0115+63 using observations performed with Suzaku, RXTE, Swift and INTEGRAL satellites. The wide-band spectral data with low energy coverage allowed us to characterize the broadband continuum and detect the CRSFs. We find that the broadband continuum is adequately descri…
▽ More
We study the variations in the Cyclotron Resonant Scattering Feature (CRSF) during 2011 outburst of the high mass X-ray binary 4U 0115+63 using observations performed with Suzaku, RXTE, Swift and INTEGRAL satellites. The wide-band spectral data with low energy coverage allowed us to characterize the broadband continuum and detect the CRSFs. We find that the broadband continuum is adequately described by a combination of a low temperature (kT ~ 0.8 keV) blackbody and a power-law with high energy cutoff (Ecut ~ 5.4 keV) without the need for a broad Gaussian at ~ 10 keV as used in some earlier studies. Though winds from the companion can affect the emission from the neutron star at low energies (< 3 keV), the blackbody component shows a significant presence in our continuum model. We report evidence for the possible presence of two independent sets of CRSFs with fundamentals at ~ 11 keV and ~ 15 keV. These two sets of CRSFs could arise from spatially distinct emitting regions. We also find evidence for variations in the line equivalent widths, with the 11 keV CRSF weakening and the 15 keV line strengthening with decreasing luminosity. Finally, we propose that the reason for the earlier observed anti-correlation of line energy with luminosity could be due to modelling of these two independent line sets (~ 11 keV and ~ 15 keV) as a single CRSF.
△ Less
Submitted 20 August, 2015; v1 submitted 10 June, 2015;
originally announced June 2015.
-
Determination of mass of IGR J17091-3624 from "Spectro-Temporal" variations during onset-phase of the 2011 outburst
Authors:
N. Iyer,
A. Nandi,
S. Mandal
Abstract:
The 2011 outburst of the black hole candidate IGR J17091-3624 followed the canonical track of state transitions along with the evolution of Quasi-Periodic Oscillation (QPO) frequencies before it began exhibiting various variability classes similar to GRS 1915+105. We use this canonical evolution of spectral and temporal properties to determine the mass of IGR J17091-3624, using three different met…
▽ More
The 2011 outburst of the black hole candidate IGR J17091-3624 followed the canonical track of state transitions along with the evolution of Quasi-Periodic Oscillation (QPO) frequencies before it began exhibiting various variability classes similar to GRS 1915+105. We use this canonical evolution of spectral and temporal properties to determine the mass of IGR J17091-3624, using three different methods, viz : Photon Index ($Γ$) - QPO frequency ($ν$) correlation, QPO frequency ($ν$) - Time (day) evolution and broadband spectral modelling based on Two Component Advective Flow. We provide a combined mass estimate for the source using a Naive Bayes based joint likelihood approach. This gives a probable mass range of 11.8 M$_{\odot}$ - 13.7 M$_{\odot}$. Considering each individual estimate and taking the lowermost and uppermost bounds among all three methods, we get a mass range of 8.7 M$_{\odot}$ - 15.6 M$_{\odot}$ with 90% confidence. We discuss the probable implications of our findings in the context of two component accretion flow.
△ Less
Submitted 11 May, 2015;
originally announced May 2015.
-
On the kernel of the push-forward homomorphism between Chow groups
Authors:
Kalyan Banerjee,
Jaya NN Iyer
Abstract:
In this note we prove that the kernel of the push-forward homomorphism on $d$-cycles modulo rational equivalence, induced by the closed embedding of an ample divisor linearly equivalent to some multiple of the theta divisor inside the Jacobian variety $J(C)$ is trivial. Here $C$ is a smooth projective curve of genus $g$.
In this note we prove that the kernel of the push-forward homomorphism on $d$-cycles modulo rational equivalence, induced by the closed embedding of an ample divisor linearly equivalent to some multiple of the theta divisor inside the Jacobian variety $J(C)$ is trivial. Here $C$ is a smooth projective curve of genus $g$.
△ Less
Submitted 20 June, 2016; v1 submitted 29 April, 2015;
originally announced April 2015.
-
Cohomological invariants of a variation of flat connection
Authors:
Jaya N. N. Iyer
Abstract:
In this paper, we apply the theory of Chern-Cheeger-Simons to construct canonical invariants associated to a $r$-simplex whose points parametrize flat connections on a smooth manifold $X$. These invariants lie in degrees $(2p-r-1)$-cohomology with $C/Z$-coefficients, for $p> r\geq 1$. In turn, this corresponds to a homomorphism on the higher homology groups of the moduli space of flat connections,…
▽ More
In this paper, we apply the theory of Chern-Cheeger-Simons to construct canonical invariants associated to a $r$-simplex whose points parametrize flat connections on a smooth manifold $X$. These invariants lie in degrees $(2p-r-1)$-cohomology with $C/Z$-coefficients, for $p> r\geq 1$. In turn, this corresponds to a homomorphism on the higher homology groups of the moduli space of flat connections, and taking values in $C/Z$-cohomology of the underlying smooth manifold $X$.
△ Less
Submitted 22 September, 2015; v1 submitted 20 April, 2015;
originally announced April 2015.