Search | arXiv e-print repository

arXiv:2406.03699 [pdf, other]

M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering

Authors: Anand Subramanian, Viktor Schlegel, Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Vijay Prakash Dwivedi, Stefan Winkler

Abstract: There is vivid research on adapting Large Language Models (LLMs) to perform a variety of tasks in high-stakes domains such as healthcare. Despite their popularity, there is a lack of understanding of the extent and contributing factors that allow LLMs to recall relevant knowledge and combine it with presented information in the clinical and biomedical domain: a fundamental pre-requisite for succes… ▽ More There is vivid research on adapting Large Language Models (LLMs) to perform a variety of tasks in high-stakes domains such as healthcare. Despite their popularity, there is a lack of understanding of the extent and contributing factors that allow LLMs to recall relevant knowledge and combine it with presented information in the clinical and biomedical domain: a fundamental pre-requisite for success on down-stream tasks. Addressing this gap, we use Multiple Choice and Abstractive Question Answering to conduct a large-scale empirical study on 22 datasets in three generalist and three specialist biomedical sub-domains. Our multifaceted analysis of the performance of 15 LLMs, further broken down by sub-domain, source of knowledge and model architecture, uncovers success factors such as instruction tuning that lead to improved recall and comprehension. We further show that while recently proposed domain-adapted models may lack adequate knowledge, directly fine-tuning on our collected medical knowledge datasets shows encouraging results, even generalising to unseen specialist sub-domains. We complement the quantitative results with a skill-oriented manual error analysis, which reveals a significant gap between the models' capabilities to simply recall necessary knowledge and to integrate it with the presented context. To foster research and collaboration in this field we share M-QALM, our resources, standardised methodology, and evaluation results, with the research community to facilitate further advancements in clinical knowledge representation learning within language models. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Accepted at ACL 2024 (Findings)

arXiv:2403.04134 [pdf, other]

An Adaptable, Safe, and Portable Robot-Assisted Feeding System

Authors: Ethan Kroll Gordon, Rajat Kumar Jenamani, Amal Nanavati, Ziang Liu, Haya Bolotski, Raida Karim, Daniel Stabile, Atharva Kashyap, Bernie Hao Zhu, Xilai Dai, Tyler Schrenk, Jonathan Ko, Taylor Kessler Faulkner, Tapomayukh Bhattacharjee, Siddhartha Srinivasa

Abstract: We demonstrate a robot-assisted feeding system that enables people with mobility impairments to feed themselves. Our system design embodies Safety, Portability, and User Control, with comprehensive full-stack safety checks, the ability to be mounted on and powered by any powered wheelchair, and a custom web-app allowing care-recipients to leverage their own assistive devices for robot control. For… ▽ More We demonstrate a robot-assisted feeding system that enables people with mobility impairments to feed themselves. Our system design embodies Safety, Portability, and User Control, with comprehensive full-stack safety checks, the ability to be mounted on and powered by any powered wheelchair, and a custom web-app allowing care-recipients to leverage their own assistive devices for robot control. For bite acquisition, we leverage multi-modal online learning to tractably adapt to unseen food types. For bite transfer, we leverage real-time mouth perception and interaction-aware control. Co-designed with community researchers, our system has been validated through multiple end-user studies. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: HRI 2024 Demo; Corrected inaccurate author ordering in ACM DL which occurred due to formatting issues

arXiv:2312.13533 [pdf, other]

Automated Clinical Coding for Outpatient Departments

Authors: Viktor Schlegel, Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Tsung-Han Yang, Vijay Prakash Dwivedi, Wei-Hsian Yin, Jeng Wei, Stefan Winkler

Abstract: Computerised clinical coding approaches aim to automate the process of assigning a set of codes to medical records. While there is active research pushing the state of the art on clinical coding for hospitalized patients, the outpatient setting -- where doctors tend to non-hospitalised patients -- is overlooked. Although both settings can be formalised as a multi-label classification task, they pr… ▽ More Computerised clinical coding approaches aim to automate the process of assigning a set of codes to medical records. While there is active research pushing the state of the art on clinical coding for hospitalized patients, the outpatient setting -- where doctors tend to non-hospitalised patients -- is overlooked. Although both settings can be formalised as a multi-label classification task, they present unique and distinct challenges, which raises the question of whether the success of inpatient clinical coding approaches translates to the outpatient setting. This paper is the first to investigate how well state-of-the-art deep learning-based clinical coding approaches work in the outpatient setting at hospital scale. To this end, we collect a large outpatient dataset comprising over 7 million notes documenting over half a million patients. We adapt four state-of-the-art clinical coding approaches to this setting and evaluate their potential to assist coders. We find evidence that clinical coding in outpatient settings can benefit from more innovations in popular inpatient coding benchmarks. A deeper analysis of the factors contributing to the success -- amount and form of data and choice of document representation -- reveals the presence of easy-to-solve examples, the coding of which can be completely automated with a low error rate. △ Less

Submitted 24 December, 2023; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: 9 pages, preprint under review

arXiv:2312.12810 [pdf, other]

Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection

Authors: Jiachen Lian, Carly Feng, Naasir Farooqi, Steve Li, Anshul Kashyap, Cheol Jun Cho, Peter Wu, Robbie Netzorg, Tingle Li, Gopala Krishna Anumanchipalli

Abstract: Dysfluent speech modeling requires time-accurate and silence-aware transcription at both the word-level and phonetic-level. However, current research in dysfluency modeling primarily focuses on either transcription or detection, and the performance of each aspect remains limited. In this work, we present an unconstrained dysfluency modeling (UDM) approach that addresses both transcription and dete… ▽ More Dysfluent speech modeling requires time-accurate and silence-aware transcription at both the word-level and phonetic-level. However, current research in dysfluency modeling primarily focuses on either transcription or detection, and the performance of each aspect remains limited. In this work, we present an unconstrained dysfluency modeling (UDM) approach that addresses both transcription and detection in an automatic and hierarchical manner. UDM eliminates the need for extensive manual annotation by providing a comprehensive solution. Furthermore, we introduce a simulated dysfluent dataset called VCTK++ to enhance the capabilities of UDM in phonetic transcription. Our experimental results demonstrate the effectiveness and robustness of our proposed methods in both transcription and detection tasks. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: 2023 ASRU

arXiv:2310.05800 [pdf]

A review on multiscale computational studies for enhanced oil recovery using nanoparticles

Authors: Rajneesh Kashyap, Mohit Kalra, Arti Kashyap

Abstract: Oil reservoirs around the globe are at their declining phase and in spite of enormous effectiveness of Enhanced Oil Recovery(EOR) in the Tertiary Stage. This process still bypasses some oil reason being surface forces responsible for holding oil inside the rock surface which are not being altered by the application of existing technologies. The processes coming under Tertiary Section Supplements p… ▽ More Oil reservoirs around the globe are at their declining phase and in spite of enormous effectiveness of Enhanced Oil Recovery(EOR) in the Tertiary Stage. This process still bypasses some oil reason being surface forces responsible for holding oil inside the rock surface which are not being altered by the application of existing technologies. The processes coming under Tertiary Section Supplements primary and secondary sections. However, the mechanism of operating is different in both. Nanoparticles are showing a significant role in EOR techniques and is a promising approach to increase crude oil extraction. This is due to the fact that size of nanoparticles used for EOR lies in the range of 1-100 nm. It is also an interesting fact that in different operational conditions and parameters, the performance of nanoparticles also vary and some are more effective than others, which leads to various levels of recovery in the EOR process. In the present study, we intend to summarize a report having an up to date status on nanotechnology assisted EOR mechanisms where nanoparticles are used as nano-catalysts, nano-emulsions and nanoparticles assisted EOR mechanisms to destabilize the oil layer on carbonate surface. This review also highlights the various mechanisms such Gibb's free energy, wettability alteration, and Interfacial Tension Reduction (ITR) including interaction of available nanoparticles with reservoirs. Experimental measurements for a wide range of nanoparticles are not only expensive but are challenging because of the relatively small size, especially for the measurements of thinner capillaries of a nanoscale diameter. Therefore, we considered computational simulations as a more adequate approach to gain more microscopic insights into the oil displacement process to classify the suitability of nanomaterials. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2310.00062 [pdf, other]

Tradeoffs in concentration sensing in dynamic environments

Authors: Aparajita Kashyap, Wei Wang, Brian A. Camley

Abstract: When cells measure concentrations of chemical signals, they may average multiple measurements over time in order to reduce noise in their measurements. However, when cells are in a environment that changes over time, past measurements may not reflect current conditions - creating a new source of error that trades off against noise in chemical sensing. What statistics in the cell's environment cont… ▽ More When cells measure concentrations of chemical signals, they may average multiple measurements over time in order to reduce noise in their measurements. However, when cells are in a environment that changes over time, past measurements may not reflect current conditions - creating a new source of error that trades off against noise in chemical sensing. What statistics in the cell's environment control this tradeoff? What properties of the environment make it variable enough that this tradeoff is relevant? We model a single eukaryotic cell sensing a chemical secreted from bacteria (e.g. folic acid). In this case, the environment changes because the bacteria swim - leading to changes in the true concentration at the cell. We develop analytical calculations and stochastic simulations of sensing in this environment. We find that cells can have a huge variety of optimal sensing strategies, ranging from not time averaging at all, to averaging over an arbitrarily long time, or having a finite optimal averaging time. The factors that primarily control the ideal averaging are the ratio of sensing noise to environmental variation, and the ratio of timescales of sensing to the timescale of environmental variation. Sensing noise depends on the receptor-ligand kinetics, while the environmental variation depends on the density of bacteria and the degradation and diffusion properties of the secreted chemoattractant. Our results suggest that fluctuating environmental concentrations may be a relevant source of noise even in a relatively static environment. △ Less

Submitted 29 September, 2023; originally announced October 2023.

arXiv:2307.02006 [pdf, other]

PULSAR at MEDIQA-Sum 2023: Large Language Models Augmented by Synthetic Dialogue Convert Patient Dialogues to Medical Records

Authors: Viktor Schlegel, Hao Li, Yu** Wu, Anand Subramanian, Thanh-Tung Nguyen, Abhinav Ramesh Kashyap, Daniel Beck, Xiaojun Zeng, Riza Theresa Batista-Navarro, Stefan Winkler, Goran Nenadic

Abstract: This paper describes PULSAR, our system submission at the ImageClef 2023 MediQA-Sum task on summarising patient-doctor dialogues into clinical records. The proposed framework relies on domain-specific pre-training, to produce a specialised language model which is trained on task-specific natural data augmented by synthetic data generated by a black-box LLM. We find limited evidence towards the eff… ▽ More This paper describes PULSAR, our system submission at the ImageClef 2023 MediQA-Sum task on summarising patient-doctor dialogues into clinical records. The proposed framework relies on domain-specific pre-training, to produce a specialised language model which is trained on task-specific natural data augmented by synthetic data generated by a black-box LLM. We find limited evidence towards the efficacy of domain-specific pre-training and data augmentation, while scaling up the language model yields the best performance gains. Our approach was ranked second and third among 13 submissions on task B of the challenge. Our code is available at https://github.com/yu**-wu/PULSAR. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 8 pages. ImageClef 2023 MediQA-Sum

arXiv:2306.02754 [pdf, other]

PULSAR: Pre-training with Extracted Healthcare Terms for Summarising Patients' Problems and Data Augmentation with Black-box Large Language Models

Authors: Hao Li, Yu** Wu, Viktor Schlegel, Riza Batista-Navarro, Thanh-Tung Nguyen, Abhinav Ramesh Kashyap, Xiaojun Zeng, Daniel Beck, Stefan Winkler, Goran Nenadic

Abstract: Medical progress notes play a crucial role in documenting a patient's hospital journey, including his or her condition, treatment plan, and any updates for healthcare providers. Automatic summarisation of a patient's problems in the form of a problem list can aid stakeholders in understanding a patient's condition, reducing workload and cognitive bias. BioNLP 2023 Shared Task 1A focuses on generat… ▽ More Medical progress notes play a crucial role in documenting a patient's hospital journey, including his or her condition, treatment plan, and any updates for healthcare providers. Automatic summarisation of a patient's problems in the form of a problem list can aid stakeholders in understanding a patient's condition, reducing workload and cognitive bias. BioNLP 2023 Shared Task 1A focuses on generating a list of diagnoses and problems from the provider's progress notes during hospitalisation. In this paper, we introduce our proposed approach to this task, which integrates two complementary components. One component employs large language models (LLMs) for data augmentation; the other is an abstractive summarisation LLM with a novel pre-training objective for generating the patients' problems summarised as a list. Our approach was ranked second among all submissions to the shared task. The performance of our model on the development and test datasets shows that our approach is more robust on unknown data, with an improvement of up to 3.1 points over the same size of the larger model. △ Less

Submitted 5 June, 2023; originally announced June 2023.

Comments: Accepted by ACL 2023's workshop BioNLP 2023

arXiv:2306.00005 [pdf, other]

A Two-Stage Decoder for Efficient ICD Coding

Authors: Thanh-Tung Nguyen, Viktor Schlegel, Abhinav Kashyap, Stefan Winkler

Abstract: Clinical notes in healthcare facilities are tagged with the International Classification of Diseases (ICD) code; a list of classification codes for medical diagnoses and procedures. ICD coding is a challenging multilabel text classification problem due to noisy clinical document inputs and long-tailed label distribution. Recent automated ICD coding efforts improve performance by encoding medical n… ▽ More Clinical notes in healthcare facilities are tagged with the International Classification of Diseases (ICD) code; a list of classification codes for medical diagnoses and procedures. ICD coding is a challenging multilabel text classification problem due to noisy clinical document inputs and long-tailed label distribution. Recent automated ICD coding efforts improve performance by encoding medical notes and codes with additional data and knowledge bases. However, most of them do not reflect how human coders generate the code: first, the coders select general code categories and then look for specific subcategories that are relevant to a patient's condition. Inspired by this, we propose a two-stage decoding mechanism to predict ICD codes. Our model uses the hierarchical properties of the codes to split the prediction into two steps: At first, we predict the parent code and then predict the child code based on the previous prediction. Experiments on the public MIMIC-III data set show that our model performs well in single-model settings without external data or knowledge. △ Less

Submitted 27 May, 2023; originally announced June 2023.

Comments: Accepted to ACL'23

arXiv:2305.18028 [pdf, other]

ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation

Authors: Ambuj Mehrish, Abhinav Ramesh Kashyap, Li Yingting, Navonil Majumder, Soujanya Poria

Abstract: There are significant challenges for speaker adaptation in text-to-speech for languages that are not widely spoken or for speakers with accents or dialects that are not well-represented in the training data. To address this issue, we propose the use of the "mixture of adapters" method. This approach involves adding multiple adapters within a backbone-model layer to learn the unique characteristics… ▽ More There are significant challenges for speaker adaptation in text-to-speech for languages that are not widely spoken or for speakers with accents or dialects that are not well-represented in the training data. To address this issue, we propose the use of the "mixture of adapters" method. This approach involves adding multiple adapters within a backbone-model layer to learn the unique characteristics of different speakers. Our approach outperforms the baseline, with a noticeable improvement of 5% observed in speaker preference tests when using only one minute of data for each new speaker. Moreover, following the adapter paradigm, we fine-tune only the adapter parameters (11% of the total model parameters). This is a significant achievement in parameter-efficient speaker adaptation, and one of the first models of its kind. Overall, our proposed approach offers a promising solution to the speech synthesis techniques, particularly for adapting to speakers from diverse backgrounds. △ Less

Submitted 29 May, 2023; originally announced May 2023.

Comments: Interspeech 2023

arXiv:2305.14369 [pdf, other]

Learning low-dimensional dynamics from whole-brain data improves task capture

Authors: Eloy Geenjaar, Donghyun Kim, Riyasat Ohib, Marlena Duda, Amrit Kashyap, Sergey Plis, Vince Calhoun

Abstract: The neural dynamics underlying brain activity are critical to understanding cognitive processes and mental disorders. However, current voxel-based whole-brain dimensionality reduction techniques fall short of capturing these dynamics, producing latent timeseries that inadequately relate to behavioral tasks. To address this issue, we introduce a novel approach to learning low-dimensional approximat… ▽ More The neural dynamics underlying brain activity are critical to understanding cognitive processes and mental disorders. However, current voxel-based whole-brain dimensionality reduction techniques fall short of capturing these dynamics, producing latent timeseries that inadequately relate to behavioral tasks. To address this issue, we introduce a novel approach to learning low-dimensional approximations of neural dynamics by using a sequential variational autoencoder (SVAE) that represents the latent dynamical system via a neural ordinary differential equation (NODE). Importantly, our method finds smooth dynamics that can predict cognitive processes with accuracy higher than classical methods. Our method also shows improved spatial localization to task-relevant brain regions and identifies well-known structures such as the motor homunculus from fMRI motor task recordings. We also find that non-linear projections to the latent space enhance performance for specific tasks, offering a promising direction for future research. We evaluate our approach on various task-fMRI datasets, including motor, working memory, and relational processing tasks, and demonstrate that it outperforms widely used dimensionality reduction techniques in how well the latent timeseries relates to behavioral sub-tasks, such as left-hand or right-hand tap**. Additionally, we replace the NODE with a recurrent neural network (RNN) and compare the two approaches to understand the importance of explicitly learning a dynamical system. Lastly, we analyze the robustness of the learned dynamical systems themselves and find that their fixed points are robust across seeds, highlighting our method's potential for the analysis of cognitive processes as dynamical systems. △ Less

Submitted 18 May, 2023; originally announced May 2023.

Comments: 9 pages, 4 figures

arXiv:2305.12641 [pdf, other]

A Comprehensive Survey of Sentence Representations: From the BERT Epoch to the ChatGPT Era and Beyond

Authors: Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Viktor Schlegel, Stefan Winkler, See-Kiong Ng, Soujanya Poria

Abstract: Sentence representations are a critical component in NLP applications such as retrieval, question answering, and text classification. They capture the meaning of a sentence, enabling machines to understand and reason over human language. In recent years, significant progress has been made in develo** methods for learning sentence representations, including unsupervised, supervised, and transfer… ▽ More Sentence representations are a critical component in NLP applications such as retrieval, question answering, and text classification. They capture the meaning of a sentence, enabling machines to understand and reason over human language. In recent years, significant progress has been made in develo** methods for learning sentence representations, including unsupervised, supervised, and transfer learning approaches. However there is no literature review on sentence representations till now. In this paper, we provide an overview of the different methods for sentence representation learning, focusing mostly on deep learning models. We provide a systematic organization of the literature, highlighting the key contributions and challenges in this area. Overall, our review highlights the importance of this area in natural language processing, the progress made in sentence representation learning, and the challenges that remain. We conclude with directions for future research, suggesting potential avenues for improving the quality and efficiency of sentence representations. △ Less

Submitted 2 February, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

Comments: Accepted to EACL'24

arXiv:2304.13998 [pdf, other]

Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Authors: Thanh-Tung Nguyen, Viktor Schlegel, Abhinav Kashyap, Stefan Winkler, Shao-Syuan Huang, Jie-Jyun Liu, Chih-Jen Lin

Abstract: Clinical notes are assigned ICD codes - sets of codes for diagnoses and procedures. In the recent years, predictive machine learning models have been built for automatic ICD coding. However, there is a lack of widely accepted benchmarks for automated ICD coding models based on large-scale public EHR data. This paper proposes a public benchmark suite for ICD-10 coding using a large EHR dataset de… ▽ More Clinical notes are assigned ICD codes - sets of codes for diagnoses and procedures. In the recent years, predictive machine learning models have been built for automatic ICD coding. However, there is a lack of widely accepted benchmarks for automated ICD coding models based on large-scale public EHR data. This paper proposes a public benchmark suite for ICD-10 coding using a large EHR dataset derived from MIMIC-IV, the most recent public EHR dataset. We implement and compare several popular methods for ICD coding prediction tasks to standardize data preprocessing and establish a comprehensive ICD coding benchmark dataset. This approach fosters reproducibility and model comparison, accelerating progress toward employing automated ICD coding in future studies. Furthermore, we create a new ICD-9 benchmark using MIMIC-IV data, providing more data points and a higher number of ICD codes than MIMIC-III. Our open-source code offers easy access to data processing steps, benchmark creation, and experiment replication for those with MIMIC-IV access, providing insights, guidance, and protocols to efficiently develop ICD coding models. △ Less

Submitted 27 April, 2023; originally announced April 2023.

Comments: Benchmark, Multilabel, Classification

arXiv:2302.12301 [pdf, other]

An Aligned Multi-Temporal Multi-Resolution Satellite Image Dataset for Change Detection Research

Authors: Rahul Deshmukh, Constantine J. Roros, Amith Kashyap, Avinash C. Kak

Abstract: This paper presents an aligned multi-temporal and multi-resolution satellite image dataset for research in change detection. We expect our dataset to be useful to researchers who want to fuse information from multiple satellites for detecting changes on the surface of the earth that may not be fully visible in any single satellite. The dataset we present was created by augmenting the SpaceNet-7 da… ▽ More This paper presents an aligned multi-temporal and multi-resolution satellite image dataset for research in change detection. We expect our dataset to be useful to researchers who want to fuse information from multiple satellites for detecting changes on the surface of the earth that may not be fully visible in any single satellite. The dataset we present was created by augmenting the SpaceNet-7 dataset with temporally parallel stacks of Landsat and Sentinel images. The SpaceNet-7 dataset consists of time-sequenced Planet images recorded over 101 AOIs (Areas-of-Interest). In our dataset, for each of the 60 AOIs that are meant for training, we augment the Planet datacube with temporally parallel datacubes of Landsat and Sentinel images. The temporal alignments between the high-res Planet images, on the one hand, and the Landsat and Sentinel images, on the other, are approximate since the temporal resolution for the Planet images is one month -- each image being a mosaic of the best data collected over a month. Whenever we have a choice regarding which Landsat and Sentinel images to pair up with the Planet images, we have chosen those that had the least cloud cover. A particularly important feature of our dataset is that the high-res and the low-res images are spatially aligned together with our MuRA framework presented in this paper. Foundational to the alignment calculation is the modeling of inter-satellite misalignment errors with polynomials as in NASA's AROP algorithm. We have named our dataset MuRA-T for the MuRA framework that is used for aligning the cross-satellite images and "T" for the temporal dimension in the dataset. △ Less

Submitted 27 February, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

Comments: 8 pages, 4 figures, 3 tables, satellite image dataset

arXiv:2302.03194 [pdf, other]

UDApter -- Efficient Domain Adaptation Using Adapters

Authors: Bhavitvya Malik, Abhinav Ramesh Kashyap, Min-Yen Kan, Soujanya Poria

Abstract: We propose two methods to make unsupervised domain adaptation (UDA) more parameter efficient using adapters, small bottleneck layers interspersed with every layer of the large-scale pre-trained language model (PLM). The first method deconstructs UDA into a two-step process: first by adding a domain adapter to learn domain-invariant information and then by adding a task adapter that uses domain-inv… ▽ More We propose two methods to make unsupervised domain adaptation (UDA) more parameter efficient using adapters, small bottleneck layers interspersed with every layer of the large-scale pre-trained language model (PLM). The first method deconstructs UDA into a two-step process: first by adding a domain adapter to learn domain-invariant information and then by adding a task adapter that uses domain-invariant information to learn task representations in the source domain. The second method jointly learns a supervised classifier while reducing the divergence measure. Compared to strong baselines, our simple methods perform well in natural language inference (MNLI) and the cross-domain sentiment classification task. We even outperform unsupervised domain adaptation methods such as DANN and DSN in sentiment classification, and we are within 0.85% F1 for natural language inference task, by fine-tuning only a fraction of the full model parameters. We release our code at https://github.com/declare-lab/domadapter △ Less

Submitted 16 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

Comments: Accepted to EACL 2023

arXiv:2211.05100 [pdf, other]

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License. △ Less

Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

arXiv:2210.03667 [pdf, other]

CommsVAE: Learning the brain's macroscale communication dynamics using coupled sequential VAEs

Authors: Eloy Geenjaar, Noah Lewis, Amrit Kashyap, Robyn Miller, Vince Calhoun

Abstract: Communication within or between complex systems is commonplace in the natural sciences and fields such as graph neural networks. The brain is a perfect example of such a complex system, where communication between brain regions is constantly being orchestrated. To analyze communication, the brain is often split up into anatomical regions that each perform certain computations. These regions must i… ▽ More Communication within or between complex systems is commonplace in the natural sciences and fields such as graph neural networks. The brain is a perfect example of such a complex system, where communication between brain regions is constantly being orchestrated. To analyze communication, the brain is often split up into anatomical regions that each perform certain computations. These regions must interact and communicate with each other to perform tasks and support higher-level cognition. On a macroscale, these regions communicate through signal propagation along the cortex and along white matter tracts over longer distances. When and what types of signals are communicated over time is an unsolved problem and is often studied using either functional or structural data. In this paper, we propose a non-linear generative approach to communication from functional data. We address three issues with common connectivity approaches by explicitly modeling the directionality of communication, finding communication at each timestep, and encouraging sparsity. To evaluate our model, we simulate temporal data that has sparse communication between nodes embedded in it and show that our model can uncover the expected communication dynamics. Subsequently, we apply our model to temporal neural data from multiple tasks and show that our approach models communication that is more specific to each task. The specificity of our method means it can have an impact on the understanding of psychiatric disorders, which are believed to be related to highly specific communication between brain regions compared to controls. In sum, we propose a general model for dynamic communication learning on graphs, and show its applicability to a subfield of the natural sciences, with potential widespread scientific impact. △ Less

Submitted 7 October, 2022; originally announced October 2022.

Comments: 14 pages, 8 figures

arXiv:2209.15428 [pdf, other]

PyPose: A Library for Robot Learning with Physics-based Optimization

Authors: Chen Wang, Dasong Gao, Kuan Xu, Junyi Geng, Yaoyu Hu, Yuheng Qiu, Bowen Li, Fan Yang, Brady Moon, Abhinav Pandey, Aryan, Jiahe Xu, Tianhao Wu, Haonan He, Daning Huang, Zhongqiang Ren, Shibo Zhao, Taimeng Fu, Pranay Reddy, Xiao Lin, Wenshan Wang, **gnan Shi, Rajat Talak, Kun Cao, Yi Du , et al. (12 additional authors not shown)

Abstract: Deep learning has had remarkable success in robotic perception, but its data-centric nature suffers when it comes to generalizing to ever-changing environments. By contrast, physics-based optimization generalizes better, but it does not perform as well in complicated tasks due to the lack of high-level semantic information and reliance on manual parametric tuning. To take advantage of these two co… ▽ More Deep learning has had remarkable success in robotic perception, but its data-centric nature suffers when it comes to generalizing to ever-changing environments. By contrast, physics-based optimization generalizes better, but it does not perform as well in complicated tasks due to the lack of high-level semantic information and reliance on manual parametric tuning. To take advantage of these two complementary worlds, we present PyPose: a robotics-oriented, PyTorch-based library that combines deep perceptual models with physics-based optimization. PyPose's architecture is tidy and well-organized, it has an imperative style interface and is efficient and user-friendly, making it easy to integrate into real-world robotic applications. Besides, it supports parallel computing of any order gradients of Lie groups and Lie algebras and $2^{\text{nd}}$-order optimizers, such as trust region methods. Experiments show that PyPose achieves more than $10\times$ speedup in computation compared to the state-of-the-art libraries. To boost future research, we provide concrete examples for several fields of robot learning, including SLAM, planning, control, and inertial navigation. △ Less

Submitted 24 March, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

Comments: Project Website: https://pypose.org Documentation: https://pypose.org/docs/ Tutorial: https://pypose.org/tutorials/ Source code: https://github.com/pypose/pypose

Journal ref: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

arXiv:2205.13640 [pdf, other]

Spatio-temporally separable non-linear latent factor learning: an application to somatomotor cortex fMRI data

Authors: Eloy Geenjaar, Amrit Kashyap, Noah Lewis, Robyn Miller, Vince Calhoun

Abstract: Functional magnetic resonance imaging (fMRI) data contain complex spatiotemporal dynamics, thus researchers have developed approaches that reduce the dimensionality of the signal while extracting relevant and interpretable dynamics. Models of fMRI data that can perform whole-brain discovery of dynamical latent factors are understudied. The benefits of approaches such as linear independent componen… ▽ More Functional magnetic resonance imaging (fMRI) data contain complex spatiotemporal dynamics, thus researchers have developed approaches that reduce the dimensionality of the signal while extracting relevant and interpretable dynamics. Models of fMRI data that can perform whole-brain discovery of dynamical latent factors are understudied. The benefits of approaches such as linear independent component analysis models have been widely appreciated, however, nonlinear extensions of these models present challenges in terms of identification. Deep learning methods provide a way forward, but new methods for efficient spatial weight-sharing are critical to deal with the high dimensionality of the data and the presence of noise. Our approach generalizes weight sharing to non-Euclidean neuroimaging data by first performing spectral clustering based on the structural and functional similarity between voxels. The spectral clusters and their assignments can then be used as patches in an adapted multi-layer perceptron (MLP)-mixer model to share parameters among input points. To encourage temporally independent latent factors, we use an additional total correlation term in the loss. Our approach is evaluated on data with multiple motor sub-tasks to assess whether the model captures disentangled latent factors that correspond to each sub-task. Then, to assess the latent factors we find further, we compare the spatial location of each latent factor to the motor homunculus. Finally, we show that our approach captures task effects better than the current gold standard of source signal separation, independent component analysis (ICA). △ Less

Submitted 26 May, 2022; originally announced May 2022.

Comments: 12 pages, 3 figures

arXiv:2205.05977 [pdf, other]

doi 10.1021/acs.jpcc.2c04767

Pivotal Role of Intersite Hubbard Interactions in Fe-doped $α$-MnO$_2$

Authors: Ruchika Mahajan, Arti Kashyap, Iurii Timrov

Abstract: We present a first-principles investigation of the structural, electronic, and magnetic properties of the pristine and Fe-doped $α$-MnO$_2$ using density-functional theory with extended Hubbard functionals. The onsite $U$ and intersite $V$ Hubbard parameters are determined from first principles and self-consistently using density-functional perturbation theory in the basis of Löwdin-orthogonalized… ▽ More We present a first-principles investigation of the structural, electronic, and magnetic properties of the pristine and Fe-doped $α$-MnO$_2$ using density-functional theory with extended Hubbard functionals. The onsite $U$ and intersite $V$ Hubbard parameters are determined from first principles and self-consistently using density-functional perturbation theory in the basis of Löwdin-orthogonalized atomic orbitals. For the pristine $α$-MnO$_2$ we find that the so-called C2-AFM spin configuration is the most energetically favorable, in agreement with the experimentally observed antiferromagnetic ground state. For the Fe-doped $α$-MnO$_2$ two types of do** are considered: Fe insertion in the $2 \times 2$ tunnels and partial substitution of Fe for Mn. We find that the interstitial do** preserves the C2-AFM spin configuration of the host lattice only when both onsite $U$ and intersite $V$ Hubbard corrections are included, while for the substitutional do** the onsite Hubbard $U$ correction alone is able to preserve the C2-AFM spin configuration of the host lattice. The oxidation state of Fe is found to be $+2$ and $+4$ in the case of the interstitial and substitutional do**, respectively, while the oxidation state of Mn is $+4$ in both cases. This work paves the way for accurate studies of other MnO$_2$ polymorphs and complex transition-metal compounds when the localization of $3d$ electrons occurs in the presence of strong covalent interactions with ligands. △ Less

Submitted 26 August, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

Journal ref: J. Phys. Chem. C 126, 14353 (2022)

arXiv:2205.04093 [pdf, other]

So Different Yet So Alike! Constrained Unsupervised Text Style Transfer

Authors: Abhinav Ramesh Kashyap, Devamanyu Hazarika, Min-Yen Kan, Roger Zimmermann, Soujanya Poria

Abstract: Automatic transfer of text between domains has become popular in recent times. One of its aims is to preserve the semantic content of text being translated from source to target domain. However, it does not explicitly maintain other attributes between the source and translated text, for e.g., text length and descriptiveness. Maintaining constraints in transfer has several downstream applications,… ▽ More Automatic transfer of text between domains has become popular in recent times. One of its aims is to preserve the semantic content of text being translated from source to target domain. However, it does not explicitly maintain other attributes between the source and translated text, for e.g., text length and descriptiveness. Maintaining constraints in transfer has several downstream applications, including data augmentation and de-biasing. We introduce a method for such constrained unsupervised text style transfer by introducing two complementary losses to the generative adversarial network (GAN) family of models. Unlike the competing losses used in GANs, we introduce cooperative losses where the discriminator and the generator cooperate and reduce the same loss. The first is a contrastive loss and the second is a classification loss, aiming to regularize the latent space further and bring similar sentences across domains closer together. We demonstrate that such training retains lexical, syntactic, and domain-specific constraints between domains for multiple benchmark datasets, including ones where more than one attribute change. We show that the complementary cooperative losses improve text quality, according to both automated and human evaluation measures. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: Accepted to ACL 2022

arXiv:2203.09037 [pdf, other]

Collision Avoidance of 3-Dimensional Objects in Dynamic Environments

Authors: Kashish Dhal, Abhishek Kashyap, Animesh Chakravarthy

Abstract: Achieving collision avoidance between moving objects is an important objective while determining robot trajectories. In performing collision avoidance maneuvers, the relative shapes of the objects play an important role. The literature largely models the shapes of the objects as spheres, and this can make the avoidance maneuvers very conservative, especially when the objects are of elongated shape… ▽ More Achieving collision avoidance between moving objects is an important objective while determining robot trajectories. In performing collision avoidance maneuvers, the relative shapes of the objects play an important role. The literature largely models the shapes of the objects as spheres, and this can make the avoidance maneuvers very conservative, especially when the objects are of elongated shape and/or non-convex. In this paper, we model the shapes of the objects using suitable combinations of ellipsoids and one-sheeted/two-sheeted hyperboloids, and employ a collision cone approach to achieve collision avoidance. We present a method to construct the 3-D collision cone, and present simulation results demonstrating the working of the collision avoidance laws. △ Less

Submitted 16 March, 2022; originally announced March 2022.

arXiv:2111.02859 [pdf]

Large Scale Diverse Combinatorial Optimization: ESPN Fantasy Football Player Trades

Authors: Aaron Baughman, Daniel Bohm, Micah Forster, Eduardo Morales, Jeff Powell, Shaun McPartlin, Raja Hebbar, Kavitha Yogaraj, Yoshika Chhabra, Sudeep Ghosh, Rukhsan Ul Haq, Arjun Kashyap

Abstract: Even skilled fantasy football managers can be disappointed by their mid-season rosters as some players inevitably fall short of draft day expectations. Team managers can quickly discover that their team has a low score ceiling even if they start their best active players. A novel and diverse combinatorial optimization system proposes high volume and unique player trades between complementary teams… ▽ More Even skilled fantasy football managers can be disappointed by their mid-season rosters as some players inevitably fall short of draft day expectations. Team managers can quickly discover that their team has a low score ceiling even if they start their best active players. A novel and diverse combinatorial optimization system proposes high volume and unique player trades between complementary teams to balance trade fairness. Several algorithms create the valuation of each fantasy football player with an ensemble of computing models: Quantum Support Vector Classifier with Permutation Importance (QSVC-PI), Quantum Support Vector Classifier with Accumulated Local Effects (QSVC-ALE), Variational Quantum Circuit with Permutation Importance (VQC-PI), Hybrid Quantum Neural Network with Permutation Importance (HQNN-PI), eXtreme Gradient Boosting Classifier (XGB), and Subject Matter Expert (SME) rules. The valuation of each player is personalized based on league rules, roster, and selections. The cost of trading away a player is related to a team's roster, such as the depth at a position, slot count, and position importance. Teams are paired together for trading based on a cosine dissimilarity score so that teams can offset their strengths and weaknesses. A knapsack 0-1 algorithm computes outgoing players for each team. Postprocessors apply analytics and deep learning models to measure 6 different objective measures about each trade. Over the 2020 and 2021 National Football League (NFL) seasons, a group of 24 experts from IBM and ESPN evaluated trade quality through 10 Football Error Analysis Tool (FEAT) sessions. Our system started with 76.9% of high-quality trades and was deployed for the 2021 season with 97.3% of high-quality trades. To increase trade quantity, our quantum, classical, and rules-based computing have 100% trade uniqueness. We use Qiskit's quantum simulators throughout our work. △ Less

Submitted 18 April, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

Comments: 16 pages, 6 figures, 30 equations

arXiv:2110.14903 [pdf]

doi 10.1063/5.0076785

Ultralow Thermal Conductivity and Thermoelectric Properties of Bi4GeTe7 with an Intrinsic van der Waal Heterostructure

Authors: Niraj Kumar Singh, Ankit Kashyap, Ajay Soni

Abstract: Ternary chalcogenides, having large crystalline unit cell and van der Waal stacking of layers, are expected to be poor thermal conductors and good thermoelectric (TE) materials. We are reporting that layered Bi4GeTe7, with alternating quintuplet-septuplet layers of Bi2Te3 and Bi2GeTe4, has an ultralow thermal conductivity, \k{appa}total 0.42 Wm-1K-1 because of high degree of anharmonicity as estim… ▽ More Ternary chalcogenides, having large crystalline unit cell and van der Waal stacking of layers, are expected to be poor thermal conductors and good thermoelectric (TE) materials. We are reporting that layered Bi4GeTe7, with alternating quintuplet-septuplet layers of Bi2Te3 and Bi2GeTe4, has an ultralow thermal conductivity, \k{appa}total 0.42 Wm-1K-1 because of high degree of anharmonicity as estimated from large Gruneisen parameter (γ 4.07) and low Debye temperature (θd 135 K). The electron dominated charge transport has been realized from the Seebeck coefficient, S - 82 uV/K, at 380 K, and Hall carrier concentration of ne ~ 9.8 x 1019 cm-3 at 300 K. Observation of weak antilocalization (WAL), due to spin-orbit coupling (SOC) of heavy Bi and Te, advocate Bi4GeTe7 to be a topological quantum material also. The cross-sectional transmission electron microscopy images show the inherent stacking of hetero-layers, which are leading to a large anharmonicity for poor phonon propagation. Thus, being a poor thermal conductor with a TE figure of merit, ZT ~ 0.24, at 380 K, the Bi4GeTe7 is a good material for TE applications. △ Less

Submitted 28 October, 2021; originally announced October 2021.

Report number: 119, 223903

Journal ref: Applied Physics Letters 2021

arXiv:2109.06790 [pdf, other]

A Deep Learning Approach for Masking Fetal Gender in Ultrasound Images

Authors: Amit Borundiya, Arshak Navruzyan, Dennis Igoschev, Feras C. Oughali, Hemanth Pasupuleti, Mike Fuller, Vinay Kanigicherla, T S Aniruddha Kashyap, Rishabh Chaurasia, Sonali Vinod Jain

Abstract: Ultrasound (US) imaging is highly effective with regards to both cost and versatility in real-time diagnosis; however, determination of fetal gender by US scan in the early stages of pregnancy is also a cause of sex-selective abortion. This work proposes a deep learning object detection approach to accurately mask fetal gender in US images in order to increase the accessibility of the technology.… ▽ More Ultrasound (US) imaging is highly effective with regards to both cost and versatility in real-time diagnosis; however, determination of fetal gender by US scan in the early stages of pregnancy is also a cause of sex-selective abortion. This work proposes a deep learning object detection approach to accurately mask fetal gender in US images in order to increase the accessibility of the technology. We demonstrate how the YOLOv5L architecture exhibits superior performance relative to other object detection models on this task. Our model achieves 45.8% AP[0.5:0.95], 92% F1-score and 0.006 False Positive Per Image rate on our test set. Furthermore, we introduce a bounding box delay rule based on frame-to-frame structural similarity to reduce the false negative rate by 85%, further improving masking reliability. △ Less

Submitted 14 September, 2021; originally announced September 2021.

arXiv:2106.00520 [pdf, other]

doi 10.1103/PhysRevMaterials.5.104402

Importance of intersite Hubbard interactions in $β$-MnO$_2$: A first-principles DFT+$U$+$V$ study

Authors: Ruchika Mahajan, Iurii Timrov, Nicola Marzari, Arti Kashyap

Abstract: We present a first-principles investigation of the structural, electronic, and magnetic properties of pyrolusite ($β$-MnO$_2$) using conventional and extended Hubbard-corrected density-functional theory (DFT+$U$ and DFT+$U$+$V$). The onsite $U$ and intersite $V$ Hubbard parameters are computed using linear-response theory in the framework of density-functional perturbation theory. We show that whi… ▽ More We present a first-principles investigation of the structural, electronic, and magnetic properties of pyrolusite ($β$-MnO$_2$) using conventional and extended Hubbard-corrected density-functional theory (DFT+$U$ and DFT+$U$+$V$). The onsite $U$ and intersite $V$ Hubbard parameters are computed using linear-response theory in the framework of density-functional perturbation theory. We show that while the inclusion of the onsite $U$ is crucial to describe the localized nature of the Mn($3d$) states, the intersite $V$ is key to capture accurately the strong hybridization between neighboring Mn($3d$) and O($2p$) states. In this framework, we stabilize the simplified collinear antiferromagnetic (AFM) ordering (suggested by the Goodenough-Kanamori rule) that is commonly used as an approximation to the experimentally-observed noncollinear screw-type spiral magnetic ordering. A detailed investigation of the ferromagnetic and of other three collinear AFM spin configurations is also presented. The findings from Hubbard-corrected DFT are discussed using two kinds of Hubbard manifolds - nonorthogonalized and orthogonalized atomic orbitals - showing that special attention must be given to the choice of the Hubbard projectors, with orthogonalized manifolds providing more accurate results than nonorthogonalized ones within DFT+$U$+$V$. This work paves the way for future studies of complex transition-metal compounds containing strongly localized electrons in the presence of pronounced covalent interactions. △ Less

Submitted 5 October, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

Comments: 17 pages, 9 figures

Journal ref: Phys. Rev. Materials 5, 104402 (2021)

arXiv:2105.10393 [pdf, other]

ReLUSyn: Synthesizing Stealthy Attacks for Deep Neural Network Based Cyber-Physical Systems

Authors: Aarti Kashyap, Syed Mubashir Iqbal, Karthik Pattabiraman, Margo Seltzer

Abstract: Cyber Physical Systems (cps) are deployed in many mission-critical settings, such as medical devices, autonomous vehicular systems and aircraft control management systems. As more and more CPS adopt Deep Neural Networks (Deep Neural Network (dnns), these systems can be vulnerable to attacks. . Prior work has demonstrated the susceptibility of CPS to False Data Injection Attacks (False Data Injecti… ▽ More Cyber Physical Systems (cps) are deployed in many mission-critical settings, such as medical devices, autonomous vehicular systems and aircraft control management systems. As more and more CPS adopt Deep Neural Networks (Deep Neural Network (dnns), these systems can be vulnerable to attacks. . Prior work has demonstrated the susceptibility of CPS to False Data Injection Attacks (False Data Injection Attacks (fdias), which can cause significant damage. We identify a new category of attacks on these systems. In this paper, we demonstrate that DNN based CPS are also subject to these attacks. These attacks, which we call Ripple False Data Injection Attacks (rfdia), use minimal input perturbations to stealthily change the dnn output. The input perturbations propagate as ripples through multiple dnn layers to affect the output in a targeted manner. We develop an automated technique to synthesize rfdias against DNN-based CPS. Our technique models the attack as an optimization problem using Mixed Integer Linear Programming (Mixed Integer Linear Program (milp)). We define an abstraction for dnnbased cps that allows us to automatically: 1) identify the critical inputs, and 2) find the smallest perturbations that produce output changes. We demonstrate our technique on three practical cps with two mission-critical applications: an (Artifical Pancreas System (aps)) and two aircraft control management systems (Horizontal Collision Avoidance System (hcas) and Collision Avoidance System-Xu (acas-xu)). △ Less

Submitted 21 May, 2021; originally announced May 2021.

arXiv:2104.11272 [pdf, other]

Towards Offloadable and Migratable Microservices on Disaggregated Architectures: Vision, Challenges, and Research Roadmap

Authors: Xiaoyi Lu, Arjun Kashyap

Abstract: Microservice and serverless computing systems open up massive versatility and opportunity to distributed and datacenter-scale computing. In the meantime, the deployments of modern datacenter resources are moving to disaggregated architectures. With the flourishing growths from both sides, we believe this is high time to write this vision paper to propose a potential research agenda to achieve effi… ▽ More Microservice and serverless computing systems open up massive versatility and opportunity to distributed and datacenter-scale computing. In the meantime, the deployments of modern datacenter resources are moving to disaggregated architectures. With the flourishing growths from both sides, we believe this is high time to write this vision paper to propose a potential research agenda to achieve efficient deployments, management, and executions of next-generation microservices on top of the emerging disaggregated datacenter architectures. In particular, we envision a critical systems research direction of designing and develo** offloadable and migratable microservices on disaggregated architectures. With this vision, we have surveyed the recent related work to demonstrate the importance and necessity of researching it. We also outline the fundamental challenges that distributed systems and datacenter-scale computing research may encounter. We further propose a research roadmap to achieve our envisioned objectives in a promising way. Within the roadmap, we identify potential techniques and methods that can be leveraged. △ Less

Submitted 22 April, 2021; originally announced April 2021.

Comments: 7 pages, 2 figures, WORDS'21, co-located with ASPLOS'21

Journal ref: WORDS 2021: The Second Workshop On Resource Disaggregation and Serverless

arXiv:2012.02890 [pdf, other]

Towards a Domain Specific Solution for a New Generation of Wireless Modems

Authors: Alan Gatherer, Ashish Shrivastava, Hao Luan, Asheesh Kashyap, Zhenguo Gu, Miguel Dajer

Abstract: Wireless cellular System on Chip (SoC) are experiencing unprecedented demands on data rate, latency use case variety. 5G wireless technologies require a massive number of antennas and complex signal processing to improve bandwidth and spectral efficiency. The Internet of Things is causing a proliferation in the number of connected devices, and service categories, such as ultra-reliable low latency… ▽ More Wireless cellular System on Chip (SoC) are experiencing unprecedented demands on data rate, latency use case variety. 5G wireless technologies require a massive number of antennas and complex signal processing to improve bandwidth and spectral efficiency. The Internet of Things is causing a proliferation in the number of connected devices, and service categories, such as ultra-reliable low latency, which will produce new use cases, such as self-driving cars, robotic factories, and remote surgery. In addressing these challenges, we can no longer rely on faster cores, or even more silicon. Modem software development is becoming increasingly error prone and difficult as the complexity of the applications and the architectures increase. In this report we propose a Wireless Domain Specific Solution that takes a Dataflow acceleration approach and addresses the need of the SoC to support dataflows that change with use case and user activity, while maintaining the Firm Real Time High Availability with low probability of Heisenbugs that is required in cellular modems. We do this by develo** a Domain Specific Architecture that describes the requirements in a suitably abstracted dataflow Domain Specific language. A toolchain is described that automates translation of those requirements in an efficient and robust manner and provides formal guarantees against Heisenbugs. The dataflow native DSA supports the toolchain output with specialized processing, data management and control features with high performance and low power, and recovers rapidly from dropped dataflows while continuing to achieve the real time requirements. This report focuses on the dataflow acceleration in the DSA and the part of the automated toolchain that formally checks the performance and correctness of software running on this dataflow hardware. Results are presented and a summary of future work is given. △ Less

Submitted 4 December, 2020; originally announced December 2020.

Comments: 49 pages

arXiv:2010.12198 [pdf, other]

Domain Divergences: a Survey and Empirical Analysis

Authors: Abhinav Ramesh Kashyap, Devamanyu Hazarika, Min-Yen Kan, Roger Zimmermann

Abstract: Domain divergence plays a significant role in estimating the performance of a model in new domains. While there is a significant literature on divergence measures, researchers find it hard to choose an appropriate divergence for a given NLP application. We address this shortcoming by both surveying the literature and through an empirical study. We develop a taxonomy of divergence measures consisti… ▽ More Domain divergence plays a significant role in estimating the performance of a model in new domains. While there is a significant literature on divergence measures, researchers find it hard to choose an appropriate divergence for a given NLP application. We address this shortcoming by both surveying the literature and through an empirical study. We develop a taxonomy of divergence measures consisting of three classes -- Information-theoretic, Geometric, and Higher-order measures and identify the relationships between them. Further, to understand the common use-cases of these measures, we recognise three novel applications -- 1) Data Selection, 2) Learning Representation, and 3) Decisions in the Wild -- and use it to organise our literature. From this, we identify that Information-theoretic measures are prevalent for 1) and 3), and Higher-order measures are more common for 2). To further help researchers choose appropriate measures to predict drop in performance -- an important aspect of Decisions in the Wild, we perform correlation analysis spanning 130 domain adaptation scenarios, 3 varied NLP tasks and 12 divergence measures identified from our survey. To calculate these divergences, we consider the current contextual word representations (CWR) and contrast with the older distributed representations. We find that traditional measures over word distributions still serve as strong baselines, while higher-order measures with CWR are effective. △ Less

Submitted 19 April, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

Comments: Accepted for publication in 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)

arXiv:2010.01588 [pdf, other]

Collaborative Tracking and Capture of Aerial Object using UAVs

Authors: Lima Agnel Tony, Shuvrangshu Jana, Varun V P, Vidyadhara B V, Mohitvishnu S Gadde, Abhishek Kashyap, Rahul Ravichandran, Debasish Ghose

Abstract: This work details the problem of aerial target capture using multiple UAVs. This problem is motivated from the challenge 1 of Mohammed Bin Zayed International Robotic Challenge 2020. The UAVs utilise visual feedback to autonomously detect target, approach it and capture without disturbing the vehicle which carries the target. Multi-UAV collaboration improves the efficiency of the system and increa… ▽ More This work details the problem of aerial target capture using multiple UAVs. This problem is motivated from the challenge 1 of Mohammed Bin Zayed International Robotic Challenge 2020. The UAVs utilise visual feedback to autonomously detect target, approach it and capture without disturbing the vehicle which carries the target. Multi-UAV collaboration improves the efficiency of the system and increases the chance of capturing the ball robustly in short span of time. In this paper, the proposed architecture is validated through simulation in ROS-Gazebo environment and is further implemented on hardware. △ Less

Submitted 4 October, 2020; originally announced October 2020.

Journal ref: MBZIRC Symposium 2020, ADNEC, Abu Dhabi

arXiv:2006.11379 [pdf]

Improving Train Track Safety using Drones, Computer Vision and Machine Learning

Authors: Kirthi Kumar, Anuraag Kaashyap

Abstract: Millions of human casualties resulting from train accidents globally are caused by the inefficient, manual track inspections. Government agencies are seriously concerned about the safe operations of the rail industry after series of accidents reported across e USA and around the globe, mainly attributed to track defects. Casualties resulting from track defects result in billions of dollars loss in… ▽ More Millions of human casualties resulting from train accidents globally are caused by the inefficient, manual track inspections. Government agencies are seriously concerned about the safe operations of the rail industry after series of accidents reported across e USA and around the globe, mainly attributed to track defects. Casualties resulting from track defects result in billions of dollars loss in public and private investments and loss of revenue due to downtime, ultimately resulting in loss of the public's confidence. The manual, mundane, and expensive monitoring of rail track safety can be transform through the use of drones, computer vision, and machine learning. The primary goal of this study is to develop multiple algorithms that implement supervised and semi-supervised learning that accurately analyze whether a track is safe or unsafe based on simulated training data of train tracks. This includes being able to develop a Convolutional Neural Network that can identify track defects using supervised learning without having to specify a particular algorithm for detecting those defects, and that the new model would both speed up and improve the quality of the track defect detection process, accompanied with a computer vision image-processing algorithm. Our other goals included designing and building a prototype representation of train tracks to simulate track defects, to precisely and consistently conduct the visual inspection using drones. Ultimately, the goal demonstrates that the state of good repairs in railway tracks can be attained through the use of drones, computer vision and machine learning. △ Less

Submitted 4 June, 2020; originally announced June 2020.

Comments: 27 pages and 19 figures

arXiv:2004.03807 [pdf, other]

SciWING -- A Software Toolkit for Scientific Document Processing

Authors: Abhinav Ramesh Kashyap, Min-Yen Kan

Abstract: We introduce SciWING, an open-source software toolkit which provides access to pre-trained models for scientific document processing tasks, inclusive of citation string parsing and logical structure recovery. SciWING enables researchers to rapidly experiment with different models by swap** and stacking different modules. It also enables them declare and run models from a configuration file. It e… ▽ More We introduce SciWING, an open-source software toolkit which provides access to pre-trained models for scientific document processing tasks, inclusive of citation string parsing and logical structure recovery. SciWING enables researchers to rapidly experiment with different models by swap** and stacking different modules. It also enables them declare and run models from a configuration file. It enables researchers to perform production-ready transfer learning from general, pre-trained transformers (i.e., BERT, SciBERT etc), and aids development of end-user applications. It includes ready-to-use web and terminal-based applications and demonstrations (Available from http://sciwing.io). △ Less

Submitted 23 October, 2020; v1 submitted 8 April, 2020; originally announced April 2020.

Comments: 6 pages, 3 figures, First Workshop on Scholarly Document Processing - SDP@EMNLP 2020

arXiv:1911.12187 [pdf, other]

doi 10.1103/PhysRevB.102.144435

Asymmetric modification of the magnetic proximity effect in Pt/Co/Pt trilayers by the insertion of a Ta buffer layer

Authors: Ankan Mukhopadhyay, Sarathlal Koyiloth Vayalil, Dominik Graulich, Imran Ahamed, Sonia Francoual, Arti Kashyap, Timo Kuschel, P S Anil Kumar

Abstract: The magnetic proximity effect in top and bottom Pt layers induced by Co in Ta/Pt/Co/Pt multilayers has been studied by interface sensitive, element specific x-ray resonant magnetic reflectivity. The asymmetry ratio for circularly polarized x-rays of left and right helicity has been measured at the Pt $L_3$ absorption edge (11567 eV) with an in-plane magnetic field ($\pm158$ mT) to verify its magne… ▽ More The magnetic proximity effect in top and bottom Pt layers induced by Co in Ta/Pt/Co/Pt multilayers has been studied by interface sensitive, element specific x-ray resonant magnetic reflectivity. The asymmetry ratio for circularly polarized x-rays of left and right helicity has been measured at the Pt $L_3$ absorption edge (11567 eV) with an in-plane magnetic field ($\pm158$ mT) to verify its magnetic origin. The proximity-induced magnetic moment in the bottom Pt layer decreases with the thickness of the Ta buffer layer. Grazing incidence x-ray diffraction has been carried out to show that the Ta buffer layer induces the growth of Pt(011) rather than Pt(111) which in turn reduces the induced moment. A detailed density functional theory study shows that an adjacent Co layer induces more magnetic moment in Pt(111) than in Pt(011). The manipulation of the magnetism in Pt by the insertion of a Ta buffer layer provides a new way of controlling the magnetic proximity effect which is of huge importance in spin-transport experiments across similar kind of interfaces. △ Less

Submitted 27 November, 2019; originally announced November 2019.

Comments: 7 pages, 9 figures

Journal ref: Phys. Rev. B 102, 144435 (2020)

arXiv:1807.08113 [pdf, other]

Estimating Stellar Atmospheric Parameters by Automated Methods Using SSLs

Authors: Kaushal Sharma, H. P. Singh, A. Kashyap, P. Prugniel

Abstract: Libraries of stellar spectra, such as ELODIE (Prugniel & Soubiran 2001), CFLIB (Valdes et al. 2004), or MILES (Sánchez-Blázquez et al. 2006), are used for a variety of applications, and especially in modelling stellar populations (e. g. Le Borgne et al. (2004)). In that context, apart from the completeness and quality of these spectral databases (Singh et al. 2006), the accurate calibration of ste… ▽ More Libraries of stellar spectra, such as ELODIE (Prugniel & Soubiran 2001), CFLIB (Valdes et al. 2004), or MILES (Sánchez-Blázquez et al. 2006), are used for a variety of applications, and especially in modelling stellar populations (e. g. Le Borgne et al. (2004)). In that context, apart from the completeness and quality of these spectral databases (Singh et al. 2006), the accurate calibration of stellar atmospheric parameters, temperature (Teff), surface gravity (log g), and metallicity ([Fe/H]), is known to be critical (Prugniel et al. 2007; Percival & Salaris 2009). We discuss the technique of determining stellar atmospheric parameters accurately by `full spectrum fitting'. △ Less

Submitted 21 July, 2018; originally announced July 2018.

Comments: 4 pages, 2 figures, Proceeding for the "International Workshop on Spectral Stellar Libraries, 2017 (IWSSL2017)" held in Sao Paulo, Brazil, February 6-10, 2017. ASI Conference Series, 2017, Vol. 14. Edited by P. Coelho, L. Martins & E. Griffin, pp. 69-72, published

arXiv:1805.03175 [pdf, other]

Voltron: Understanding and Exploiting the Voltage-Latency-Reliability Trade-Offs in Modern DRAM Chips to Improve Energy Efficiency

Authors: Kevin K. Chang, Abdullah Giray Yaglıkçı, Saugata Ghose, Aditya Agrawal, Niladrish Chatterjee, Abhijith Kashyap, Donghyuk Lee, Mike O'Connor, Hasan Hassan, Onur Mutlu

Abstract: This paper summarizes our work on experimental characterization and analysis of reduced-voltage operation in modern DRAM chips, which was published in SIGMETRICS 2017, and examines the work's significance and future potential. We take a comprehensive approach to understanding and exploiting the latency and reliability characteristics of modern DRAM when the DRAM supply voltage is lowered below t… ▽ More This paper summarizes our work on experimental characterization and analysis of reduced-voltage operation in modern DRAM chips, which was published in SIGMETRICS 2017, and examines the work's significance and future potential. We take a comprehensive approach to understanding and exploiting the latency and reliability characteristics of modern DRAM when the DRAM supply voltage is lowered below the nominal voltage level specified by DRAM standards. We perform an experimental study of 124 real DDR3L (low-voltage) DRAM chips manufactured recently by three major DRAM vendors. We find that reducing the supply voltage below a certain point introduces bit errors in the data, and we comprehensively characterize the behavior of these errors. We discover that these errors can be avoided by increasing the latency of three major DRAM operations (activation, restoration, and precharge). We perform detailed DRAM circuit simulations to validate and explain our experimental findings. We also characterize the various relationships between reduced supply voltage and error locations, stored data patterns, DRAM temperature, and data retention. Based on our observations, we propose a new DRAM energy reduction mechanism, called Voltron. The key idea of Voltron is to use a performance model to determine by how much we can reduce the supply voltage without introducing errors and without exceeding a user-specified threshold for performance loss. Our evaluations show that Voltron reduces the average DRAM and system energy consumption by 10.5% and 7.3%, respectively, while limiting the average system performance loss to only 1.8%, for a variety of memory-intensive quad-core workloads. We also show that Voltron significantly outperforms prior dynamic voltage and frequency scaling mechanisms for DRAM. △ Less

Submitted 8 May, 2018; originally announced May 2018.

arXiv:1805.03154 [pdf, other]

Flexible-Latency DRAM: Understanding and Exploiting Latency Variation in Modern DRAM Chips

Authors: Kevin K. Chang, Abhijith Kashyap, Hasan Hassan, Saugata Ghose, Kevin Hsieh, Donghyuk Lee, Tianshi Li, Gennady Pekhimenko, Samira Khan, Onur Mutlu

Abstract: This article summarizes key results of our work on experimental characterization and analysis of latency variation and latency-reliability trade-offs in modern DRAM chips, which was published in SIGMETRICS 2016, and examines the work's significance and future potential. The goal of this work is to (i) experimentally characterize and understand the latency variation across cells within a DRAM chi… ▽ More This article summarizes key results of our work on experimental characterization and analysis of latency variation and latency-reliability trade-offs in modern DRAM chips, which was published in SIGMETRICS 2016, and examines the work's significance and future potential. The goal of this work is to (i) experimentally characterize and understand the latency variation across cells within a DRAM chip for these three fundamental DRAM operations, and (ii) develop new mechanisms that exploit our understanding of the latency variation to reliably improve performance. To this end, we comprehensively characterize 240 DRAM chips from three major vendors, and make six major new observations about latency variation within DRAM. Notably, we find that (i) there is large latency variation across the cells for each of the three operations; (ii) variation characteristics exhibit significant spatial locality: slower cells are clustered in certain regions of a DRAM chip; and (iii) the three fundamental operations exhibit different reliability characteristics when the latency of each operation is reduced. Based on our observations, we propose Flexible-LatencY DRAM (FLY-DRAM), a mechanism that exploits latency variation across DRAM cells within a DRAM chip to improve system performance. The key idea of FLY-DRAM is to exploit the spatial locality of slower cells within DRAM, and access the faster DRAM regions with reduced latencies for the fundamental operations. Our evaluations show that FLY-DRAM improves the performance of a wide range of applications by 13.3%, 17.6%, and 19.5%, on average, for each of the three different vendors' real DRAM chips, in a simulated 8-core system. △ Less

Submitted 8 May, 2018; originally announced May 2018.

arXiv:1708.01092 [pdf, other]

Chaotic Properties of Single Element Nonlinear Chimney Model: Effect of Directionality

Authors: Anisha R. V. Kashyap, Kiran M. Kolwankar

Abstract: We generalize the chimney model by introducing nonlinear restoring and gravitational forces for the purpose of modeling swaying of trees at high wind speeds. We have derived general equations governing the system using Lagrangian formulation. We have studied the simplest case of a single element in more detail. The governing equation we arrive at for this case has not been studied so far. We study… ▽ More We generalize the chimney model by introducing nonlinear restoring and gravitational forces for the purpose of modeling swaying of trees at high wind speeds. We have derived general equations governing the system using Lagrangian formulation. We have studied the simplest case of a single element in more detail. The governing equation we arrive at for this case has not been studied so far. We study the chaotic properties of this simple building block and also the effect of directionality in the wind on the chaotic properties. We also consider the special case of two elements. △ Less

Submitted 21 February, 2019; v1 submitted 3 August, 2017; originally announced August 2017.

Comments: To appear in IJBC

arXiv:1705.10292 [pdf, other]

doi 10.1145/3084447

Understanding Reduced-Voltage Operation in Modern DRAM Chips: Characterization, Analysis, and Mechanisms

Authors: Kevin K. Chang, Abdullah Giray Yağlıkçı, Saugata Ghose, Aditya Agrawal, Niladrish Chatterjee, Abhijith Kashyap, Donghyuk Lee, Mike O'Connor, Hasan Hassan, Onur Mutlu

Abstract: The energy consumption of DRAM is a critical concern in modern computing systems. Improvements in manufacturing process technology have allowed DRAM vendors to lower the DRAM supply voltage conservatively, which reduces some of the DRAM energy consumption. We would like to reduce the DRAM supply voltage more aggressively, to further reduce energy. Aggressive supply voltage reduction requires a tho… ▽ More The energy consumption of DRAM is a critical concern in modern computing systems. Improvements in manufacturing process technology have allowed DRAM vendors to lower the DRAM supply voltage conservatively, which reduces some of the DRAM energy consumption. We would like to reduce the DRAM supply voltage more aggressively, to further reduce energy. Aggressive supply voltage reduction requires a thorough understanding of the effect voltage scaling has on DRAM access latency and DRAM reliability. In this paper, we take a comprehensive approach to understanding and exploiting the latency and reliability characteristics of modern DRAM when the supply voltage is lowered below the nominal voltage level specified by DRAM standards. Using an FPGA-based testing platform, we perform an experimental study of 124 real DDR3L (low-voltage) DRAM chips manufactured recently by three major DRAM vendors. We find that reducing the supply voltage below a certain point introduces bit errors in the data, and we comprehensively characterize the behavior of these errors. We discover that these errors can be avoided by increasing the latency of three major DRAM operations (activation, restoration, and precharge). We perform detailed DRAM circuit simulations to validate and explain our experimental findings. We also characterize the various relationships between reduced supply voltage and error locations, stored data patterns, DRAM temperature, and data retention. Based on our observations, we propose a new DRAM energy reduction mechanism, called Voltron. The key idea of Voltron is to use a performance model to determine by how much we can reduce the supply voltage without introducing errors and without exceeding a user-specified threshold for performance loss. Voltron reduces the average system energy by 7.3% while limiting the average system performance loss to only 1.8%, for a variety of workloads. △ Less

Submitted 29 May, 2017; originally announced May 2017.

Comments: 25 pages, 25 figures, 7 tables, Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS)

arXiv:1704.00631 [pdf, other]

Detection of Copy-move Image forgery using SVD and Cuckoo Search Algorithm

Authors: Abhishek Kashyap, Megha Agarwal, Hariom Gupta

Abstract: Copy-move forgery is one of the simple and effective operations to create forged images. Recently, techniques based on singular value decomposition (SVD) are widely used to detect copy-move forgery (CMF). Some approaches based on SVD are most acceptable to detect copy-move forgery but some copy-move forgery detection approaches can not produce satisfactory detection results. Sometimes these approa… ▽ More Copy-move forgery is one of the simple and effective operations to create forged images. Recently, techniques based on singular value decomposition (SVD) are widely used to detect copy-move forgery (CMF). Some approaches based on SVD are most acceptable to detect copy-move forgery but some copy-move forgery detection approaches can not produce satisfactory detection results. Sometimes these approaches may even produce error results. According to our observation, detection result produced using SVD depend highly on those parameters whose values are often determined with experiences. These values are only applicable to a few images, which limit their application. To solve this problem, a novel approach named as copy-move forgery detection using Cuckoo search algorithm (CMFD-CS) is proposed in this paper. CMFD-CS integrates the CS algorithm into SVD. It utilizes the CS algorithm to generate customized parameter values for images, which are used CMFD under block-based framework. △ Less

Submitted 3 April, 2017; originally announced April 2017.

Comments: 8 pages, 10 figures

arXiv:1703.09968 [pdf, other]

An Evaluation of Digital Image Forgery Detection Approaches

Authors: Abhishek Kashyap, Rajesh Singh Parmar, Megha Agrawal, Hariom Gupta

Abstract: With the headway of the advanced image handling software and altering tools, a computerized picture can be effectively controlled. The identification of image manipulation is vital in light of the fact that an image can be utilized as legitimate confirmation, in crime scene investigation, and in numerous different fields. The image forgery detection techniques intend to confirm the credibility of… ▽ More With the headway of the advanced image handling software and altering tools, a computerized picture can be effectively controlled. The identification of image manipulation is vital in light of the fact that an image can be utilized as legitimate confirmation, in crime scene investigation, and in numerous different fields. The image forgery detection techniques intend to confirm the credibility of computerized pictures with no prior information about the original image. There are numerous routes for altering a picture, for example, resampling, splicing, and copy-move. In this paper, we have examined different type of image forgery and their detection techniques; mainly we focused on pixel based image forgery detection techniques. △ Less

Submitted 30 March, 2017; v1 submitted 29 March, 2017; originally announced March 2017.

arXiv:1608.07381 [pdf]

Measurement of Wave Electric Fields in Plasmas by Electro-Optic Probe

Authors: M. Nishiura, Z. Yoshida, T. Mushiake, Y. Kawazura, R. Osawa, K. Fu**ami, Y. Yano, H. Saitoh, M. Yamasaki, A. Kashyap, N. Takahashi, M. Nakatsuka, A. Fukuyama

Abstract: Electric field measurement in plasmas permits quantitative comparison between the experiment and the simulation in this study. An electro-optic (EO) sensor based on Pockels effect is demonstrated to measure wave electric fields in the laboratory magnetosphere of the RT-1 device with high frequency heating sources. This system gives the merits that electric field measurements can detect electrostat… ▽ More Electric field measurement in plasmas permits quantitative comparison between the experiment and the simulation in this study. An electro-optic (EO) sensor based on Pockels effect is demonstrated to measure wave electric fields in the laboratory magnetosphere of the RT-1 device with high frequency heating sources. This system gives the merits that electric field measurements can detect electrostatic waves separated clearly from wave magnetic fields, and that the sensor head is separated electrically from strong stray fields in circumference. The electromagnetic waves are excited at the double loop antenna for ion heating in electron cyclotron heated plasmas. In the air, the measured wave electric fields are in good absolute agreement with those predicted by the TASK/WF2 code. In inhomogeneous plasmas, the wave electric fields in the peripheral region are enhanced compared with the simulated electric fields. The potential oscillation of the antenna is one of the possible reason to explain the experimental results qualitatively. △ Less

Submitted 26 August, 2016; originally announced August 2016.

Comments: submitted to Review of Scientific Instruments

arXiv:1602.01165 [pdf, other]

doi 10.1585/pfr.11.2402024

Anisotropy in broad component of H$α$ line in the magnetospheric device RT-1

Authors: Yohei Kawazura, Noriki Takahashi, Zensho Yoshida, Masaki Nishiura, Tomoaki Nogami, Ankur Kashyap, Yoshihisa Yano, Haruhiko Saitoh, Miyuri Yamasaki, Toshiki Mushiake, Masataka Nakatsuka

Abstract: Temperature anisotropy in broad component of H$α$ line was found in the ring trap 1 (RT-1) device by Doppler spectroscopy. Since hot hydrogen neutrals emitting a broad component are mainly produced by charge exchange between neutrals and protons, the anisotropy in the broad component is the evidence of proton temperature anisotropy generated by betatron acceleration. Temperature anisotropy in broad component of H$α$ line was found in the ring trap 1 (RT-1) device by Doppler spectroscopy. Since hot hydrogen neutrals emitting a broad component are mainly produced by charge exchange between neutrals and protons, the anisotropy in the broad component is the evidence of proton temperature anisotropy generated by betatron acceleration. △ Less

Submitted 2 February, 2016; originally announced February 2016.

Journal ref: Plasma and Fusion Research 11, 2402024 (2016)

arXiv:1509.03556 [pdf, other]

Teaching Python programming with automatic assessment and feedback provision

Authors: Hans Fangohr, Neil O'Brien, Anil Prabhakar, Arti Kashyap

Abstract: We describe a method of automatic feedback provision for students learning programming and computational methods in Python. We have implemented, used and refined this system since 2009 for growing student numbers, and summarise the design and experience of using it. The core idea is to use a unit testing framework: the teacher creates a set of unit tests, and the student code is tested by running… ▽ More We describe a method of automatic feedback provision for students learning programming and computational methods in Python. We have implemented, used and refined this system since 2009 for growing student numbers, and summarise the design and experience of using it. The core idea is to use a unit testing framework: the teacher creates a set of unit tests, and the student code is tested by running these tests. With our implementation, students typically submit work for assessment, and receive feedback by email within a few minutes after submission. The choice of tests and the reporting back to the student is chosen to optimise the educational value for the students. The system very significantly reduces the staff time required to establish whether a student's solution is correct, and shifts the emphasis of computing laboratory student contact time from assessing correctness to providing guidance. The self-paced nature of the automatic feedback provision supports a student-centred learning approach. Students can re-submit their work repeatedly and iteratively improve their solution, and enjoy using the system. We include an evaluation of the system and data from using it in a class of 425 students. △ Less

Submitted 11 September, 2015; originally announced September 2015.

Comments: 26 pages

arXiv:1507.03692 [pdf, other]

doi 10.1063/1.4935894

Observation of particle acceleration in laboratory magnetosphere

Authors: Yohei Kawazura, Zensho Yoshida, Masaki Nishiura, Haruhiko Saitoh, Yoshihisa Yano, Tomoaki Nogami, Naoki Sato, Miyuri Yamasaki, Ankur Kashyap, Toshiki Mushiake

Abstract: The self-organization of magnetospheric plasma is brought about by inward diffusion of magnetized particles. Not only creating a density gradient toward the center of a dipole magnetic field, the inward diffusion also accelerates particles and provides a planetary radiation belt with high energy particles. Here, we report the first experimental observation of a 'laboratory radiation belt' created… ▽ More The self-organization of magnetospheric plasma is brought about by inward diffusion of magnetized particles. Not only creating a density gradient toward the center of a dipole magnetic field, the inward diffusion also accelerates particles and provides a planetary radiation belt with high energy particles. Here, we report the first experimental observation of a 'laboratory radiation belt' created in the Ring Trap 1 (RT-1) device. By spectroscopic measurement, we found an appreciable anisotropy in the ion temperature, proving the betatron acceleration mechanism which heats particles in the perpendicular direction with respect to the magnetic field when particles move inward. The energy balance model including the heating mechanism explains the observed ion temperature profile. △ Less

Submitted 18 November, 2015; v1 submitted 13 July, 2015; originally announced July 2015.

Journal ref: Physics of Plasmas 22, 112503 (2015)

arXiv:1406.3692 [pdf, other]

Analyzing Social and Stylometric Features to Identify Spear phishing Emails

Authors: Prateek Dewan, Anand Kashyap, Ponnurangam Kumaraguru

Abstract: Spear phishing is a complex targeted attack in which, an attacker harvests information about the victim prior to the attack. This information is then used to create sophisticated, genuine-looking attack vectors, drawing the victim to compromise confidential information. What makes spear phishing different, and more powerful than normal phishing, is this contextual information about the victim. Onl… ▽ More Spear phishing is a complex targeted attack in which, an attacker harvests information about the victim prior to the attack. This information is then used to create sophisticated, genuine-looking attack vectors, drawing the victim to compromise confidential information. What makes spear phishing different, and more powerful than normal phishing, is this contextual information about the victim. Online social media services can be one such source for gathering vital information about an individual. In this paper, we characterize and examine a true positive dataset of spear phishing, spam, and normal phishing emails from Symantec's enterprise email scanning service. We then present a model to detect spear phishing emails sent to employees of 14 international organizations, by using social features extracted from LinkedIn. Our dataset consists of 4,742 targeted attack emails sent to 2,434 victims, and 9,353 non targeted attack emails sent to 5,912 non victims; and publicly available information from their LinkedIn profiles. We applied various machine learning algorithms to this labeled data, and achieved an overall maximum accuracy of 97.76% in identifying spear phishing emails. We used a combination of social features from LinkedIn profiles, and stylometric features extracted from email subjects, bodies, and attachments. However, we achieved a slightly better accuracy of 98.28% without the social features. Our analysis revealed that social features extracted from LinkedIn do not help in identifying spear phishing emails. To the best of our knowledge, this is one of the first attempts to make use of a combination of stylometric features extracted from emails, and social features extracted from an online social network to detect targeted spear phishing emails. △ Less

Submitted 14 June, 2014; originally announced June 2014.

Comments: Detection of spear phishing using social media features

arXiv:0710.3974 [pdf, ps, other]

Distributed source coding in dense sensor networks

Authors: Akshay Kashyap, Luis Alfonso Lastras-Montaño, Cathy Xia, Zhen Liu

Abstract: We study the problem of the reconstruction of a Gaussian field defined in [0,1] using N sensors deployed at regular intervals. The goal is to quantify the total data rate required for the reconstruction of the field with a given mean square distortion. We consider a class of two-stage mechanisms which a) send information to allow the reconstruction of the sensor's samples within sufficient accur… ▽ More We study the problem of the reconstruction of a Gaussian field defined in [0,1] using N sensors deployed at regular intervals. The goal is to quantify the total data rate required for the reconstruction of the field with a given mean square distortion. We consider a class of two-stage mechanisms which a) send information to allow the reconstruction of the sensor's samples within sufficient accuracy, and then b) use these reconstructions to estimate the entire field. To implement the first stage, the heavy correlation between the sensor samples suggests the use of distributed coding schemes to reduce the total rate. We demonstrate the existence of a distributed block coding scheme that achieves, for a given fidelity criterion for the reconstruction of the field, a total information rate that is bounded by a constant, independent of the number $N$ of sensors. The constant in general depends on the autocorrelation function of the field and the desired distortion criterion for the sensor samples. We then describe a scheme which can be implemented using only scalar quantizers at the sensors, without any use of distributed source coding, and which also achieves a total information rate that is a constant, independent of the number of sensors. While this scheme operates at a rate that is greater than the rate achievable through distributed coding and entails greater delay in reconstruction, its simplicity makes it attractive for implementation in sensor networks. △ Less

Submitted 22 October, 2007; originally announced October 2007.

Comments: This is an extended version of the paper which appeared in the proceedings of, and was presented at, DCC 2005

Showing 1–47 of 47 results for author: Kaashyap, A