-
High-Throughput Phenoty** using Computer Vision and Machine Learning
Authors:
Vivaan Singhvi,
Langalibalele Lunga,
Pragya Nidhi,
Chris Keum,
Varrun Prakash
Abstract:
High-throughput phenoty** refers to the non-destructive and efficient evaluation of plant phenotypes. In recent years, it has been coupled with machine learning in order to improve the process of phenoty** plants by increasing efficiency in handling large datasets and develo** methods for the extraction of specific traits. Previous studies have developed methods to advance these challenges t…
▽ More
High-throughput phenoty** refers to the non-destructive and efficient evaluation of plant phenotypes. In recent years, it has been coupled with machine learning in order to improve the process of phenoty** plants by increasing efficiency in handling large datasets and develo** methods for the extraction of specific traits. Previous studies have developed methods to advance these challenges through the application of deep neural networks in tandem with automated cameras; however, the datasets being studied often excluded physical labels. In this study, we used a dataset provided by Oak Ridge National Laboratory with 1,672 images of Populus Trichocarpa with white labels displaying treatment (control or drought), block, row, position, and genotype. Optical character recognition (OCR) was used to read these labels on the plants, image segmentation techniques in conjunction with machine learning algorithms were used for morphological classifications, machine learning models were used to predict treatment based on those classifications, and analyzed encoded EXIF tags were used for the purpose of finding leaf size and correlations between phenotypes. We found that our OCR model had an accuracy of 94.31% for non-null text extractions, allowing for the information to be accurately placed in a spreadsheet. Our classification models identified leaf shape, color, and level of brown splotches with an average accuracy of 62.82%, and plant treatment with an accuracy of 60.08%. Finally, we identified a few crucial pieces of information absent from the EXIF tags that prevented the assessment of the leaf size. There was also missing information that prevented the assessment of correlations between phenotypes and conditions. However, future studies could improve upon this to allow for the assessment of these features.
△ Less
Submitted 9 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
Welfare-Optimal Serial Dictatorships have Polynomial Query Complexity
Authors:
Ioannis Caragiannis,
Kurt Mehlhorn,
Nidhi Rathi
Abstract:
Serial dictatorship is a simple mechanism for coordinating agents in solving combinatorial optimization problems according to their preferences. The most representative such problem is one-sided matching, in which a set of n agents have values for a set of n items, and the objective is to compute a matching of the agents to the items of maximum total value (a.k.a., social welfare). Following the r…
▽ More
Serial dictatorship is a simple mechanism for coordinating agents in solving combinatorial optimization problems according to their preferences. The most representative such problem is one-sided matching, in which a set of n agents have values for a set of n items, and the objective is to compute a matching of the agents to the items of maximum total value (a.k.a., social welfare). Following the recent framework of Caragiannis and Rathi [10], we consider a model in which the agent-item values are not available upfront but become known by querying agent sequences. In particular, when the agents are asked to act in a sequence, they respond by picking their favorite item that has not been picked by agents who acted before and reveal their value for it. Can we compute an agent sequence that induces a social welfare-optimal matching?
We answer this question affirmatively and present an algorithm that uses polynomial number (n^5) of queries. This solves the main open problem stated by Caragiannis and Rathi [CR23]. Our analysis uses a potential function argument that measures progress towards learning the underlying edge-weight information. Furthermore, the algorithm has a truthful implementation by adapting the paradigm of VCG payments.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Actionable Cyber Threat Intelligence using Knowledge Graphs and Large Language Models
Authors:
Romy Fieblinger,
Md Tanvirul Alam,
Nidhi Rastogi
Abstract:
Cyber threats are constantly evolving. Extracting actionable insights from unstructured Cyber Threat Intelligence (CTI) data is essential to guide cybersecurity decisions. Increasingly, organizations like Microsoft, Trend Micro, and CrowdStrike are using generative AI to facilitate CTI extraction. This paper addresses the challenge of automating the extraction of actionable CTI using advancements…
▽ More
Cyber threats are constantly evolving. Extracting actionable insights from unstructured Cyber Threat Intelligence (CTI) data is essential to guide cybersecurity decisions. Increasingly, organizations like Microsoft, Trend Micro, and CrowdStrike are using generative AI to facilitate CTI extraction. This paper addresses the challenge of automating the extraction of actionable CTI using advancements in Large Language Models (LLMs) and Knowledge Graphs (KGs). We explore the application of state-of-the-art open-source LLMs, including the Llama 2 series, Mistral 7B Instruct, and Zephyr for extracting meaningful triples from CTI texts. Our methodology evaluates techniques such as prompt engineering, the guidance framework, and fine-tuning to optimize information extraction and structuring. The extracted data is then utilized to construct a KG, offering a structured and queryable representation of threat intelligence. Experimental results demonstrate the effectiveness of our approach in extracting relevant information, with guidance and fine-tuning showing superior performance over prompt engineering. However, while our methods prove effective in small-scale tests, applying LLMs to large-scale data for KG construction and Link Prediction presents ongoing challenges.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence
Authors:
Md Tanvirul Alam,
Dipkamal Bhusal,
Le Nguyen,
Nidhi Rastogi
Abstract:
Cyber threat intelligence (CTI) is crucial in today's cybersecurity landscape, providing essential insights to understand and mitigate the ever-evolving cyber threats. The recent rise of Large Language Models (LLMs) have shown potential in this domain, but concerns about their reliability, accuracy, and hallucinations persist. While existing benchmarks provide general evaluations of LLMs, there ar…
▽ More
Cyber threat intelligence (CTI) is crucial in today's cybersecurity landscape, providing essential insights to understand and mitigate the ever-evolving cyber threats. The recent rise of Large Language Models (LLMs) have shown potential in this domain, but concerns about their reliability, accuracy, and hallucinations persist. While existing benchmarks provide general evaluations of LLMs, there are no benchmarks that address the practical and applied aspects of CTI-specific tasks. To bridge this gap, we introduce CTIBench, a benchmark designed to assess LLMs' performance in CTI applications. CTIBench includes multiple datasets focused on evaluating knowledge acquired by LLMs in the cyber-threat landscape. Our evaluation of several state-of-the-art models on these tasks provides insights into their strengths and weaknesses in CTI contexts, contributing to a better understanding of LLM capabilities in CTI.
△ Less
Submitted 24 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
SECURE: Benchmarking Generative Large Language Models for Cybersecurity Advisory
Authors:
Dipkamal Bhusal,
Md Tanvirul Alam,
Le Nguyen,
Ashim Mahara,
Zachary Lightcap,
Rodney Frazier,
Romy Fieblinger,
Grace Long Torales,
Nidhi Rastogi
Abstract:
Large Language Models (LLMs) have demonstrated potential in cybersecurity applications but have also caused lower confidence due to problems like hallucinations and a lack of truthfulness. Existing benchmarks provide general evaluations but do not sufficiently address the practical and applied aspects of LLM performance in cybersecurity-specific tasks. To address this gap, we introduce the SECURE…
▽ More
Large Language Models (LLMs) have demonstrated potential in cybersecurity applications but have also caused lower confidence due to problems like hallucinations and a lack of truthfulness. Existing benchmarks provide general evaluations but do not sufficiently address the practical and applied aspects of LLM performance in cybersecurity-specific tasks. To address this gap, we introduce the SECURE (Security Extraction, Understanding \& Reasoning Evaluation), a benchmark designed to assess LLMs performance in realistic cybersecurity scenarios. SECURE includes six datasets focussed on the Industrial Control System sector to evaluate knowledge extraction, understanding, and reasoning based on industry-standard sources. Our study evaluates seven state-of-the-art models on these tasks, providing insights into their strengths and weaknesses in cybersecurity contexts, and offer recommendations for improving LLMs reliability as cyber advisory tools.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Network reduction and absence of Hopf Bifurcations in dual phosphorylation networks with three Intermediates
Authors:
Elisenda Feliu,
Nidhi Kaihnsa
Abstract:
Phosphorylation networks, representing the mechanisms by which proteins are phosphorylated at one or multiple sites, are ubiquitous in cell signalling and display rich dynamics such as unlimited multistability. Dual-site phosphorylation networks are known to exhibit oscillations in the form of periodic trajectories, when phosphorylation and dephosphorylation occurs as a mixed mechanism: phosphoryl…
▽ More
Phosphorylation networks, representing the mechanisms by which proteins are phosphorylated at one or multiple sites, are ubiquitous in cell signalling and display rich dynamics such as unlimited multistability. Dual-site phosphorylation networks are known to exhibit oscillations in the form of periodic trajectories, when phosphorylation and dephosphorylation occurs as a mixed mechanism: phosphorylation of the two sites requires one encounter of the kinase, while dephosphorylation of the two sites requires two encounters with the phosphatase. A still open question is whether a mechanism requiring two encounters for both phosphorylation and dephosphorylation also admits oscillations. In this work we provide evidence in favor of the absence of oscillations of this network by precluding Hopf bifurcations in any reduced network comprising three out of its four intermediate protein complexes. Our argument relies on a novel network reduction step that preserves the absence of Hopf bifurcations, and on a detailed analysis of the semi-algebraic conditions precluding Hopf bifurcations obtained from Hurwitz determinants of the characteristic polynomial of the Jacobian of the system. We conjecture that the removal of certain reverse reactions appearing in Michaelis-Menten-type mechanisms does not have an impact on the presence or absence of Hopf bifurcations. We prove an implication of the conjecture under certain favorable scenarios and support the conjecture with additional example-based evidence.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Epistemic EFX Allocations Exist for Monotone Valuations
Authors:
Hannaneh Akrami,
Nidhi Rathi
Abstract:
We study the fundamental problem of fairly dividing a set of indivisible items among agents with (general) monotone valuations. The notion of envy-freeness up to any item (EFX) is considered to be one of the most fascinating fairness concepts in this line of work. Unfortunately, despite significant efforts, existence of EFX allocations is a major open problem in fair division, thereby making the s…
▽ More
We study the fundamental problem of fairly dividing a set of indivisible items among agents with (general) monotone valuations. The notion of envy-freeness up to any item (EFX) is considered to be one of the most fascinating fairness concepts in this line of work. Unfortunately, despite significant efforts, existence of EFX allocations is a major open problem in fair division, thereby making the study of approximations and relaxations of EFX a natural line of research. Recently, Caragiannis et al. introduced a promising relaxation of EFX, called epistemic EFX (EEFX). We say an allocation to be EEFX if, for every agent, it is possible to shuffle the items in the remaining bundles so that she becomes "EFX-satisfied". Caragiannis et al. prove existence and polynomial-time computability of EEFX allocations for additive valuations. A natural question asks what happens when we consider valuations more general than additive?
We address this important open question and answer it affirmatively by establishing the existence of EEFX allocations for an arbitrary number of agents with general monotone valuations. To the best of our knowledge, EEFX is the only known relaxation of EFX (beside EF1) to have such strong existential guarantees. Furthermore, we complement our existential result by proving computational and information-theoretic lower bounds. We prove that even for an arbitrary number of (more than one) agents with identical submodular valuations, it is PLS-hard to compute EEFX allocations and it requires exponentially-many value queries to do so.
△ Less
Submitted 12 June, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Akal Badi ya Bias: An Exploratory Study of Gender Bias in Hindi Language Technology
Authors:
Rishav Hada,
Safiya Husain,
Varun Gumma,
Harshita Diddee,
Aditya Yadavalli,
Agrima Seth,
Nidhi Kulkarni,
Ujwal Gadiraju,
Aditya Vashistha,
Vivek Seshadri,
Kalika Bali
Abstract:
Existing research in measuring and mitigating gender bias predominantly centers on English, overlooking the intricate challenges posed by non-English languages and the Global South. This paper presents the first comprehensive study delving into the nuanced landscape of gender bias in Hindi, the third most spoken language globally. Our study employs diverse mining techniques, computational models,…
▽ More
Existing research in measuring and mitigating gender bias predominantly centers on English, overlooking the intricate challenges posed by non-English languages and the Global South. This paper presents the first comprehensive study delving into the nuanced landscape of gender bias in Hindi, the third most spoken language globally. Our study employs diverse mining techniques, computational models, field studies and sheds light on the limitations of current methodologies. Given the challenges faced with mining gender biased statements in Hindi using existing methods, we conducted field studies to bootstrap the collection of such sentences. Through field studies involving rural and low-income community women, we uncover diverse perceptions of gender bias, underscoring the necessity for context-specific approaches. This paper advocates for a community-centric research design, amplifying voices often marginalized in previous studies. Our findings not only contribute to the understanding of gender bias in Hindi but also establish a foundation for further exploration of Indic languages. By exploring the intricacies of this understudied context, we call for thoughtful engagement with gender bias, promoting inclusivity and equity in linguistic and cultural contexts beyond the Global North.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Sheet model description of spatio-temporal evolution of upper-hybrid oscillations in an inhomogeneous magnetic field
Authors:
Nidhi Rathee,
Someswar Dutta,
R. Srinivasan,
Sudip Sengupta
Abstract:
Spatio-temporal evolution of large amplitude upper hybrid oscillations in a cold homogeneous plasma in the presence of an inhomogeneous magnetic field is studied analytically and numerically using the Dawson sheet model. It is observed that the inhomogeneity in magnetic field which causes the upper hybrid frequency to acquire a spatial dependence, results in phase mixing and subsequent breaking of…
▽ More
Spatio-temporal evolution of large amplitude upper hybrid oscillations in a cold homogeneous plasma in the presence of an inhomogeneous magnetic field is studied analytically and numerically using the Dawson sheet model. It is observed that the inhomogeneity in magnetic field which causes the upper hybrid frequency to acquire a spatial dependence, results in phase mixing and subsequent breaking of the upper hybrid oscillations at arbitrarily low amplitudes. This result is in sharp contrast to the usual upper hybrid oscillations in a homogeneous magnetic field where the oscillations break within a fraction of a period when the amplitude exceeds a certain critical value. Our perturbative calculations show that the phase mixing (wave breaking) time scales inversely with the amplitude of magnetic field inhomogeneity ($Δ$) and amplitude of imposed density perturbation ($δ$), and scales directly with the ratio of magnetic field inhomogeneity scale length to imposed density perturbation scale length ($(α/k_L)^{-1}$ ) as $ω_{pe}τ_{mix} \sim \left( 1+β^2 \right) ^{3/2}k_L/(β^2δΔα)$, where $β$ is the ratio of electron cyclotron frequency to electron plasma frequency. Further phase mixing time measured in simulations, performed using a 1-1/2 D code based on Dawson sheet model, shows good agreement with the above mentioned scaling. This result may be of relevance to plasma based particle acceleration experiments in the presence of a transverse inhomogeneous magnetic field.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Improve Academic Query Resolution through BERT-based Question Extraction from Images
Authors:
Nidhi Kamal,
Saurabh Yadav,
Jorawar Singh,
Aditi Avasthi
Abstract:
Providing fast and accurate resolution to the student's query is an essential solution provided by Edtech organizations. This is generally provided with a chat-bot like interface to enable students to ask their doubts easily. One preferred format for student queries is images, as it allows students to capture and post questions without ty** complex equations and information. However, this format…
▽ More
Providing fast and accurate resolution to the student's query is an essential solution provided by Edtech organizations. This is generally provided with a chat-bot like interface to enable students to ask their doubts easily. One preferred format for student queries is images, as it allows students to capture and post questions without ty** complex equations and information. However, this format also presents difficulties, as images may contain multiple questions or textual noise that lowers the accuracy of existing single-query answering solutions. In this paper, we propose a method for extracting questions from text or images using a BERT-based deep learning model and compare it to the other rule-based and layout-based methods. Our method aims to improve the accuracy and efficiency of student query resolution in Edtech organizations.
△ Less
Submitted 28 April, 2024;
originally announced May 2024.
-
Efficient and Adaptive Posterior Sampling Algorithms for Bandits
Authors:
Bingshan Hu,
Zhiming Huang,
Tianyue H. Zhang,
Mathias Lécuyer,
Nidhi Hegde
Abstract:
We study Thompson Sampling-based algorithms for stochastic bandits with bounded rewards. As the existing problem-dependent regret bound for Thompson Sampling with Gaussian priors [Agrawal and Goyal, 2017] is vacuous when $T \le 288 e^{64}$, we derive a more practical bound that tightens the coefficient of the leading term %from $288 e^{64}$ to $1270$. Additionally, motivated by large-scale real-wo…
▽ More
We study Thompson Sampling-based algorithms for stochastic bandits with bounded rewards. As the existing problem-dependent regret bound for Thompson Sampling with Gaussian priors [Agrawal and Goyal, 2017] is vacuous when $T \le 288 e^{64}$, we derive a more practical bound that tightens the coefficient of the leading term %from $288 e^{64}$ to $1270$. Additionally, motivated by large-scale real-world applications that require scalability, adaptive computational resource allocation, and a balance in utility and computation, we propose two parameterized Thompson Sampling-based algorithms: Thompson Sampling with Model Aggregation (TS-MA-$α$) and Thompson Sampling with Timestamp Duelling (TS-TD-$α$), where $α\in [0,1]$ controls the trade-off between utility and computation. Both algorithms achieve $O \left(K\ln^{α+1}(T)/Δ\right)$ regret bound, where $K$ is the number of arms, $T$ is the finite learning horizon, and $Δ$ denotes the single round performance loss when pulling a sub-optimal arm.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
PASA: Attack Agnostic Unsupervised Adversarial Detection using Prediction & Attribution Sensitivity Analysis
Authors:
Dipkamal Bhusal,
Md Tanvirul Alam,
Monish K. Veerabhadran,
Michael Clifford,
Sara Rampazzi,
Nidhi Rastogi
Abstract:
Deep neural networks for classification are vulnerable to adversarial attacks, where small perturbations to input samples lead to incorrect predictions. This susceptibility, combined with the black-box nature of such networks, limits their adoption in critical applications like autonomous driving. Feature-attribution-based explanation methods provide relevance of input features for model predictio…
▽ More
Deep neural networks for classification are vulnerable to adversarial attacks, where small perturbations to input samples lead to incorrect predictions. This susceptibility, combined with the black-box nature of such networks, limits their adoption in critical applications like autonomous driving. Feature-attribution-based explanation methods provide relevance of input features for model predictions on input samples, thus explaining model decisions. However, we observe that both model predictions and feature attributions for input samples are sensitive to noise. We develop a practical method for this characteristic of model prediction and feature attribution to detect adversarial samples. Our method, PASA, requires the computation of two test statistics using model prediction and feature attribution and can reliably detect adversarial samples using thresholds learned from benign samples. We validate our lightweight approach by evaluating the performance of PASA on varying strengths of FGSM, PGD, BIM, and CW attacks on multiple image and non-image datasets. On average, we outperform state-of-the-art statistical unsupervised adversarial detectors on CIFAR-10 and ImageNet by 14\% and 35\% ROC-AUC scores, respectively. Moreover, our approach demonstrates competitive performance even when an adversary is aware of the defense mechanism.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Trials Factor for Semi-Supervised NN Classifiers in Searches for Narrow Resonances at the LHC
Authors:
Benjamin Lieberman,
Andreas Crivellin,
Salah-Eddine Dahbi,
Finn Stevenson,
Nidhi Tripathi,
Mukesh Kumar,
Bruce Mellado
Abstract:
To mitigate the model dependencies of searches for new narrow resonances at the Large Hadron Collider (LHC), semi-supervised Neural Networks (NNs) can be used. Unlike fully supervised classifiers these models introduce an additional look-elsewhere effect in the process of optimising thresholds on the response distribution. We perform a frequentist study to quantify this effect, in the form of a tr…
▽ More
To mitigate the model dependencies of searches for new narrow resonances at the Large Hadron Collider (LHC), semi-supervised Neural Networks (NNs) can be used. Unlike fully supervised classifiers these models introduce an additional look-elsewhere effect in the process of optimising thresholds on the response distribution. We perform a frequentist study to quantify this effect, in the form of a trials factor. As an example, we consider simulated $Zγ$ data to perform narrow resonance searches using semi-supervised NN classifiers. The results from this analysis provide substantiation that the look-elsewhere effect induced by the semi-supervised NN is under control.
△ Less
Submitted 27 June, 2024; v1 submitted 11 April, 2024;
originally announced April 2024.
-
ReALM: Reference Resolution As Language Modeling
Authors:
Joel Ruben Antony Moniz,
Soundarya Krishnan,
Melis Ozyildirim,
Prathamesh Saraf,
Halim Cagri Ates,
Yuan Zhang,
Hong Yu,
Nidhi Rajshree
Abstract:
Reference resolution is an important problem, one that is essential to understand and successfully handle context of different kinds. This context includes both previous turns and context that pertains to non-conversational entities, such as entities on the user's screen or those running in the background. While LLMs have been shown to be extremely powerful for a variety of tasks, their use in ref…
▽ More
Reference resolution is an important problem, one that is essential to understand and successfully handle context of different kinds. This context includes both previous turns and context that pertains to non-conversational entities, such as entities on the user's screen or those running in the background. While LLMs have been shown to be extremely powerful for a variety of tasks, their use in reference resolution, particularly for non-conversational entities, remains underutilized. This paper demonstrates how LLMs can be used to create an extremely effective system to resolve references of various types, by showing how reference resolution can be converted into a language modeling problem, despite involving forms of entities like those on screen that are not traditionally conducive to being reduced to a text-only modality. We demonstrate large improvements over an existing system with similar functionality across different types of references, with our smallest model obtaining absolute gains of over 5% for on-screen references. We also benchmark against GPT-3.5 and GPT-4, with our smallest model achieving performance comparable to that of GPT-4, and our larger models substantially outperforming it.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Connectivity of Parameter Regions of Multistationarity for Multisite Phosphorylation Networks
Authors:
Nidhi Kaihnsa,
Máté L. Telek
Abstract:
The parameter region of multistationarity of a reaction network contains all the parameters for which the associated dynamical system exhibits multiple steady states. Describing this region is challenging and remains an active area of research. In this paper, we concentrate on two biologically relevant families of reaction networks that model multisite phosphorylation and dephosphorylation of a su…
▽ More
The parameter region of multistationarity of a reaction network contains all the parameters for which the associated dynamical system exhibits multiple steady states. Describing this region is challenging and remains an active area of research. In this paper, we concentrate on two biologically relevant families of reaction networks that model multisite phosphorylation and dephosphorylation of a substrate at $n$ sites. For small values of $n$, it had previously been shown that the parameter region of multistationarity is connected. Here, we extend these results and provide a proof that applies to all values of $n$. Our techniques are based on the study of the critical polynomial associated with these reaction networks together with polyhedral geometric conditions of the signed support of this polynomial.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Iterations of the functor of naive $\mathbb A^1$-connected components of varieties
Authors:
Nidhi Gupta
Abstract:
For any sheaf of sets $\mathcal F$ on $Sm/k$, it is well known that the universal $\mathbb A^1$-invariant quotient of $\mathcal F$ is given as the colimit of sheaves $\mathcal S^n(\mathcal F)$ where $\mathcal S(F)$ is the sheaf of naive $\mathbb A^1$-connected components of $\mathcal F$. We show that these infinite iterations of naive $\mathbb A^1$-connected components in the construction of unive…
▽ More
For any sheaf of sets $\mathcal F$ on $Sm/k$, it is well known that the universal $\mathbb A^1$-invariant quotient of $\mathcal F$ is given as the colimit of sheaves $\mathcal S^n(\mathcal F)$ where $\mathcal S(F)$ is the sheaf of naive $\mathbb A^1$-connected components of $\mathcal F$. We show that these infinite iterations of naive $\mathbb A^1$-connected components in the construction of universal $\mathbb A^1$-invariant quotient for a scheme are certainly required. For every $n$, we construct an $\mathbb A^1$-connected variety $X_n$ such that $\mathcal S^n(X_n)\neq \mathcal S^{n+1}(X_n)$ and $\mathcal S^{n+2}(X_n)=*$.
△ Less
Submitted 14 March, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
MORPH: Towards Automated Concept Drift Adaptation for Malware Detection
Authors:
Md Tanvirul Alam,
Romy Fieblinger,
Ashim Mahara,
Nidhi Rastogi
Abstract:
Concept drift is a significant challenge for malware detection, as the performance of trained machine learning models degrades over time, rendering them impractical. While prior research in malware concept drift adaptation has primarily focused on active learning, which involves selecting representative samples to update the model, self-training has emerged as a promising approach to mitigate conc…
▽ More
Concept drift is a significant challenge for malware detection, as the performance of trained machine learning models degrades over time, rendering them impractical. While prior research in malware concept drift adaptation has primarily focused on active learning, which involves selecting representative samples to update the model, self-training has emerged as a promising approach to mitigate concept drift. Self-training involves retraining the model using pseudo labels to adapt to shifting data distributions. In this research, we propose MORPH -- an effective pseudo-label-based concept drift adaptation method specifically designed for neural networks. Through extensive experimental analysis of Android and Windows malware datasets, we demonstrate the efficacy of our approach in mitigating the impact of concept drift. Our method offers the advantage of reducing annotation efforts when combined with active learning. Furthermore, our method significantly improves over existing works in automated concept drift adaptation for malware detection.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
ANFIS and metaheuristics for green supply chain with inspection and rework
Authors:
Nidhi Sharma,
Madhu Jain,
Dinesh Sharma
Abstract:
The focus of present article is to investigate a supply chain inventory model of deteriorated items along with inspection and stock dependent demand using green technology to reduce carbon emissions. Products that are decaying have a high sensitivity to the environment in terms of temperature, carbon emission, humidity, waste disposal, etc. This study develops a profit maximization model in the pr…
▽ More
The focus of present article is to investigate a supply chain inventory model of deteriorated items along with inspection and stock dependent demand using green technology to reduce carbon emissions. Products that are decaying have a high sensitivity to the environment in terms of temperature, carbon emission, humidity, waste disposal, etc. This study develops a profit maximization model in the presence of deterioration, preservation, imperfect production, inspection error, rework, stock and price-dependent demand. Three carbon emission strategies are proposed to reduce the expenses in different carbon emissions scenarios. The suggested approach may be used to determine the optimal production period, preservation investment, and level of green investment. The solution of the proposed non-linear constraint optimization is provided by using a penalty method in metaheuristic approaches. In order to conduct a sensitivity analysis for the essential model parameters, a numerical example is presented. The results produced by DE and PSO are compared with the results obtained by Adaptive Neuro-Fuzzy Inference System (ANFIS) technique.
△ Less
Submitted 12 July, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Ultralow Lattice Thermal Conductivity in Complex Structure Cu26V2Sn6Se32 due to Interaction of Low-Frequency Acoustic-Optical Phonons
Authors:
Kewal Singh Rana,
Debattam Sarkar,
Nidhi,
Aditya Singh,
Chandan Bera,
Kanishka Biswas,
Ajay Soni
Abstract:
Dam** of phonon momentum suppresses the lattice thermal conductivity (kl) through low energy acoustic-optical phonon interactions. We studied the thermal transport properties and underlying mechanism of phonon interactions in the large unit cell Cu26V2Sn6Se32. The large number of atoms in the unit cell results in low acoustic phonon cutoff frequency, flat phonon branches, low frequency Raman act…
▽ More
Dam** of phonon momentum suppresses the lattice thermal conductivity (kl) through low energy acoustic-optical phonon interactions. We studied the thermal transport properties and underlying mechanism of phonon interactions in the large unit cell Cu26V2Sn6Se32. The large number of atoms in the unit cell results in low acoustic phonon cutoff frequency, flat phonon branches, low frequency Raman active modes, localized rattler-like vibrations and strong crystalline anharmonicity. The crystal structure complexity disrupts the phonon propagation through weak bonded Cu atoms, boson peak and poor phonon velocity. The sulfur at selenium sites (Cu26V2Sn6Se30S2) distort the crystal lattice by offering additional scattering mechanism at the anionic sites, thereby increases the power factor and decreases the kl. This strategic manipulation of phonon scattering towards ultra-low kl not only results in improved thermoelectric performance but also offers insights into the fundamental understanding of heat transport in complex structured, large unit cell compounds.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Predicting Neural Stem Cell Differentiation Using Deep Learning Models
Authors:
Chandra Suda,
Nidhi Parthasarathy,
Anika Mittal,
Ian Young Chen,
Ananya Jalihal
Abstract:
Neural stem cells have immense therapeutic potential for treating various neurological disorders. However, lengthy differentiation protocols hinder the translation of neural stem cells into clinical applications. In this study, we present a deep learning approach using convolutional neural networks (CNNs) to predict the fate of neural stem cell differentiation at an early stage. We trained a CNN m…
▽ More
Neural stem cells have immense therapeutic potential for treating various neurological disorders. However, lengthy differentiation protocols hinder the translation of neural stem cells into clinical applications. In this study, we present a deep learning approach using convolutional neural networks (CNNs) to predict the fate of neural stem cell differentiation at an early stage. We trained a CNN model on a dataset of cellular images from neural stem cell cultures, and the model achieved impressive results with a 99.7% testing accuracy for the binary Resnet50 model and a 93.3% testing accuracy for the multiclass Resnet50 model in predicting neuron and glial cell differentiation. This demonstrates the feasibility of using CNNs for rapid, early differentiation outcome prediction from simple microscopy images, which could greatly accelerate neural stem cell research and therapies. Additionally, the model provides biological insights into morphological features associated with specific neural cell lineages.
△ Less
Submitted 19 November, 2023;
originally announced December 2023.
-
Converting Epics/Stories into Pseudocode using Transformers
Authors:
Gaurav Kolhatkar,
Akshit Madan,
Nidhi Kowtal,
Satyajit Roy,
Sheetal Sonawane
Abstract:
The conversion of user epics or stories into their appropriate representation in pseudocode or code is a time-consuming task, which can take up a large portion of the time in an industrial project. With this research paper, we aim to present a methodology to generate pseudocode from a given agile user story of small functionalities so as to reduce the overall time spent on the industrial project.…
▽ More
The conversion of user epics or stories into their appropriate representation in pseudocode or code is a time-consuming task, which can take up a large portion of the time in an industrial project. With this research paper, we aim to present a methodology to generate pseudocode from a given agile user story of small functionalities so as to reduce the overall time spent on the industrial project. Pseudocode is a programming language agnostic representation of the steps involved in a computer program, which can be easily converted into any programming language. Leveraging the potential of Natural Language Processing, we want to simplify the development process in organizations that use the Agile Model of Software Development. We present a methodology to convert a problem described in the English language into pseudocode. This methodology divides the Text to Pseudocode conversion task into two stages or subtasks, each of which is treated like an individual machine translation task. Stage 1 is Text to Code Conversion and Stage 2 is Code to Pseudocode Conversion. We find that the CodeT5 model gives the best results in terms of BLEU score when trained separately on the two subtasks mentioned above. BLEU score is a metric that is used to measure the similarity between a machine-translated text and a set of reference translations.
△ Less
Submitted 8 December, 2023;
originally announced December 2023.
-
Complexity-Guided Curriculum Learning for Text Graphs
Authors:
Nidhi Vakil,
Hadi Amiri
Abstract:
Curriculum learning provides a systematic approach to training. It refines training progressively, tailors training to task requirements, and improves generalization through exposure to diverse examples. We present a curriculum learning approach that builds on existing knowledge about text and graph complexity formalisms for training with text graph data. The core part of our approach is a novel d…
▽ More
Curriculum learning provides a systematic approach to training. It refines training progressively, tailors training to task requirements, and improves generalization through exposure to diverse examples. We present a curriculum learning approach that builds on existing knowledge about text and graph complexity formalisms for training with text graph data. The core part of our approach is a novel data scheduler, which employs "spaced repetition" and complexity formalisms to guide the training process. We demonstrate the effectiveness of the proposed approach on several text graph tasks and graph neural network architectures. The proposed model gains more and uses less data; consistently prefers text over graph complexity indices throughout training, while the best curricula derived from text and graph complexity indices are equally effective; and it learns transferable curricula across GNN models and datasets. In addition, we find that both node-level (local) and graph-level (global) graph complexity indices, as well as shallow and traditional text complexity indices play a crucial role in effective curriculum learning.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
Fully time-dependent cloud formation from a non-equilibrium gas-phase in exoplanetary atmospheres
Authors:
Sven Kiefer,
Helena Lecoq-Molinos,
Christiane Helling,
Nidhi Bangera,
Leen Decin
Abstract:
Recent observations suggest the presence of clouds in exoplanet atmospheres but have also shown that certain chemical species in the upper atmosphere might not be in chemical equilibrium. The goal of this work is to calculate the two main cloud formation processes, nucleation and bulk growth, consistently from a non-equilibrium gas-phase. The aim is further to explore the interaction between a kin…
▽ More
Recent observations suggest the presence of clouds in exoplanet atmospheres but have also shown that certain chemical species in the upper atmosphere might not be in chemical equilibrium. The goal of this work is to calculate the two main cloud formation processes, nucleation and bulk growth, consistently from a non-equilibrium gas-phase. The aim is further to explore the interaction between a kinetic gas-phase and cloud micro-physics. The cloud formation is modeled using the moment method and kinetic nucleation which are coupled to a gas-phase kinetic rate network. Specifically, the formation of cloud condensation nuclei is derived from cluster rates that include the thermochemical data of (TiO$_2$)$_N$ from N = 1 to 15. The surface growth of 9 bulk Al/Fe/Mg/O/Si/S/Ti binding materials considers the respective gas-phase species through condensation and surface reactions as derived from kinetic disequilibrium. The effect of completeness of rate networks and the time evolution of the cloud particle formation is studied for an example exoplanet HD 209458 b. A consistent, fully time-dependent cloud formation model in chemical disequilibrium with respect to nucleation, bulk growth and the gas-phase is presented and first test cases are studied. This model shows that cloud formation in exoplanet atmospheres is a fast process. This confirms previous findings that the formation of cloud particles is a local process. Tests on selected locations within the atmosphere of the gas-giant HD 209458 b show that the cloud particle number density and volume reach constant values within 1s. The complex kinetic polymer nucleation of TiO$_2$ confirms results from classical nucleation models. The surface reactions of SiO[s] and SiO$_2$[s] can create a catalytic cycle that dissociates H$_2$ to 2 H, resulting in a reduction of the CH$_4$ number densities.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Emergent (In)Security of Multi-Cloud Environments
Authors:
Morgan Reece,
Theodore Lander Jr.,
Sudip Mittal,
Nidhi Rastogi,
Josiah Dykstra,
Andy Sampson
Abstract:
As organizations increasingly use cloud services to host their IT infrastructure, there is a need to share data among these cloud hosted services and systems. A majority of IT organizations have workloads spread across different cloud service providers, growing their multi-cloud environments. When an organization grows their multi-cloud environment, the threat vectors and vulnerabilities for their…
▽ More
As organizations increasingly use cloud services to host their IT infrastructure, there is a need to share data among these cloud hosted services and systems. A majority of IT organizations have workloads spread across different cloud service providers, growing their multi-cloud environments. When an organization grows their multi-cloud environment, the threat vectors and vulnerabilities for their cloud systems and services grow as well. The increase in the number of attack vectors creates a challenge of how to prioritize mitigations and countermeasures to best defend a multi-cloud environment against attacks. Utilizing multiple industry standard risk analysis tools, we conducted an analysis of multi-cloud threat vectors enabling calculation and prioritization for the identified mitigations and countermeasures. The prioritizations from the analysis showed that authentication and architecture are the highest risk areas of threat vectors. Armed with this data, IT managers are able to more appropriately budget cybersecurity expenditure to implement the most impactful mitigations and countermeasures.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
Terahertz Saturable Absorption from Relativistic High-Temperature Thermodynamics in Black Phosphorus
Authors:
Nidhi Adhlakha,
Zeinab Ebrahimpour,
Paola Di Pietro,
Johannes Schmidt,
Federica Piccirilli,
Daniele Fausti,
Angela Montanaro,
Emmanuele Cappelluti,
Stefano Lupi,
Andrea Perucchi
Abstract:
Thanks to its tunable infrared band-gap and to its anisotropic conduction properties, black phosphorus represents a very unique 2D material, whose potential in the engineering of new devices still needs to be fully explored. We investigate here the nonlinear terahertz (THz) electrodynamics of black phosphorus along the more conducting armchair direction. Similarly to the case of other 2D systems l…
▽ More
Thanks to its tunable infrared band-gap and to its anisotropic conduction properties, black phosphorus represents a very unique 2D material, whose potential in the engineering of new devices still needs to be fully explored. We investigate here the nonlinear terahertz (THz) electrodynamics of black phosphorus along the more conducting armchair direction. Similarly to the case of other 2D systems like graphene and topological insulators, the THz saturable absorption properties of black phosphorus can be understood within a thermodynamic model by assuming a fast thermalization of the electron bath. While black phosphorus does not display the presence of massless fermions at ambient pressure and temperature, our analysis shows that its anomalous THz nonlinear properties can be accounted for by a relativistic massive Dirac dispersion, provided the Fermi temperature is low enough. An optimal tuning of the Fermi level therefore represents a strategy to engineer strong THz nonlinear response in other massive Dirac materials as in transition metal dichalchogenides or high-temperature superconductors.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Observed Trends in FRB Population and Bi-modality in the Luminosity Density Distribution
Authors:
Nidhi Saini,
Patrick Das Gupta
Abstract:
Fast radio bursts (FRBs) are radio transients of extragalactic origin lasting for about a few to several milli-seconds. We have analyzed both non-CHIME and CHIME FRB data. To circumvent the absence of measured fluence and flux density of FRBs belonging to the CHIME catalog, we have devised a novel approach that utilizes the ratio of the lower limits of the flux density $S_{ν_O}$ to the fluence…
▽ More
Fast radio bursts (FRBs) are radio transients of extragalactic origin lasting for about a few to several milli-seconds. We have analyzed both non-CHIME and CHIME FRB data. To circumvent the absence of measured fluence and flux density of FRBs belonging to the CHIME catalog, we have devised a novel approach that utilizes the ratio of the lower limits of the flux density $S_{ν_O}$ to the fluence $F_{ν_O}$ of individual FRB events to construct several parameters to investigate the presence of underlying trends in the FRB population drawn from both CHIME and non-CHIME data sets. One of these parameters involves true brightness temperature as well as energy density, despite not knowing the actual size of the FRB emission region. Our first robust conclusion is that the non-CHIME FRBs fall under two broad categories - those with luminosity density less than about $4\times 10^{33} $ erg/s/Hz at the frequency 300 MHz and those having larger luminosity density values than this. Assuming that FRBs are caused by magnetar glitches, we have discussed in this paper a simple physical model, incorporating an abrupt change in the light cylinder radius of an oblique rotator, to address the existence of these two categories.
△ Less
Submitted 21 April, 2024; v1 submitted 18 October, 2023;
originally announced October 2023.
-
COVID-19 South African Vaccine Hesitancy Models Show Boost in Performance Upon Fine-Tuning on M-pox Tweets
Authors:
Nicholas Perikli,
Srimoy Bhattacharya,
Blessing Ogbuokiri,
Zahra Movahedi Nia,
Benjamin Lieberman,
Nidhi Tripathi,
Salah-Eddine Dahbi,
Finn Stevenson,
Nicola Bragazzi,
Jude Kong,
Bruce Mellado
Abstract:
Very large numbers of M-pox cases have, since the start of May 2022, been reported in non-endemic countries leading many to fear that the M-pox Outbreak would rapidly transition into another pandemic, while the COVID-19 pandemic ravages on. Given the similarities of M-pox with COVID-19, we chose to test the performance of COVID-19 models trained on South African twitter data on a hand-labelled M-p…
▽ More
Very large numbers of M-pox cases have, since the start of May 2022, been reported in non-endemic countries leading many to fear that the M-pox Outbreak would rapidly transition into another pandemic, while the COVID-19 pandemic ravages on. Given the similarities of M-pox with COVID-19, we chose to test the performance of COVID-19 models trained on South African twitter data on a hand-labelled M-pox dataset before and after fine-tuning. More than 20k M-pox-related tweets from South Africa were hand-labelled as being either positive, negative or neutral. After fine-tuning these COVID-19 models on the M-pox dataset, the F1-scores increased by more than 8% falling just short of 70%, but still outperforming state-of-the-art models and well-known classification algorithms. An LDA-based topic modelling procedure was used to compare the miss-classified M-pox tweets of the original COVID-19 RoBERTa model with its fine-tuned version, and from this analysis, we were able to draw conclusions on how to build more sophisticated models.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Tight Sampling in Unbounded Networks
Authors:
Kshitijaa Jaglan,
Meher Chaitanya,
Triansh Sharma,
Abhijeeth Singam,
Nidhi Goyal,
Ponnurangam Kumaraguru,
Ulrik Brandes
Abstract:
The default approach to deal with the enormous size and limited accessibility of many Web and social media networks is to sample one or more subnetworks from a conceptually unbounded unknown network. Clearly, the extracted subnetworks will crucially depend on the sampling scheme. Motivated by studies of homophily and opinion formation, we propose a variant of snowball sampling designed to prioriti…
▽ More
The default approach to deal with the enormous size and limited accessibility of many Web and social media networks is to sample one or more subnetworks from a conceptually unbounded unknown network. Clearly, the extracted subnetworks will crucially depend on the sampling scheme. Motivated by studies of homophily and opinion formation, we propose a variant of snowball sampling designed to prioritize inclusion of entire cohesive communities rather than any kind of representativeness, breadth, or depth of coverage. The method is illustrated on a concrete example, and experiments on synthetic networks suggest that it behaves as desired.
△ Less
Submitted 5 October, 2023; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Analyzing the Efficacy of an LLM-Only Approach for Image-based Document Question Answering
Authors:
Nidhi Hegde,
Sujoy Paul,
Gagan Madan,
Gaurav Aggarwal
Abstract:
Recent document question answering models consist of two key components: the vision encoder, which captures layout and visual elements in images, and a Large Language Model (LLM) that helps contextualize questions to the image and supplements them with external world knowledge to generate accurate answers. However, the relative contributions of the vision encoder and the language model in these ta…
▽ More
Recent document question answering models consist of two key components: the vision encoder, which captures layout and visual elements in images, and a Large Language Model (LLM) that helps contextualize questions to the image and supplements them with external world knowledge to generate accurate answers. However, the relative contributions of the vision encoder and the language model in these tasks remain unclear. This is especially interesting given the effectiveness of instruction-tuned LLMs, which exhibit remarkable adaptability to new tasks. To this end, we explore the following aspects in this work: (1) The efficacy of an LLM-only approach on document question answering tasks (2) strategies for serializing textual information within document images and feeding it directly to an instruction-tuned LLM, thus bypassing the need for an explicit vision encoder (3) thorough quantitative analysis on the feasibility of such an approach. Our comprehensive analysis encompasses six diverse benchmark datasets, utilizing LLMs of varying scales. Our findings reveal that a strategy exclusively reliant on the LLM yields results that are on par with or closely approach state-of-the-art performance across a range of datasets. We posit that this evaluation framework will serve as a guiding resource for selecting appropriate datasets for future research endeavors that emphasize the fundamental importance of layout and image content information.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
The creation of a massive UCD by tidal threshing from NGC 936
Authors:
Sanjaya Paudel,
Pierre-Alain Duc,
Sungsoon Lim,
Mélina Poulain,
Francine R. Marleau,
Oliver Müller,
Rubén Sánchez-Janssen,
Rebecca Habas,
Patrick R. Durrell,
Nick Heesters,
Daya Nidhi Chhatkuli,
Suk-** Yoon
Abstract:
We study a compact nucleus embedded in an early-type dwarf galaxy, MATLAS-167, which is in the process of disruption by the tidal force of the neighboring giant S0 galaxy, NGC 936, in a group environment. Using the imaging data of the MATLAS survey, we analyze the stellar tidal tail of MATLAS-167 and its central compact nucleus, designated as NGC 936_UCD. We find that NGC 936_UCD has a luminosity…
▽ More
We study a compact nucleus embedded in an early-type dwarf galaxy, MATLAS-167, which is in the process of disruption by the tidal force of the neighboring giant S0 galaxy, NGC 936, in a group environment. Using the imaging data of the MATLAS survey, we analyze the stellar tidal tail of MATLAS-167 and its central compact nucleus, designated as NGC 936_UCD. We find that NGC 936_UCD has a luminosity of M$_{g}$ = $-$11.43$\pm$0.01 mag and a size of 66.5$\pm$17 pc, sharing the global properties of Ultra Compact Dwarf galaxies (UCDs) but significantly larger and brighter compared to the typical UCD populations observed in the Virgo cluster. By integrating the total luminosity of both the tidal stream and MATLAS-167, we estimate that the disrupted dwarf progenitor possesses a luminosity of M$_{g}$ = $-$15.92$\pm$0.06 mag, a typical bright dE luminosity. With the help of the optical spectrum observed by the SDSS survey, we derive the simple stellar population properties of NGC 936_UCD: a light-weighted age of 5.6$\pm$0.7 Gyr and metallicity of [Z/H] = $-$0.83$\pm$0.3 dex. Our findings suggest that tidal threshing is a possible formation mechanism of bright UCD populations in close proximity to giant galaxies.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization
Authors:
Dave Van Veen,
Cara Van Uden,
Louis Blankemeier,
Jean-Benoit Delbrouck,
Asad Aali,
Christian Bluethgen,
Anuj Pareek,
Malgorzata Polacin,
Eduardo Pontes Reis,
Anna Seehofnerova,
Nidhi Rohatgi,
Poonam Hosamani,
William Collins,
Neera Ahuja,
Curtis P. Langlotz,
Jason Hom,
Sergios Gatidis,
John Pauly,
Akshay S. Chaudhari
Abstract:
Analyzing vast textual data and summarizing key information from electronic health records imposes a substantial burden on how clinicians allocate their time. Although large language models (LLMs) have shown promise in natural language processing (NLP), their effectiveness on a diverse range of clinical summarization tasks remains unproven. In this study, we apply adaptation methods to eight LLMs,…
▽ More
Analyzing vast textual data and summarizing key information from electronic health records imposes a substantial burden on how clinicians allocate their time. Although large language models (LLMs) have shown promise in natural language processing (NLP), their effectiveness on a diverse range of clinical summarization tasks remains unproven. In this study, we apply adaptation methods to eight LLMs, spanning four distinct clinical summarization tasks: radiology reports, patient questions, progress notes, and doctor-patient dialogue. Quantitative assessments with syntactic, semantic, and conceptual NLP metrics reveal trade-offs between models and adaptation methods. A clinical reader study with ten physicians evaluates summary completeness, correctness, and conciseness; in a majority of cases, summaries from our best adapted LLMs are either equivalent (45%) or superior (36%) compared to summaries from medical experts. The ensuing safety analysis highlights challenges faced by both LLMs and medical experts, as we connect errors to potential medical harm and categorize types of fabricated information. Our research provides evidence of LLMs outperforming medical experts in clinical text summarization across multiple tasks. This suggests that integrating LLMs into clinical workflows could alleviate documentation burden, allowing clinicians to focus more on patient care.
△ Less
Submitted 11 April, 2024; v1 submitted 14 September, 2023;
originally announced September 2023.
-
Detecting the Presence of COVID-19 Vaccination Hesitancy from South African Twitter Data Using Machine Learning
Authors:
Nicholas Perikli,
Srimoy Bhattacharya,
Blessing Ogbuokiri,
Zahra Movahedi Nia,
Benjamin Lieberman,
Nidhi Tripathi,
Salah-Eddine Dahbi,
Finn Stevenson,
Nicola Bragazzi,
Jude Kong,
Bruce Mellado
Abstract:
Very few social media studies have been done on South African user-generated content during the COVID-19 pandemic and even fewer using hand-labelling over automated methods. Vaccination is a major tool in the fight against the pandemic, but vaccine hesitancy jeopardizes any public health effort. In this study, sentiment analysis on South African tweets related to vaccine hesitancy was performed, w…
▽ More
Very few social media studies have been done on South African user-generated content during the COVID-19 pandemic and even fewer using hand-labelling over automated methods. Vaccination is a major tool in the fight against the pandemic, but vaccine hesitancy jeopardizes any public health effort. In this study, sentiment analysis on South African tweets related to vaccine hesitancy was performed, with the aim of training AI-mediated classification models and assessing their reliability in categorizing UGC. A dataset of 30000 tweets from South Africa were extracted and hand-labelled into one of three sentiment classes: positive, negative, neutral. The machine learning models used were LSTM, bi-LSTM, SVM, BERT-base-cased and the RoBERTa-base models, whereby their hyperparameters were carefully chosen and tuned using the WandB platform. We used two different approaches when we pre-processed our data for comparison: one was semantics-based, while the other was corpus-based. The pre-processing of the tweets in our dataset was performed using both methods, respectively. All models were found to have low F1-scores within a range of 45$\%$-55$\%$, except for BERT and RoBERTa which both achieved significantly better measures with overall F1-scores of 60$\%$ and 61$\%$, respectively. Topic modelling using an LDA was performed on the miss-classified tweets of the RoBERTa model to gain insight on how to further improve model accuracy.
△ Less
Submitted 12 July, 2023;
originally announced July 2023.
-
On the complexity of Pareto-optimal and envy-free lotteries
Authors:
Ioannis Caragiannis,
Kristoffer Arnsfelt Hansen,
Nidhi Rathi
Abstract:
We study the classic problem of dividing a collection of indivisible resources in a fair and efficient manner among a set of agents having varied preferences. Pareto optimality is a standard notion of economic efficiency, which states that it should be impossible to find an allocation that improves some agent's utility without reducing any other's. On the other hand, a fundamental notion of fairne…
▽ More
We study the classic problem of dividing a collection of indivisible resources in a fair and efficient manner among a set of agents having varied preferences. Pareto optimality is a standard notion of economic efficiency, which states that it should be impossible to find an allocation that improves some agent's utility without reducing any other's. On the other hand, a fundamental notion of fairness in resource allocation settings is that of envy-freeness, which renders an allocation to be fair if every agent (weakly) prefers her own bundle over that of any other agent's bundle. Unfortunately, an envy-free allocation may not exist if we wish to divide a collection of indivisible items. Introducing randomness is a typical way of circumventing the non-existence of solutions, and therefore, allocation lotteries, i.e., distributions over allocations have been explored while relaxing the notion of fairness to ex-ante envy freeness.
We consider a general fair division setting with $n$ agents and a family of admissible $n$-partitions of an underlying set of items. Every agent is endowed with partition-based utilities, which specify her cardinal utility for each bundle of items in every admissible partition. In such fair division instances, Cole and Tao (2021) have proved that an ex-ante envy-free and Pareto-optimal allocation lottery is always guaranteed to exist. We strengthen their result while examining the computational complexity of the above total problem and establish its membership in the complexity class PPAD. Furthermore, for instances with a constant number of agents, we develop a polynomial-time algorithm to find an ex-ante envy-free and Pareto-optimal allocation lottery. On the negative side, we prove that maximizing social welfare over ex-ante envy-free and Pareto-optimal allocation lotteries is NP-hard.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
Curriculum Learning for Graph Neural Networks: A Multiview Competence-based Approach
Authors:
Nidhi Vakil,
Hadi Amiri
Abstract:
A curriculum is a planned sequence of learning materials and an effective one can make learning efficient and effective for both humans and machines. Recent studies developed effective data-driven curriculum learning approaches for training graph neural networks in language applications. However, existing curriculum learning approaches often employ a single criterion of difficulty in their trainin…
▽ More
A curriculum is a planned sequence of learning materials and an effective one can make learning efficient and effective for both humans and machines. Recent studies developed effective data-driven curriculum learning approaches for training graph neural networks in language applications. However, existing curriculum learning approaches often employ a single criterion of difficulty in their training paradigms. In this paper, we propose a new perspective on curriculum learning by introducing a novel approach that builds on graph complexity formalisms (as difficulty criteria) and model competence during training. The model consists of a scheduling scheme which derives effective curricula by accounting for different views of sample difficulty and model competence during training. The proposed solution advances existing research in curriculum learning for graph neural networks with the ability to incorporate a fine-grained spectrum of graph difficulty criteria in their training paradigms. Experimental results on real-world link prediction and node classification tasks illustrate the effectiveness of the proposed approach.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Fixed-Parameter Algorithms for Fair Hitting Set Problems
Authors:
Tanmay Inamdar,
Lawqueen Kanesh,
Madhumita Kundu,
Nidhi Purohit,
Saket Saurabh
Abstract:
Selection of a group of representatives satisfying certain fairness constraints, is a commonly occurring scenario. Motivated by this, we initiate a systematic algorithmic study of a \emph{fair} version of \textsc{Hitting Set}. In the classical \textsc{Hitting Set} problem, the input is a universe $\mathcal{U}$, a family $\mathcal{F}$ of subsets of $\mathcal{U}$, and a non-negative integer $k$. The…
▽ More
Selection of a group of representatives satisfying certain fairness constraints, is a commonly occurring scenario. Motivated by this, we initiate a systematic algorithmic study of a \emph{fair} version of \textsc{Hitting Set}. In the classical \textsc{Hitting Set} problem, the input is a universe $\mathcal{U}$, a family $\mathcal{F}$ of subsets of $\mathcal{U}$, and a non-negative integer $k$. The goal is to determine whether there exists a subset $S \subseteq \mathcal{U}$ of size $k$ that \emph{hits} (i.e., intersects) every set in $\mathcal{F}$. Inspired by several recent works, we formulate a fair version of this problem, as follows. The input additionally contains a family $\mathcal{B}$ of subsets of $\mathcal{U}$, where each subset in $\mathcal{B}$ can be thought of as the group of elements of the same \emph{type}. We want to find a set $S \subseteq \mathcal{U}$ of size $k$ that (i) hits all sets of $\mathcal{F}$, and (ii) does not contain \emph{too many} elements of each type. We call this problem \textsc{Fair Hitting Set}, and chart out its tractability boundary from both classical as well as multivariate perspective. Our results use a multitude of techniques from parameterized complexity including classical to advanced tools, such as, methods of representative sets for matroids, FO model checking, and a generalization of best known kernels for \textsc{Hitting Set}.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Absolute Concentration Robustness and Multistationarity in Reaction Networks: Conditions for Coexistence
Authors:
Nidhi Kaihnsa,
Tung Nguyen,
Anne Shiu
Abstract:
Many reaction networks arising in applications are multistationary, that is, they have the capacity for more than one steady state; while some networks exhibit absolute concentration robustness (ACR), which means that some species concentration is the same at all steady states. Both multistationarity and ACR are significant in biological settings, but only recently has attention focused on the pos…
▽ More
Many reaction networks arising in applications are multistationary, that is, they have the capacity for more than one steady state; while some networks exhibit absolute concentration robustness (ACR), which means that some species concentration is the same at all steady states. Both multistationarity and ACR are significant in biological settings, but only recently has attention focused on the possibility for these properties to coexist. Our main result states that such coexistence in at-most-bimolecular networks (which encompass most networks arising in biology) requires at least $3$ species, $5$ complexes, and $3$ reactions. We prove additional bounds on the number of reactions for general networks based on the number of linear conservation laws. Finally, we prove that, outside of a few exceptional cases, ACR is equivalent to non-multistationarity for bimolecular networks that are small (more precisely, one-dimensional or up to two species). Our proofs involve analyses of systems of sparse polynomials, and we also use classical results from chemical reaction network theory.
△ Less
Submitted 9 July, 2023;
originally announced July 2023.
-
Spectroscopic study of Herbig Ae/Be stars in the Galactic Anti-center region from LAMOST DR5
Authors:
S. Nidhi,
Blesson Mathew,
B. Shridharan,
R. Arun,
R. Anusha,
Sreeja S. Kartha
Abstract:
We study a sample of 119 Herbig Ae/Be stars in the Galactic anti-center direction using the spectroscopic data from Large sky Area Multi-Object fiber Spectroscopic Telescope (LAMOST) survey program. Emission lines of hydrogen belonging to the Balmer and Paschen series, and metallic lines of species such as FeII, OI, CaII triplet are identified. A moderate correlation is observed between the emissi…
▽ More
We study a sample of 119 Herbig Ae/Be stars in the Galactic anti-center direction using the spectroscopic data from Large sky Area Multi-Object fiber Spectroscopic Telescope (LAMOST) survey program. Emission lines of hydrogen belonging to the Balmer and Paschen series, and metallic lines of species such as FeII, OI, CaII triplet are identified. A moderate correlation is observed between the emission strengths of H$α$ and FeII 5169 Å, suggesting a possible common emission region for FeII lines and one of the components of H$α$. We explored a technique for the extinction correction of the HAeBe stars using diffuse interstellar bands present in the spectrum. We estimated the stellar parameters such as age and mass of these HAeBe stars, which are found to be in the range 0.1 -- 10 Myr and 1.5 -- 10 $M_{\odot}$, respectively. We found that the mass accretion rate of the HAeBe stars in the Galactic anti-center direction follows the relation $\dot{M}_{acc}$ $\propto$ $M_{*}^{3.12^{+0.21}_{-0.34}}$, which is similar to the relation derived for HAeBe stars in other regions of the Galaxy. The mass accretion rate of HAeBe stars is found to have a functional form of $\dot{M}_{acc} \propto t^{-1.1 \pm 0.2}$ with age, in agreement with previous studies.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Oscillation results of higher order linear differential equation
Authors:
Nidhi Gahlian
Abstract:
We study higher order linear differential equation $y^{(k)}+A_1(z)y=0$ with $k\geq2$, where $A_1=A+h$, $A$ is a transcendental entire function of finite order with $\frac{1}{2}\leq μ(A)<1$ and $h\neq0$ is an entire function with $ρ(h)<μ(A)$. Then it is shown that, if $f^{(k)}+A(z)f=0$ has a solution $f$ with $λ(f)<μ(A)$ then exponent of convergence of zeros of any non trivial solutions of…
▽ More
We study higher order linear differential equation $y^{(k)}+A_1(z)y=0$ with $k\geq2$, where $A_1=A+h$, $A$ is a transcendental entire function of finite order with $\frac{1}{2}\leq μ(A)<1$ and $h\neq0$ is an entire function with $ρ(h)<μ(A)$. Then it is shown that, if $f^{(k)}+A(z)f=0$ has a solution $f$ with $λ(f)<μ(A)$ then exponent of convergence of zeros of any non trivial solutions of $y^{(k)}+A_1(z)y=0$ is infinite.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Rejuvenating Star Formation Activity in an Early-type Dwarf Galaxy, LEDA 1915372, with Accreted HI Gas
Authors:
Sanjaya Paudel,
Suk-** Yoon,
Omkar Bait,
Chandreyee Sengupta,
Woong-Bae G. Zee,
Daya Nidhi Chhatkuli,
Binod Adhikari,
Binil Aryal
Abstract:
We report a rare astrophysical phenomenon, in which an early-type dwarf galaxy (dE), LEDA 1915372, is accreting gas from a nearby star-forming dwarf galaxy, MRK 0689, and is rejuvenating star-formation activity at the center. Both LEDA 1915372 and MRK 0689 have similar brightness of $M_{r}$ = $-$16.99 and $-$16.78 mag, respectively. They are located in a small group environment, separated by a sky…
▽ More
We report a rare astrophysical phenomenon, in which an early-type dwarf galaxy (dE), LEDA 1915372, is accreting gas from a nearby star-forming dwarf galaxy, MRK 0689, and is rejuvenating star-formation activity at the center. Both LEDA 1915372 and MRK 0689 have similar brightness of $M_{r}$ = $-$16.99 and $-$16.78 mag, respectively. They are located in a small group environment, separated by a sky-projected distance of 20.27 kpc (up to 70 kpc in three dimension), and have a relative line-of-sight radial velocity of 6 km/s. The observation of 21 cm emission with the Giant Metrewave Radio Telescope provides strong evidence of interaction between the pair dwarf galaxies in terms of neutral hydrogen (HI) morphology and kinematics. In particular, the HI map reveals that the two galaxies are clearly connected by a gas bridge, and the gas components of both LEDA 1915372 and MRK 0689 share a common direction of rotation. We also find that the HI emission peak deviates from LEDA 1915372 toward its optical blue plume, suggesting a tidal origin of ongoing central star formation. Our findings provide a new path to the formation of blue-cored dEs.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Reporting existing datasets for automatic epilepsy diagnosis and seizure detection
Authors:
Palak Handa,
Sakshi Tiwari,
Nidhi Goel
Abstract:
More than 50 million individuals are affected by epilepsy, a chronic neurological disorder characterized by unprovoked, recurring seizures and psychological symptoms. Researchers are working to automatically detect or predict epileptic episodes through Electroencephalography (EEG) signal analysis, and machine, and deep learning methods. Good quality, open-source, and free EEG data acts as a cataly…
▽ More
More than 50 million individuals are affected by epilepsy, a chronic neurological disorder characterized by unprovoked, recurring seizures and psychological symptoms. Researchers are working to automatically detect or predict epileptic episodes through Electroencephalography (EEG) signal analysis, and machine, and deep learning methods. Good quality, open-source, and free EEG data acts as a catalyst in this ongoing battle to manage this disease. This article presents 40+ publicly available EEG datasets for adult and pediatric human populations from 2001-2023. A comparative analysis and discussion on open and private EEG datasets have been done based on objective parameters in this domain. Bonn and CHB-MIT remain the benchmark datasets used for the automatic detection of epileptic and seizure EEG signals. Meta-data has also been released for large EEG data like CHB-MIT. This article will be updated every year to report the progress and changing trends in the development of EEG datasets in this field.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
Fifty Years of ISCA: A data-driven retrospective on key trends
Authors:
Gaurang Upasani,
Matthew D. Sinclair,
Adrian Sampson,
Parthasarathy Ranganathan,
David Patterson,
Shaan Shah,
Nidhi Parthasarathy,
Rutwik Jain
Abstract:
Computer Architecture, broadly, involves optimizing hardware and software for current and future processing systems. Although there are several other top venues to publish Computer Architecture research, including ASPLOS, HPCA, and MICRO, ISCA (the International Symposium on Computer Architecture) is one of the oldest, longest running, and most prestigious venues for publishing Computer Architectu…
▽ More
Computer Architecture, broadly, involves optimizing hardware and software for current and future processing systems. Although there are several other top venues to publish Computer Architecture research, including ASPLOS, HPCA, and MICRO, ISCA (the International Symposium on Computer Architecture) is one of the oldest, longest running, and most prestigious venues for publishing Computer Architecture research. Since 1973, except for 1975, ISCA has been organized annually. Accordingly, this year will be the 50th year of ISCA. Thus, we set out to analyze the past 50 years of ISCA to understand who and what has been driving and innovating computing systems thus far. Our analysis identifies several interesting trends that reflect how ISCA, and Computer Architecture in general, has grown and evolved in the past 50 years, including minicomputers, general-purpose uniprocessor CPUs, multiprocessor and multi-core CPUs, general-purpose GPUs, and accelerators.
△ Less
Submitted 18 November, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Systemic Risk and Vulnerability Analysis of Multi-cloud Environments
Authors:
Morgan Reece,
Theodore Edward Lander Jr.,
Matthew Stoffolano,
Andy Sampson,
Josiah Dykstra,
Sudip Mittal,
Nidhi Rastogi
Abstract:
With the increasing use of multi-cloud environments, security professionals face challenges in configuration, management, and integration due to uneven security capabilities and features among providers. As a result, a fragmented approach toward security has been observed, leading to new attack vectors and potential vulnerabilities. Other research has focused on single-cloud platforms or specific…
▽ More
With the increasing use of multi-cloud environments, security professionals face challenges in configuration, management, and integration due to uneven security capabilities and features among providers. As a result, a fragmented approach toward security has been observed, leading to new attack vectors and potential vulnerabilities. Other research has focused on single-cloud platforms or specific applications of multi-cloud environments. Therefore, there is a need for a holistic security and vulnerability assessment and defense strategy that applies to multi-cloud platforms. We perform a risk and vulnerability analysis to identify attack vectors from software, hardware, and the network, as well as interoperability security issues in multi-cloud environments. Applying the STRIDE and DREAD threat modeling methods, we present an analysis of the ecosystem across six attack vectors: cloud architecture, APIs, authentication, automation, management differences, and cybersecurity legislation. We quantitatively determine and rank the threats in multi-cloud environments and suggest mitigation strategies.
△ Less
Submitted 7 July, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
Universal Mechanical Polycomputation in Granular Matter
Authors:
Atoosa Parsa,
Sven Witthaus,
Nidhi Pashine,
Corey S. O'Hern,
Rebecca Kramer-Bottiglio,
Josh Bongard
Abstract:
Unconventional computing devices are increasingly of interest as they can operate in environments hostile to silicon-based electronics, or compute in ways that traditional electronics cannot. Mechanical computers, wherein information processing is a material property emerging from the interaction of components with the environment, are one such class of devices. This information processing can be…
▽ More
Unconventional computing devices are increasingly of interest as they can operate in environments hostile to silicon-based electronics, or compute in ways that traditional electronics cannot. Mechanical computers, wherein information processing is a material property emerging from the interaction of components with the environment, are one such class of devices. This information processing can be manifested in various physical substrates, one of which is granular matter. In a granular assembly, vibration can be treated as the information-bearing mode. This can be exploited to realize "polycomputing": materials can be evolved such that a single grain within them can report the result of multiple logical operations simultaneously at different frequencies, without recourse to quantum effects. Here, we demonstrate the evolution of a material in which one grain acts simultaneously as two different NAND gates at two different frequencies. NAND gates are of interest as any logical operations can be built from them. Moreover, they are nonlinear thus demonstrating a step toward general-purpose, computationally dense mechanical computers. Polycomputation was found to be distributed across each evolved material, suggesting the material's robustness. With recent advances in material sciences, hardware realization of these materials may eventually provide devices that challenge the computational density of traditional computers.
△ Less
Submitted 28 May, 2023;
originally announced May 2023.
-
Outage Performance and Novel Loss Function for an ML-Assisted Resource Allocation: An Exact Analytical Framework
Authors:
Nidhi Simmons,
David E Simmons,
Michel Daoud Yacoub
Abstract:
We introduce a novel loss function to minimize the outage probability of an ML-based resource allocation system. A single-user multi-resource greedy allocation strategy constitutes our application scenario, for which an ML binary classification predictor assists in selecting a resource satisfying the established outage criterium. While other resource allocation policies may be suitable, they are n…
▽ More
We introduce a novel loss function to minimize the outage probability of an ML-based resource allocation system. A single-user multi-resource greedy allocation strategy constitutes our application scenario, for which an ML binary classification predictor assists in selecting a resource satisfying the established outage criterium. While other resource allocation policies may be suitable, they are not the focus of our study. Instead, our primary emphasis is on theoretically develo** this loss function and leveraging it to train an ML model to address the outage probability challenge. With no access to future channel state information, this predictor foresees each resource's likely future outage status. When the predictor encounters a resource it believes will be satisfactory, it allocates it to the user. Our main result establishes exact and asymptotic expressions for this system's outage probability. These expressions reveal that focusing solely on the optimization of the per-resource outage probability conditioned on the ML predictor recommending resource allocation (a strategy that appears to be most appropriate) may produce inadequate predictors that reject every resource. They also reveal that focusing on standard metrics, like precision, false-positive rate, or recall, may not produce optimal predictors. With our result, we formulate a theoretically optimal, differentiable loss function to train our predictor. We then compare predictors trained using this and traditional loss functions namely, binary cross-entropy (BCE), mean squared error (MSE), and mean absolute error (MAE). In all scenarios, predictors trained using our novel loss function provide superior outage probability performance. Moreover, in some cases, our loss function outperforms predictors trained with BCE, MAE, and MSE by multiple orders of magnitude.
△ Less
Submitted 18 November, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
Estimation of Stellar Parameters and Mass Accretion Rate of Classical T Tauri Stars from LAMOST DR6
Authors:
S. Nidhi,
Blesson Mathew,
B. Shridharan,
Suman Bhattacharyya,
Edwin Das,
Sreeja S Kartha
Abstract:
Classical T Tauri stars are low-mass pre-main sequence stars with an active circumstellar environment. In this work we present the identification and study of 260 Classical T Tauri stars using LAMOST Data Release 6, among which 104 stars are newly identified. We distinguish Classical T Tauri stars from Giants and main-sequence dwarfs based on the log g values and the presence of H (alpha) emission…
▽ More
Classical T Tauri stars are low-mass pre-main sequence stars with an active circumstellar environment. In this work we present the identification and study of 260 Classical T Tauri stars using LAMOST Data Release 6, among which 104 stars are newly identified. We distinguish Classical T Tauri stars from Giants and main-sequence dwarfs based on the log g values and the presence of H (alpha) emission line and infrared excess that arises from the circumstellar accretion disk. We estimated the mass and age of 210 stars using the Gaia color-magnitude diagram. The age is from 0.1 to 20 Myr, where 90% of the stars have age below 10 Myr and the mass ranges between 0.11 to 1.9 M(solar). From the measured H(alpha) equivalent widths, we homogeneously estimated the mass accretion rates for 172 stars, with most values ranging from 10^-7 to 10^-10 M(solar) yr^-1. The mass accretion rates are found to follow a power law distribution with the mass of the star, having a relation of the form Macc proportional to M(star)^1.43 +/- 0.26, in agreement with previous studies.
△ Less
Submitted 18 May, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
Dish detection in food platters: A framework for automated diet logging and nutrition management
Authors:
Mansi Goel,
Shashank Dargar,
Shounak Ghatak,
Nidhi Verma,
Pratik Chauhan,
Anushka Gupta,
Nikhila Vishnumolakala,
Hareesh Amuru,
Ekta Gambhir,
Ronak Chhajed,
Meenal Jain,
Astha Jain,
Samiksha Garg,
Nitesh Narwade,
Nikhilesh Verhwani,
Abhuday Tiwari,
Kirti Vashishtha,
Ganesh Bagler
Abstract:
Diet is central to the epidemic of lifestyle disorders. Accurate and effortless diet logging is one of the significant bottlenecks for effective diet management and calorie restriction. Dish detection from food platters is a challenging problem due to a visually complex food layout. We present an end-to-end computational framework for diet management, from data compilation, annotation, and state-o…
▽ More
Diet is central to the epidemic of lifestyle disorders. Accurate and effortless diet logging is one of the significant bottlenecks for effective diet management and calorie restriction. Dish detection from food platters is a challenging problem due to a visually complex food layout. We present an end-to-end computational framework for diet management, from data compilation, annotation, and state-of-the-art model identification to its mobile app implementation. As a case study, we implement the framework in the context of Indian food platters known for their complex presentation that poses a challenge for the automated detection of dishes. Starting with the 61 most popular Indian dishes, we identify the state-of-the-art model through a comparative analysis of deep-learning-based object detection architectures. Rooted in a meticulous compilation of 68,005 platter images with 134,814 manual dish annotations, we first compare ten architectures for multi-label classification to identify ResNet152 (mAP=84.51%) as the best model. YOLOv8x (mAP=87.70%) emerged as the best model architecture for dish detection among the eight deep-learning models implemented after a thorough performance evaluation. By comparing with the state-of-the-art model for the IndianFood10 dataset, we demonstrate the superior object detection performance of YOLOv8x for this subset and establish Resnet152 as the best architecture for multi-label classification. The models thus trained on richly annotated data can be extended to include dishes from across global cuisines. The proposed framework is demonstrated through a proof-of-concept mobile application with diverse applications for diet logging, food recommendation systems, nutritional interventions, and mitigation of lifestyle disorders.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Coronal Heating as Determined by the Solar Flare Frequency Distribution Obtained by Aggregating Case Studies
Authors:
James Paul Mason,
Alexandra Werth,
Colin G. West,
Allison A. Youngblood,
Donald L. Woodraska,
Courtney Peck,
Kevin Lacjak,
Florian G. Frick,
Moutamen Gabir,
Reema A. Alsinan,
Thomas Jacobsen,
Mohammad Alrubaie,
Kayla M. Chizmar,
Benjamin P. Lau,
Lizbeth Montoya Dominguez,
David Price,
Dylan R. Butler,
Connor J. Biron,
Nikita Feoktistov,
Kai Dewey,
N. E. Loomis,
Michal Bodzianowski,
Connor Kuybus,
Henry Dietrick,
Aubrey M. Wolfe
, et al. (977 additional authors not shown)
Abstract:
Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms th…
▽ More
Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms that could explain it: nanoflares or Alfvén waves. To date, neither can be directly observed. Nanoflares are, by definition, extremely small, but their aggregate energy release could represent a substantial heating mechanism, presuming they are sufficiently abundant. One way to test this presumption is via the flare frequency distribution, which describes how often flares of various energies occur. If the slope of the power law fitting the flare frequency distribution is above a critical threshold, $α=2$ as established in prior literature, then there should be a sufficient abundance of nanoflares to explain coronal heating. We performed $>$600 case studies of solar flares, made possible by an unprecedented number of data analysts via three semesters of an undergraduate physics laboratory course. This allowed us to include two crucial, but nontrivial, analysis methods: pre-flare baseline subtraction and computation of the flare energy, which requires determining flare start and stop times. We aggregated the results of these analyses into a statistical study to determine that $α= 1.63 \pm 0.03$. This is below the critical threshold, suggesting that Alfvén waves are an important driver of coronal heating.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Synthetic DOmain-Targeted Augmentation (S-DOTA) Improves Model Generalization in Digital Pathology
Authors:
Sai Chowdary Gullapally,
Yibo Zhang,
Nitin Kumar Mittal,
Deeksha Kartik,
Sandhya Srinivasan,
Kevin Rose,
Daniel Shenker,
Dinkar Juyal,
Harshith Padigela,
Raymond Biju,
Victor Minden,
Chirag Maheshwari,
Marc Thibault,
Zvi Goldstein,
Luke Novak,
Nidhi Chandra,
Justin Lee,
Aaditya Prakash,
Chintan Shah,
John Abel,
Darren Fahy,
Amaro Taylor-Weiner,
Anand Sampat
Abstract:
Machine learning algorithms have the potential to improve patient outcomes in digital pathology. However, generalization of these tools is currently limited by sensitivity to variations in tissue preparation, staining procedures and scanning equipment that lead to domain shift in digitized slides. To overcome this limitation and improve model generalization, we studied the effectiveness of two Syn…
▽ More
Machine learning algorithms have the potential to improve patient outcomes in digital pathology. However, generalization of these tools is currently limited by sensitivity to variations in tissue preparation, staining procedures and scanning equipment that lead to domain shift in digitized slides. To overcome this limitation and improve model generalization, we studied the effectiveness of two Synthetic DOmain-Targeted Augmentation (S-DOTA) methods, namely CycleGAN-enabled Scanner Transform (ST) and targeted Stain Vector Augmentation (SVA), and compared them against the International Color Consortium (ICC) profile-based color calibration (ICC Cal) method and a baseline method using traditional brightness, color and noise augmentations. We evaluated the ability of these techniques to improve model generalization to various tasks and settings: four models, two model types (tissue segmentation and cell classification), two loss functions, six labs, six scanners, and three indications (hepatocellular carcinoma (HCC), nonalcoholic steatohepatitis (NASH), prostate adenocarcinoma). We compared these methods based on the macro-averaged F1 scores on in-distribution (ID) and out-of-distribution (OOD) test sets across multiple domains, and found that S-DOTA methods (i.e., ST and SVA) led to significant improvements over ICC Cal and baseline on OOD data while maintaining comparable performance on ID data. Thus, we demonstrate that S-DOTA may help address generalization due to domain shift in real world applications.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
Effect of trap states, ion migration and interfaces on carrier transport in single crystal, polycrystalline and thick film devices of halide perovskites CH$_3$NH$_3$PbX$_3$ (X= I, Br, Cl)
Authors:
Mohd Warish,
Gaurav Jamwal,
Zara Aftab,
Nidhi Bhatt,
Asad Niazi
Abstract:
The understanding of the mixed ionic-electronic nature of charge transport in Metal Halide Perovskites (MHPs) and the role of morphological and interface defects is crucial for improving the performance of MHP based photovoltaic devices. We present results of a parallel study on MAPbX$_3$ (X = I, Br and Cl), synthesized as solution processed polycrystalline powders, and as single crystals grown by…
▽ More
The understanding of the mixed ionic-electronic nature of charge transport in Metal Halide Perovskites (MHPs) and the role of morphological and interface defects is crucial for improving the performance of MHP based photovoltaic devices. We present results of a parallel study on MAPbX$_3$ (X = I, Br and Cl), synthesized as solution processed polycrystalline powders, and as single crystals grown by a facile low temperature assisted technique. We have studied ionic-electronic charge transport in single-crystal and polycrystalline (pressed pellet and thick film) samples in order to compare the effect of defects and trap states associated with halide ion migration, device morphology and interfaces at grain boundaries as well as at electrodes. The mobility of halide ions and associated Coulomb capture of electrons or holes was determined by dielectric and space charge limited current (SCLC) dark I-V measurements, and also simulated using an ionic-electronic model. The defect capture cross section of electronic charge was found to be proportional to the simulated halide ion density Nion, which varied in the range $10^{16}$ cm$^{-3}$ - $10^{22}$ cm$^{-3}$ depending on the halide ion. The trap state density from I-V measurements, Ntrap ~ $10^9$ - $10^{10}$ cm$^{-3}$, was found to be lower than previous reports. Single crystal MAPbI3 devices exhibited a low capture cross section ($σ$ ~ $10^{-16}$ cm$^{-2}$), high mobility ($μ$ ~ 196 cm$^2$/V-s) and large diffusion length (LD ~ 6 $μ$m). The study shows that non-radiative energy loss and carrier trap** are suppressed and transport properties enhanced by reducing grain boundary effects, along with interface engineering to prevent halide ion accumulation at the electrodes.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.