Search | arXiv e-print repository

Accurately Classifying Out-Of-Distribution Data in Facial Recognition

Authors: Gianluca Barone, Aashrit Cunchala, Rudy Nunez

Abstract: Standard classification theory assumes that the distribution of images in the test and training sets are identical. Unfortunately, real-life scenarios typically feature unseen data ("out-of-distribution data") which is different from data in the training distribution("in-distribution"). This issue is most prevalent in social justice problems where data from under-represented groups may appear in t… ▽ More Standard classification theory assumes that the distribution of images in the test and training sets are identical. Unfortunately, real-life scenarios typically feature unseen data ("out-of-distribution data") which is different from data in the training distribution("in-distribution"). This issue is most prevalent in social justice problems where data from under-represented groups may appear in the test data without representing an equal proportion of the training data. This may result in a model returning confidently wrong decisions and predictions. We are interested in the following question: Can the performance of a neural network improve on facial images of out-of-distribution data when it is trained simultaneously on multiple datasets of in-distribution data? We approach this problem by incorporating the Outlier Exposure model and investigate how the model's performance changes when other datasets of facial images were implemented. We observe that the accuracy and other metrics of the model can be increased by applying Outlier Exposure, incorporating a trainable weight parameter to increase the machine's emphasis on outlier images, and by re-weighting the importance of different class labels. We also experimented with whether sorting the images and determining outliers via image features would have more of an effect on the metrics than sorting by average pixel value. Our goal was to make models not only more accurate but also more fair by scanning a more expanded range of images. We also tested the datasets in reverse order to see whether a more fair dataset with balanced features has an effect on the model's accuracy. △ Less

Submitted 24 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: 18 pages, 6 tables, 6 figures

arXiv:2402.14489 [pdf, other]

A Class of Topological Pseudodistances for Fast Comparison of Persistence Diagrams

Authors: Rolando Kindelan Nuñez, Mircea Petrache, Mauricio Cerda, Nancy Hitschfeld

Abstract: Persistence diagrams (PD)s play a central role in topological data analysis, and are used in an ever increasing variety of applications. The comparison of PD data requires computing comparison metrics among large sets of PDs, with metrics which are accurate, theoretically sound, and fast to compute. Especially for denser multi-dimensional PDs, such comparison metrics are lacking. While on the one… ▽ More Persistence diagrams (PD)s play a central role in topological data analysis, and are used in an ever increasing variety of applications. The comparison of PD data requires computing comparison metrics among large sets of PDs, with metrics which are accurate, theoretically sound, and fast to compute. Especially for denser multi-dimensional PDs, such comparison metrics are lacking. While on the one hand, Wasserstein-type distances have high accuracy and theoretical guarantees, they incur high computational cost. On the other hand, distances between vectorizations such as Persistence Statistics (PS)s have lower computational cost, but lack the accuracy guarantees and in general they are not guaranteed to distinguish PDs (i.e. the two PS vectors of different PDs may be equal). In this work we introduce a class of pseudodistances called Extended Topological Pseudodistances (ETD)s, which have tunable complexity, and can approximate Sliced and classical Wasserstein distances at the high-complexity extreme, while being computationally lighter and close to Persistence Statistics at the lower complexity extreme, and thus allow users to interpolate between the two metrics. We build theoretical comparisons to show how to fit our new distances at an intermediate level between persistence vectorizations and Wasserstein distances. We also experimentally verify that ETDs outperform PSs in terms of accuracy and outperform Wasserstein and Sliced Wasserstein distances in terms of computational complexity. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: Accepted for presentation and poster on the 38th Annual AAAI Conference on Artificial Intelligence (AAAI24)

MSC Class: 62R40; 55N31; 68T09; 62R07; 68T10 ACM Class: I.2; I.3.5; I.5.1; I.5.2; I.5.3; I.5.4

arXiv:2207.03936 [pdf, other]

doi 10.1016/j.jocs.2021.101422

Search by triplet: An efficient local track reconstruction algorithm for parallel architectures

Authors: Daniel Hugo Cámpora Pérez, Niko Neufeld, Agustín Riscos Núñez

Abstract: Millions of particles are collided every second at the LHCb detector placed inside the Large Hadron Collider at CERN. The particles produced as a result of these collisions pass through various detecting devices which will produce a combined raw data rate of up to 40 Tbps by 2021. These data will be fed through a data acquisition system which reconstructs individual particles and filters the colli… ▽ More Millions of particles are collided every second at the LHCb detector placed inside the Large Hadron Collider at CERN. The particles produced as a result of these collisions pass through various detecting devices which will produce a combined raw data rate of up to 40 Tbps by 2021. These data will be fed through a data acquisition system which reconstructs individual particles and filters the collision events in real time. This process will occur in a heterogeneous farm employing exclusively off-the-shelf CPU and GPU hardware, in a two stage process known as High Level Trigger. The reconstruction of charged particle trajectories in physics detectors, also referred to as track reconstruction or tracking, determines the position, charge and momentum of particles as they pass through detectors. The Vertex Locator subdetector (VELO) is the closest such detector to the beamline, placed outside of the region where the LHCb magnet produces a sizable magnetic field. It is used to reconstruct straight particle trajectories which serve as seeds for reconstruction of other subdetectors and to locate collision vertices. The VELO subdetector will detect up to 1000 million particles every second, which need to be reconstructed in real time in the High Level Trigger. We present Search by triplet, an efficient track reconstruction algorithm. Our algorithm is designed to run efficiently across parallel architectures. We extend on previous work and explain the algorithm evolution since its inception. We show the scaling of our algorithm under various situations, and analyze its amortized time in terms of complexity for each of its constituent parts and profile its performance. Our algorithm is the current state-of-the-art in VELO track reconstruction on SIMT architectures, and we qualify its improvements over previous results. △ Less

Submitted 11 July, 2022; v1 submitted 8 July, 2022; originally announced July 2022.

Journal ref: Journal of Computational Science, Volume 54, 2021, 101422, ISSN 1877-7503

arXiv:2201.12398 [pdf]

AI for Chemical Space Gap Filling and Novel Compound Generation

Authors: Monee Y. McGrady, Sean M. Colby, Jamie R Nuñez, Ryan S. Renslow, Thomas O. Metz

Abstract: When considering large sets of molecules, it is helpful to place them in the context of a "chemical space" - a multidimensional space defined by a set of descriptors that can be used to visualize and analyze compound grou** as well as identify regions that might be void of valid structures. The chemical space of all possible molecules in a given biological or environmental sample can be vast and… ▽ More When considering large sets of molecules, it is helpful to place them in the context of a "chemical space" - a multidimensional space defined by a set of descriptors that can be used to visualize and analyze compound grou** as well as identify regions that might be void of valid structures. The chemical space of all possible molecules in a given biological or environmental sample can be vast and largely unexplored, mainly due to current limitations in processing of 'big data' by brute force methods (e.g., enumeration of all possible compounds in a space). Recent advances in artificial intelligence (AI) have led to multiple new cheminformatics tools that incorporate AI techniques to characterize and learn the structure and properties of molecules in order to generate plausible compounds, thereby contributing to more accessible and explorable regions of chemical space without the need for brute force methods. We have used one such tool, a deep-learning software called DarkChem, which learns a representation of the molecular structure of compounds by compressing them into a latent space. With DarkChem's design, distance in this latent space is often associated with compound similarity, making sparse regions interesting targets for compound generation due to the possibility of generating novel compounds. In this study, we used 1 million small molecules (less than 1000 Da) to create a representative chemical space (defined by calculated molecular properties) of all small molecules. We identified regions with few or no compounds and investigated their location in DarkChem's latent space. From these spaces, we generated 694,645 valid molecules, all of which represent molecules not found in any chemical database to date. These molecules filled 50.8% of the probed empty spaces in molecular property space. Generated molecules are provided in the supporting information. △ Less

Submitted 28 January, 2022; originally announced January 2022.

arXiv:2112.03466 [pdf]

DEIMoS: an open-source tool for processing high-dimensional mass spectrometry data

Authors: Sean M. Colby, Christine H. Chang, Jessica L. Bade, Jamie R. Nunez, Madison R. Blumer, Daniel J. Orton, Kent J. Bloodsworth, Ernesto S. Nakayasu, Richard D. Smith, Yehia M. Ibrahim, Ryan S. Renslow, Thomas O. Metz

Abstract: We present DEIMoS: Data Extraction for Integrated Multidimensional Spectrometry, a Python application programming interface (API) and command-line tool for high-dimensional mass spectrometry data analysis workflows that offers ease of development and access to efficient algorithmic implementations. Functionality includes feature detection, feature alignment, collision cross section (CCS) calibrati… ▽ More We present DEIMoS: Data Extraction for Integrated Multidimensional Spectrometry, a Python application programming interface (API) and command-line tool for high-dimensional mass spectrometry data analysis workflows that offers ease of development and access to efficient algorithmic implementations. Functionality includes feature detection, feature alignment, collision cross section (CCS) calibration, isotope detection, and MS/MS spectral deconvolution, with the output comprising detected features aligned across study samples and characterized by mass, CCS, tandem mass spectra, and isotopic signature. Notably, DEIMoS operates on N-dimensional data, largely agnostic to acquisition instrumentation; algorithm implementations simultaneously utilize all dimensions to (i) offer greater separation between features, thus improving detection sensitivity, (ii) increase alignment/feature matching confidence among datasets, and (iii) mitigate convolution artifacts in tandem mass spectra. We demonstrate DEIMoS with LC-IMS-MS/MS data to illustrate the advantages of a multidimensional approach in each data processing step. △ Less

Submitted 6 December, 2021; originally announced December 2021.

arXiv:2110.12552 [pdf, ps, other]

Noisy UGC Translation at the Character Level: Revisiting Open-Vocabulary Capabilities and Robustness of Char-Based Models

Authors: José Carlos Rosales Núñez, Guillaume Wisniewski, Djamé Seddah

Abstract: This work explores the capacities of character-based Neural Machine Translation to translate noisy User-Generated Content (UGC) with a strong focus on exploring the limits of such approaches to handle productive UGC phenomena, which almost by definition, cannot be seen at training time. Within a strict zero-shot scenario, we first study the detrimental impact on translation performance of various… ▽ More This work explores the capacities of character-based Neural Machine Translation to translate noisy User-Generated Content (UGC) with a strong focus on exploring the limits of such approaches to handle productive UGC phenomena, which almost by definition, cannot be seen at training time. Within a strict zero-shot scenario, we first study the detrimental impact on translation performance of various user-generated content phenomena on a small annotated dataset we developed, and then show that such models are indeed incapable of handling unknown letters, which leads to catastrophic translation failure once such characters are encountered. We further confirm this behavior with a simple, yet insightful, copy task experiment and highlight the importance of reducing the vocabulary size hyper-parameter to increase the robustness of character-based models for machine translation. △ Less

Submitted 24 October, 2021; originally announced October 2021.

arXiv:2110.12551 [pdf, other]

Understanding the Impact of UGC Specificities on Translation Quality

Authors: José Carlos Rosales Núñez, Djamé Seddah, Guillaume Wisniewski

Abstract: This work takes a critical look at the evaluation of user-generated content automatic translation, the well-known specificities of which raise many challenges for MT. Our analyses show that measuring the average-case performance using a standard metric on a UGC test set falls far short of giving a reliable image of the UGC translation quality. That is why we introduce a new data set for the evalua… ▽ More This work takes a critical look at the evaluation of user-generated content automatic translation, the well-known specificities of which raise many challenges for MT. Our analyses show that measuring the average-case performance using a standard metric on a UGC test set falls far short of giving a reliable image of the UGC translation quality. That is why we introduce a new data set for the evaluation of UGC translation in which UGC specificities have been manually annotated using a fine-grained typology. Using this data set, we conduct several experiments to measure the impact of different kinds of UGC specificities on translation quality, more precisely than previously possible. △ Less

Submitted 24 October, 2021; originally announced October 2021.

arXiv:2109.09164 [pdf, other]

Optimal Ensemble Construction for Multi-Study Prediction with Applications to COVID-19 Excess Mortality Estimation

Authors: Gabriel Loewinger, Rolando Acosta Nunez, Rahul Mazumder, Giovanni Parmigiani

Abstract: It is increasingly common to encounter prediction tasks in the biomedical sciences for which multiple datasets are available for model training. Common approaches such as pooling datasets and applying standard statistical learning methods can result in poor out-of-study prediction performance when datasets are heterogeneous. Theoretical and applied work has shown $\textit{multi-study ensembling}$… ▽ More It is increasingly common to encounter prediction tasks in the biomedical sciences for which multiple datasets are available for model training. Common approaches such as pooling datasets and applying standard statistical learning methods can result in poor out-of-study prediction performance when datasets are heterogeneous. Theoretical and applied work has shown $\textit{multi-study ensembling}$ to be a viable alternative that leverages the variability across datasets in a manner that promotes model generalizability. Multi-study ensembling uses a two-stage $\textit{stacking}$ strategy which fits study-specific models and estimates ensemble weights separately. This approach ignores, however, the ensemble properties at the model-fitting stage, potentially resulting in a loss of efficiency. We therefore propose $\textit{optimal ensemble construction}$, an $\textit{all-in-one}$ approach to multi-study stacking whereby we jointly estimate ensemble weights as well as parameters associated with each study-specific model. We prove that limiting cases of our approach yield existing methods such as multi-study stacking and pooling datasets before model fitting. We propose an efficient block coordinate descent algorithm to optimize the proposed loss function. We compare our approach to standard methods by applying it to a multi-country COVID-19 dataset for baseline mortality prediction. We show that when little data is available for a country before the onset of the pandemic, leveraging data from other countries can substantially improve prediction accuracy. Importantly, our approach outperforms multi-study stacking and other standard methods in this application. We further characterize the method's performance in simulations. Our method remains competitive with or outperforms multi-study stacking and other earlier methods across a range of between-study heterogeneity levels. △ Less

Submitted 2 October, 2021; v1 submitted 19 September, 2021; originally announced September 2021.

Comments: Manuscript: 27 pages, 6 figures, 4 tables; Supplement: 18 pages, 11 figures, 10 tables

arXiv:2109.05183 [pdf, other]

Following the spacial dynamics of COVID-19 in Mexico and some notes

Authors: Genaro J. Martínez, Magali Cárdenas Tapia, Ricardo Antonio Tena Núñez, Adriana de la Paz Sánchez Moreno

Abstract: Actually, after one year it is recognized that the evolution of COVID-19 is different in each country or region around the world. In this paper, we do a revision to the date about COVID-19 evolution in Mexico, we explain where the main epicenter and states with most high impact. Mexico has a particular geographical position in the American continent because it is a natural bridge between the USA a… ▽ More Actually, after one year it is recognized that the evolution of COVID-19 is different in each country or region around the world. In this paper, we do a revision to the date about COVID-19 evolution in Mexico, we explain where the main epicenter and states with most high impact. Mexico has a particular geographical position in the American continent because it is a natural bridge between the USA and Latin America, that represents a special point of propagation because between other facts this virus is transported by people of different nationalities migrating to the USA. The research in this paper helps to understand why Mexico is one of the countries with the most high mortality impact by this new virus and how the lockdown works in the population. Finally, we give a practical perspective as this evolution is a complex system. △ Less

Submitted 30 October, 2021; v1 submitted 11 September, 2021; originally announced September 2021.

Comments: 14 pages, 10 figures

arXiv:2009.08249 [pdf, ps, other]

Volumes of line bundles as limits on generically nonreduced schemes

Authors: Roberto Nunez

Abstract: The volume of a line bundle is defined in terms of a limsup. It is a fundamental question whether this limsup is a limit. It has been shown that this is always the case on generically reduced schemes. We show that volumes are limits in two classes of schemes that are not necessarily generically reduced: codimension one subschemes of projective varieties such that their components of maximal dimens… ▽ More The volume of a line bundle is defined in terms of a limsup. It is a fundamental question whether this limsup is a limit. It has been shown that this is always the case on generically reduced schemes. We show that volumes are limits in two classes of schemes that are not necessarily generically reduced: codimension one subschemes of projective varieties such that their components of maximal dimension contain normal points and projective schemes whose nilradical squared equals zero. △ Less

Submitted 9 February, 2021; v1 submitted 15 September, 2020; originally announced September 2020.

Comments: 21 pages. arXiv admin note: text overlap with arXiv:2007.12925

arXiv:2007.12925 [pdf, ps, other]

Volumes of Line Bundles on Schemes

Authors: Steven Dale Cutkosky, Roberto Nunez

Abstract: Volumes of line bundles are known to exist as limits on generically reduced projective schemes. However, it is not known if they always exist as limits on more general projective schemes. We show that they do always exist as a limit on a codimension one subscheme of a nonsingular projective variety. Volumes of line bundles are known to exist as limits on generically reduced projective schemes. However, it is not known if they always exist as limits on more general projective schemes. We show that they do always exist as a limit on a codimension one subscheme of a nonsingular projective variety. △ Less

Submitted 17 July, 2021; v1 submitted 25 July, 2020; originally announced July 2020.

Comments: 11 pages. Final version

MSC Class: 14C40; 14C17

arXiv:2003.14360 [pdf]

doi 10.1039/D0CP03620J

Application and Assessment of Deep Learning for the Generation of Potential NMDA Receptor Antagonists

Authors: Katherine J. Schultz, Sean M. Colby, Yasemin Yesiltepe, Jamie R. Nuñez, Monee Y. McGrady, Ryan R. Renslow

Abstract: Uncompetitive antagonists of the N-methyl D-aspartate receptor (NMDAR) have demonstrated therapeutic benefit in the treatment of neurological diseases such as Parkinson's and Alzheimer's, but some also cause dissociative effects that have led to the synthesis of illicit drugs. The ability to generate NMDAR antagonists in silico is therefore desirable both for new medication development and for pre… ▽ More Uncompetitive antagonists of the N-methyl D-aspartate receptor (NMDAR) have demonstrated therapeutic benefit in the treatment of neurological diseases such as Parkinson's and Alzheimer's, but some also cause dissociative effects that have led to the synthesis of illicit drugs. The ability to generate NMDAR antagonists in silico is therefore desirable both for new medication development and for preempting and identifying new designer drugs. Recently, generative deep learning models have been applied to de novo drug design as a means to expand the amount of chemical space that can be explored for potential drug-like compounds. In this study, we assess the application of a generative model to the NMDAR to achieve two primary objectives: (i) the creation and release of a comprehensive library of experimentally validated NMDAR phencyclidine (PCP) site antagonists to assist the drug discovery community and (ii) an analysis of both the advantages conferred by applying such generative artificial intelligence models to drug design and the current limitations of the approach. We apply, and provide source code for, a variety of ligand- and structure-based assessment techniques used in standard drug discovery analyses to the deep learning-generated compounds. We present twelve candidate antagonists that are not available in existing chemical databases to provide an example of what this type of workflow can achieve, though synthesis and experimental validation of these compounds is still required. △ Less

Submitted 31 March, 2020; originally announced March 2020.

arXiv:1905.08411 [pdf]

Deep learning to generate in silico chemical property libraries and candidate molecules for small molecule identification in complex samples

Authors: Sean M. Colby, Jamie R. Nuñez, Nathan O. Hodas, Courtney D. Corley, Ryan R. Renslow

Abstract: Comprehensive and unambiguous identification of small molecules in complex samples will revolutionize our understanding of the role of metabolites in biological systems. Existing and emerging technologies have enabled measurement of chemical properties of molecules in complex mixtures and, in concert, are sensitive enough to resolve even stereoisomers. Despite these experimental advances, small mo… ▽ More Comprehensive and unambiguous identification of small molecules in complex samples will revolutionize our understanding of the role of metabolites in biological systems. Existing and emerging technologies have enabled measurement of chemical properties of molecules in complex mixtures and, in concert, are sensitive enough to resolve even stereoisomers. Despite these experimental advances, small molecule identification is inhibited by (i) chemical reference libraries representing <1% of known molecules, limiting the number of possible identifications, and (ii) the lack of a method to generate candidate matches directly from experimental features (i.e. without a library). To this end, we developed a variational autoencoder (VAE) to learn a continuous numerical, or latent, representation of molecular structure to expand reference libraries for small molecule identification. We extended the VAE to include a chemical property decoder, trained as a multitask network, in order to shape the latent representation such that it assembles according to desired chemical properties. The approach is unique in its application to small molecule identification, with its focus on m/z and CCS, paired with its training paradigm, which involved a cascade of transfer learning iterations. This allows the network to learn as much as possible at each stage, enabling success with progressively smaller datasets without overfitting. Once trained, the network can rapidly predict chemical properties directly from structure, as well as generate candidate structures with desired chemical properties. Additionally, the ability to generate novel molecules along manifolds, defined by chemical property analogues, positions DarkChem as highly useful in a number of application areas, including metabolomics and small molecule identification, drug discovery and design, chemical forensics, and beyond. △ Less

Submitted 20 May, 2019; originally announced May 2019.

arXiv:1810.07367 [pdf]

Advancing Standards-Free Methods for the Identification of Small Molecules in Complex Samples

Authors: Jamie R. Nuñez, Sean M. Colby, Dennis G. Thomas, Malak M. Tfaily, Nikola Tolic, Elin M. Ulrich, Jon R. Sobus, Thomas O. Metz, Justin G. Teeguarden, Ryan S. Renslow

Abstract: The current gold standard for unambiguous identification in metabolomics analysis is based on comparing two or more orthogonal properties from the analysis of authentic, pure reference materials (standards) to experimental data acquired in the same laboratory with the same analytical methods. This represents a significant limitation for comprehensive chemical identification of small molecules in c… ▽ More The current gold standard for unambiguous identification in metabolomics analysis is based on comparing two or more orthogonal properties from the analysis of authentic, pure reference materials (standards) to experimental data acquired in the same laboratory with the same analytical methods. This represents a significant limitation for comprehensive chemical identification of small molecules in complex samples since this process is time-consuming and costly, and the majority of molecules are not yet represented by standards, leading to a need for standards-free identification. To address this need, we are advancing chemical property calculations and develo** multi-attribute scoring and matching algorithms to utilize data from multiple analytical platforms through the utilization and creation of the in silico Chemical Library Engine (ISiCLE) and the Multi-Attribute Matching Engine (MAME). Here, we describe our results in a blinded analysis of synthetic chemical mixtures as part of the U.S. Environmental Protection Agency's (EPA) Non-Targeted Analysis Collaborative Trial (ENTACT). The blinded false negative rate (FNR), false discovery rate (FDR), and accuracy were 57%, 77%, and 91%, respectively. For high confidence identifications, the FDR was 35%. After unblinding of the sample compositions, we improved our approach by optimizing the scoring parameters used to increase confidence. The final FNR, FDR, and accuracy were 67%, 53%, and 96%, respectively. For high confidence identifications, the FDR was 10%. This study demonstrates that standards-free small molecule identification and multi-attribute matching methods can significantly reduce reliance on standards. △ Less

Submitted 16 October, 2018; originally announced October 2018.

arXiv:1809.08378 [pdf]

ISiCLE: A molecular collision cross section calculation pipeline for establishing large in silico reference libraries for compound identification

Authors: Sean M. Colby, Dennis G. Thomas, Jamie R. Nunez, Douglas J. Baxter, Kurt R. Glaesemann, Joseph M. Brown, Meg A Pirrung, Niranjan Govind, Justin G. Teeguarden, Thomas O. Metz, Ryan S. Renslow

Abstract: Comprehensive and confident identifications of metabolites and other chemicals in complex samples will revolutionize our understanding of the role these chemically diverse molecules play in biological systems. Despite recent advances, metabolomics studies still result in the detection of a disproportionate number of features than cannot be confidently assigned to a chemical structure. This inadequ… ▽ More Comprehensive and confident identifications of metabolites and other chemicals in complex samples will revolutionize our understanding of the role these chemically diverse molecules play in biological systems. Despite recent advances, metabolomics studies still result in the detection of a disproportionate number of features than cannot be confidently assigned to a chemical structure. This inadequacy is driven by the single most significant limitation in metabolomics: the reliance on reference libraries constructed by analysis of authentic reference chemicals. To this end, we have developed the in silico chemical library engine (ISiCLE), a high-performance computing-friendly cheminformatics workflow for generating libraries of chemical properties. In the instantiation described here, we predict probable three-dimensional molecular conformers using chemical identifiers as input, from which collision cross sections (CCS) are derived. The approach employs state-of-the-art first-principles simulation, distinguished by use of molecular dynamics, quantum chemistry, and ion mobility calculations to generate structures and libraries, all without training data. Importantly, optimization of ISiCLE included a refactoring of the popular MOBCAL code for trajectory-based mobility calculations, improving its computational efficiency by over two orders of magnitude. Calculated CCS values were validated against 1,983 experimentally-measured CCS values and compared to previously reported CCS calculation approaches. An online database is introduced for sharing both calculated and experimental CCS values (metabolomics.pnnl.gov), initially including a CCS library with over 1 million entries. Finally, three successful applications of molecule characterization using calculated CCS are described. This work represents a promising method to address the limitations of small molecule identification. △ Less

Submitted 21 September, 2018; originally announced September 2018.

arXiv:1712.01662 [pdf]

doi 10.1371/journal.pone.0199239

Optimizing colormaps with consideration for color vision deficiency to enable accurate interpretation of scientific data

Authors: Jamie R. Nuñez, Christopher R. Anderton, Ryan S. Renslow

Abstract: Color vision deficiency (CVD) affects more than 4% of the population and leads to a different visual perception of colors. Though this has been known for decades, colormaps with many colors across the visual spectra are often used to represent data, leading to the potential for misinterpretation or difficulty with interpretation by someone with this deficiency. Until the creation of the module pre… ▽ More Color vision deficiency (CVD) affects more than 4% of the population and leads to a different visual perception of colors. Though this has been known for decades, colormaps with many colors across the visual spectra are often used to represent data, leading to the potential for misinterpretation or difficulty with interpretation by someone with this deficiency. Until the creation of the module presented here, there were no colormaps mathematically optimized for CVD using modern color appearance models. While there have been some attempts to make aesthetically pleasing or subjectively tolerable colormaps for those with CVD, our goal was to make optimized colormaps for the most accurate perception of scientific data by as many viewers as possible. We developed a Python module, cmaputil, to create CVD-optimized colormaps, which imports colormaps and modifies them to be perceptually uniform in CVD-safe colorspace while linearizing and maximizing the brightness range. The module is made available to the science community to enable others to easily create their own CVDoptimized colormaps. Here, we present an example CVD-optimized colormap created with this module that is optimized for viewing by those without a CVD as well as those with redgreen colorblindness. This colormap, cividis, enables nearly-identical visual-data interpretation to both groups, is perceptually uniform in hue and brightness, and increases in brightness linearly. △ Less

Submitted 1 August, 2018; v1 submitted 29 November, 2017; originally announced December 2017.

Journal ref: J. R. Nuñez, C. R. Anderton, and R. S. Renslow, "Optimizing colormaps with consideration for color vision deficiency to enable accurate interpretation of scientific data," PLOS ONE, 2018. 13(7): p. e0199239

arXiv:1706.04644 [pdf, ps, other]

doi 10.2140/pjm.2018.297.67

A characterization of round spheres in space forms

Authors: Francisco Fontenele, Roberto Alonso Núñez

Abstract: Let $\mathbb Q^{n+1}_c$ be the complete simply-connected $(n+1)$-dimensional space form of curvature $c$. In this paper we obtain a new characterization of geodesic spheres in $\mathbb Q^{n+1}_c$ in terms of the higher order mean curvatures. In particular, we prove that the geodesic sphere is the only complete bounded immersed hypersurface in $\mathbb Q^{n+1}_c,\;c\leq 0,$ with constant mean curva… ▽ More Let $\mathbb Q^{n+1}_c$ be the complete simply-connected $(n+1)$-dimensional space form of curvature $c$. In this paper we obtain a new characterization of geodesic spheres in $\mathbb Q^{n+1}_c$ in terms of the higher order mean curvatures. In particular, we prove that the geodesic sphere is the only complete bounded immersed hypersurface in $\mathbb Q^{n+1}_c,\;c\leq 0,$ with constant mean curvature and constant scalar curvature. The proof relies on the well known Omori-Yau maximum principle, a formula of Walter for the Laplacian of the $r$-th mean curvature of a hypersurface in a space form, and a classical inequality of Gårding for hyperbolic polynomials. △ Less

Submitted 14 June, 2017; originally announced June 2017.

Journal ref: Pacific J. Math. 297 (2018) 67-78

arXiv:1606.00806 [pdf, ps, other]

On complete hypersurfaces with constant mean and scalar curvatures in Euclidean spaces

Authors: Roberto Alonso Núñez

Abstract: Generalizing a theorem of Huang, Cheng and Wan classified the complete hypersurfaces of $\mathbb R^4$ with non-zero constant mean curvature and constant scalar curvature. In our work, we obtain results of this nature in higher dimensions. In particular, we prove that if a complete hypersurface of $\mathbb R^5$ has constant mean curvature $H\neq 0$ and constant scalar curvature… ▽ More Generalizing a theorem of Huang, Cheng and Wan classified the complete hypersurfaces of $\mathbb R^4$ with non-zero constant mean curvature and constant scalar curvature. In our work, we obtain results of this nature in higher dimensions. In particular, we prove that if a complete hypersurface of $\mathbb R^5$ has constant mean curvature $H\neq 0$ and constant scalar curvature $R\geq\frac{2}{3}H^2$, then $R=H^2$, $R=\frac{8}{9}H^2$ or $R=\frac{2}{3}H^2$. Moreover, we characterize the hypersurface in the cases $R=H^2$ and $R=\frac{8}{9}H^2$, and provide an example in the case $R=\frac{2}{3}H^2$. The proofs are based on the principal curvature theorem of Smyth-Xavier and a well known formula for the Laplacian of the squared norm of the second fundamental form of a hypersurface in a space form. △ Less

Submitted 2 June, 2016; originally announced June 2016.

arXiv:1503.06276 [pdf, other]

doi 10.1088/0004-637X/805/2/156

PSR J1930-1852: a pulsar in the widest known orbit around another neutron star

Authors: J. K. Swiggum, R. Rosen, M. A. McLaughlin, D. R. Lorimer, S. Heatherly, R. Lynch, S. Scoles, T. Hockett, E. Filik, J. A. Marlowe, B. N. Barlow, M. Weaver, M. Hilzendeger, S. Ernst, R. Crowley, E. Stone, B. Miller, R. Nunez, G. Trevino, M. Doehler, A. Cramer, D. Yencsik, J. Thorley, R. Andrews, A. Laws , et al. (11 additional authors not shown)

Abstract: In the summer of 2012, during a Pulsar Search Collaboratory workshop, two high-school students discovered J1930$-$1852, a pulsar in a double neutron star (DNS) system. Most DNS systems are characterized by short orbital periods, rapid spin periods and eccentric orbits. However, J1930$-$1852 has the longest spin period ($P_{\rm spin}\sim$185 ms) and orbital period ($P_{\rm b}\sim$45 days) yet measu… ▽ More In the summer of 2012, during a Pulsar Search Collaboratory workshop, two high-school students discovered J1930$-$1852, a pulsar in a double neutron star (DNS) system. Most DNS systems are characterized by short orbital periods, rapid spin periods and eccentric orbits. However, J1930$-$1852 has the longest spin period ($P_{\rm spin}\sim$185 ms) and orbital period ($P_{\rm b}\sim$45 days) yet measured among known, recycled pulsars in DNS systems, implying a shorter than average and/or inefficient recycling period before its companion went supernova. We measure the relativistic advance of periastron for J1930$-$1852, $\dotω=0.00078$(4) deg/yr, which implies a total mass (M$_{\rm{tot}}=2.59$(4) M$_{\odot}$) consistent with other DNS systems. The $2σ$ constraints on M$_{\rm{tot}}$ place limits on the pulsar and companion masses ($m_{\rm p}<1.32$ M$_{\odot}$ and $m_{\rm c}>1.30$ M$_{\odot}$ respectively). J1930$-$1852's spin and orbital parameters challenge current DNS population models and make J1930$-$1852 an important system for further investigation. △ Less

Submitted 21 March, 2015; originally announced March 2015.

Comments: 8 pages, 6 figures

arXiv:1111.1116 [pdf, ps, other]

Some Remarks on a Generalized Vector Product

Authors: Primitivo B. Acosta-Humánez, Moisés Aranda, Reinaldo Núñez

Abstract: In this paper we use a generalized vector product to construct an exterior form $\wedge :(\mathbb{R}^{n}) ^{k}\to \mathbb{R}^{\binom{n}{k}}$, where $\binom{n}{k}=\frac{n!}{(n-k)!k!}$, $k\leq n$. Finally, for $n=k-1$ we introduce the reversing operation to study this generalized vector product over palindromic and antipalindromic vectors. In this paper we use a generalized vector product to construct an exterior form $\wedge :(\mathbb{R}^{n}) ^{k}\to \mathbb{R}^{\binom{n}{k}}$, where $\binom{n}{k}=\frac{n!}{(n-k)!k!}$, $k\leq n$. Finally, for $n=k-1$ we introduce the reversing operation to study this generalized vector product over palindromic and antipalindromic vectors. △ Less

Submitted 14 March, 2012; v1 submitted 3 November, 2011; originally announced November 2011.

Comments: 10 pages, 14 pages in the published version: Revista Integración

MSC Class: 15A75; 15A72

arXiv:cond-mat/9710047 [pdf, ps, other]

The Two - Dimensional Attractive Hubbard Model: Highly Non-Linear Superconductivity With Sum Rules

Authors: J. J. Rodriguez - Nunez, C. E. Cordeiro, A. Delfino

Abstract: We use the moment approach of Nolting (exact sum rules) (Z. Physik 255, 25 (1972)) for the attractive Hubbard model in the superconducting phase. Our diagonal and off - diagonal spectral functions are constructed and evaluated with the sum rules. They reduce to the $BCS$ limit for weak interaction. However, the presence of correlations modify the $BCS$ picture dramatically. For example, due to t… ▽ More We use the moment approach of Nolting (exact sum rules) (Z. Physik 255, 25 (1972)) for the attractive Hubbard model in the superconducting phase. Our diagonal and off - diagonal spectral functions are constructed and evaluated with the sum rules. They reduce to the $BCS$ limit for weak interaction. However, the presence of correlations modify the $BCS$ picture dramatically. For example, due to the presence of correlations we have postulated a three - pole ansatz for the diagonal Green function, $G(\vec{k},ω)$, while the off - diagonal one, $B(\vec{k},ω)$, is supposed to have two poles. In the paper we present results for the three spectral weights of the diagonal Green function, $α_j(\vec{k})$, j = 1,2,3. Our results compare reasonably well with more elaborated auto - consistent highly non - linear equations (double fluctuation calculations in the $T-$ Matrix approach of one of the authors). Then, the physical picture which emerges is that the lower Hubbard band is split due to the superconducting gap and the upper Hubbard band remains mostly unmodified. △ Less

Submitted 4 October, 1997; originally announced October 1997.

Comments: 4 pages, three figures in ps (windows95). To appear in Physica A. Presented at LAWNP'97

Showing 1–21 of 21 results for author: Nunez, R