-
Fostering the integration of European Open Data into Data Spaces through High-Quality Metadata
Authors:
Javier Conde,
Alejandro Pozo,
Andrés Munoz-Arcentales,
Johnny Choque,
Álvaro Alonso
Abstract:
The term Data Space, understood as the secure exchange of data in distributed systems, ensuring openness, transparency, decentralization, sovereignty, and interoperability of information, has gained importance during the last years. However, Data Spaces are in an initial phase of definition, and new research is necessary to address their requirements. The Open Data ecosystem can be understood as o…
▽ More
The term Data Space, understood as the secure exchange of data in distributed systems, ensuring openness, transparency, decentralization, sovereignty, and interoperability of information, has gained importance during the last years. However, Data Spaces are in an initial phase of definition, and new research is necessary to address their requirements. The Open Data ecosystem can be understood as one of the precursors of Data Spaces as it provides mechanisms to ensure the interoperability of information through resource discovery, information exchange, and aggregation via metadata. However, Data Spaces require more advanced capabilities including the automatic and scalable generation and publication of high-quality metadata. In this work, we present a set of software tools that facilitate the automatic generation and publication of metadata, the modeling of datasets through standards, and the assessment of the quality of the generated metadata. We validate all these tools through the YODA Open Data Portal showing how they can be connected to integrate Open Data into Data Spaces.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Galaxy formation with Wave/Fuzzy Dark Matter: The core-halo structure
Authors:
Alvaro Pozo,
Razieh Emami,
Philip Mocz,
Tom Broadhurst,
Lars Hernquist,
Mark Vogelsberger,
Randall Smith,
Grant Tremblay,
Ramesh Narayan,
James Steiner,
Josh Grindlay,
George Smoot
Abstract:
Dark matter-dominated cores have long been claimed for the well-studied local group dwarf galaxies. More recently, extended stellar halos have been uncovered around several of these dwarfs through deeper imaging and spectroscopy. Such core-halo structures are not a feature of conventional cold dark matter (CDM), based on collisionless particles where smooth, scale-free profiles are predicted. In c…
▽ More
Dark matter-dominated cores have long been claimed for the well-studied local group dwarf galaxies. More recently, extended stellar halos have been uncovered around several of these dwarfs through deeper imaging and spectroscopy. Such core-halo structures are not a feature of conventional cold dark matter (CDM), based on collisionless particles where smooth, scale-free profiles are predicted. In contrast, smooth and prominent dark matter cores are predicted for Warm and Fuzzy/Wave Dark Matter (WDM/$ψ$DM) respectively. The question arises to what extent the visible stellar profiles should reflect this dark matter core structure. Here we compare cosmological hydrodynamical simulations of CDM, WDM $\&$ $ψ$DM, aiming to predict the stellar profiles for these three DM scenarios. We show that cores surrounded by extended halos are distinguishable for WDM and $ψ$DM, with the most prominent cores in the case of $ψ$DM, where the stellar density is enhanced in the core due to the presence of the relatively dense soliton. Our analysis demonstrates that such behavior does not appear in CDM, implying that the small-scale cut-off in the power spectrum present for WDM and $ψ$DM provides a core-halo transition. Consequently, we estimate the mass of the $ψ$DM particle at this core-halo transition point. Furthermore, we observe the anticipated asymmetry for $ψ$DM due to the soliton's random walk, a distinctive characteristic not found in the symmetric distributions of stars in Warm and CDM models.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
ALBERTI, a Multilingual Domain Specific Language Model for Poetry Analysis
Authors:
Javier de la Rosa,
Álvaro Pérez Pozo,
Salvador Ros,
Elena González-Blanco
Abstract:
The computational analysis of poetry is limited by the scarcity of tools to automatically analyze and scan poems. In a multilingual settings, the problem is exacerbated as scansion and rhyme systems only exist for individual languages, making comparative studies very challenging and time consuming. In this work, we present \textsc{Alberti}, the first multilingual pre-trained large language model f…
▽ More
The computational analysis of poetry is limited by the scarcity of tools to automatically analyze and scan poems. In a multilingual settings, the problem is exacerbated as scansion and rhyme systems only exist for individual languages, making comparative studies very challenging and time consuming. In this work, we present \textsc{Alberti}, the first multilingual pre-trained large language model for poetry. Through domain-specific pre-training (DSP), we further trained multilingual BERT on a corpus of over 12 million verses from 12 languages. We evaluated its performance on two structural poetry tasks: Spanish stanza type classification, and metrical pattern prediction for Spanish, English and German. In both cases, \textsc{Alberti} outperforms multilingual BERT and other transformers-based models of similar sizes, and even achieves state-of-the-art results for German when compared to rule-based systems, demonstrating the feasibility and effectiveness of DSP in the poetry domain.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Dwarf Galaxies United by Dark Bosons
Authors:
Alvaro Pozo,
Tom Broadhurst,
George F. Smoot,
Tzihong Chiueh,
Hoang Nhan Luu,
Mark Vogelsberger,
Philip Mocz
Abstract:
Low mass galaxies in the Local Group are dominated by dark matter and comprise the well studied ``dwarf Spheroidal" (dSph) class, with typical masses of $10^{9-10}M_\odot$ and also the equally numerous ``ultra faint dwarfs" (UFD), discovered recently, that are distinctly smaller and denser with masses of only $10^{7-8}M_\odot$. This bimodality amongst low mass galaxies contrasts with the scale fre…
▽ More
Low mass galaxies in the Local Group are dominated by dark matter and comprise the well studied ``dwarf Spheroidal" (dSph) class, with typical masses of $10^{9-10}M_\odot$ and also the equally numerous ``ultra faint dwarfs" (UFD), discovered recently, that are distinctly smaller and denser with masses of only $10^{7-8}M_\odot$. This bimodality amongst low mass galaxies contrasts with the scale free continuity expected for galaxies formed under gravity, as in the standard Cold Dark Matter (CDM) model for heavy particles. Within each dwarf class we find the core radius $R_c$ is inversely related to velocity dispersion $σ$, quite the opposite of standard expectations, but indicative of dark matter in a Bose-Einstein state, where the Uncertainty Principle requires $R_c \times σ$ is fixed by Planks constant, $h$. The corresponding boson mass, $m_b=h/R_c σ$, differs by one order of magnitude between the UDF and dSph classes, with $10^{-21.4}$eV and $10^{-20.3}$eV respectively. Two boson species is reinforced by parallel relations seen between the central density and radius of UDF and dSph dwarfs respectively, each matching the steep prediction, $ρ_c \propto R_c^{-4}$, for soliton cores in the ground state. Furthermore, soliton cores accurately fit the stellar profiles of UDF and dSph dwarfs where prominent, dense cores appear surrounded by low density halos, as predicted by our simulations. Multiple bosons may point to a String Theory interpretation for dark matter, where a discrete mass spectrum of axions is generically predicted to span many decades in mass, offering a unifying "Axiverse" interpretation for the observed "diversity" of dark matter dominated dwarf galaxies.
△ Less
Submitted 19 March, 2024; v1 submitted 31 January, 2023;
originally announced February 2023.
-
A Three-Parameter Elliptic Double-Box
Authors:
Alex Chaparro Pozo,
Matt von Hippel
Abstract:
We express a toy model of the ten-point elliptic double-box, first characterized in arXiv:1712.02785, in terms of elliptic polylogarithms. This toy model corresponds to a particular unphysical limit of the elliptic double-box in which it depends on only three dual conformal cross-ratios. While the diagram is fully permutation symmetric in the cross-ratios in this limit, this property is not manife…
▽ More
We express a toy model of the ten-point elliptic double-box, first characterized in arXiv:1712.02785, in terms of elliptic polylogarithms. This toy model corresponds to a particular unphysical limit of the elliptic double-box in which it depends on only three dual conformal cross-ratios. While the diagram is fully permutation symmetric in the cross-ratios in this limit, this property is not manifest in either of the two elliptic polylogarithm formalisms we use to express it. We observe that the function is a pure elliptic polylogarithm, which is the result of nontrivial identities between elliptic integrals depending on the conformal cross-ratios.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Portuguese Man-of-War Image Classification with Convolutional Neural Networks
Authors:
Alessandra Carneiro,
Lorena Nascimento,
Mauricio Noernberg,
Carmem Hara,
Aurora Pozo
Abstract:
Portuguese man-of-war (PMW) is a gelatinous organism with long tentacles capable of causing severe burns, thus leading to negative impacts on human activities, such as tourism and fishing. There is a lack of information about the spatio-temporal dynamics of this species. Therefore, the use of alternative methods for collecting data can contribute to their monitoring. Given the widespread use of so…
▽ More
Portuguese man-of-war (PMW) is a gelatinous organism with long tentacles capable of causing severe burns, thus leading to negative impacts on human activities, such as tourism and fishing. There is a lack of information about the spatio-temporal dynamics of this species. Therefore, the use of alternative methods for collecting data can contribute to their monitoring. Given the widespread use of social networks and the eye-catching look of PMW, Instagram posts can be a promising data source for monitoring. The first task to follow this approach is to identify posts that refer to PMW. This paper reports on the use of convolutional neural networks for PMW images classification, in order to automate the recognition of Instagram posts. We created a suitable dataset, and trained three different neural networks: VGG-16, ResNet50, and InceptionV3, with and without a pre-trained step with the ImageNet dataset. We analyzed their results using accuracy, precision, recall, and F1 score metrics. The pre-trained ResNet50 network presented the best results, obtaining 94% of accuracy and 95% of precision, recall, and F1 score. These results show that convolutional neural networks can be very effective for recognizing PMW images from the Instagram social media.
△ Less
Submitted 3 July, 2022;
originally announced July 2022.
-
An Analysis of the Admissibility of the Objective Functions Applied in Evolutionary Multi-objective Clustering
Authors:
Cristina Y. Morimoto,
Aurora Pozo,
Marcílio C. P. de Souto
Abstract:
A variety of clustering criteria has been applied as an objective function in Evolutionary Multi-Objective Clustering approaches (EMOCs). However, most EMOCs do not provide detailed analysis regarding the choice and usage of the objective functions. Aiming to support a better choice and definition of the objectives in the EMOCs, this paper proposes an analysis of the admissibility of the clusterin…
▽ More
A variety of clustering criteria has been applied as an objective function in Evolutionary Multi-Objective Clustering approaches (EMOCs). However, most EMOCs do not provide detailed analysis regarding the choice and usage of the objective functions. Aiming to support a better choice and definition of the objectives in the EMOCs, this paper proposes an analysis of the admissibility of the clustering criteria in evolutionary optimization by examining the search direction and its potential in finding optimal results. As a result, we demonstrate how the admissibility of the objective functions can influence the optimization. Furthermore, we provide insights regarding the combinations and usage of the clustering criteria in the EMOCs.
△ Less
Submitted 19 June, 2022;
originally announced June 2022.
-
Understanding the "Feeble Giant" Crater II with tidally stretched Wave Dark Matter
Authors:
A. Pozo,
T. Broadhurst,
R. Emami,
G. Smoot
Abstract:
The unusually large "dwarf" galaxy Crater II, with its small velocity dispersion, $\simeq 3$ km/s, defies expectations that low mass galaxies should be small and dense. We combine the latest stellar and velocity dispersion profiles finding Crater II has a prominent dark core of radius $\simeq 0.71^{+0.09}_{-0.08}$ kpc, surrounded by a low density halo, with a transition visible between the core an…
▽ More
The unusually large "dwarf" galaxy Crater II, with its small velocity dispersion, $\simeq 3$ km/s, defies expectations that low mass galaxies should be small and dense. We combine the latest stellar and velocity dispersion profiles finding Crater II has a prominent dark core of radius $\simeq 0.71^{+0.09}_{-0.08}$ kpc, surrounded by a low density halo, with a transition visible between the core and the halo. We show that this profile matches the distinctive core-halo profile predicted by "Wave Dark Matter" as a Bose-Einstein condensate, $ψ$DM, where the ground state soliton core is surrounded by a tenuous halo of interfering waves, with a marked density transition predicted between the core and halo. Similar core-halo structure is seen in most dwarf spheroidal galaxies (dSph), but with smaller cores, $\simeq 0.25$ kpc and higher velocity dispersions, $\simeq 9$km/s, and we argue here that Crater II may have been a typical dSph that has lost most of its halo mass to tidal strip**, so its velocity dispersion is lower by a factor of 3 and the soliton is wider by a factor of 3, following the inverse scaling required by the Uncertainty Principle. This tidal solution for Crater II in the context of $ψ$DM, is supported by its small pericenter of $\simeq 20$ kpc established by Gaia, implying significant tidal strip** of Crater II by the Milky Way is expected.
△ Less
Submitted 1 July, 2022; v1 submitted 13 December, 2021;
originally announced December 2021.
-
A Review of Evolutionary Multi-objective Clustering Approaches
Authors:
Cristina Y. Morimoto,
Aurora Pozo,
Marcílio C. P. de Souto
Abstract:
Evolutionary multi-objective clustering (EMOC), a modern clustering technique, has been widely applied to extract patterns, allowing us to analyze different aspects of complex data by considering multiple criteria. In this article, we present an analysis of the advances in EMOC studies and provide a profile of this study field by considering an extensive map** of the literature to identify the m…
▽ More
Evolutionary multi-objective clustering (EMOC), a modern clustering technique, has been widely applied to extract patterns, allowing us to analyze different aspects of complex data by considering multiple criteria. In this article, we present an analysis of the advances in EMOC studies and provide a profile of this study field by considering an extensive map** of the literature to identify the main methods and concepts that have been adopted to design the EMOC approaches. This review provides a comprehensive view of the EMOC studies that supports newcomers or busy researchers in understanding the general features of the existing algorithms and guides the generation of new approaches. For that, we introduce a general architecture of EMOC to describe the main elements applied in designing EMOC algorithms and we correlate them with the main features found in the literature. Also, we categorized the EMOC algorithms based on shared characteristics that highlight the main features or application fields. The paper ends by addressing some potential subjects for future research.
△ Less
Submitted 1 April, 2022; v1 submitted 15 October, 2021;
originally announced October 2021.
-
Multi-objective Clustering: A Data-driven Analysis of MOCLE, MOCK and $Δ$-MOCK
Authors:
Adriano Kultzak,
Cristina Y. Morimoto,
Aurora Pozo,
Marcílio C. P. de Souto
Abstract:
We present a data-driven analysis of MOCK, $Δ$-MOCK, and MOCLE. These are three closely related approaches that use multi-objective optimization for crisp clustering. More specifically, based on a collection of 12 datasets presenting different proprieties, we investigate the performance of MOCLE and MOCK compared to the recently proposed $Δ$-MOCK. Besides performing a quantitative analysis identif…
▽ More
We present a data-driven analysis of MOCK, $Δ$-MOCK, and MOCLE. These are three closely related approaches that use multi-objective optimization for crisp clustering. More specifically, based on a collection of 12 datasets presenting different proprieties, we investigate the performance of MOCLE and MOCK compared to the recently proposed $Δ$-MOCK. Besides performing a quantitative analysis identifying which method presents a good/poor performance with respect to another, we also conduct a more detailed analysis on why such a behavior happened. Indeed, the results of our analysis provide useful insights into the strengths and weaknesses of the methods investigated.
△ Less
Submitted 23 October, 2021; v1 submitted 14 October, 2021;
originally announced October 2021.
-
Network Flow based approaches for the Pipelines Routing Problem in Naval Design
Authors:
Víctor Blanco,
Gabriel González,
Yolanda Hinojosa,
Diego Ponce,
Miguel A. Pozo,
Justo Puerto
Abstract:
In this paper we propose a general methodology for the optimal automatic routing of spatial pipelines motivated by a recent collaboration with Ghenova, a leading Naval Engineering company. We provide a minimum cost multicommodity network flow based model for the problem incorporating all the technical requirements for a feasible pipeline routing. A branch-and-cut approach is designed and different…
▽ More
In this paper we propose a general methodology for the optimal automatic routing of spatial pipelines motivated by a recent collaboration with Ghenova, a leading Naval Engineering company. We provide a minimum cost multicommodity network flow based model for the problem incorporating all the technical requirements for a feasible pipeline routing. A branch-and-cut approach is designed and different matheuristic algorithms are derived for solving efficiently the problem. We report the results of a battery of computational experiments to assess the problem performance as well as a case study of a real-world naval instance provided by our partner company.
△ Less
Submitted 1 August, 2021;
originally announced August 2021.
-
Detection of a universal core-halo transition in dwarf galaxies as predicted by Bose-Einstein dark matter
Authors:
Alvaro Pozo,
Tom Broadhurst,
Ivan de Martino,
Tzihong Chiueh,
George F. Smoot,
Silvia Bonoli,
Raul Angulo
Abstract:
The presence of large dark matter cores in dwarf galaxies has long been puzzling and many are now known to be surrounded by an extensive halo of stars. Distinctive core-halo structure is characteristic of dark matter as a Bose Einstein condensate, $ψ$DM, with a dense, soliton core predicted in every galaxy, representing the ground state, surrounded by a large, tenuous halo of excited density waves…
▽ More
The presence of large dark matter cores in dwarf galaxies has long been puzzling and many are now known to be surrounded by an extensive halo of stars. Distinctive core-halo structure is characteristic of dark matter as a Bose Einstein condensate, $ψ$DM, with a dense, soliton core predicted in every galaxy, representing the ground state, surrounded by a large, tenuous halo of excited density waves. A marked density transition is predicted between the core and the halo set by the de Broglie wavelength, as the soliton core is a prominent standing wave that is denser by over an order of magnitude than the surrounding halo. Here we identify this predicted behavior in the stellar profiles of the well known "isolated" dwarfs that lie outside the Milky Way, each with a clear density transition at $\simeq 1.0~{\rm kpc}$, implying a very light boson, $m_ψ \simeq 10^{-22}$eV. The classical dwarf galaxies orbiting within the Milky Way also show this predicted core-halo structure but with larger density transitions of over two orders of magnitude, that we show implies tidal strip** of dwarf galaxies by the Milky way, as the tenuous halo is more easily stripped than the stable soliton core. We conclude that dark matter as a light boson explains the observed family of classical dwarf profiles with tidal strip** included, in contrast to the standard heavy particle interpretation where low mass galaxies should be concentrated and core-less, quite unlike the core-halo structure observed.
△ Less
Submitted 17 December, 2021; v1 submitted 20 October, 2020;
originally announced October 2020.
-
Concerning Quantum Identification Without Entanglement
Authors:
Carlos E. González-Guillén,
María Isabel González Vasco,
Floyd Johnson,
Ángel L. Pérez del Pozo
Abstract:
Identification schemes are interactive protocols typically involving two parties, a prover, who wants to provide evidence of his or her identity and a verifier, who checks the provided evidence and decide whether it comes or not from the intended prover. In this paper, we comment on a recent proposal for quantum identity authentication from Zawadzki, and give a concrete attack upholding theoretica…
▽ More
Identification schemes are interactive protocols typically involving two parties, a prover, who wants to provide evidence of his or her identity and a verifier, who checks the provided evidence and decide whether it comes or not from the intended prover. In this paper, we comment on a recent proposal for quantum identity authentication from Zawadzki, and give a concrete attack upholding theoretical impossibility results from Lo and Buhrman et al. More precisely, we show that using a simple strategyan adversary may indeed obtain non-negligible information on the shared identification secret. While the security of a quantum identity authentication scheme is not formally defined in [1], it is clear that such a definition should somehow imply that an external entity may gain no information on the shared identification scheme (even if he actively participates injecting messages in a protocol execution, which is not assumed in our attack strategy).
△ Less
Submitted 30 March, 2020; v1 submitted 26 March, 2020;
originally announced March 2020.
-
Wave Dark Matter and Ultra Diffuse Galaxies
Authors:
Alvaro Pozo,
Tom Broadhurst,
Ivan De Martino,
Hoang Nhan Luu,
George F. Smoot,
Jeremy Lim,
Mark Neyrinck
Abstract:
Dark matter as a Bose-Einstein condensate, such as the axionic scalar field particles of String Theory, can explain the coldness of dark matter on large scales. Pioneering simulations in this context predict a rich wave-like structure, with a ground state soliton core in every galaxy surrounded by a halo of excited states that interfere on the de Broglie scale. This de Broglie scale is largest for…
▽ More
Dark matter as a Bose-Einstein condensate, such as the axionic scalar field particles of String Theory, can explain the coldness of dark matter on large scales. Pioneering simulations in this context predict a rich wave-like structure, with a ground state soliton core in every galaxy surrounded by a halo of excited states that interfere on the de Broglie scale. This de Broglie scale is largest for low mass galaxies as momentum is lower, providing a simple explanation for the wide cores of dwarf spheroidal galaxies. Here we extend these "wave dark matter" ($ψ$DM) predictions to the newly discovered class of "Ultra Diffuse Galaxies" (UDG) that resemble dwarf spheroidal galaxies but with more extended stellar profiles. Currently the best studied example, DF44, has a uniform velocity dispersion of $\simeq 33$km/s, extending to at least 3 kpc, that we show is reproduced by our $ψ$DM simulations with a soliton radius of $\simeq 0.5$ kpc. In the $ψ$DM context, we show the relatively flat dispersion profile of DF44 lies between massive galaxies with compact dense solitons, as may be present in the Milky Way on a scale of 100pc and lower mass galaxies where the velocity dispersion declines centrally within a wide, low density soliton, like Antlia II, of radius 3 kpc.
△ Less
Submitted 31 March, 2021; v1 submitted 18 March, 2020;
originally announced March 2020.
-
Proceedings of the X International Workshop on Locational Analysis and Related Problems
Authors:
Maria Albareda-Sambola,
Marta Baldomero-Naranjo,
Luisa I. Martínez-Merino,
Diego Ponce,
Miguel A. Pozo,
Justo Puerto,
Victoria Rebillas-Loredo.
Abstract:
The International Workshop on Locational Analysis and Related Problems will take place during January 23-24, 2020 in Seville (Spain). It is organized by the Spanish Location Network and the Location Group GELOCA from the Spanish Society of Statistics and Operations Research(SEIO). The Spanish Location Network is a group of more than 140 researchers from several Spanish universities organized into…
▽ More
The International Workshop on Locational Analysis and Related Problems will take place during January 23-24, 2020 in Seville (Spain). It is organized by the Spanish Location Network and the Location Group GELOCA from the Spanish Society of Statistics and Operations Research(SEIO). The Spanish Location Network is a group of more than 140 researchers from several Spanish universities organized into 7 thematic groups. The Network has been funded by the Spanish Government since 2003.
One of the main activities of the Network is a yearly meeting aimed at promoting the communication among its members and between them and other researchers, and to contribute to the development of the location field and related problems. The last meetings have taken place in Cádiz (January 20-February 1, 2019), Segovia (September 27-29, 2017), Málaga (September 14-16, 2016), Barcelona (November 25-28, 2015), Sevilla (October 1-3, 2014), Torremolinos (Málaga, June 19-21, 2013), Granada (May 10-12, 2012), Las Palmas de Gran Canaria (February 2-5, 2011) and Sevilla (February 1-3, 2010).
The topics of interest are location analysis and related problems. This includes location models, networks, transportation, logistics, exact and heuristic solution methods, and computational geometry, among others.
△ Less
Submitted 5 February, 2020;
originally announced February 2020.
-
A splitting method for the augmented Burgers equation
Authors:
Liviu I. Ignat,
Alejandro Pozo
Abstract:
In this paper we consider a splitting method for the augmented Burgers equation and prove that it is of first order. We also analyze the large-time behavior of the approximated solution by obtaining the first term in the asymptotic expansion. We prove that, when time increases, these solutions behave as the self-similar solutions of the viscous Burgers equation
In this paper we consider a splitting method for the augmented Burgers equation and prove that it is of first order. We also analyze the large-time behavior of the approximated solution by obtaining the first term in the asymptotic expansion. We prove that, when time increases, these solutions behave as the self-similar solutions of the viscous Burgers equation
△ Less
Submitted 21 November, 2016; v1 submitted 13 November, 2016;
originally announced November 2016.
-
An expanded evaluation of protein function prediction methods shows an improvement in accuracy
Authors:
Yuxiang Jiang,
Tal Ronnen Oron,
Wyatt T Clark,
Asma R Bankapur,
Daniel D'Andrea,
Rosalba Lepore,
Christopher S Funk,
Indika Kahanda,
Karin M Verspoor,
Asa Ben-Hur,
Emily Koo,
Duncan Penfold-Brown,
Dennis Shasha,
Noah Youngs,
Richard Bonneau,
Alexandra Lin,
Sayed ME Sahraeian,
Pier Luigi Martelli,
Giuseppe Profiti,
Rita Casadio,
Renzhi Cao,
Zhaolong Zhong,
Jianlin Cheng,
Adrian Altenhoff,
Nives Skunca
, et al. (122 additional authors not shown)
Abstract:
Background: The increasing volume and variety of genotypic and phenotypic data is a major defining characteristic of modern biomedical sciences. At the same time, the limitations in technology for generating data and the inherently stochastic nature of biomolecular events have led to the discrepancy between the volume of data and the amount of knowledge gleaned from it. A major bottleneck in our a…
▽ More
Background: The increasing volume and variety of genotypic and phenotypic data is a major defining characteristic of modern biomedical sciences. At the same time, the limitations in technology for generating data and the inherently stochastic nature of biomolecular events have led to the discrepancy between the volume of data and the amount of knowledge gleaned from it. A major bottleneck in our ability to understand the molecular underpinnings of life is the assignment of function to biological macromolecules, especially proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, accurately assessing methods for protein function prediction and tracking progress in the field remain challenging. Methodology: We have conducted the second Critical Assessment of Functional Annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. One hundred twenty-six methods from 56 research groups were evaluated for their ability to predict biological functions using the Gene Ontology and gene-disease associations using the Human Phenotype Ontology on a set of 3,681 proteins from 18 species. CAFA2 featured significantly expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis also compared the best methods participating in CAFA1 to those of CAFA2. Conclusions: The top performing methods in CAFA2 outperformed the best methods from CAFA1, demonstrating that computational function prediction is improving. This increased accuracy can be attributed to the combined effect of the growing number of experimental annotations and improved methods for function prediction.
△ Less
Submitted 2 January, 2016;
originally announced January 2016.
-
MOEA/D-GM: Using probabilistic graphical models in MOEA/D for solving combinatorial optimization problems
Authors:
Murilo Zangari de Souza,
Roberto Santana,
Aurora Trinidad Ramirez Pozo,
Alexander Mendiburu
Abstract:
Evolutionary algorithms based on modeling the statistical dependencies (interactions) between the variables have been proposed to solve a wide range of complex problems. These algorithms learn and sample probabilistic graphical models able to encode and exploit the regularities of the problem. This paper investigates the effect of using probabilistic modeling techniques as a way to enhance the beh…
▽ More
Evolutionary algorithms based on modeling the statistical dependencies (interactions) between the variables have been proposed to solve a wide range of complex problems. These algorithms learn and sample probabilistic graphical models able to encode and exploit the regularities of the problem. This paper investigates the effect of using probabilistic modeling techniques as a way to enhance the behavior of MOEA/D framework. MOEA/D is a decomposition based evolutionary algorithm that decomposes a multi-objective optimization problem (MOP) in a number of scalar single-objective subproblems and optimizes them in a collaborative manner. MOEA/D framework has been widely used to solve several MOPs. The proposed algorithm, MOEA/D using probabilistic Graphical Models (MOEA/D-GM) is able to instantiate both univariate and multi-variate probabilistic models for each subproblem. To validate the introduced framework algorithm, an experimental study is conducted on a multi-objective version of the deceptive function Trap5. The results show that the variant of the framework (MOEA/D-Tree), where tree models are learned from the matrices of the mutual information between the variables, is able to capture the structure of the problem. MOEA/D-Tree is able to achieve significantly better results than both MOEA/D using genetic operators and MOEA/D using univariate probability models, in terms of the approximation to the true Pareto front.
△ Less
Submitted 17 November, 2015;
originally announced November 2015.
-
A semi-discrete large-time behavior preserving scheme for the augmented Burgers equation
Authors:
Liviu I. Ignat,
Alejandro Pozo
Abstract:
In this paper we analyze the large-time behavior of the augmented Burgers equation. We first study the well-posedness of the Cauchy problem and obtain $L^1$-$L^p$ decay rates. The asymptotic behavior of the solution is obtained by showing that the influence of the convolution term $K*u_{xx}$ is the same as $u_{xx}$ for large times. Then, we propose a semi-discrete numerical scheme that preserves t…
▽ More
In this paper we analyze the large-time behavior of the augmented Burgers equation. We first study the well-posedness of the Cauchy problem and obtain $L^1$-$L^p$ decay rates. The asymptotic behavior of the solution is obtained by showing that the influence of the convolution term $K*u_{xx}$ is the same as $u_{xx}$ for large times. Then, we propose a semi-discrete numerical scheme that preserves this asymptotic behavior, by introducing two correcting factors in the discretization of the non-local term. Numerical experiments illustrating the accuracy of the results of the paper are also presented.
△ Less
Submitted 5 June, 2017; v1 submitted 12 September, 2014;
originally announced September 2014.
-
Large-time asymptotics, vanishing viscosity and numerics for 1-D scalar conservation laws
Authors:
Liviu I. Ignat,
Alejandro Pozo,
Enrique Zuazua
Abstract:
In this paper we analyze the large time asymptotic behavior of the discrete solutions of numerical approximation schemes for scalar hyperbolic conservation laws. We consider three monotone conservative schemes that are consistent with the one-sided Lipschitz condition (OSLC): Lax-Friedrichs, Engquist-Osher and Godunov. We mainly focus on the inviscid Burgers equation, for which we know that the la…
▽ More
In this paper we analyze the large time asymptotic behavior of the discrete solutions of numerical approximation schemes for scalar hyperbolic conservation laws. We consider three monotone conservative schemes that are consistent with the one-sided Lipschitz condition (OSLC): Lax-Friedrichs, Engquist-Osher and Godunov. We mainly focus on the inviscid Burgers equation, for which we know that the large time behavior is of self-similar nature, described by a two-parameter family of N-waves. We prove that, at the numerical level, the large time dynamics depends on the amount of numerical viscosity introduced by the scheme: while Engquist-Osher and Godunov yield the same N-wave asymptotic behavior, the Lax-Friedrichs scheme leads to viscous self-similar profiles, corresponding to the asymptotic behavior of the solutions of the continuous viscous Burgers equation. The same problem is analyzed in the context of self-similar variables that lead to a better numerical performance but to the same dichotomy on the asymptotic behavior: N-waves versus viscous ones. We also give some hints to extend the results to more general fluxes. Some numerical experiments illustrating the accuracy of the results of the paper are also presented.
△ Less
Submitted 29 May, 2014; v1 submitted 20 June, 2013;
originally announced June 2013.
-
A modeling framework for Ordered Weighted Average Combinatorial Optimization
Authors:
Elena Fernández,
Miguel A. Pozo,
Justo Puerto
Abstract:
Multiobjective combinatorial optimization deals with problems considering more than one viewpoint or scenario. The problem of aggregating multiple criteria to obtain a globalizing objective function is of special interest when the number of Pareto solutions becomes considerably large or when a single, meaningful solution is required. Ordered Weighted Average or Ordered Median operators are very us…
▽ More
Multiobjective combinatorial optimization deals with problems considering more than one viewpoint or scenario. The problem of aggregating multiple criteria to obtain a globalizing objective function is of special interest when the number of Pareto solutions becomes considerably large or when a single, meaningful solution is required. Ordered Weighted Average or Ordered Median operators are very useful when preferential information is available and objectives are comparable since they assign importance weights not to specific objectives but to their sorted values. In this paper, Ordered Weighted Average optimization problems are studied from a modeling point of view. Alternative integer programming formulations for such problems are presented and their respective domains studied and compared. In addition, their associated polyhedra are studied and some families of facets and new families of valid inequalities presented. The proposed formulations are particularized for two well-known combinatorial optimization problems, namely, shortest path and minimum cost perfect matching, and the results of computational experiments presented and analyzed. These results indicate that the new formulations reinforced with appropriate constraints can be effective for efficiently solving medium to large size instances.
△ Less
Submitted 6 June, 2013;
originally announced June 2013.
-
Effective linkage learning using low-order statistics and clustering
Authors:
Leonardo Emmendorfer,
Aurora Pozo
Abstract:
The adoption of probabilistic models for the best individuals found so far is a powerful approach for evolutionary computation. Increasingly more complex models have been used by estimation of distribution algorithms (EDAs), which often result better effectiveness on finding the global optima for hard optimization problems. Supervised and unsupervised learning of Bayesian networks are very effec…
▽ More
The adoption of probabilistic models for the best individuals found so far is a powerful approach for evolutionary computation. Increasingly more complex models have been used by estimation of distribution algorithms (EDAs), which often result better effectiveness on finding the global optima for hard optimization problems. Supervised and unsupervised learning of Bayesian networks are very effective options, since those models are able to capture interactions of high order among the variables of a problem. Diversity preservation, through niching techniques, has also shown to be very important to allow the identification of the problem structure as much as for kee** several global optima. Recently, clustering was evaluated as an effective niching technique for EDAs, but the performance of simpler low-order EDAs was not shown to be much improved by clustering, except for some simple multimodal problems. This work proposes and evaluates a combination operator guided by a measure from information theory which allows a clustered low-order EDA to effectively solve a comprehensive range of benchmark optimization problems.
△ Less
Submitted 16 October, 2007; v1 submitted 15 October, 2007;
originally announced October 2007.
-
Understanding multifractality: reconstructing images from edges
Authors:
Antonio Turiel,
Angela del Pozo
Abstract:
It has been recently proven that natural images exhibit scaling properties analogue to those of turbulent flows. These properties allow regarding each image as a multifractal object, for which its most singular manifold conveys the most of the non-redundant structure. In the present work, we go further in this analysis, proposing a simple propagator that reconstructs the whole image from this se…
▽ More
It has been recently proven that natural images exhibit scaling properties analogue to those of turbulent flows. These properties allow regarding each image as a multifractal object, for which its most singular manifold conveys the most of the non-redundant structure. In the present work, we go further in this analysis, proposing a simple propagator that reconstructs the whole image from this set. This fact could have deep implications for biology, technology and statistical mechanics.
△ Less
Submitted 26 November, 1998;
originally announced November 1998.
-
QCD
Authors:
P. Nason,
B. R. Webber,
D. Ward,
D. Lanske,
L. A. del Pozo,
F. Fabbri,
B. Poli,
G. Cowan,
C. Padilla,
M. Seymour,
F. Hautmann,
Yu. L. Dokshitzer,
V. A. Khoze
Abstract:
We discuss QCD studies that will be possible at LEP2. We examine both experimental and theoretical aspects of jets, fragmentation functions, multiplicities and particle spectra.
We discuss QCD studies that will be possible at LEP2. We examine both experimental and theoretical aspects of jets, fragmentation functions, multiplicities and particle spectra.
△ Less
Submitted 13 February, 1996;
originally announced February 1996.