-
Automating Easy Read Text Segmentation
Authors:
Jesús Calleja,
Thierry Etchegoyhen,
David Ponce
Abstract:
Easy Read text is one of the main forms of access to information for people with reading difficulties. One of the key characteristics of this type of text is the requirement to split sentences into smaller grammatical segments, to facilitate reading. Automated segmentation methods could foster the creation of Easy Read content, but their viability has yet to be addressed. In this work, we study no…
▽ More
Easy Read text is one of the main forms of access to information for people with reading difficulties. One of the key characteristics of this type of text is the requirement to split sentences into smaller grammatical segments, to facilitate reading. Automated segmentation methods could foster the creation of Easy Read content, but their viability has yet to be addressed. In this work, we study novel methods for the task, leveraging masked and generative language models, along with constituent parsing. We conduct comprehensive automatic and human evaluations in three languages, analysing the strengths and weaknesses of the proposed alternatives, under scarce resource limitations. Our results highlight the viability of automated ER segmentation and remaining deficiencies compared to expert-driven human segmentation.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Constraint relaxation for the Discrete Ordered Median Problem
Authors:
Luisa I. Martínez-Merino,
Diego Ponce,
Justo Puerto
Abstract:
This paper compares different exact approaches to solve the Discrete Ordered Median Problem (DOMP). In recent years, DOMP has been formulated using set packing constraints giving rise to one of its most promising formulations. The use of this family of constraints, known as strong order constraints (SOC), has been validated in the literature by its theoretical properties and because their linear r…
▽ More
This paper compares different exact approaches to solve the Discrete Ordered Median Problem (DOMP). In recent years, DOMP has been formulated using set packing constraints giving rise to one of its most promising formulations. The use of this family of constraints, known as strong order constraints (SOC), has been validated in the literature by its theoretical properties and because their linear relaxation provides very good lower bounds. Furthermore, embedded in branch-and-cut or branch-price-and-cut procedures as valid inequalities, they allow one to improve computational aspects of solution methods such as CPU time and use of memory. In spite of that, the above mentioned formulations require to include another family of order constraints, e.g., the weak order constraints (WOC), which leads to coefficient matrices with elements other than {0,1}. In this work, we develop a new approach that does not consider extra families of order constraints and furthermore relaxes SOC -- in a branch-and-cut procedure that does not start with a complete formulation -- to add them iteratively using row generation techniques to certify feasibility and optimality. Exhaustive computational experiments show that it is advisable to use row generation techniques in order to only consider {0,1}-coefficient matrices modeling the DOMP. Moreover, we test how to exploit the problem structure. Implementing an efficient separation of SOC using callbacks improves the solution performance. This allows us to deal with bigger instances than using fixed cuts/constraints pools automatically added by the solver in the branch-and-cut for SOC, concerning both the formulation based on WOC and the row generation procedure.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Split and Rephrase with Large Language Models
Authors:
David Ponce,
Thierry Etchegoyhen,
Jesús Calleja Pérez,
Harritxu Gete
Abstract:
The Split and Rephrase (SPRP) task, which consists in splitting complex sentences into a sequence of shorter grammatical sentences, while preserving the original meaning, can facilitate the processing of complex texts for humans and machines alike. It is also a valuable testbed to evaluate natural language processing models, as it requires modelling complex grammatical aspects. In this work, we ev…
▽ More
The Split and Rephrase (SPRP) task, which consists in splitting complex sentences into a sequence of shorter grammatical sentences, while preserving the original meaning, can facilitate the processing of complex texts for humans and machines alike. It is also a valuable testbed to evaluate natural language processing models, as it requires modelling complex grammatical aspects. In this work, we evaluate large language models on the task, showing that they can provide large improvements over the state of the art on the main metrics, although still lagging in terms of splitting compliance. Results from two human evaluations further support the conclusions drawn from automated metric results. We provide a comprehensive study that includes prompting variants, domain shift, fine-tuned pretrained language models of varying parameter size and training data volumes, contrasted with both zero-shot and few-shot approaches on instruction-tuned language models. Although the latter were markedly outperformed by fine-tuned models, they may constitute a reasonable off-the-shelf alternative. Our results provide a fine-grained analysis of the potential and limitations of large language models for SPRP, with significant improvements achievable using relatively small amounts of training data and model parameters overall, and remaining limitations for all models on the task.
△ Less
Submitted 3 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Network Flow based approaches for the Pipelines Routing Problem in Naval Design
Authors:
Víctor Blanco,
Gabriel González,
Yolanda Hinojosa,
Diego Ponce,
Miguel A. Pozo,
Justo Puerto
Abstract:
In this paper we propose a general methodology for the optimal automatic routing of spatial pipelines motivated by a recent collaboration with Ghenova, a leading Naval Engineering company. We provide a minimum cost multicommodity network flow based model for the problem incorporating all the technical requirements for a feasible pipeline routing. A branch-and-cut approach is designed and different…
▽ More
In this paper we propose a general methodology for the optimal automatic routing of spatial pipelines motivated by a recent collaboration with Ghenova, a leading Naval Engineering company. We provide a minimum cost multicommodity network flow based model for the problem incorporating all the technical requirements for a feasible pipeline routing. A branch-and-cut approach is designed and different matheuristic algorithms are derived for solving efficiently the problem. We report the results of a battery of computational experiments to assess the problem performance as well as a case study of a real-world naval instance provided by our partner company.
△ Less
Submitted 1 August, 2021;
originally announced August 2021.
-
A Branch-and-Price approach for the Continuous Multifacility Monotone Ordered Median Problem
Authors:
Víctor Blanco,
Ricardo Gázquez,
Diego Ponce,
Justo Puerto
Abstract:
In this paper, we address the Continuous Multifacility Monotone Ordered Median Problem. This problem minimizes a monotone ordered weighted median function of the distances between given demand points in $\mathbb{R}^d$ and its closest facility among the $p$ selected, also in a continuous space. We propose a new branch-and-price procedure for this problem, and two mathehuristics. One of them is a de…
▽ More
In this paper, we address the Continuous Multifacility Monotone Ordered Median Problem. This problem minimizes a monotone ordered weighted median function of the distances between given demand points in $\mathbb{R}^d$ and its closest facility among the $p$ selected, also in a continuous space. We propose a new branch-and-price procedure for this problem, and two mathehuristics. One of them is a decomposition-based procedure and the other an aggregation-based heuristic. We give detailed discussions of the validity of the exact formulations and also specify the implementation details of all the solution procedures. Besides, we assess their performance in an extensive computational experience that shows the superiority of the branch-and-price approach over the compact formulation in medium-sized instances. To handle larger instances it is advisable to resort to the matheuristics that also report rather good results.
△ Less
Submitted 1 August, 2021;
originally announced August 2021.
-
A Branch-and-price procedure for clustering data that are graph connected
Authors:
Stefano Benati,
Diego Ponce,
Justo Puerto,
Antonio M. Rodríguez-Chía
Abstract:
This paper studies the Graph-Connected Clique-Partitioning Problem (GCCP), a clustering optimization model in which units are characterized by both individual and relational data. This problem, introduced by Benati et al. (2017) under the name of Connected Partitioning Problem, shows that the combination of the two data types improves the clustering quality in comparison with other methodologies.…
▽ More
This paper studies the Graph-Connected Clique-Partitioning Problem (GCCP), a clustering optimization model in which units are characterized by both individual and relational data. This problem, introduced by Benati et al. (2017) under the name of Connected Partitioning Problem, shows that the combination of the two data types improves the clustering quality in comparison with other methodologies. Nevertheless, the resulting optimization problem is difficult to solve; only small-sized instances can be solved exactly, large-sized instances require the application of heuristic algorithms. In this paper we improve the exact and the heuristic algorithms previously proposed. Here, we provide a new Integer Linear Programming (ILP) formulation, that solves larger instances, but at the cost of using an exponential number of variables. In order to limit the number of variables necessary to calculate the optimum, the new ILP formulation is solved implementing a branch-and-price algorithm. The resulting pricing problem is itself a new combinatorial model: the Maximum-weighted Graph-Connected Single-Clique problem (MGCSC), that we solve testing various Mixed Integer Linear Programming (MILP) formulations and proposing a new fast "random shrink" heuristic. In this way, we are able to improve the previous algorithms: The branch-and-price method outperforms the computational times of the previous MILP algorithms and the new random shrink heuristic, when applied to GCCP, is both faster and more accurate than the previous heuristic methods. Moreover, the combination of column generation and random shrink is itself a new MILP-relaxed matheuristic that can be applied to large instances too. Its main advantage is that all heuristic local optima are combined together in a restricted MILP, consisting in the application of the exact branch-and-price method but solving heuristically the pricing problem.
△ Less
Submitted 12 April, 2021;
originally announced April 2021.
-
Proceedings of the X International Workshop on Locational Analysis and Related Problems
Authors:
Maria Albareda-Sambola,
Marta Baldomero-Naranjo,
Luisa I. Martínez-Merino,
Diego Ponce,
Miguel A. Pozo,
Justo Puerto,
Victoria Rebillas-Loredo.
Abstract:
The International Workshop on Locational Analysis and Related Problems will take place during January 23-24, 2020 in Seville (Spain). It is organized by the Spanish Location Network and the Location Group GELOCA from the Spanish Society of Statistics and Operations Research(SEIO). The Spanish Location Network is a group of more than 140 researchers from several Spanish universities organized into…
▽ More
The International Workshop on Locational Analysis and Related Problems will take place during January 23-24, 2020 in Seville (Spain). It is organized by the Spanish Location Network and the Location Group GELOCA from the Spanish Society of Statistics and Operations Research(SEIO). The Spanish Location Network is a group of more than 140 researchers from several Spanish universities organized into 7 thematic groups. The Network has been funded by the Spanish Government since 2003.
One of the main activities of the Network is a yearly meeting aimed at promoting the communication among its members and between them and other researchers, and to contribute to the development of the location field and related problems. The last meetings have taken place in Cádiz (January 20-February 1, 2019), Segovia (September 27-29, 2017), Málaga (September 14-16, 2016), Barcelona (November 25-28, 2015), Sevilla (October 1-3, 2014), Torremolinos (Málaga, June 19-21, 2013), Granada (May 10-12, 2012), Las Palmas de Gran Canaria (February 2-5, 2011) and Sevilla (February 1-3, 2010).
The topics of interest are location analysis and related problems. This includes location models, networks, transportation, logistics, exact and heuristic solution methods, and computational geometry, among others.
△ Less
Submitted 5 February, 2020;
originally announced February 2020.
-
On the multisource hyperplanes location problem to fitting set of points
Authors:
Víctor Blanco,
Alberto Japón,
Diego Ponce,
Justo Puerto
Abstract:
In this paper we study the problem of locating a given number of hyperplanes minimizing an objective function of the closest distances from a set of points. We propose a general framework for the problem in which norm-based distances between points and hyperplanes are aggregated by means of ordered median functions. A compact Mixed Integer Linear (or Non Linear) programming formulation is presente…
▽ More
In this paper we study the problem of locating a given number of hyperplanes minimizing an objective function of the closest distances from a set of points. We propose a general framework for the problem in which norm-based distances between points and hyperplanes are aggregated by means of ordered median functions. A compact Mixed Integer Linear (or Non Linear) programming formulation is presented for the problem and also an extended set partitioning formulation with an exponential number of variables is derived. We develop a column generation procedure embedded within a branch-and-price algorithm for solving the problem by adequately performing its preprocessing, pricing and branching. We also analyze geometrically the optimal solutions of the problem, deriving properties which are exploited to generate initial solutions for the proposed algorithms. Finally, the results of an extensive computational experience are reported. The issue of scalability is also addressed showing theoretical upper bounds on the errors assumed by replacing the original datasets by aggregated versions.
△ Less
Submitted 5 February, 2020;
originally announced February 2020.
-
Autonomous Shuttle-as-a-Service (ASaaS): Challenges, Opportunities, and Social Implications
Authors:
Antonio Bucchiarone,
Sandro Battisti,
Annapaola Marconi,
Roberto Maldacea,
Diego Cardona Ponce
Abstract:
Modern cities are composed of complex socio-technical systems that exist to provide services effectively to their residents and visitors. In this context, smart mobility systems aim to support the efficient exploitation of the city transport facilities as well as sustainable mobility within the urban environment. People need to travel quickly and conveniently between locations at different scales,…
▽ More
Modern cities are composed of complex socio-technical systems that exist to provide services effectively to their residents and visitors. In this context, smart mobility systems aim to support the efficient exploitation of the city transport facilities as well as sustainable mobility within the urban environment. People need to travel quickly and conveniently between locations at different scales, ranging from a trip of a few blocks within a city to a journey across cities or further. At the same time, goods need to be timely delivered considering the needs of both the users and the businesses. While most of the mobility and delivery solutions can cover significant distances and multiple requests, they suffer when the requests come from the growing neighborhoods and hard-to-reach areas such as city centers, corporate headquarters, and hospitals. In the last few years, several cities indicated interest in using Autonomous Vehicles (AV) for the "last-mile" mobility services. With them, it seems to be easier to get people and goods around using fewer vehicles. In this context, Autonomous Shuttles (AS) are beginning to be thought of as a new mobility/delivery service into the city center where narrow streets are not easily served by traditional buses. They allow them to serve critical areas with minimal new infrastructure and reducing noise and pollution. The goal of this article is to present an innovative vision on the introduction of the Autonomous Shuttles-as-a service (ASaaS) concept as the key pillar for the realization of innovative and sustainable proximity mobility. Through a set of real application scenarios, we present our view, and we discuss a set of challenges, opportunities, and social implications that this way to reimage the mobility of the future introduces.
△ Less
Submitted 14 January, 2020;
originally announced January 2020.
-
Portfolio problems with two levels decision-makers: Optimal portfolio selection with pricing decisions on transaction costs. Extended version and complete risk profiles analysis
Authors:
Marina Leal,
Diego Ponce,
Justo Puerto
Abstract:
This paper presents novel bilevel leader-follower portfolio selection problems in which the financial intermediary becomes a decision-maker. This financial intermediary decides on the unit transaction costs for investing in some securities, maximizing its benefits, and the investor chooses his optimal portfolio, minimizing risk and ensuring a given expected return. Hence, transaction costs become…
▽ More
This paper presents novel bilevel leader-follower portfolio selection problems in which the financial intermediary becomes a decision-maker. This financial intermediary decides on the unit transaction costs for investing in some securities, maximizing its benefits, and the investor chooses his optimal portfolio, minimizing risk and ensuring a given expected return. Hence, transaction costs become decision variables in the portfolio problem, and two levels of decision-makers are incorporated: the financial intermediary and the investor. These situations give rise to general Nonlinear Programming formulations in both levels of the decision process. We present different bilevel versions of the problem: financial intermediary-leader, investor-leader, and social welfare; besides, their properties are analyzed. Moreover, we develop Mixed Integer Linear Programming formulations for some of the proposed problems and effective algorithms for some others. Finally, we report on some computational experiments performed on data taken from the Dow Jones Industrial Average, and analyze and compare the results obtained by the different models.
△ Less
Submitted 10 December, 2019; v1 submitted 11 April, 2018;
originally announced April 2018.
-
An extended version of a Branch-Price-and-Cut Procedure for the Discrete Ordered Median Problem
Authors:
Samuel Deleplanque,
Martine Labbé,
Diego Ponce,
Justo Puerto
Abstract:
The Discrete Ordered Median Problem (DOMP) is formulated as a set partitioning problem using an exponential number of variables. Each variable corresponds to a set of demand points allocated to the same facility with the information of the sorting position of their corresponding costs. We develop a column generation approach to solve the continuous relaxation of this model. Then, we apply a branch…
▽ More
The Discrete Ordered Median Problem (DOMP) is formulated as a set partitioning problem using an exponential number of variables. Each variable corresponds to a set of demand points allocated to the same facility with the information of the sorting position of their corresponding costs. We develop a column generation approach to solve the continuous relaxation of this model. Then, we apply a branch-price-and-cut algorithm to solve to optimality small to moderate size of DOMP in competitive computational time.
△ Less
Submitted 9 February, 2018;
originally announced February 2018.
-
Mathematical Programming formulations for the efficient solution of the $k$-sum approval voting problem
Authors:
Diego Ponce,
Justo Puerto,
Federica Ricca,
Andrea Scozzari
Abstract:
In this paper we address the problem of electing a committee among a set of $m$ candidates and on the basis of the preferences of a set of $n$ voters. We consider the approval voting method in which each voter can approve as many candidates as she/he likes by expressing a preference profile (boolean $m$-vector). In order to elect a committee, a voting rule must be established to `transform' the…
▽ More
In this paper we address the problem of electing a committee among a set of $m$ candidates and on the basis of the preferences of a set of $n$ voters. We consider the approval voting method in which each voter can approve as many candidates as she/he likes by expressing a preference profile (boolean $m$-vector). In order to elect a committee, a voting rule must be established to `transform' the $n$ voters' profiles into a winning committee. The problem is widely studied in voting theory; for a variety of voting rules the problem was shown to be computationally difficult and approximation algorithms and heuristic techniques were proposed in the literature. In this paper we follow an Ordered Weighted Averaging approach and study the $k$-sum approval voting (optimization) problem in the general case $1 \leq k <n$. For this problem we provide different mathematical programming formulations that allow us to solve it in an exact solution framework. We provide computational results showing that our approach is efficient for medium-size test problems ($n$ up to 200, $m$ up to 60) since in all tested cases it was able to find the exact optimal solution in very short computational times.
△ Less
Submitted 28 July, 2017;
originally announced July 2017.
-
Continuous Location under Refraction
Authors:
Victor Blanco,
Justo Puerto,
Diego Ponce
Abstract:
In this paper we address the problem of locating a new facility on a $d$-dimensional space when the distance measure ($\ell_p$- or polyhedral-norms) is different at each one of the sides of a given hyperplane $\mathcal{H}$. We relate this problem with the physical phenomenon of refraction, and extends it to any finite dimension space and different distances at each one of the sides of any hyperpla…
▽ More
In this paper we address the problem of locating a new facility on a $d$-dimensional space when the distance measure ($\ell_p$- or polyhedral-norms) is different at each one of the sides of a given hyperplane $\mathcal{H}$. We relate this problem with the physical phenomenon of refraction, and extends it to any finite dimension space and different distances at each one of the sides of any hyperplane. An application to this problem is the location of a facility within or outside an urban area where different distance measures must be used. We provide a new second order cone programming formulation, based on the $\ell_p$-norm representation given in \cite{BPE2014} that allows to solve, exactly, the problem in any finite dimension space with semidefinite programming tools. We also extend the problem to the case where the hyperplane is considered as a rapid transit media (a different third norm is also considered over $\mathcal{H}$) that allows the demand to travel faster through $\mathcal{H}$ to reach the new facility. Extensive computational experiments run in Gurobi are reported in order to show the effectiveness of the approach.
△ Less
Submitted 11 April, 2014;
originally announced April 2014.