-
MultiLS-SP/CA: Lexical Complexity Prediction and Lexical Simplification Resources for Catalan and Spanish
Authors:
Stefan Bott,
Horacio Saggion,
Nelson Peréz Rojas,
Martin Solis Salazar,
Saul Calderon Ramirez
Abstract:
Automatic lexical simplification is a task to substitute lexical items that may be unfamiliar and difficult to understand with easier and more common words. This paper presents MultiLS-SP/CA, a novel dataset for lexical simplification in Spanish and Catalan. This dataset represents the first of its kind in Catalan and a substantial addition to the sparse data on automatic lexical simplification wh…
▽ More
Automatic lexical simplification is a task to substitute lexical items that may be unfamiliar and difficult to understand with easier and more common words. This paper presents MultiLS-SP/CA, a novel dataset for lexical simplification in Spanish and Catalan. This dataset represents the first of its kind in Catalan and a substantial addition to the sparse data on automatic lexical simplification which is available for Spanish. Specifically, MultiLS-SP is the first dataset for Spanish which includes scalar ratings of the understanding difficulty of lexical items. In addition, we describe experiments with this dataset, which can serve as a baseline for future work on the same data.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Counting and Computing Join-Endomorphisms in Lattices (Revisited)
Authors:
Carlos Pinzón,
Santiago Quintero,
Sergio Ramírez,
Camilo Rueda,
Frank Valencia
Abstract:
Structures involving a lattice and join-endomorphisms on it are ubiquitous in computer science. We study the cardinality of the set $\mathcal{E}(L)$ of all join-endomorphisms of a given finite lattice $L$. In particular, we show for $\mathbf{M}_n$, the discrete order of $n$ elements extended with top and bottom, $| \mathcal{E}(\mathbf{M}_n) | =n!\mathcal{L}_n(-1)+(n+1)^2$ where $\mathcal{L}_n(x)$…
▽ More
Structures involving a lattice and join-endomorphisms on it are ubiquitous in computer science. We study the cardinality of the set $\mathcal{E}(L)$ of all join-endomorphisms of a given finite lattice $L$. In particular, we show for $\mathbf{M}_n$, the discrete order of $n$ elements extended with top and bottom, $| \mathcal{E}(\mathbf{M}_n) | =n!\mathcal{L}_n(-1)+(n+1)^2$ where $\mathcal{L}_n(x)$ is the Laguerre polynomial of degree $n$. We also study the following problem: Given a lattice $L$ of size $n$ and a set $S\subseteq \mathcal{E}(L)$ of size $m$, find the greatest lower bound ${\large\sqcap}_{\mathcal{E}(L)} S$. The join-endomorphism ${\large\sqcap}_{\mathcal{E}(L)} S$ has meaningful interpretations in epistemic logic, distributed systems, and Aumann structures. We show that this problem can be solved with worst-case time complexity in $O(mn)$ for distributive lattices and $O(mn + n^3)$ for arbitrary lattices. In the particular case of modular lattices, we present an adaptation of the latter algorithm that reduces its average time complexity. We provide theoretical and experimental results to support this enhancement. The complexity is expressed in terms of the basic binary lattice operations performed by the algorithm.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
On the Computation of Distributed Knowledge as the Greatest Lower Bound of Knowledge
Authors:
Santiago Quintero,
Carlos Pinzón,
Sergio Ramírez,
Frank Valencia
Abstract:
Let $L$ be a finite lattice and $\mathcal{E}(L)$ be the set of join endomorphisms of $L$. We consider the problem of given $L$ and $f,g \in \mathcal{E}(L)$, finding the greatest lower bound $f \sqcap_{{\scriptsize \mathcal{E}(L)}} g$ in the lattice $\mathcal{E}(L)$. (1) We show that if $L$ is distributive, the problem can be solved in time $O(n)$ where $n=| L |$. The previous upper bound was…
▽ More
Let $L$ be a finite lattice and $\mathcal{E}(L)$ be the set of join endomorphisms of $L$. We consider the problem of given $L$ and $f,g \in \mathcal{E}(L)$, finding the greatest lower bound $f \sqcap_{{\scriptsize \mathcal{E}(L)}} g$ in the lattice $\mathcal{E}(L)$. (1) We show that if $L$ is distributive, the problem can be solved in time $O(n)$ where $n=| L |$. The previous upper bound was $O(n^2)$. (2) We provide new algorithms for arbitrary lattices and give experimental evidence that they are significantly faster than the existing algorithm. (3) We characterize the standard notion of distributed knowledge of a group as the greatest lower bound of the join-endomorphisms representing the knowledge of each member of the group. (4) We show that deciding whether an agent has the distributed knowledge of two other agents can be computed in time $O(n^2)$ where $n$ is the size of the underlying set of states. (5) For the special case of $S5$ knowledge, we show that it can be decided in time $O(nα_{n})$ where $α_{n}$ is the inverse of the Ackermann function.
△ Less
Submitted 24 October, 2022; v1 submitted 14 October, 2022;
originally announced October 2022.
-
Algebraic Structures from Concurrent Constraint Programming Calculi for Distributed Information in Multi-Agent Systems
Authors:
Michell Guzmán,
Sophia Knight,
Santiago Quintero,
Sergio Ramírez,
Camilo Rueda,
Frank Valencia
Abstract:
Spatial constraint systems (scs) are semantic structures for reasoning about spatial and epistemic information in concurrent systems. We develop the theory of scs to reason about the distributed information of potentially infinite groups. We characterize the notion of distributed information of a group of agents as the infimum of the set of join-preserving functions that represent the spaces of th…
▽ More
Spatial constraint systems (scs) are semantic structures for reasoning about spatial and epistemic information in concurrent systems. We develop the theory of scs to reason about the distributed information of potentially infinite groups. We characterize the notion of distributed information of a group of agents as the infimum of the set of join-preserving functions that represent the spaces of the agents in the group. We provide an alternative characterization of this notion as the greatest family of join-preserving functions that satisfy certain basic properties. For completely distributive lattices, we establish that distributed information of a group is the greatest information below all possible combinations of information in the spaces of the agents in the group that derive a given piece of information. We show compositionality results for these characterizations and conditions under which information that can be obtained by an infinite group can also be obtained by a finite group. Finally, we provide an application on mathematical morphology where dilations, one of its fundamental operations, define an scs on a powerset lattice. We show that distributed information represents a particular dilation in such scs.
△ Less
Submitted 8 February, 2021; v1 submitted 20 October, 2020;
originally announced October 2020.
-
A Rewriting Logic Approach to Stochastic and Spatial Constraint System Specification and Verification
Authors:
Miguel Romero,
Sergio Ramírez,
Camilo Rocha,
Frank Valencia
Abstract:
This paper addresses the issue of specifying, simulating, and verifying reactive systems in rewriting logic. It presents an executable semantics for probabilistic, timed, and spatial concurrent constraint programming -- here called stochastic and spatial concurrent constraint systems (SSCC) -- in the rewriting logic semantic framework. The approach is based on an enhanced and generalized model of…
▽ More
This paper addresses the issue of specifying, simulating, and verifying reactive systems in rewriting logic. It presents an executable semantics for probabilistic, timed, and spatial concurrent constraint programming -- here called stochastic and spatial concurrent constraint systems (SSCC) -- in the rewriting logic semantic framework. The approach is based on an enhanced and generalized model of concurrent constraint programming (CCP) where computational hierarchical spaces can be assigned to belong to agents. The executable semantics faithfully represents and operationally captures the highly concurrent nature, uncertain behavior, and spatial and epistemic characteristics of reactive systems with flow of information. In SSCC, timing attributes -- represented by stochastic duration -- can be associated to processes, and exclusive and independent probabilistic choice is also supported. SMT solving technology, available from the Maude system, is used to realize the underlying constraint system of SSCC with quantifier-free formulas over integers and reals. This results in a fully executable real-time symbolic specification that can be used for quantitative analysis in the form of statistical model checking. The main features and capabilities of SSCC are illustrated with examples throughout the paper. This contribution is part of a larger research effort aimed at making available formal analysis techniques and tools, mathematically founded on the CCP approach, to the research community.
△ Less
Submitted 2 November, 2022; v1 submitted 9 September, 2019;
originally announced September 2019.
-
Sitara: Spectrum Measurement Goes Mobile Through Crowd-sourcing
Authors:
Phillip Smith,
Anh Luong,
Shamik Sarkar,
Harsimran Singh,
Neal Patwari,
Sneha Kasera,
Kurt Derr,
Samuel Ramirez
Abstract:
Software-defined radios (SDRs) are often used in the experimental evaluation of next-generation wireless technologies. While crowd-sourced spectrum monitoring is an important component of future spectrum-agile technologies, there is no clear way to test it in the real world, i.e., with hundreds of users each with an SDR in their pocket participating in RF experiments controlled by, and data upload…
▽ More
Software-defined radios (SDRs) are often used in the experimental evaluation of next-generation wireless technologies. While crowd-sourced spectrum monitoring is an important component of future spectrum-agile technologies, there is no clear way to test it in the real world, i.e., with hundreds of users each with an SDR in their pocket participating in RF experiments controlled by, and data uploaded to, the cloud. Current fully functional SDRs are bulky, with components connected via wires, and last at most hours on a single battery charge. To address the needs of such experiments, we design and develop a compact, portable, untethered, and inexpensive SDR we call Sitara. Our SDR interfaces with a mobile device over Bluetooth 5 and can function standalone or as a client to a central command and control server. The Sitara offers true portability: it operates up to one week on battery power, requires no external wired connections and occupies a footprint smaller than a credit card. It transmits and receives common waveforms, uploads IQ samples or processed receiver data through a mobile device to a server for remote processing and performs spectrum sensing functions. Multiple Sitaras form a distributed system capable of conducting experiments in wireless networking and communication in addition to RF monitoring and sensing activities. In this paper, we describe our design, evaluate our solution, present experimental results from multi-sensor deployments and discuss the value of this system in future experimentation.
△ Less
Submitted 30 May, 2019;
originally announced May 2019.
-
Detecting Social Influence in Event Cascades by Comparing Discriminative Rankers
Authors:
Sandeep Soni,
Shawn Ling Ramirez,
Jacob Eisenstein
Abstract:
The global dynamics of event cascades are often governed by the local dynamics of peer influence. However, detecting social influence from observational data is challenging due to confounds like homophily and practical issues like missing data. We propose a simple discriminative method to detect influence from observational data. The core of the approach is to train a ranking algorithm to predict…
▽ More
The global dynamics of event cascades are often governed by the local dynamics of peer influence. However, detecting social influence from observational data is challenging due to confounds like homophily and practical issues like missing data. We propose a simple discriminative method to detect influence from observational data. The core of the approach is to train a ranking algorithm to predict the source of the next event in a cascade, and compare its out-of-sample accuracy against a competitive baseline which lacks access to features corresponding to social influence. We analyze synthetically generated data to show that this method correctly identifies influence in the presence of confounds, and is robust to both missing data and misspecification --- unlike well-known alternatives. We apply the method to two real-world datasets: (1) the co-sponsorship of legislation in the U.S. House of Representatives on a social network of shared campaign donors; (2) rumors about the Higgs boson discovery on a follower network of $10^5$ Twitter accounts. Our model identifies the role of social influence in these scenarios and uses it to make more accurate predictions about the future trajectory of cascades.
△ Less
Submitted 19 July, 2019; v1 submitted 16 February, 2018;
originally announced February 2018.