Search | arXiv e-print repository

The ModelCC Model-Based Parser Generator

Authors: Luis Quesada, Fernando Berzal, Juan-Carlos Cubero

Abstract: Formal languages let us define the textual representation of data with precision. Formal grammars, typically in the form of BNF-like productions, describe the language syntax, which is then annotated for syntax-directed translation and completed with semantic actions. When, apart from the textual representation of data, an explicit representation of the corresponding data structure is required, th… ▽ More Formal languages let us define the textual representation of data with precision. Formal grammars, typically in the form of BNF-like productions, describe the language syntax, which is then annotated for syntax-directed translation and completed with semantic actions. When, apart from the textual representation of data, an explicit representation of the corresponding data structure is required, the language designer has to devise the map** between the suitable data model and its proper language specification, and then develop the conversion procedure from the parse tree to the data model instance. Unfortunately, whenever the format of the textual representation has to be modified, changes have to propagated throughout the entire language processor tool chain. These updates are time-consuming, tedious, and error-prone. Besides, in case different applications use the same language, several copies of the same language specification have to be maintained. In this paper, we introduce ModelCC, a model-based parser generator that decouples language specification from language processing, hence avoiding many of the problems caused by grammar-driven parsers and parser generators. ModelCC incorporates reference resolution within the parsing process. Therefore, instead of returning mere abstract syntax trees, ModelCC is able to obtain abstract syntax graphs from input strings. △ Less

Submitted 11 January, 2015; originally announced January 2015.

Comments: arXiv admin note: substantial text overlap with arXiv:1111.3970, arXiv:1501.02038

arXiv:1501.02795 [pdf, ps, other]

Scanning and Parsing Languages with Ambiguities and Constraints: The Lamb and Fence Algorithms

Authors: Luis Quesada, Fernando Berzal, Francisco J. Cortijo

Abstract: Traditional language processing tools constrain language designers to specific kinds of grammars. In contrast, model-based language processing tools decouple language design from language processing. These tools allow the occurrence of lexical and syntactic ambiguities in language specifications and the declarative specification of constraints for resolving them. As a result, these techniques requ… ▽ More Traditional language processing tools constrain language designers to specific kinds of grammars. In contrast, model-based language processing tools decouple language design from language processing. These tools allow the occurrence of lexical and syntactic ambiguities in language specifications and the declarative specification of constraints for resolving them. As a result, these techniques require scanners and parsers able to parse context-free grammars, handle ambiguities, and enforce constraints for disambiguation. In this paper, we present Lamb and Fence. Lamb is a scanning algorithm that supports ambiguous token definitions and the specification of custom pattern matchers and constraints. Fence is a chart parsing algorithm that supports ambiguous context-free grammars and the definition of constraints on associativity, composition, and precedence, as well as custom constraints. Lamb and Fence, in conjunction, enable the implementation of the ModelCC model-based language processing tool. △ Less

Submitted 11 January, 2015; originally announced January 2015.

Comments: arXiv admin note: text overlap with arXiv:1111.3970, arXiv:1110.1470

arXiv:1501.02038 [pdf, other]

doi 10.4204/EPTCS.173.5

The ModelCC Model-Driven Parser Generator

Authors: Fernando Berzal, Francisco J. Cortijo, Juan-Carlos Cubero, Luis Quesada

Abstract: Syntax-directed translation tools require the specification of a language by means of a formal grammar. This grammar must conform to the specific requirements of the parser generator to be used. This grammar is then annotated with semantic actions for the resulting system to perform its desired function. In this paper, we introduce ModelCC, a model-based parser generator that decouples language sp… ▽ More Syntax-directed translation tools require the specification of a language by means of a formal grammar. This grammar must conform to the specific requirements of the parser generator to be used. This grammar is then annotated with semantic actions for the resulting system to perform its desired function. In this paper, we introduce ModelCC, a model-based parser generator that decouples language specification from language processing, avoiding some of the problems caused by grammar-driven parser generators. ModelCC receives a conceptual model as input, along with constraints that annotate it. It is then able to create a parser for the desired textual syntax and the generated parser fully automates the instantiation of the language conceptual model. ModelCC also includes a reference resolution mechanism so that ModelCC is able to instantiate abstract syntax graphs, rather than mere abstract syntax trees. △ Less

Submitted 8 January, 2015; originally announced January 2015.

Comments: In Proceedings PROLE 2014, arXiv:1501.01693

Journal ref: EPTCS 173, 2015, pp. 56-70

arXiv:1401.3842 [pdf]

doi 10.1613/jair.2992

Develo** Approaches for Solving a Telecommunications Feature Subscription Problem

Authors: David Lesaint, Deepak Mehta, Barry O'Sullivan, Luis Quesada, Nic Wilson

Abstract: Call control features (e.g., call-divert, voice-mail) are primitive options to which users can subscribe off-line to personalise their service. The configuration of a feature subscription involves choosing and sequencing features from a catalogue and is subject to constraints that prevent undesirable feature interactions at run-time. When the subscription requested by a user is inconsistent, one… ▽ More Call control features (e.g., call-divert, voice-mail) are primitive options to which users can subscribe off-line to personalise their service. The configuration of a feature subscription involves choosing and sequencing features from a catalogue and is subject to constraints that prevent undesirable feature interactions at run-time. When the subscription requested by a user is inconsistent, one problem is to find an optimal relaxation, which is a generalisation of the feedback vertex set problem on directed graphs, and thus it is an NP-hard task. We present several constraint programming formulations of the problem. We also present formulations using partial weighted maximum Boolean satisfiability and mixed integer linear programming. We study all these formulations by experimentally comparing them on a variety of randomly generated instances of the feature subscription problem. △ Less

Submitted 15 January, 2014; originally announced January 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 38, pages 271-305, 2010

arXiv:1301.4858 [pdf, ps, other]

A DSL for Map** Abstract Syntax Models to Concrete Syntax Models in ModelCC

Authors: Luis Quesada, Fernando Berzal, Juan-Carlos Cubero

Abstract: ModelCC is a model-based parser generator that decouples language design from language processing. ModelCC provides two different mechanisms to specify the map** from an abstract syntax model to a concrete syntax model: metadata annotations defined on top of the abstract syntax model specification and a domain-specific language for defining ASM-CSM map**s. Using a domain-specific language to s… ▽ More ModelCC is a model-based parser generator that decouples language design from language processing. ModelCC provides two different mechanisms to specify the map** from an abstract syntax model to a concrete syntax model: metadata annotations defined on top of the abstract syntax model specification and a domain-specific language for defining ASM-CSM map**s. Using a domain-specific language to specify the map** from abstract to concrete syntax models allows the definition of multiple concrete syntax models for the same abstract syntax model. In this paper, we describe the ModelCC domain-specific language for abstract syntax model to concrete syntax model map**s and we showcase its capabilities by providing a meta-definition of that domain-specific language. △ Less

Submitted 21 January, 2013; originally announced January 2013.

Comments: arXiv admin note: substantial text overlap with arXiv:1202.6593

arXiv:1205.3183 [pdf, ps, other]

A Model-Driven Probabilistic Parser Generator

Authors: Luis Quesada, Fernando Berzal, Francisco J. Cortijo

Abstract: Existing probabilistic scanners and parsers impose hard constraints on the way lexical and syntactic ambiguities can be resolved. Furthermore, traditional grammar-based parsing tools are limited in the mechanisms they allow for taking context into account. In this paper, we propose a model-driven tool that allows for statistical language models with arbitrary probability estimators. Our work on mo… ▽ More Existing probabilistic scanners and parsers impose hard constraints on the way lexical and syntactic ambiguities can be resolved. Furthermore, traditional grammar-based parsing tools are limited in the mechanisms they allow for taking context into account. In this paper, we propose a model-driven tool that allows for statistical language models with arbitrary probability estimators. Our work on model-driven probabilistic parsing is built on top of ModelCC, a model-based parser generator, and enables the probabilistic interpretation and resolution of anaphoric, cataphoric, and recursive references in the disambiguation of abstract syntax graphs. In order to prove the expression power of ModelCC, we describe the design of a general-purpose natural language parser. △ Less

Submitted 14 May, 2012; originally announced May 2012.

arXiv:1205.3180 [pdf, ps, other]

Community-Quality-Based Player Ranking in Collaborative Games with no Explicit Objectives

Authors: Luis Quesada, Pablo J. Villacorta

Abstract: Player ranking can be used to determine the quality of the contributions of a player to a collaborative community. However, collaborative games with no explicit objectives do not support player ranking, as there is no metric to measure the quality of player contributions. An implicit objective of such communities is not being disruptive towards other players. In this paper, we propose a parameteri… ▽ More Player ranking can be used to determine the quality of the contributions of a player to a collaborative community. However, collaborative games with no explicit objectives do not support player ranking, as there is no metric to measure the quality of player contributions. An implicit objective of such communities is not being disruptive towards other players. In this paper, we propose a parameterizable approach for real-time player ranking in collaborative games with no explicit objectives. Our method computes a ranking by applying a simple heuristic community quality function. We also demonstrate the capabilities of our approach by applying several parameterizations of it to a case study and comparing the obtained results. △ Less

Submitted 14 May, 2012; originally announced May 2012.

arXiv:1203.0076 [pdf, ps, other]

Using Barriers to Reduce the Sensitivity to Edge Miscalculations of Casting-Based Object Projection Feature Estimation

Authors: Luis Quesada

Abstract: 3D motion tracking is a critical task in many computer vision applications. Unsupervised markerless 3D motion tracking systems determine the most relevant object in the screen and then track it by continuously estimating its projection features (center and area) from the edge image and a point inside the relevant object projection (namely, inner point), until the tracking fails. Existing reliable… ▽ More 3D motion tracking is a critical task in many computer vision applications. Unsupervised markerless 3D motion tracking systems determine the most relevant object in the screen and then track it by continuously estimating its projection features (center and area) from the edge image and a point inside the relevant object projection (namely, inner point), until the tracking fails. Existing reliable object projection feature estimation techniques are based on ray-casting or grid-filling from the inner point. These techniques assume the edge image to be accurate. However, in real case scenarios, edge miscalculations may arise from low contrast between the target object and its surroundings or motion blur caused by low frame rates or fast moving target objects. In this paper, we propose a barrier extension to casting-based techniques that mitigates the effect of edge miscalculations. △ Less

Submitted 29 February, 2012; originally announced March 2012.

Comments: arXiv admin note: substantial text overlap with arXiv:1202.6586v1 and arXiv:1111.3969

arXiv:1202.6593 [pdf, ps, other]

A Model-Driven Parser Generator, from Abstract Syntax Trees to Abstract Syntax Graphs

Authors: Luis Quesada, Fernando Berzal, Juan-Carlos Cubero

Abstract: Model-based parser generators decouple language specification from language processing. The model-driven approach avoids the limitations that conventional parser generators impose on the language designer. Conventional tools require the designed language grammar to conform to the specific kind of grammar supported by the particular parser generator (being LL and LR parser generators the most commo… ▽ More Model-based parser generators decouple language specification from language processing. The model-driven approach avoids the limitations that conventional parser generators impose on the language designer. Conventional tools require the designed language grammar to conform to the specific kind of grammar supported by the particular parser generator (being LL and LR parser generators the most common). Model-driven parser generators, like ModelCC, do not require a grammar specification, since that grammar can be automatically derived from the language model and, if needed, adapted to conform to the requirements of the given kind of parser, all of this without interfering with the conceptual design of the language and its associated applications. Moreover, model-driven tools such as ModelCC are able to automatically resolve references between language elements, hence producing abstract syntax graphs instead of abstract syntax trees as the result of the parsing process. Such graphs are not confined to directed acyclic graphs and they can contain cycles, since ModelCC supports anaphoric, cataphoric, and recursive references. △ Less

Submitted 29 February, 2012; originally announced February 2012.

arXiv:1202.6586 [pdf, ps, other]

Filling-Based Techniques Applied to Object Projection Feature Estimation

Authors: Luis Quesada, Alejandro J. León

Abstract: 3D motion tracking is a critical task in many computer vision applications. Unsupervised markerless 3D motion tracking systems determine the most relevant object in the screen and then track it by continuously estimating its projection features (center and area) from the edge image and a point inside the relevant object projection (namely, inner point), until the tracking fails. Existing object pr… ▽ More 3D motion tracking is a critical task in many computer vision applications. Unsupervised markerless 3D motion tracking systems determine the most relevant object in the screen and then track it by continuously estimating its projection features (center and area) from the edge image and a point inside the relevant object projection (namely, inner point), until the tracking fails. Existing object projection feature estimation techniques are based on ray-casting from the inner point. These techniques present three main drawbacks: when the inner point is surrounded by edges, rays may not reach other relevant areas; as a consequence of that issue, the estimated features may greatly vary depending on the position of the inner point relative to the object projection; and finally, increasing the number of rays being casted and the ray-casting iterations (which would make the results more accurate and stable) increases the processing time to the point the tracking cannot be performed on the fly. In this paper, we analyze an intuitive filling-based object projection feature estimation technique that solves the aforementioned problems but is too sensitive to edge miscalculations. Then, we propose a less computing-intensive modification to that technique that would not be affected by the existing techniques issues and would be no more sensitive to edge miscalculations than ray-casting-based techniques. △ Less

Submitted 29 February, 2012; originally announced February 2012.

Comments: arXiv admin note: substantial text overlap with arXiv:1111.3969

arXiv:1202.6583 [pdf, ps, other]

A Lexical Analysis Tool with Ambiguity Support

Authors: Luis Quesada, Fernando Berzal, Francisco J. Cortijo

Abstract: Lexical ambiguities naturally arise in languages. We present Lamb, a lexical analyzer that produces a lexical analysis graph describing all the possible sequences of tokens that can be found within the input string. Parsers can process such lexical analysis graphs and discard any sequence of tokens that does not produce a valid syntactic sentence, therefore performing, together with Lamb, a contex… ▽ More Lexical ambiguities naturally arise in languages. We present Lamb, a lexical analyzer that produces a lexical analysis graph describing all the possible sequences of tokens that can be found within the input string. Parsers can process such lexical analysis graphs and discard any sequence of tokens that does not produce a valid syntactic sentence, therefore performing, together with Lamb, a context-sensitive lexical analysis in lexically-ambiguous language specifications. △ Less

Submitted 29 February, 2012; originally announced February 2012.

arXiv:1111.3970 [pdf, ps, other]

A Tool for Model-Based Language Specification

Authors: Luis Quesada, Fernando Berzal, Juan-Carlos Cubero

Abstract: Formal languages let us define the textual representation of data with precision. Formal grammars, typically in the form of BNF-like productions, describe the language syntax, which is then annotated for syntax-directed translation and completed with semantic actions. When, apart from the textual representation of data, an explicit representation of the corresponding data structure is required, th… ▽ More Formal languages let us define the textual representation of data with precision. Formal grammars, typically in the form of BNF-like productions, describe the language syntax, which is then annotated for syntax-directed translation and completed with semantic actions. When, apart from the textual representation of data, an explicit representation of the corresponding data structure is required, the language designer has to devise the map** between the suitable data model and its proper language specification, and then develop the conversion procedure from the parse tree to the data model instance. Unfortunately, whenever the format of the textual representation has to be modified, changes have to propagated throughout the entire language processor tool chain. These updates are time-consuming, tedious, and error-prone. Besides, in case different applications use the same language, several copies of the same language specification have to be maintained. In this paper, we introduce a model-based parser generator that decouples language specification from language processing, hence avoiding many of the problems caused by grammar-driven parsers and parser generators. △ Less

Submitted 16 November, 2011; originally announced November 2011.

arXiv:1111.3969 [pdf, ps, other]

The Object Projection Feature Estimation Problem in Unsupervised Markerless 3D Motion Tracking

Authors: Luis Quesada, Alejandro J. León

Abstract: 3D motion tracking is a critical task in many computer vision applications. Existing 3D motion tracking techniques require either a great amount of knowledge on the target object or specific hardware. These requirements discourage the wide spread of commercial applications based on 3D motion tracking. 3D motion tracking systems that require no knowledge on the target object and run on a single low… ▽ More 3D motion tracking is a critical task in many computer vision applications. Existing 3D motion tracking techniques require either a great amount of knowledge on the target object or specific hardware. These requirements discourage the wide spread of commercial applications based on 3D motion tracking. 3D motion tracking systems that require no knowledge on the target object and run on a single low-budget camera require estimations of the object projection features (namely, area and position). In this paper, we define the object projection feature estimation problem and we present a novel 3D motion tracking system that needs no knowledge on the target object and that only requires a single low-budget camera, as installed in most computers and smartphones. Our system estimates, in real time, the three-dimensional position of a non-modeled unmarked object that may be non-rigid, non-convex, partially occluded, self occluded, or motion blurred, given that it is opaque, evenly colored, and enough contrasting with the background in each frame. Our system is also able to determine the most relevant object to track in the screen. Our 3D motion tracking system does not impose hard constraints, therefore it allows a market-wide implementation of applications that use 3D motion tracking. △ Less

Submitted 18 November, 2011; v1 submitted 16 November, 2011; originally announced November 2011.

arXiv:1110.1716 [pdf, ps, other]

Treating Insomnia, Amnesia, and Acalculia in Regular Expression Matching

Authors: Luis Quesada, Fernando Berzal, Francisco J. Cortijo

Abstract: Regular expressions provide a flexible means for matching strings and they are often used in data-intensive applications. They are formally equivalent to either deterministic finite automata (DFAs) or nondeterministic finite automata (NFAs). Both DFAs and NFAs are affected by two problems known as amnesia and acalculia, and DFAs are also affected by a problem known as insomnia. Existing techniques… ▽ More Regular expressions provide a flexible means for matching strings and they are often used in data-intensive applications. They are formally equivalent to either deterministic finite automata (DFAs) or nondeterministic finite automata (NFAs). Both DFAs and NFAs are affected by two problems known as amnesia and acalculia, and DFAs are also affected by a problem known as insomnia. Existing techniques require an automata conversion and compaction step that prevents the use of existing automaton databases and hinders the maintenance of the resulting compact automata. In this paper, we propose Parallel Finite State Machines (PFSMs), which are able to run any DFA- or NFA-like state machines without a previous conversion or compaction step. PFSMs report, online, all the matches found within an input string and they solve the three aforementioned problems. Parallel Finite State Machines require quadratic time and linear memory and they are distributable. Parallel Finite State Machines make very fast distributed regular expression matching in data-intensive applications feasible. △ Less

Submitted 5 November, 2011; v1 submitted 8 October, 2011; originally announced October 2011.

arXiv:1110.1470 [pdf, ps, other]

A Constraint-Satisfaction Parser for Context-Free Grammars

Authors: Luis Quesada, Fernando Berzal, Francisco J. Cortijo

Abstract: Traditional language processing tools constrain language designers to specific kinds of grammars. In contrast, model-based language specification decouples language design from language processing. As a consequence, model-based language specification tools need general parsers able to parse unrestricted context-free grammars. As languages specified following this approach may be ambiguous, parsers… ▽ More Traditional language processing tools constrain language designers to specific kinds of grammars. In contrast, model-based language specification decouples language design from language processing. As a consequence, model-based language specification tools need general parsers able to parse unrestricted context-free grammars. As languages specified following this approach may be ambiguous, parsers must deal with ambiguities. Model-based language specification also allows the definition of associativity, precedence, and custom constraints. Therefore parsers generated by model-driven language specification tools need to enforce constraints. In this paper, we propose Fence, an efficient bottom-up chart parser with lexical and syntactic ambiguity support that allows the specification of constraints and, therefore, enables the use of model-based language specification in practice. △ Less

Submitted 2 February, 2012; v1 submitted 7 October, 2011; originally announced October 2011.

arXiv:1109.1231 [pdf, other]

A Combinatorial Optimisation Approach to Designing Dual-Parented Long-Reach Passive Optical Networks

Authors: Hadrien Cambazard, Deepak Mehta, Barry O'Sullivan, Luis Quesada, Marco Ruffini, David Payne, Linda Doyle

Abstract: We present an application focused on the design of resilient long-reach passive optical networks. We specifically consider dual-parented networks whereby each customer must be connected to two metro sites via local exchange sites. An important property of such a placement is resilience to single metro node failure. The objective of the application is to determine the optimal position of a set of m… ▽ More We present an application focused on the design of resilient long-reach passive optical networks. We specifically consider dual-parented networks whereby each customer must be connected to two metro sites via local exchange sites. An important property of such a placement is resilience to single metro node failure. The objective of the application is to determine the optimal position of a set of metro nodes such that the total optical fibre length is minimized. We prove that this problem is NP-Complete. We present two alternative combinatorial optimisation approaches to finding an optimal metro node placement using: a mixed integer linear programming (MIP) formulation of the problem; and, a hybrid approach that uses clustering as a preprocessing step. We consider a detailed case-study based on a network for Ireland. The hybrid approach scales well and finds solutions that are close to optimal, with a runtime that is two orders-of-magnitude better than the MIP model. △ Less

Submitted 6 September, 2011; originally announced September 2011.

Comments: University of Ulster, Intelligent System Research Centre, technical report series. ISSN 2041-6407

Journal ref: Proceedings of the 22nd Irish Conference on Artificial Intelligence and Cognitive Science (AICS 2011), pp. 26-35, Derry, UK

arXiv:1107.4687 [pdf, ps, other]

Fence - An Efficient Parser with Ambiguity Support for Model-Driven Language Specification

Authors: Luis Quesada, Fernando Berzal, Francisco J. Cortijo

Abstract: Model-based language specification has applications in the implementation of language processors, the design of domain-specific languages, model-driven software development, data integration, text mining, natural language processing, and corpus-based induction of models. Model-based language specification decouples language design from language processing and, unlike traditional grammar-driven app… ▽ More Model-based language specification has applications in the implementation of language processors, the design of domain-specific languages, model-driven software development, data integration, text mining, natural language processing, and corpus-based induction of models. Model-based language specification decouples language design from language processing and, unlike traditional grammar-driven approaches, which constrain language designers to specific kinds of grammars, it needs general parser generators able to deal with ambiguities. In this paper, we propose Fence, an efficient bottom-up parsing algorithm with lexical and syntactic ambiguity support that enables the use of model-based language specification in practice. △ Less

Submitted 7 October, 2011; v1 submitted 23 July, 2011; originally announced July 2011.

Showing 1–17 of 17 results for author: Quesada, L