-
Building BESSER: an open-source low-code platform
Authors:
Iván Alfonso,
Aaron Conrardy,
Armen Sulejmani,
Atefeh Nirumand,
Fitash Ul Haq,
Marcos Gomez-Vazquez,
Jean-Sébastien Sottet,
Jordi Cabot
Abstract:
Low-code platforms (latest reincarnation of the long tradition of model-driven engineering approaches) have the potential of saving us countless hours of repetitive boilerplate coding tasks. However, as software systems grow in complexity, low-code platforms need to adapt as well. Notably, nowadays this implies adapting to the modeling and generation of smart software. At the same time, if we want…
▽ More
Low-code platforms (latest reincarnation of the long tradition of model-driven engineering approaches) have the potential of saving us countless hours of repetitive boilerplate coding tasks. However, as software systems grow in complexity, low-code platforms need to adapt as well. Notably, nowadays this implies adapting to the modeling and generation of smart software. At the same time, if we want to broaden the userbase of this type of tools, we should also be able to provide more open source alternatives that help potential users avoid vendor lock-ins and give them the freedom to explore low-code development approaches (even adapting the tool to better fit their needs). To fulfil these needs, we are building BESSER, an open source low-code platform for develo** (smart) software. BESSER offers various forms (i.e., notations) for system and domain specification (e.g. UML for technical users and chatbots for business users) together with a number of generators. Both types of components can be extended and are open to contributions from the community.
△ Less
Submitted 24 May, 2024; v1 submitted 22 May, 2024;
originally announced May 2024.
-
Risks and Opportunities of Open-Source Generative AI
Authors:
Francisco Eiras,
Aleksandar Petrov,
Bertie Vidgen,
Christian Schroeder,
Fabio Pizzati,
Katherine Elkins,
Supratik Mukhopadhyay,
Adel Bibi,
Aaron Purewal,
Csaba Botos,
Fabro Steibel,
Fazel Keshtkar,
Fazl Barez,
Genevieve Smith,
Gianluca Guadagni,
Jon Chun,
Jordi Cabot,
Joseph Imperial,
Juan Arturo Nolazco,
Lori Landay,
Matthew Jackson,
Phillip H. S. Torr,
Trevor Darrell,
Yong Lee,
Jakob Foerster
Abstract:
Applications of Generative AI (Gen AI) are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about the potential risks of the technology, and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This reg…
▽ More
Applications of Generative AI (Gen AI) are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about the potential risks of the technology, and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation is likely to put at risk the budding field of open-source generative AI. Using a three-stage framework for Gen AI development (near, mid and long-term), we analyze the risks and opportunities of open-source generative AI models with similar capabilities to the ones currently available (near to mid-term) and with greater capabilities (long-term). We argue that, overall, the benefits of open-source Gen AI outweigh its risks. As such, we encourage the open sourcing of models, training and evaluation data, and provide a set of recommendations and best practices for managing risks associated with open-source generative AI.
△ Less
Submitted 29 May, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
LangBiTe: A Platform for Testing Bias in Large Language Models
Authors:
Sergio Morales,
Robert Clarisó,
Jordi Cabot
Abstract:
The integration of Large Language Models (LLMs) into various software applications raises concerns about their potential biases. Typically, those models are trained on a vast amount of data scrapped from forums, websites, social media and other internet sources, which may instill harmful and discriminating behavior into the model. To address this issue, we present LangBiTe, a testing platform to s…
▽ More
The integration of Large Language Models (LLMs) into various software applications raises concerns about their potential biases. Typically, those models are trained on a vast amount of data scrapped from forums, websites, social media and other internet sources, which may instill harmful and discriminating behavior into the model. To address this issue, we present LangBiTe, a testing platform to systematically assess the presence of biases within an LLM. LangBiTe enables development teams to tailor their test scenarios, and automatically generate and execute the test cases according to a set of user-defined ethical requirements. Each test consists of a prompt fed into the LLM and a corresponding test oracle that scrutinizes the LLM's response for the identification of biases. LangBite provides users with the bias evaluation of LLMs, and end-to-end traceability between the initial ethical requirements and the insights obtained.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
A Framework to Model ML Engineering Processes
Authors:
Sergio Morales,
Robert Clarisó,
Jordi Cabot
Abstract:
The development of Machine Learning (ML) based systems is complex and requires multidisciplinary teams with diverse skill sets. This may lead to communication issues or misapplication of best practices. Process models can alleviate these challenges by standardizing task orchestration, providing a common language to facilitate communication, and nurturing a collaborative environment. Unfortunately,…
▽ More
The development of Machine Learning (ML) based systems is complex and requires multidisciplinary teams with diverse skill sets. This may lead to communication issues or misapplication of best practices. Process models can alleviate these challenges by standardizing task orchestration, providing a common language to facilitate communication, and nurturing a collaborative environment. Unfortunately, current process modeling languages are not suitable for describing the development of such systems. In this paper, we introduce a framework for modeling ML-based software development processes, built around a domain-specific language and derived from an analysis of scientific and gray literature. A supporting toolkit is also available.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Near to Mid-term Risks and Opportunities of Open-Source Generative AI
Authors:
Francisco Eiras,
Aleksandar Petrov,
Bertie Vidgen,
Christian Schroeder de Witt,
Fabio Pizzati,
Katherine Elkins,
Supratik Mukhopadhyay,
Adel Bibi,
Botos Csaba,
Fabro Steibel,
Fazl Barez,
Genevieve Smith,
Gianluca Guadagni,
Jon Chun,
Jordi Cabot,
Joseph Marvin Imperial,
Juan A. Nolazco-Flores,
Lori Landay,
Matthew Jackson,
Paul Röttger,
Philip H. S. Torr,
Trevor Darrell,
Yong Suk Lee,
Jakob Foerster
Abstract:
In the next few years, applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation i…
▽ More
In the next few years, applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation is likely to put at risk the budding field of open-source Generative AI. We argue for the responsible open sourcing of generative AI models in the near and medium term. To set the stage, we first introduce an AI openness taxonomy system and apply it to 40 current large language models. We then outline differential benefits and risks of open versus closed source AI and present potential risk mitigation, ranging from best practices to calls for technical and scientific contributions. We hope that this report will add a much needed missing voice to the current public discourse on near to mid-term AI safety and other societal impact.
△ Less
Submitted 24 May, 2024; v1 submitted 25 April, 2024;
originally announced April 2024.
-
Using Large Language Models to Enrich the Documentation of Datasets for Machine Learning
Authors:
Joan Giner-Miguelez,
Abel Gómez,
Jordi Cabot
Abstract:
Recent regulatory initiatives like the European AI Act and relevant voices in the Machine Learning (ML) community stress the need to describe datasets along several key dimensions for trustworthy AI, such as the provenance processes and social concerns. However, this information is typically presented as unstructured text in accompanying documentation, hampering their automated analysis and proces…
▽ More
Recent regulatory initiatives like the European AI Act and relevant voices in the Machine Learning (ML) community stress the need to describe datasets along several key dimensions for trustworthy AI, such as the provenance processes and social concerns. However, this information is typically presented as unstructured text in accompanying documentation, hampering their automated analysis and processing. In this work, we explore using large language models (LLM) and a set of prompting strategies to automatically extract these dimensions from documents and enrich the dataset description with them. Our approach could aid data publishers and practitioners in creating machine-readable documentation to improve the discoverability of their datasets, assess their compliance with current AI regulations, and improve the overall quality of ML models trained on them.
In this paper, we evaluate the approach on 12 scientific dataset papers published in two scientific journals (Nature's Scientific Data and Elsevier's Data in Brief) using two different LLMs (GPT3.5 and Flan-UL2). Results show good accuracy with our prompt extraction strategies. Concrete results vary depending on the dimensions, but overall, GPT3.5 shows slightly better accuracy (81,21%) than FLAN-UL2 (69,13%) although it is more prone to hallucinations. We have released an open-source tool implementing our approach and a replication package, including the experiments' code and results, in an open-source repository.
△ Less
Submitted 24 May, 2024; v1 submitted 4 April, 2024;
originally announced April 2024.
-
From Image to UML: First Results of Image Based UML Diagram Generation Using LLMs
Authors:
Aaron Conrardy,
Jordi Cabot
Abstract:
In software engineering processes, systems are first specified using a modeling language such as UML. These initial designs are often collaboratively created, many times in meetings where different domain experts use whiteboards, paper or other types of quick supports to create drawings and blueprints that then will need to be formalized. These proper, machine-readable, models are key to ensure mo…
▽ More
In software engineering processes, systems are first specified using a modeling language such as UML. These initial designs are often collaboratively created, many times in meetings where different domain experts use whiteboards, paper or other types of quick supports to create drawings and blueprints that then will need to be formalized. These proper, machine-readable, models are key to ensure models can be part of automated processes (e.g. input of a low-code generation pipeline, a model-based testing system, ...). But going from hand-drawn diagrams to actual models is a time-consuming process that sometimes ends up with such drawings just added as informal images to the software documentation, reducing their value a lot. To avoid this tedious task, we explore the usage of Large Language Models (LLM) to generate the formal representation of (UML) models from a given drawing. More specifically, we have evaluated the capabilities of different LLMs to convert images of UML class diagrams into the actual models represented in the images. While the results are good enough to use such an approach as part of a model-driven engineering pipeline we also highlight some of their current limitations and the need to keep the human in the loop to overcome those limitations.
△ Less
Submitted 18 June, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
Low-Modeling of Software Systems
Authors:
Jordi Cabot
Abstract:
There is a growing need for better development methods and tools to keep up with the increasing complexity of new software systems. New types of user interfaces, the need for intelligent components, sustainability concerns, ... bring new challenges that we need to handle. In the last years, model-driven engineering has been key to improving the quality and productivity of software development, but…
▽ More
There is a growing need for better development methods and tools to keep up with the increasing complexity of new software systems. New types of user interfaces, the need for intelligent components, sustainability concerns, ... bring new challenges that we need to handle. In the last years, model-driven engineering has been key to improving the quality and productivity of software development, but models themselves are becoming increasingly complex to specify and manage. In this paper, we present the concept of low-modeling as a solution to enhance current model-driven engineering techniques and get them ready for this new generation of software systems.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
On the Readiness of Scientific Data for a Fair and Transparent Use in Machine Learning
Authors:
Joan Giner-Miguelez,
Abel Gómez,
Jordi Cabot
Abstract:
To ensure the fairness and trustworthiness of machine learning (ML) systems, recent legislative initiatives and relevant research in the ML community have pointed out the need to document the data used to train ML models. Besides, data-sharing practices in many scientific domains have evolved in recent years for reproducibility purposes. In this sense, the adoption of these practices by academic i…
▽ More
To ensure the fairness and trustworthiness of machine learning (ML) systems, recent legislative initiatives and relevant research in the ML community have pointed out the need to document the data used to train ML models. Besides, data-sharing practices in many scientific domains have evolved in recent years for reproducibility purposes. In this sense, the adoption of these practices by academic institutions has encouraged researchers to publish their data and technical documentation in peer-reviewed publications such as data papers. In this study, we analyze how this scientific data documentation meets the needs of the ML community and regulatory bodies for its use in ML technologies. We examine a sample of 4041 data papers of different domains, assessing their completeness and coverage of the requested dimensions, and trends in recent years, putting special emphasis on the most and least documented dimensions. As a result, we propose a set of recommendation guidelines for data creators and scientific data publishers to increase their data's preparedness for its transparent and fairer use in ML technologies.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
On the Suitability of Hugging Face Hub for Empirical Studies
Authors:
Adem Ait,
Javier Luis Cánovas Izquierdo,
Jordi Cabot
Abstract:
Background. The development of empirical studies in software engineering mainly relies on the data available on code hosting platforms, being GitHub the most representative. Nevertheless, in the last years, the emergence of Machine Learning (ML) has led to the development of platforms specifically designed for develo** ML-based projects, being Hugging Face Hub (HFH) the most popular one. With ov…
▽ More
Background. The development of empirical studies in software engineering mainly relies on the data available on code hosting platforms, being GitHub the most representative. Nevertheless, in the last years, the emergence of Machine Learning (ML) has led to the development of platforms specifically designed for develo** ML-based projects, being Hugging Face Hub (HFH) the most popular one. With over 250k repositories, and growing fast, HFH is becoming a promising ecosystem of ML artifacts and therefore a potential source of data for empirical studies. However, so far there have been no studies evaluating the potential of HFH for such studies. Objective. In this proposal for a registered report, we aim at performing an exploratory study of the current state of HFH in order to investigate its suitability to be used as a source platform for empirical studies. Method. We conduct a qualitative and quantitative analysis of HFH for empirical studies. The former will be performed by comparing the features of HFH with those of other code hosting platforms, such as GitHub and GitLab. The latter will be performed by analyzing the data available in HFH.
△ Less
Submitted 27 July, 2023;
originally announced July 2023.
-
Towards the Automatic Generation of Conversational Interfaces to Facilitate the Exploration of Tabular Data
Authors:
Marcos Gomez,
Jordi Cabot,
Robert Clarisó
Abstract:
Tabular data is the most common format to publish and exchange structured data online. A clear example is the growing number of open data portals published by all types of public administrations. However, exploitation of these data sources is currently limited to technical people able to programmatically manipulate and digest such data. As an alternative, we propose the use of chatbots to offer a…
▽ More
Tabular data is the most common format to publish and exchange structured data online. A clear example is the growing number of open data portals published by all types of public administrations. However, exploitation of these data sources is currently limited to technical people able to programmatically manipulate and digest such data. As an alternative, we propose the use of chatbots to offer a conversational interface to facilitate the exploration of tabular data sources. With our approach, any regular citizen can benefit and leverage them. Moreover, our chatbots are not manually created: instead, they are automatically generated from the data source itself thanks to the instantiation of a configurable collection of conversation patterns.
△ Less
Submitted 24 May, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
A domain-specific language for describing machine learning datasets
Authors:
Joan Giner-Miguelez,
Abel Gómez,
Jordi Cabot
Abstract:
Datasets play a central role in the training and evaluation of machine learning (ML) models. But they are also the root cause of many undesired model behaviors, such as biased predictions. To overcome this situation, the ML community is proposing a data-centric cultural shift where data issues are given the attention they deserve, and more standard practices around the gathering and processing of…
▽ More
Datasets play a central role in the training and evaluation of machine learning (ML) models. But they are also the root cause of many undesired model behaviors, such as biased predictions. To overcome this situation, the ML community is proposing a data-centric cultural shift where data issues are given the attention they deserve, and more standard practices around the gathering and processing of datasets start to be discussed and established.
So far, these proposals are mostly high-level guidelines described in natural language and, as such, they are difficult to formalize and apply to particular datasets. In this sense, and inspired by these proposals, we define a new domain-specific language (DSL) to precisely describe machine learning datasets in terms of their structure, data provenance, and social concerns. We believe this DSL will facilitate any ML initiative to leverage and benefit from this data-centric shift in ML (e.g., selecting the most appropriate dataset for a new project or better replicating other ML results). The DSL is implemented as a Visual Studio Code plugin, and it has been published under an open source license.
△ Less
Submitted 8 July, 2022; v1 submitted 5 July, 2022;
originally announced July 2022.
-
The Present and Future of Bots in Software Engineering
Authors:
Emad Shihab,
Stefan Wagner,
Marco A. Gerosa,
Mairieli Wessel,
Jordi Cabot
Abstract:
We are witnessing a massive adoption of software engineering bots, applications that react to events triggered by tools and messages posted by users and run automated tasks in response, in a variety of domains. This thematic issues describes experiences and challenges with these bots.
We are witnessing a massive adoption of software engineering bots, applications that react to events triggered by tools and messages posted by users and run automated tasks in response, in a variety of domains. This thematic issues describes experiences and challenges with these bots.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Self-adaptive Architectures in IoT Systems: A Systematic Literature Review
Authors:
Iván Alfonso,
Kelly Garcés,
Harold Castro,
Jordi Cabot
Abstract:
Over the past few years, the relevance of the Internet of Things (IoT) has grown significantly and is now a key component of many industrial processes and even a transparent participant in various activities performed in our daily life. IoT systems are subjected to changes in the dynamic environments they operate in. These changes (e.g. variations in bandwidth consumption or new devices joining/le…
▽ More
Over the past few years, the relevance of the Internet of Things (IoT) has grown significantly and is now a key component of many industrial processes and even a transparent participant in various activities performed in our daily life. IoT systems are subjected to changes in the dynamic environments they operate in. These changes (e.g. variations in bandwidth consumption or new devices joining/leaving) may impact the Quality of Service (QoS) of the IoT system. A number of self-adaptation strategies for IoT architectures to better deal with these changes have been proposed in the literature. Nevertheless, they focus on isolated types of changes. We lack a comprehensive view of the trade-offs of each proposal and how they could be combined to cope with simultaneous events of different types.
In this paper, we identify, analyze, and interpret relevant studies related to IoT adaptation and develop a comprehensive and holistic view of the interplay of different dynamic events, their consequences on QoS, and the alternatives for the adaptation. To do so, we have conducted a systematic literature review of existing scientific proposals and defined a research agenda for the near future based on the findings and weaknesses identified in the literature.
△ Less
Submitted 27 December, 2021; v1 submitted 7 September, 2021;
originally announced September 2021.
-
A Model-based Chatbot Generation Approach to Converse with Open Data Sources
Authors:
Hamza Ed-douibi,
Javier Luis Cánovas Izquierdo,
Gwendal Daniel,
Jordi Cabot
Abstract:
The Open Data movement promotes the free distribution of data. More and more companies and governmental organizations are making their data available online following the Open Data philosophy, resulting in a growing market of technologies and services to help publish and consume data. One of the emergent ways to publish such data is via Web APIs, which offer a powerful means to reuse this data and…
▽ More
The Open Data movement promotes the free distribution of data. More and more companies and governmental organizations are making their data available online following the Open Data philosophy, resulting in a growing market of technologies and services to help publish and consume data. One of the emergent ways to publish such data is via Web APIs, which offer a powerful means to reuse this data and integrate it with other services. Socrata, CKAN or OData are examples of popular specifications for publishing data via Web APIs.
Nevertheless, querying and integrating these Web APIs is time-consuming and requires technical skills that limit the benefits of Open Data movement for the regular citizen. In other contexts, chatbot applications are being increasingly adopted as a direct communication channel between companies and end-users. We believe the same could be true for Open Data as a way to bridge the gap between citizens and Open Data sources. This paper describes an approach to automatically derive full-fledged chatbots from API-based Open Data sources. Our process relies on a model-based intermediate representation (via UML class diagrams and profiles) to facilitate the customization of the chatbot to be generated.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
A Survey of Software Foundations in Open Source
Authors:
Javier Luis Cánovas Izquierdo,
Jordi Cabot
Abstract:
A number of software foundations have been created as legal instruments to better articulate the structure, collaboration and financial model of Open Source Software (OSS) projects. Some examples are the Apache, Linux, or Mozilla foundations. However, the mission and support provided by these foundations largely differ among them. In this paper we perform a study on the role of foundations in OSS…
▽ More
A number of software foundations have been created as legal instruments to better articulate the structure, collaboration and financial model of Open Source Software (OSS) projects. Some examples are the Apache, Linux, or Mozilla foundations. However, the mission and support provided by these foundations largely differ among them. In this paper we perform a study on the role of foundations in OSS development. We analyze the nature, activities, role and governance of 101 software foundations and then go deeper on the 27 having as concrete goal the development and evolution of specific open source projects (and not just generic actions to promote the free software movement or similar). Our results reveal the existence of a significant number of foundations with the sole purpose of promoting the free software movement and/or that limit themselves to core legal aspects but do not play any role in the day-to-day operations of the project (e.g., umbrella organizations for a large variety of projects). Therefore, while useful, foundations do not remove the need for specific projects to develop their own specific governance, contribution and development policies. A website to help projects to choose the foundation that best fits their needs is also available.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
Online division of labour: emergent structures in Open Source Software
Authors:
María J. Palazzi,
Jordi Cabot,
Javier Luis Cánovas Izquierdo,
Albert Solé-Ribalta,
Javier Borge-Holthoefer
Abstract:
The development Open Source Software fundamentally depends on the participation and commitment of volunteer developers to progress. Several works have presented strategies to increase the on-boarding and engagement of new contributors, but little is known on how these diverse groups of developers self-organise to work together. To understand this, one must consider that, on one hand, platforms lik…
▽ More
The development Open Source Software fundamentally depends on the participation and commitment of volunteer developers to progress. Several works have presented strategies to increase the on-boarding and engagement of new contributors, but little is known on how these diverse groups of developers self-organise to work together. To understand this, one must consider that, on one hand, platforms like GitHub provide a virtually unlimited development framework: any number of actors can potentially join to contribute in a decentralised, distributed, remote, and asynchronous manner. On the other, however, it seems reasonable that some sort of hierarchy and division of labour must be in place to meet human biological and cognitive limits, and also to achieve some level of efficiency. These latter features (hierarchy and division of labour) should translate into recognisable structural arrangements when projects are represented as developer-file bipartite networks. In this paper we analyse a set of popular open source projects from GitHub, placing the accent on three key properties: nestedness, modularity and in-block nestedness -which typify the emergence of heterogeneities among contributors, the emergence of subgroups of developers working on specific subgroups of files, and a mixture of the two previous, respectively. These analyses show that indeed projects evolve into internally organised blocks. Furthermore, the distribution of sizes of such blocks is bounded, connecting our results to the celebrated Dunbar number both in off- and on-line environments. Our analyses create a link between bio-cognitive constraints, group formation and online working environments, opening up a rich scenario for future research on (online) work team assembly.
△ Less
Submitted 8 March, 2019;
originally announced March 2019.
-
EMF-REST: Generation of RESTful APIs from Models
Authors:
Hamza Ed-Douibi,
Javier Luis Cánovas Izquierdo,
Abel Gómez,
Massimo Tisi,
Jordi Cabot
Abstract:
In the last years, RESTful Web services have become more and more popular as a lightweight solution to connect remote systems in distributed and Cloud-based architectures. However, being an architectural style rather than a specification or standard, the proper design of RESTful Web services is not trivial since developers have to deal with a plethora of recommendations and best practices. Model-D…
▽ More
In the last years, RESTful Web services have become more and more popular as a lightweight solution to connect remote systems in distributed and Cloud-based architectures. However, being an architectural style rather than a specification or standard, the proper design of RESTful Web services is not trivial since developers have to deal with a plethora of recommendations and best practices. Model-Driven Engineering (MDE) emphasizes the use of models and model transformations to raise the level of abstraction and semi-automate the development of software. In this paper we present an approach that leverages on MDE techniques to generate RESTful services. The approach, called EMF-REST, takes EMF data models as input and generates Web APIs following the REST principles and relying on well-known libraries and standards, thus facilitating its comprehension and maintainability. Additionally, EMF-REST integrates model and Web-specific features to provide model validation and security capabilities, respectively, to the generated API. For Web developers, our approach brings more agility to the Web development process by providing ready-to-run-and-test Web APIs out of data models. Also, our approach provides MDE practitioners the basis to develop Cloud-based modeling solutions as well as enhanced collaborative support.
△ Less
Submitted 14 April, 2015;
originally announced April 2015.
-
Three Metrics to Explore the Openness of GitHub projects
Authors:
Valerio Cosentino,
Javier Luis Canovas Izquierdo,
Jordi Cabot
Abstract:
Open source software projects evolve thanks to a group of volunteers that help in their development. Thus, the success of these projects depends on their ability to attract (and keep) developers. We believe the openness of a project, i.e., how easy is for a new user to actively contribute to it, can help to make a project more attractive. To explore the openness of a software project, we propose t…
▽ More
Open source software projects evolve thanks to a group of volunteers that help in their development. Thus, the success of these projects depends on their ability to attract (and keep) developers. We believe the openness of a project, i.e., how easy is for a new user to actively contribute to it, can help to make a project more attractive. To explore the openness of a software project, we propose three metrics focused on: (1) the distribution of the project community, (2) the rate of acceptance of external contributions and (3) the time it takes to become an official collaborator of the project. We have adapted and applied these metrics to a subset of GitHub projects, thus giving some practical findings on their openness.
△ Less
Submitted 15 September, 2014;
originally announced September 2014.
-
P ORTOLAN: a Model-Driven Cartography Framework
Authors:
Vincent Mahe,
Salvador Martinez Perez,
Guillaume Doux,
Hugo Brunelière,
Jordi Cabot
Abstract:
Processing large amounts of data to extract useful information is an essential task within companies. To help in this task, visualization techniques have been commonly used due to their capacity to present data in synthesized views, easier to understand and manage. However, achieving the right visualization display for a data set is a complex cartography process that involves several transformatio…
▽ More
Processing large amounts of data to extract useful information is an essential task within companies. To help in this task, visualization techniques have been commonly used due to their capacity to present data in synthesized views, easier to understand and manage. However, achieving the right visualization display for a data set is a complex cartography process that involves several transformation steps to adapt the (domain) data to the (visualization) data format expected by visualization tools. To maximize the benefits of visualization we propose Portolan, a generic model-driven cartography framework that facilitates the discovery of the data to visualize, the specification of view definitions for that data and the transformations to bridge the gap with the visualization tools. Our approach has been implemented on top of the Eclipse EMF modeling framework and validated on three different use cases.
△ Less
Submitted 23 February, 2011;
originally announced February 2011.