Search | arXiv e-print repository

arXiv:2403.13784 [pdf, ps, other]

The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency, and Usability in Artificial Intelligence

Authors: Matt White, Ibrahim Haddad, Cailean Osborne, Xiao-Yang Liu Yanglet, Ahmed Abdelmonsef, Sachin Varghese

Abstract: Generative AI (GAI) offers unprecedented opportunities for research and innovation, but its commercialization has raised concerns about transparency, reproducibility, and safety. Many open GAI models lack the necessary components for full understanding and reproducibility, and some use restrictive licenses whilst claiming to be ``open-source''. To address these concerns, we propose the Model Openn… ▽ More Generative AI (GAI) offers unprecedented opportunities for research and innovation, but its commercialization has raised concerns about transparency, reproducibility, and safety. Many open GAI models lack the necessary components for full understanding and reproducibility, and some use restrictive licenses whilst claiming to be ``open-source''. To address these concerns, we propose the Model Openness Framework (MOF), a ranked classification system that rates machine learning models based on their completeness and openness, following principles of open science, open source, open data, and open access. The MOF requires specific components of the model development lifecycle to be included and released under appropriate open licenses. This framework aims to prevent misrepresentation of models claiming to be open, guide researchers and developers in providing all model components under permissive licenses, and help individuals and organizations identify models that can be safely adopted without restrictions. By promoting transparency and reproducibility, the MOF combats ``openwashing'' practices and establishes completeness and openness as primary criteria alongside the core tenets of responsible AI. Wide adoption of the MOF will foster a more open AI ecosystem, benefiting research, innovation, and adoption of state-of-the-art models. △ Less

Submitted 3 June, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

Comments: 22 pages

arXiv:2401.06246 [pdf, other]

doi 10.1126/science.adn7087

Solid-state continuous time crystal with a built-in clock

Authors: I. Carraro Haddad, D. L. Chafatinos, A. S. Kuznetsov, I. A. Papuccio-Fernández, A. A. Reynoso, A. E. Bruchhausen, K. Biermann, P. V. Santos, G. Usaj, A. Fainstein

Abstract: Time crystals (TCs) are many-body systems displaying spontaneous breaking of time translation symmetry. Here, we demonstrate a TC using driven-dissipative condensates of microcavity exciton-polaritons, spontaneously formed from an incoherent particle bath. In contrast to other realizations, the TC phases can be controlled by the power of continuous-wave non-resonant optical drive exciting the cond… ▽ More Time crystals (TCs) are many-body systems displaying spontaneous breaking of time translation symmetry. Here, we demonstrate a TC using driven-dissipative condensates of microcavity exciton-polaritons, spontaneously formed from an incoherent particle bath. In contrast to other realizations, the TC phases can be controlled by the power of continuous-wave non-resonant optical drive exciting the condensate and optomechanical interactions with phonons. Those phases are for increasing power: (i) Larmor precession of pseudo-spins - a signature of continuous TC, (ii) locking of the frequency of precession to self-sustained coherent phonons - stabilized TC, (iii) doubling of TC frequency by phonons - a discrete TC with continuous excitation. These results establish microcavity polaritons as a platform for the investigation of time-broken symmetry in non-hermitian systems. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: 25 pages; 15 figures

Journal ref: Science 384, 995-1000 (2024)

arXiv:2307.06824 [pdf]

CLAIMED -- the open source framework for building coarse-grained operators for accelerated discovery in science

Authors: Romeo Kienzler, Rafflesia Khan, Jerome Nilmeier, Ivan Nesic, Ibrahim Haddad

Abstract: In modern data-driven science, reproducibility and reusability are key challenges. Scientists are well skilled in the process from data to publication. Although some publication channels require source code and data to be made accessible, rerunning and verifying experiments is usually hard due to a lack of standards. Therefore, reusing existing scientific data processing code from state-of-the-art… ▽ More In modern data-driven science, reproducibility and reusability are key challenges. Scientists are well skilled in the process from data to publication. Although some publication channels require source code and data to be made accessible, rerunning and verifying experiments is usually hard due to a lack of standards. Therefore, reusing existing scientific data processing code from state-of-the-art research is hard as well. This is why we introduce CLAIMED, which has a proven track record in scientific research for addressing the repeatability and reusability issues in modern data-driven science. CLAIMED is a framework to build reusable operators and scalable scientific workflows by supporting the scientist to draw from previous work by re-composing workflows from existing libraries of coarse-grained scientific operators. Although various implementations exist, CLAIMED is programming language, scientific library, and execution environment agnostic. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: Received IEEE OSS Award 2023 - https://conferences.computer.org/services/2023/symposia/oss.html

arXiv:2201.00579 [pdf]

doi 10.1016/j.envint.2022.107325

European Aerosol Phenomenology -- 8: Harmonised Source Apportionment of Organic Aerosol using 22 Year-long ACSM/AMS Datasets

Authors: Gang Chen, Francesco Canonaco, Anna Tobler, Wenche Aas, Andres Alastuey, James Allan, Samira Atabakhsh, Minna Aurela, Urs Baltensperger, Aikaterini Bougiatioti, Joel F. De Brito, Darius Ceburnis, Benjamin Chazeau, Hasna Chebaicheb, Kaspar R. Daellenbach, Mikael Ehn, Imad El Haddad, Konstantinos Eleftheriadis, Olivier Favez, Harald Flentje, Anna Font, Kirsten Fossum, Evelyn Freney, Maria Gini, David C Green , et al. (45 additional authors not shown)

Abstract: Organic aerosol (OA) is a key component to total submicron particulate matter (PM1), and comprehensive knowledge of OA sources across Europe is crucial to mitigate PM1 levels. Europe has a well-established air quality research infrastructure from which yearlong datasets using 21 aerosol chemical speciation monitors (ACSMs) and 1 aerosol mass spectrometer (AMS) were gathered during 2013-2019. It in… ▽ More Organic aerosol (OA) is a key component to total submicron particulate matter (PM1), and comprehensive knowledge of OA sources across Europe is crucial to mitigate PM1 levels. Europe has a well-established air quality research infrastructure from which yearlong datasets using 21 aerosol chemical speciation monitors (ACSMs) and 1 aerosol mass spectrometer (AMS) were gathered during 2013-2019. It includes 9 non-urban and 13 urban sites. This study developed a state-of-the-art source apportionment protocol to analyse long-term OA mass spectrum data by applying the most advanced source apportionment strategies (i.e., rolling PMF, ME-2, and bootstrap). This harmonised protocol enables the quantifications of the most common OA components such as hydrocarbon-like OA (HOA), biomass burning OA (BBOA), cooking-like OA (COA), more oxidised-oxygenated OA (MO-OOA), and less oxidised-oxygenated OA (LO-OOA). Other components such as coal combustion OA (CCOA), solid fuel OA (SFOA: mainly mixture of coal and peat combustion), cigarette smoke OA (CSOA), sea salt (mostly inorganic but part of the OA mass spectrum), coffee OA, and ship industry OA could also be separated at a few specific sites. Oxygenated OA (OOA) components make up most of the submicron OA mass (average = 71.1%, a range of 43.7-100%). Solid fuel combustion-related OA components (i.e., BBOA, CCOA, and SFOA) are still considerable with in total 16.0% yearly contribution to the OA, yet mainly during winter months (21.4%). Overall, this comprehensive protocol works effectively across all sites governed by different sources and generates robust and consistent source apportionment results. Our work presents a comprehensive overview of OA sources in Europe with a unique combination of high time resolution and long-term data coverage (9-36 months), providing essential information to improve/validate air quality, health impact, and climate models. △ Less

Submitted 4 January, 2022; v1 submitted 3 January, 2022; originally announced January 2022.

Journal ref: Environment International 165(2022) 107325

arXiv:2101.00675 [pdf, other]

Sentiment Analysis for Open Domain Conversational Agent

Authors: Mohamad Alissa, Issa Haddad, Jonathan Meyer, Jade Obeid, Nicolas Wiecek, Sukrit Wongariyakavee

Abstract: The applicability of common sentiment analysis models to open domain human robot interaction is investigated within this paper. The models are used on a dataset specific to user interaction with the Alana system (a Alexa prize system) in order to determine which would be more appropriate for the task of identifying sentiment when a user interacts with a non-human driven socialbot. With the identif… ▽ More The applicability of common sentiment analysis models to open domain human robot interaction is investigated within this paper. The models are used on a dataset specific to user interaction with the Alana system (a Alexa prize system) in order to determine which would be more appropriate for the task of identifying sentiment when a user interacts with a non-human driven socialbot. With the identification of a model, various improvements are attempted and detailed prior to integration into the Alana system. The study showed that a Random Forest Model with 25 trees trained on the dataset specific to user interaction with the Alana system combined with the dataset present in NLTK Vader outperforms other models. The new system (called 'Rob') matches it's output utterance sentiment with the user's utterance sentiment. This method is expected to improve user experience because it builds upon the overall sentiment detection which makes it seem that new system sympathises with user feelings. Furthermore, the results obtained from the user feedback confirms our expectation. △ Less

Submitted 15 July, 2021; v1 submitted 3 January, 2021; originally announced January 2021.

Comments: 9 pages, 3 figures

Showing 1–5 of 5 results for author: Haddad, I