-
Predicting Machine Translation Performance on Low-Resource Languages: The Role of Domain Similarity
Authors:
Eric Khiu,
Hasti Toossi,
David Anugraha,
**yu Liu,
Jiaxu Li,
Juan Armando Parra Flores,
Leandro Acros Roman,
A. Seza Doğruöz,
En-Shiun Annie Lee
Abstract:
Fine-tuning and testing a multilingual large language model is expensive and challenging for low-resource languages (LRLs). While previous studies have predicted the performance of natural language processing (NLP) tasks using machine learning methods, they primarily focus on high-resource languages, overlooking LRLs and shifts across domains. Focusing on LRLs, we investigate three factors: the si…
▽ More
Fine-tuning and testing a multilingual large language model is expensive and challenging for low-resource languages (LRLs). While previous studies have predicted the performance of natural language processing (NLP) tasks using machine learning methods, they primarily focus on high-resource languages, overlooking LRLs and shifts across domains. Focusing on LRLs, we investigate three factors: the size of the fine-tuning corpus, the domain similarity between fine-tuning and testing corpora, and the language similarity between source and target languages. We employ classical regression models to assess how these factors impact the model's performance. Our results indicate that domain similarity has the most critical impact on predicting the performance of Machine Translation models.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
NEUROPULS: NEUROmorphic energy-efficient secure accelerators based on Phase change materials aUgmented siLicon photonicS
Authors:
Fabio Pavanello,
Cedric Marchand,
Ian O'Connor,
Regis Orobtchouk,
Fabien Mandorlo,
Xavier Letartre,
Sebastien Cueff,
Elena Ioana Vatajelu,
Giorgio Di Natale,
Benoit Cluzel,
Aurelien Coillet,
Benoit Charbonnier,
Pierre Noe,
Frantisek Kavan,
Martin Zoldak,
Michal Szaj,
Peter Bienstman,
Thomas Van Vaerenbergh,
Ulrich Ruhrmair,
Paulo Flores,
Luis Guerra e Silva,
Ricardo Chaves,
Luis-Miguel Silveira,
Mariano Ceccato,
Dimitris Gizopoulos
, et al. (12 additional authors not shown)
Abstract:
This special session paper introduces the Horizon Europe NEUROPULS project, which targets the development of secure and energy-efficient RISC-V interfaced neuromorphic accelerators using augmented silicon photonics technology. Our approach aims to develop an augmented silicon photonics platform, an FPGA-powered RISC-V-connected computing platform, and a complete simulation platform to demonstrate…
▽ More
This special session paper introduces the Horizon Europe NEUROPULS project, which targets the development of secure and energy-efficient RISC-V interfaced neuromorphic accelerators using augmented silicon photonics technology. Our approach aims to develop an augmented silicon photonics platform, an FPGA-powered RISC-V-connected computing platform, and a complete simulation platform to demonstrate the neuromorphic accelerator capabilities. In particular, their main advantages and limitations will be addressed concerning the underpinning technology for each platform. Then, we will discuss three targeted use cases for edge-computing applications: Global National Satellite System (GNSS) anti-jamming, autonomous driving, and anomaly detection in edge devices. Finally, we will address the reliability and security aspects of the stand-alone accelerator implementation and the project use cases.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Site-specific weed management in corn using UAS imagery analysis and computer vision techniques
Authors:
Ranjan Sapkota,
John Stenger,
Michael Ostlie,
Paulo Flores
Abstract:
Currently, weed control in commercial corn production is performed without considering weed distribution information in the field. This kind of weed management practice leads to excessive amounts of chemical herbicides being applied in a given field. The objective of this study was to perform site-specific weed control (SSWC) in a corn field by 1) using an unmanned aerial system (UAS) to map the s…
▽ More
Currently, weed control in commercial corn production is performed without considering weed distribution information in the field. This kind of weed management practice leads to excessive amounts of chemical herbicides being applied in a given field. The objective of this study was to perform site-specific weed control (SSWC) in a corn field by 1) using an unmanned aerial system (UAS) to map the spatial distribution information of weeds in the field; 2) creating a prescription map based on the weed distribution map, and 3) spraying the field using the prescription map and a commercial size sprayer. In this study, we are proposing a Crop Row Identification (CRI) algorithm, a computer vision algorithm that identifies corn rows on UAS imagery. After being identified, the corn rows were then removed from the imagery and the remaining vegetation fraction was classified as weeds. Based on that information, a grid-based weed prescription map was created and the weed control application was implemented through a commercial-size sprayer. The decision of spraying herbicides on a particular grid was based on the presence of weeds in that grid cell. All the grids that contained at least one weed were sprayed, while the grids free of weeds were not. Using our SSWC approach, we were able to save 26.23\% of the land (1.97 acres) from being sprayed with chemical herbicides compared to the existing method. This study presents a full workflow from UAS image collection to field weed control implementation using a commercial-size sprayer, and it shows that some level of savings can potentially be obtained even in a situation with high weed infestation, which might provide an opportunity to reduce chemical usage in corn production systems.
△ Less
Submitted 31 December, 2022;
originally announced January 2023.
-
Error-Aware B-PINNs: Improving Uncertainty Quantification in Bayesian Physics-Informed Neural Networks
Authors:
Olga Graf,
Pablo Flores,
Pavlos Protopapas,
Karim Pichara
Abstract:
Physics-Informed Neural Networks (PINNs) are gaining popularity as a method for solving differential equations. While being more feasible in some contexts than the classical numerical techniques, PINNs still lack credibility. A remedy for that can be found in Uncertainty Quantification (UQ) which is just beginning to emerge in the context of PINNs. Assessing how well the trained PINN complies with…
▽ More
Physics-Informed Neural Networks (PINNs) are gaining popularity as a method for solving differential equations. While being more feasible in some contexts than the classical numerical techniques, PINNs still lack credibility. A remedy for that can be found in Uncertainty Quantification (UQ) which is just beginning to emerge in the context of PINNs. Assessing how well the trained PINN complies with imposed differential equation is the key to tackling uncertainty, yet there is lack of comprehensive methodology for this task. We propose a framework for UQ in Bayesian PINNs (B-PINNs) that incorporates the discrepancy between the B-PINN solution and the unknown true solution. We exploit recent results on error bounds for PINNs on linear dynamical systems and demonstrate the predictive uncertainty on a class of linear ODEs.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
Using UAS Imagery and Computer Vision to Support Site-Specific Weed Control in Corn
Authors:
Ranjan Sapkota,
Paulo Flores
Abstract:
Currently, weed control in a corn field is performed by a blanket application of herbicides that do not consider spatial distribution information of weeds and also uses an extensive amount of chemical herbicides. To reduce the amount of chemicals, we used drone-based high-resolution imagery and computer-vision techniques to perform site-specific weed control in corn.
Currently, weed control in a corn field is performed by a blanket application of herbicides that do not consider spatial distribution information of weeds and also uses an extensive amount of chemical herbicides. To reduce the amount of chemicals, we used drone-based high-resolution imagery and computer-vision techniques to perform site-specific weed control in corn.
△ Less
Submitted 2 June, 2022;
originally announced June 2022.
-
Uncertainty Quantification in Neural Differential Equations
Authors:
Olga Graf,
Pablo Flores,
Pavlos Protopapas,
Karim Pichara
Abstract:
Uncertainty quantification (UQ) helps to make trustworthy predictions based on collected observations and uncertain domain knowledge. With increased usage of deep learning in various applications, the need for efficient UQ methods that can make deep models more reliable has increased as well. Among applications that can benefit from effective handling of uncertainty are the deep learning based dif…
▽ More
Uncertainty quantification (UQ) helps to make trustworthy predictions based on collected observations and uncertain domain knowledge. With increased usage of deep learning in various applications, the need for efficient UQ methods that can make deep models more reliable has increased as well. Among applications that can benefit from effective handling of uncertainty are the deep learning based differential equation (DE) solvers. We adapt several state-of-the-art UQ methods to get the predictive uncertainty for DE solutions and show the results on four different DE types.
△ Less
Submitted 7 November, 2021;
originally announced November 2021.
-
Audience and Streamer Participation at Scale on Twitch
Authors:
Claudia Flores-Saviaga,
Jessica Hammer,
Juan Pablo Flores,
Joseph Seering,
Stuart Reeves,
Saiph Savage
Abstract:
Large-scale streaming platforms such as Twitch are becoming increasingly popular, but detailed audience-streamer interaction dynamics remain unexplored at scale. In this paper, we perform a mixed-methods study on a dataset with over 12 million audience chat messages and 45 hours of streaming video to understand audience participation and streamer performance on Twitch. We uncover five types of str…
▽ More
Large-scale streaming platforms such as Twitch are becoming increasingly popular, but detailed audience-streamer interaction dynamics remain unexplored at scale. In this paper, we perform a mixed-methods study on a dataset with over 12 million audience chat messages and 45 hours of streaming video to understand audience participation and streamer performance on Twitch. We uncover five types of streams based on size and audience participation styles: Clique Streams, small streams with close streamer-audience interactions; Rising Streamers, mid-range streams using custom technology and moderators to formalize their communities; Chatter-boxes, mid-range streams with established conversational dynamics; Spotlight Streamers, large streams that engage large numbers of viewers while still retaining a sense of community; and Professionals, massive streams with the stadium-style audiences. We discuss challenges and opportunities emerging for streamers and audiences from each style and conclude by providing data-backed design implications that empower streamers, audiences, live streaming platforms, and game designers
△ Less
Submitted 30 November, 2020;
originally announced December 2020.
-
Using data science as a community advocacy tool to promote equity in urban renewal programs: An analysis of Atlanta's Anti-Displacement Tax Fund
Authors:
Jeremy Auerbach,
Hayley Barton,
Takeria Blunt,
Vishwamitra Chaganti,
Bhavya Ghai,
Amanda Meng,
Christopher Blackburn,
Ellen Zegura,
Pamela Flores
Abstract:
Cities across the United States are undergoing great transformation and urban growth. Data and data analysis has become an essential element of urban planning as cities use data to plan land use and development. One great challenge is to use the tools of data science to promote equity along with growth. The city of Atlanta is an example site of large-scale urban renewal that aims to engage in deve…
▽ More
Cities across the United States are undergoing great transformation and urban growth. Data and data analysis has become an essential element of urban planning as cities use data to plan land use and development. One great challenge is to use the tools of data science to promote equity along with growth. The city of Atlanta is an example site of large-scale urban renewal that aims to engage in development without displacement. On the Westside of downtown Atlanta, the construction of the new Mercedes-Benz Stadium and the conversion of an underutilized rail-line into a multi-use trail may result in increased property values. In response to community residents' concerns and a commitment to development without displacement, the city and philanthropic partners announced an Anti-Displacement Tax Fund to subsidize future property tax increases of owner occupants for the next twenty years. To achieve greater transparency, accountability, and impact, residents expressed a desire for a tool that would help them determine eligibility and quantify this commitment. In support of this goal, we use machine learning techniques to analyze historical tax assessment and predict future tax assessments. We then apply eligibility estimates to our predictions to estimate the total cost for the first seven years of the program. These forecasts are also incorporated into an interactive tool for community residents to determine their eligibility for the fund and the expected increase in their home value over the next seven years.
△ Less
Submitted 6 October, 2017;
originally announced October 2017.