-
Region-level labels in ice charts can produce pixel-level segmentation for Sea Ice types
Authors:
Muhammed Patel,
Xinwei Chen,
Linlin Xu,
Yuhao Chen,
K Andrea Scott,
David A. Clausi
Abstract:
Fully supervised deep learning approaches have demonstrated impressive accuracy in sea ice classification, but their dependence on high-resolution labels presents a significant challenge due to the difficulty of obtaining such data. In response, our weakly supervised learning method provides a compelling alternative by utilizing lower-resolution regional labels from expert-annotated ice charts. Th…
▽ More
Fully supervised deep learning approaches have demonstrated impressive accuracy in sea ice classification, but their dependence on high-resolution labels presents a significant challenge due to the difficulty of obtaining such data. In response, our weakly supervised learning method provides a compelling alternative by utilizing lower-resolution regional labels from expert-annotated ice charts. This approach achieves exceptional pixel-level classification performance by introducing regional loss representations during training to measure the disparity between predicted and ice chart-derived sea ice type distributions. Leveraging the AI4Arctic Sea Ice Challenge Dataset, our method outperforms the fully supervised U-Net benchmark, the top solution of the AutoIce challenge, in both map** resolution and class-wise accuracy, marking a significant advancement in automated operational sea ice map**.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Utilizing synthetic training data for the supervised classification of rat ultrasonic vocalizations
Authors:
K. Jack Scott,
Lucinda J. Speers,
David K. Bilkey
Abstract:
Murine rodents generate ultrasonic vocalizations (USVs) with frequencies that extend to around 120kHz. These calls are important in social behaviour, and so their analysis can provide insights into the function of vocal communication, and its dysfunction. The manual identification of USVs, and subsequent classification into different subcategories is time consuming. Although machine learning appro…
▽ More
Murine rodents generate ultrasonic vocalizations (USVs) with frequencies that extend to around 120kHz. These calls are important in social behaviour, and so their analysis can provide insights into the function of vocal communication, and its dysfunction. The manual identification of USVs, and subsequent classification into different subcategories is time consuming. Although machine learning approaches for identification and classification can lead to enormous efficiency gains, the time and effort required to generate training data can be high, and the accuracy of current approaches can be problematic. Here we compare the detection and classification performance of a trained human against two convolutional neural networks (CNNs), DeepSqueak and VocalMat, on audio containing rat USVs. Furthermore, we test the effect of inserting synthetic USVs into the training data of the VocalMat CNN as a means of reducing the workload associated with generating a training set. Our results indicate that VocalMat outperformed the DeepSqueak CNN on measures of call identification, and classification. Additionally, we found that the augmentation of training data with synthetic images resulted in a further improvement in accuracy, such that it was sufficiently close to human performance to allow for the use of this software in laboratory conditions.
△ Less
Submitted 18 January, 2024; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Domain Adaptive Decision Trees: Implications for Accuracy and Fairness
Authors:
Jose M. Alvarez,
Kristen M. Scott,
Salvatore Ruggieri,
Bettina Berendt
Abstract:
In uses of pre-trained machine learning models, it is a known issue that the target population in which the model is being deployed may not have been reflected in the source population with which the model was trained. This can result in a biased model when deployed, leading to a reduction in model performance. One risk is that, as the population changes, certain demographic groups will be under-s…
▽ More
In uses of pre-trained machine learning models, it is a known issue that the target population in which the model is being deployed may not have been reflected in the source population with which the model was trained. This can result in a biased model when deployed, leading to a reduction in model performance. One risk is that, as the population changes, certain demographic groups will be under-served or otherwise disadvantaged by the model, even as they become more represented in the target population. The field of domain adaptation proposes techniques for a situation where label data for the target population does not exist, but some information about the target distribution does exist. In this paper we contribute to the domain adaptation literature by introducing domain-adaptive decision trees (DADT). We focus on decision trees given their growing popularity due to their interpretability and performance relative to other more complex models. With DADT we aim to improve the accuracy of models trained in a source domain (or training data) that differs from the target domain (or test data). We propose an in-processing step that adjusts the information gain split criterion with outside information corresponding to the distribution of the target population. We demonstrate DADT on real data and find that it improves accuracy over a standard decision tree when testing in a shifted target population. We also study the change in fairness under demographic parity and equal opportunity. Results show an improvement in fairness with the use of DADT.
△ Less
Submitted 31 May, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Fairness in Agreement With European Values: An Interdisciplinary Perspective on AI Regulation
Authors:
Alejandra Bringas Colmenarejo,
Luca Nannini,
Alisa Rieger,
Kristen M. Scott,
Xuan Zhao,
Gourab K. Patro,
Gjergji Kasneci,
Katharina Kinder-Kurlanda
Abstract:
With increasing digitalization, Artificial Intelligence (AI) is becoming ubiquitous. AI-based systems to identify, optimize, automate, and scale solutions to complex economic and societal problems are being proposed and implemented. This has motivated regulation efforts, including the Proposal of an EU AI Act. This interdisciplinary position paper considers various concerns surrounding fairness an…
▽ More
With increasing digitalization, Artificial Intelligence (AI) is becoming ubiquitous. AI-based systems to identify, optimize, automate, and scale solutions to complex economic and societal problems are being proposed and implemented. This has motivated regulation efforts, including the Proposal of an EU AI Act. This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them, focusing on (but not limited to) the Proposal. We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives. Then, we map these perspectives along three axes of interests: (i) Standardization vs. Localization, (ii) Utilitarianism vs. Egalitarianism, and (iii) Consequential vs. Deontological ethics which leads us to identify a pattern of common arguments and tensions between these axes. Positioning the discussion within the axes of interest and with a focus on reconciling the key tensions, we identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
△ Less
Submitted 8 June, 2022;
originally announced July 2022.
-
Measuring Shifts in Attitudes Towards COVID-19 Measures in Belgium Using Multilingual BERT
Authors:
Kristen Scott,
Pieter Delobelle,
Bettina Berendt
Abstract:
We classify seven months' worth of Belgian COVID-related Tweets using multilingual BERT and relate them to their governments' COVID measures. We classify Tweets by their stated opinion on Belgian government curfew measures (too strict, ok, too loose). We examine the change in topics discussed and views expressed over time and in reference to dates of related events such as implementation of new me…
▽ More
We classify seven months' worth of Belgian COVID-related Tweets using multilingual BERT and relate them to their governments' COVID measures. We classify Tweets by their stated opinion on Belgian government curfew measures (too strict, ok, too loose). We examine the change in topics discussed and views expressed over time and in reference to dates of related events such as implementation of new measures or COVID-19 related announcements in the media.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Accountability of AI Under the Law: The Role of Explanation
Authors:
Finale Doshi-Velez,
Mason Kortz,
Ryan Budish,
Chris Bavitz,
Sam Gershman,
David O'Brien,
Kate Scott,
Stuart Schieber,
James Waldo,
David Weinberger,
Adrian Weller,
Alexandra Wood
Abstract:
The ubiquity of systems using artificial intelligence or "AI" has brought increasing attention to how those systems should be regulated. The choice of how to regulate AI systems will require care. AI systems have the potential to synthesize large amounts of data, allowing for greater levels of personalization and precision than ever before---applications range from clinical decision support to aut…
▽ More
The ubiquity of systems using artificial intelligence or "AI" has brought increasing attention to how those systems should be regulated. The choice of how to regulate AI systems will require care. AI systems have the potential to synthesize large amounts of data, allowing for greater levels of personalization and precision than ever before---applications range from clinical decision support to autonomous driving and predictive policing. That said, there exist legitimate concerns about the intentional and unintentional negative consequences of AI systems. There are many ways to hold AI systems accountable. In this work, we focus on one: explanation. Questions about a legal right to explanation from AI systems was recently debated in the EU General Data Protection Regulation, and thus thinking carefully about when and how explanation from AI systems might improve accountability is timely. In this work, we review contexts in which explanation is currently required under the law, and then list the technical considerations that must be considered if we desired AI systems that could provide kinds of explanations that are currently required of humans.
△ Less
Submitted 20 December, 2019; v1 submitted 3 November, 2017;
originally announced November 2017.
-
Correlations between user voting data, budget, and box office for films in the Internet Movie Database
Authors:
Max Wasserman,
Satyam Mukherjee,
Konner Scott,
Xiao Han T. Zeng,
Filippo Radicchi,
Luís A. N. Amaral
Abstract:
The Internet Movie Database (IMDb) is one of the most-visited websites in the world and the premier source for information on films. Like Wikipedia, much of IMDb's information is user contributed. IMDb also allows users to voice their opinion on the quality of films through voting. We investigate whether there is a connection between this user voting data and certain economic film characteristics.…
▽ More
The Internet Movie Database (IMDb) is one of the most-visited websites in the world and the premier source for information on films. Like Wikipedia, much of IMDb's information is user contributed. IMDb also allows users to voice their opinion on the quality of films through voting. We investigate whether there is a connection between this user voting data and certain economic film characteristics. To this end, we perform distribution and correlation analysis on a set of films chosen to mitigate effects of bias due to the language and country of origin of films. We show that production budget, box office gross, and total number of user votes for films are consistent with double-log normal distributions for certain time periods. Both total gross and user votes are consistent with a double-log normal distribution from the late 1980s onward, while for budget, it extends from 1935 to 1979. In addition, we find a strong correlation between number of user votes and the economic statistics, particularly budget. Remarkably, we find no evidence for a correlation between number of votes and average user rating. As previous studies have found a strong correlation between production budget and marketing expenses, our results suggest that total user votes is an indicator of a film's prominence or notability, which can be quantified by its promotional costs.
△ Less
Submitted 16 January, 2014; v1 submitted 13 December, 2013;
originally announced December 2013.