Skip to main content

Showing 1–12 of 12 results for author: Afonja, T

.
  1. arXiv:2406.12387  [pdf, other

    eess.AS cs.CL cs.SD

    Performant ASR Models for Medical Entities in Accented Speech

    Authors: Tejumade Afonja, Tobi Olatunji, Sewade Ogun, Naome A. Etori, Abraham Owodunni, Moshood Yekini

    Abstract: Recent strides in automatic speech recognition (ASR) have accelerated their application in the medical domain where their performance on accented medical named entities (NE) such as drug names, diagnoses, and lab results, is largely unknown. We rigorously evaluate multiple ASR models on a clinical English dataset of 93 African accents. Our analysis reveals that despite some models achieving low ov… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  2. arXiv:2406.11727  [pdf, ps, other

    eess.AS cs.CL

    1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis

    Authors: Sewade Ogun, Abraham T. Owodunni, Tobi Olatunji, Eniola Alese, Babatunde Oladimeji, Tejumade Afonja, Kayode Olaleye, Naome A. Etori, Tosin Adewumi

    Abstract: Recent advances in speech synthesis have enabled many useful applications like audio directions in Google Maps, screen readers, and automated content generation on platforms like TikTok. However, these systems are mostly dominated by voices sourced from data-rich geographies with personas representative of their source data. Although 3000 of the world's languages are domiciled in Africa, African v… ▽ More

    Submitted 27 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  3. arXiv:2406.09496  [pdf, other

    cs.CY cs.AI

    You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes

    Authors: Jabez Magomere, Shu Ishida, Tejumade Afonja, Aya Salama, Daniel Kochin, Foutse Yuehgoh, Imane Hamzaoui, Raesetje Sefala, Aisha Alaagib, Elizaveta Semenova, Lauren Crais, Siobhan Mackenzie Hall

    Abstract: Foundation models are increasingly ubiquitous in our daily lives, used in everyday tasks such as text-image searches, interactions with chatbots, and content generation. As use increases, so does concern over the disparities in performance and fairness of these models for different people in different parts of the world. To assess these growing regional disparities, we present World Wide Dishes, a… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2402.04912  [pdf, other

    cs.CR cs.LG

    Towards Biologically Plausible and Private Gene Expression Data Generation

    Authors: Dingfan Chen, Marie Oestreich, Tejumade Afonja, Raouf Kerkouche, Matthias Becker, Mario Fritz

    Abstract: Generative models trained with Differential Privacy (DP) are becoming increasingly prominent in the creation of synthetic data for downstream applications. Existing literature, however, primarily focuses on basic benchmarking datasets and tends to report promising results only for elementary metrics and relatively simple data distributions. In this paper, we initiate a systematic analysis of how D… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Journal ref: Proceedings on Privacy Enhancing Technologies (PoPETs 2024)

  5. arXiv:2310.00274  [pdf, other

    cs.CL

    AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR

    Authors: Tobi Olatunji, Tejumade Afonja, Aditya Yadavalli, Chris Chinenye Emezue, Sahib Singh, Bonaventure F. P. Dossou, Joanne Osuchukwu, Salomey Osei, Atnafu Lambebo Tonja, Naome Etori, Clinton Mbataku

    Abstract: Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However, clinical ASR is mature, even ubiquitous, in developed nations, and clinician-reported performance of… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Comments: Accepted to TACL 2023. This is a pre-MIT Press publication version

  6. arXiv:2307.07997  [pdf, other

    cs.LG cs.AI

    MargCTGAN: A "Marginally'' Better CTGAN for the Low Sample Regime

    Authors: Tejumade Afonja, Dingfan Chen, Mario Fritz

    Abstract: The potential of realistic and useful synthetic data is significant. However, current evaluation methods for synthetic tabular data generation predominantly focus on downstream task usefulness, often neglecting the importance of statistical properties. This oversight becomes particularly prominent in low sample scenarios, accompanied by a swift deterioration of these statistical measures. In this… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: ICML 2023 Workshop on Deployable Generative AI

  7. arXiv:2306.00253  [pdf, other

    cs.CL cs.CY

    AfriNames: Most ASR models "butcher" African Names

    Authors: Tobi Olatunji, Tejumade Afonja, Bonaventure F. P. Dossou, Atnafu Lambebo Tonja, Chris Chinenye Emezue, Amina Mardiyyah Rufai, Sahib Singh

    Abstract: Useful conversational agents must accurately capture named entities to minimize error for downstream tasks, for example, asking a voice assistant to play a track from a certain artist, initiating navigation to a specific location, or documenting a laboratory result for a patient. However, where named entities such as ``Ukachukwu`` (Igbo), ``Lakicia`` (Swahili), or ``Ingabire`` (Rwandan) are spoken… ▽ More

    Submitted 2 June, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

    Comments: Accepted at Interspeech 2023 (Main Conference)

  8. arXiv:2301.04007   

    cs.LG cs.CY

    Proceedings of the NeurIPS 2021 Workshop on Machine Learning for the Develo** World: Global Challenges

    Authors: Paula Rodriguez Diaz, Tejumade Afonja, Konstantin Klemmer, Aya Salama, Niveditha Kalavakonda, Oluwafemi Azeez, Simone Fobi

    Abstract: These are the proceedings of the 5th workshop on Machine Learning for the Develo** World (ML4D), held as part of the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS) on December 14th, 2021.

    Submitted 10 January, 2023; originally announced January 2023.

  9. arXiv:2207.12816  [pdf, other

    cs.CR cs.SD eess.AS

    Generative Extraction of Audio Classifiers for Speaker Identification

    Authors: Tejumade Afonja, Lucas Bourtoule, Varun Chandrasekaran, Sageev Oore, Nicolas Papernot

    Abstract: It is perhaps no longer surprising that machine learning models, especially deep neural networks, are particularly vulnerable to attacks. One such vulnerability that has been well studied is model extraction: a phenomenon in which the attacker attempts to steal a victim's model by training a surrogate model to mimic the decision boundaries of the victim model. Previous works have demonstrated the… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

  10. arXiv:2112.06199  [pdf, other

    cs.CL cs.SD eess.AS

    Learning Nigerian accent embeddings from speech: preliminary results based on SautiDB-Naija corpus

    Authors: Tejumade Afonja, Oladimeji Mudele, Iroro Orife, Kenechi Dukor, Lawrence Francis, Duru Goodness, Oluwafemi Azeez, Ademola Malomo, Clinton Mbataku

    Abstract: This paper describes foundational efforts with SautiDB-Naija, a novel corpus of non-native (L2) Nigerian English speech. We describe how the corpus was created and curated as well as preliminary experiments with accent classification and learning Nigerian accent embeddings. The initial version of the corpus includes over 900 recordings from L2 English speakers of Nigerian languages, such as Yoruba… ▽ More

    Submitted 12 December, 2021; originally announced December 2021.

  11. arXiv:2101.04347   

    cs.LG cs.CY

    Proceedings of the NeurIPS 2020 Workshop on Machine Learning for the Develo** World: Improving Resilience

    Authors: Tejumade Afonja, Konstantin Klemmer, Aya Salama, Paula Rodriguez Diaz, Niveditha Kalavakonda, Oluwafemi Azeez

    Abstract: These are the proceedings of the 4th workshop on Machine Learning for the Develo** World (ML4D), held as part of the Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS) on Saturday, December 12th 2020.

    Submitted 12 January, 2021; originally announced January 2021.

  12. arXiv:2001.00249   

    cs.CY

    Proceedings of NeurIPS 2019 Workshop on Machine Learning for the Develo** World: Challenges and Risks of ML4D

    Authors: Maria De-Arteaga, Tejumade Afonja, Amanda Coston

    Abstract: This is the proceedings of the 3rd ML4D workshop which was help in Vancouver, Canada on December 13, 2019 as part of the Neural Information Processing Systems conference.

    Submitted 10 April, 2020; v1 submitted 1 January, 2020; originally announced January 2020.