Skip to main content

Showing 1–10 of 10 results for author: Mehnaz, S

.
  1. arXiv:2403.10557  [pdf, other

    cs.LG cs.AI cs.CL

    Second-Order Information Matters: Revisiting Machine Unlearning for Large Language Models

    Authors: Kang Gu, Md Rafi Ur Rashid, Najrin Sultana, Shagufta Mehnaz

    Abstract: With the rapid development of Large Language Models (LLMs), we have witnessed intense competition among the major LLM products like ChatGPT, LLaMa, and Gemini. However, various issues (e.g. privacy leakage and copyright violation) of the training corpus still remain underexplored. For example, the Times sued OpenAI and Microsoft for infringing on its copyrights by using millions of its articles fo… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  2. arXiv:2311.16139  [pdf, other

    cs.CR cs.LG

    GNNBleed: Inference Attacks to Unveil Private Edges in Graphs with Realistic Access to GNN Models

    Authors: Zeyu Song, Ehsanul Kabir, Shagufta Mehnaz

    Abstract: Graph Neural Networks (GNNs) have increasingly become an indispensable tool in learning from graph-structured data, catering to various applications including social network analysis, recommendation systems, etc. At the heart of these networks are the edges which are crucial in guiding GNN models' predictions. In many scenarios, these edges represent sensitive information, such as personal associa… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: Submitted to USENIX Security '24

  3. arXiv:2310.16152  [pdf, other

    cs.CR cs.LG

    FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering

    Authors: Md Rafi Ur Rashid, Vishnu Asutosh Dasu, Kang Gu, Najrin Sultana, Shagufta Mehnaz

    Abstract: Federated learning (FL) has become a key component in various language modeling applications such as machine translation, next-word prediction, and medical record analysis. These applications are trained on datasets from many FL participants that often include privacy-sensitive data, such as healthcare records, phone/credit card numbers, login credentials, etc. Although FL enables computation with… ▽ More

    Submitted 25 May, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: 20 pages (including bibliography and Appendix), Submitted to ACM CCS '24

  4. arXiv:2310.08808   

    cs.CR

    Attacks Meet Interpretability (AmI) Evaluation and Findings

    Authors: Qian Ma, Zi** Ye, Shagufta Mehnaz

    Abstract: To investigate the effectiveness of the model explanation in detecting adversarial examples, we reproduce the results of two papers, Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples and Is AmI (Attacks Meet Interpretability) Robust to Adversarial Examples. And then conduct experiments and case studies to identify the limitations of both works. We find that Attacks… ▽ More

    Submitted 22 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Need to withdraw it. The current work needs to be changed at a large extent which would take a longer time

  5. arXiv:2308.15402  [pdf

    cs.HC

    Bornil: An open-source sign language data crowdsourcing platform for AI enabled dialect-agnostic communication

    Authors: Shahriar Elahi Dhruvo, Mohammad Akhlaqur Rahman, Manash Kumar Mandal, Md. Istiak Hossain Shihab, A. A. Noman Ansary, Kaneez Fatema Shithi, Sanjida Khanom, Rabeya Akter, Safaeid Hossain Arib, M. N. Ansary, Sazia Mehnaz, Rezwana Sultana, Sejuti Rahman, Sayma Sultana Chowdhury, Sabbir Ahmed Chowdhury, Farig Sadeque, Asif Sushmit

    Abstract: The absence of annotated sign language datasets has hindered the development of sign language recognition and translation technologies. In this paper, we introduce Bornil; a crowdsource-friendly, multilingual sign language data collection, annotation, and validation platform. Bornil allows users to record sign language gestures and lets annotators perform sentence and gloss-level annotation. It al… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: 6 pages, 7 figures

  6. arXiv:2308.05832  [pdf, other

    cs.CR cs.LG

    FLShield: A Validation Based Federated Learning Framework to Defend Against Poisoning Attacks

    Authors: Ehsanul Kabir, Zeyu Song, Md Rafi Ur Rashid, Shagufta Mehnaz

    Abstract: Federated learning (FL) is revolutionizing how we learn from data. With its growing popularity, it is now being used in many safety-critical domains such as autonomous vehicles and healthcare. Since thousands of participants can contribute in this collaborative setting, it is, however, challenging to ensure security and reliability of such systems. This highlights the need to design FL systems tha… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  7. arXiv:2306.01743  [pdf

    cs.CL

    Unicode Normalization and Grapheme Parsing of Indic Languages

    Authors: Nazmuddoha Ansary, Quazi Adibur Rahman Adib, Tahsin Reasat, Asif Shahriyar Sushmit, Ahmed Imtiaz Humayun, Sazia Mehnaz, Kanij Fatema, Mohammad Mamun Or Rashid, Farig Sadeque

    Abstract: Writing systems of Indic languages have orthographic syllables, also known as complex graphemes, as unique horizontal units. A prominent feature of these languages is these complex grapheme units that comprise consonants/consonant conjuncts, vowel diacritics, and consonant diacritics, which, together make a unique Language. Unicode-based writing schemes of these languages often disregard this feat… ▽ More

    Submitted 27 May, 2024; v1 submitted 11 May, 2023; originally announced June 2023.

    Comments: Published at LREC-COLING 2024

  8. arXiv:2206.14053  [pdf

    cs.CL cs.SD eess.AS

    Bengali Common Voice Speech Dataset for Automatic Speech Recognition

    Authors: Samiul Alam, Asif Sushmit, Zaowad Abdullah, Shahrin Nakkhatra, MD. Nazmuddoha Ansary, Syed Mobassir Hossen, Sazia Morshed Mehnaz, Tahsin Reasat, Ahmed Imtiaz Humayun

    Abstract: Bengali is one of the most spoken languages in the world with over 300 million speakers globally. Despite its popularity, research into the development of Bengali speech recognition systems is hindered due to the lack of diverse open-source datasets. As a way forward, we have crowdsourced the Bengali Common Voice Speech Dataset, which is a sentence-level automatic speech recognition corpus. Collec… ▽ More

    Submitted 29 June, 2022; v1 submitted 28 June, 2022; originally announced June 2022.

  9. arXiv:2201.09370  [pdf, other

    cs.CR cs.LG

    Are Your Sensitive Attributes Private? Novel Model Inversion Attribute Inference Attacks on Classification Models

    Authors: Shagufta Mehnaz, Sayanton V. Dibbo, Ehsanul Kabir, Ninghui Li, Elisa Bertino

    Abstract: Increasing use of machine learning (ML) technologies in privacy-sensitive domains such as medical diagnoses, lifestyle predictions, and business decisions highlights the need to better understand if these ML technologies are introducing leakage of sensitive and proprietary training data. In this paper, we focus on model inversion attacks where the adversary knows non-sensitive attributes about rec… ▽ More

    Submitted 23 January, 2022; originally announced January 2022.

    Comments: Conditionally accepted to USENIX Security 2022. This is not the camera-ready version. arXiv admin note: substantial text overlap with arXiv:2012.03404

  10. arXiv:2012.03404  [pdf, other

    cs.CR cs.LG

    Black-box Model Inversion Attribute Inference Attacks on Classification Models

    Authors: Shagufta Mehnaz, Ninghui Li, Elisa Bertino

    Abstract: Increasing use of ML technologies in privacy-sensitive domains such as medical diagnoses, lifestyle predictions, and business decisions highlights the need to better understand if these ML technologies are introducing leakages of sensitive and proprietary training data. In this paper, we focus on one kind of model inversion attacks, where the adversary knows non-sensitive attributes about instance… ▽ More

    Submitted 6 December, 2020; originally announced December 2020.