Skip to main content

Showing 1–39 of 39 results for author: Nandakumar, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09407  [pdf, other

    cs.CV

    Towards Evaluating the Robustness of Visual State Space Models

    Authors: Hashmat Shadab Malik, Fahad Shamshad, Muzammal Naseer, Karthik Nandakumar, Fahad Shahbaz Khan, Salman Khan

    Abstract: Vision State Space Models (VSSMs), a novel architecture that combines the strengths of recurrent neural networks and latent variable models, have demonstrated remarkable performance in visual perception tasks by efficiently capturing long-range dependencies and modeling complex visual dynamics. However, their robustness under natural and adversarial perturbations remains a critical concern. In thi… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2406.09250  [pdf, other

    cs.CV cs.AI cs.LG

    MirrorCheck: Efficient Adversarial Defense for Vision-Language Models

    Authors: Samar Fares, Klea Ziu, Toluwani Aremu, Nikita Durasov, Martin Takáč, Pascal Fua, Karthik Nandakumar, Ivan Laptev

    Abstract: Vision-Language Models (VLMs) are becoming increasingly vulnerable to adversarial attacks as various novel attack strategies are being proposed against these models. While existing defenses excel in unimodal contexts, they currently fall short in safeguarding VLMs against adversarial threats. To mitigate this vulnerability, we propose a novel, yet elegantly simple approach for detecting adversaria… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2406.00569  [pdf, other

    cs.LG cs.AI

    Redefining Contributions: Shapley-Driven Federated Learning

    Authors: Nurbek Tastan, Samar Fares, Toluwani Aremu, Samuel Horvath, Karthik Nandakumar

    Abstract: Federated learning (FL) has emerged as a pivotal approach in machine learning, enabling multiple participants to collaboratively train a global model without sharing raw data. While FL finds applications in various domains such as healthcare and finance, it is challenging to ensure global model convergence when participants do not contribute equally and/or honestly. To overcome this challenge, pri… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: Accepted by IJCAI 2024

  4. arXiv:2405.14881  [pdf, other

    cs.CV

    DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models

    Authors: Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood, Karthik Nandakumar

    Abstract: Recently, a number of image-mixing-based augmentation techniques have been introduced to improve the generalization of deep neural networks. In these techniques, two or more randomly selected natural images are mixed together to generate an augmented image. Such methods may not only omit important portions of the input images but also introduce label ambiguities by mixing images across labels resu… ▽ More

    Submitted 5 April, 2024; originally announced May 2024.

    Comments: Accepted at CVPR 2024

  5. arXiv:2404.13704  [pdf, other

    eess.IV cs.CV cs.LG

    PEMMA: Parameter-Efficient Multi-Modal Adaptation for Medical Image Segmentation

    Authors: Nada Saadi, Numan Saeed, Mohammad Yaqub, Karthik Nandakumar

    Abstract: Imaging modalities such as Computed Tomography (CT) and Positron Emission Tomography (PET) are key in cancer detection, inspiring Deep Neural Networks (DNN) models that merge these scans for tumor segmentation. When both CT and PET scans are available, it is common to combine them as two channels of the input to the segmentation model. However, this method requires both scan types during training… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  6. arXiv:2404.09342  [pdf, other

    cs.CV cs.SD eess.AS

    Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan

    Authors: Muhammad Saad Saeed, Shah Nawaz, Muhammad Salman Tahir, Rohan Kumar Das, Muhammad Zaigham Zaheer, Marta Moscati, Markus Schedl, Muhammad Haris Khan, Karthik Nandakumar, Muhammad Haroon Yousaf

    Abstract: The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, the audio-visual systems are one of the widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to presence of unique correlation between them. The Face-voice Association in Multilingual Environments (FAME) Challenge 2… ▽ More

    Submitted 16 April, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: ACM Multimedia Conference - Grand Challenge

  7. arXiv:2404.00847  [pdf, other

    cs.CV

    Collaborative Learning of Anomalies with Privacy (CLAP) for Unsupervised Video Anomaly Detection: A New Baseline

    Authors: Anas Al-lahham, Muhammad Zaigham Zaheer, Nurbek Tastan, Karthik Nandakumar

    Abstract: Unsupervised (US) video anomaly detection (VAD) in surveillance applications is gaining more popularity recently due to its practical real-world applications. As surveillance videos are privacy sensitive and the availability of large-scale video data may enable better US-VAD systems, collaborative learning can be highly rewarding in this setting. However, due to the extremely challenging nature of… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Accepted in IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2024

  8. arXiv:2403.10603  [pdf, other

    cs.CV cs.AI cs.LG

    SurvRNC: Learning Ordered Representations for Survival Prediction using Rank-N-Contrast

    Authors: Numan Saeed, Muhammad Ridzuan, Fadillah Adamsyah Maani, Hussain Alasmawi, Karthik Nandakumar, Mohammad Yaqub

    Abstract: Predicting the likelihood of survival is of paramount importance for individuals diagnosed with cancer as it provides invaluable information regarding prognosis at an early stage. This knowledge enables the formulation of effective treatment plans that lead to improved patient outcomes. In the past few years, deep learning models have provided a feasible solution for assessing medical images, elec… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  9. arXiv:2402.08070  [pdf, other

    cs.CV

    Multi-Attribute Vision Transformers are Efficient and Robust Learners

    Authors: Hanan Gani, Nada Saadi, Noor Hussein, Karthik Nandakumar

    Abstract: Since their inception, Vision Transformers (ViTs) have emerged as a compelling alternative to Convolutional Neural Networks (CNNs) across a wide spectrum of tasks. ViTs exhibit notable characteristics, including global attention, resilience against occlusions, and adaptability to distribution shifts. One underexplored aspect of ViTs is their potential for multi-attribute learning, referring to the… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: Code: https://github.com/hananshafi/MTL-ViT. arXiv admin note: text overlap with arXiv:2207.08677 by other authors

  10. arXiv:2312.11230  [pdf, other

    stat.ML cs.LG

    Dirichlet-based Uncertainty Quantification for Personalized Federated Learning with Improved Posterior Networks

    Authors: Nikita Kotelevskii, Samuel Horváth, Karthik Nandakumar, Martin Takáč, Maxim Panov

    Abstract: In modern federated learning, one of the main challenges is to account for inherent heterogeneity and the diverse nature of data distributions for different clients. This problem is often addressed by introducing personalization of the models towards the data distribution of the particular client. However, a personalized model might be unreliable when applied to the data that is not typical for th… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  11. arXiv:2311.04611  [pdf, other

    cs.LG math.OC

    Byzantine-Tolerant Methods for Distributed Variational Inequalities

    Authors: Nazarii Tupitsa, Abdulla Jasem Almansoori, Yanlin Wu, Martin Takáč, Karthik Nandakumar, Samuel Horváth, Eduard Gorbunov

    Abstract: Robustness to Byzantine attacks is a necessity for various distributed training scenarios. When the training reduces to the process of solving a minimization problem, Byzantine robustness is relatively well-understood. However, other problem formulations, such as min-max problems or, more generally, variational inequalities, arise in many modern machine learning and, in particular, distributed lea… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023; 69 pages, 12 figures

  12. arXiv:2310.17650  [pdf, other

    cs.CV cs.LG

    A Coarse-to-Fine Pseudo-Labeling (C2FPL) Framework for Unsupervised Video Anomaly Detection

    Authors: Anas Al-lahham, Nurbek Tastan, Zaigham Zaheer, Karthik Nandakumar

    Abstract: Detection of anomalous events in videos is an important problem in applications such as surveillance. Video anomaly detection (VAD) is well-studied in the one-class classification (OCC) and weakly supervised (WS) settings. However, fully unsupervised (US) video anomaly detection methods, which learn a complete system without any annotation or human supervision, have not been explored in depth. Thi… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024

  13. arXiv:2309.16649  [pdf, other

    cs.CV

    FLIP: Cross-domain Face Anti-spoofing with Language Guidance

    Authors: Koushik Srivatsan, Muzammal Naseer, Karthik Nandakumar

    Abstract: Face anti-spoofing (FAS) or presentation attack detection is an essential component of face recognition systems deployed in security-critical applications. Existing FAS methods have poor generalizability to unseen spoof types, camera sensors, and environmental conditions. Recently, vision transformer (ViT) models have been shown to be effective for the FAS task due to their ability to capture long… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: Accepted to ICCV-2023. Project Page: https://koushiksrivats.github.io/FLIP/

  14. arXiv:2308.10236  [pdf, other

    cs.CV cs.CR cs.LG

    FedSIS: Federated Split Learning with Intermediate Representation Sampling for Privacy-preserving Generalized Face Presentation Attack Detection

    Authors: Naif Alkhunaizi, Koushik Srivatsan, Faris Almalik, Ibrahim Almakky, Karthik Nandakumar

    Abstract: Lack of generalization to unseen domains/attacks is the Achilles heel of most face presentation attack detection (FacePAD) algorithms. Existing attempts to enhance the generalizability of FacePAD solutions assume that data from multiple source domains are available with a single entity to enable centralized training. In practice, data from different source domains may be collected by diverse entit… ▽ More

    Submitted 22 August, 2023; v1 submitted 20 August, 2023; originally announced August 2023.

    Comments: Accepted to the IEEE International Joint Conference on Biometrics (IJCB), 2023

  15. arXiv:2308.01966  [pdf, other

    cs.MM cs.CL cs.LG cs.SD eess.AS

    DCTM: Dilated Convolutional Transformer Model for Multimodal Engagement Estimation in Conversation

    Authors: Vu Ngoc Tu, Van Thong Huynh, Hyung-Jeong Yang, M. Zaigham Zaheer, Shah Nawaz, Karthik Nandakumar, Soo-Hyung Kim

    Abstract: Conversational engagement estimation is posed as a regression problem, entailing the identification of the favorable attention and involvement of the participants in the conversation. This task arises as a crucial pursuit to gain insights into human's interaction dynamics and behavior patterns within a conversation. In this research, we introduce a dilated convolutional Transformer for modeling an… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: Accepted in ACMM Grand Challenge

  16. arXiv:2306.14638  [pdf, other

    cs.CV

    FeSViBS: Federated Split Learning of Vision Transformer with Block Sampling

    Authors: Faris Almalik, Naif Alkhunaizi, Ibrahim Almakky, Karthik Nandakumar

    Abstract: Data scarcity is a significant obstacle hindering the learning of powerful machine learning models in critical healthcare applications. Data-sharing mechanisms among multiple entities (e.g., hospitals) can accelerate model training and yield more accurate predictions. Recently, approaches such as Federated Learning (FL) and Split Learning (SL) have facilitated collaboration without the need to exc… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  17. arXiv:2306.13091  [pdf, other

    cs.CV cs.CR cs.LG

    Evading Forensic Classifiers with Attribute-Conditioned Adversarial Faces

    Authors: Fahad Shamshad, Koushik Srivatsan, Karthik Nandakumar

    Abstract: The ability of generative models to produce highly realistic synthetic face images has raised security and ethical concerns. As a first line of defense against such fake faces, deep learning based forensic classifiers have been developed. While these forensic models can detect whether a face image is synthetic or real with high accuracy, they are also vulnerable to adversarial attacks. Although su… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: Accepted in CVPR 2023. Project page: https://koushiksrivats.github.io/face_attribute_attack/

  18. arXiv:2306.10008  [pdf, other

    cs.CV cs.CR cs.LG

    CLIP2Protect: Protecting Facial Privacy using Text-Guided Makeup via Adversarial Latent Search

    Authors: Fahad Shamshad, Muzammal Naseer, Karthik Nandakumar

    Abstract: The success of deep learning based face recognition systems has given rise to serious privacy concerns due to their ability to enable unauthorized tracking of users in the digital world. Existing methods for enhancing privacy fail to generate naturalistic images that can protect facial privacy without compromising user experience. We propose a novel two-step approach for facial privacy protection… ▽ More

    Submitted 20 June, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted in CVPR 2023. Project page: https://fahadshamshad.github.io/Clip2Protect/

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 20595-20605

  19. arXiv:2303.06129  [pdf, other

    cs.CV

    Single-branch Network for Multimodal Training

    Authors: Muhammad Saad Saeed, Shah Nawaz, Muhammad Haris Khan, Muhammad Zaigham Zaheer, Karthik Nandakumar, Muhammad Haroon Yousaf, Arif Mahmood

    Abstract: With the rapid growth of social media platforms, users are sharing billions of multimedia posts containing audio, images, and text. Researchers have focused on building autonomous systems capable of processing such multimedia data to solve challenging multimodal tasks including cross-modal retrieval, matching, and verification. Existing works use separate networks to extract embeddings of each mod… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: Accepted at ICASSP 2023

  20. arXiv:2302.03003  [pdf, other

    eess.IV cs.CV stat.ML

    OTRE: Where Optimal Transport Guided Unpaired Image-to-Image Translation Meets Regularization by Enhancing

    Authors: Wenhui Zhu, Peijie Qiu, Oana M. Dumitrascu, Jacob M. Sobczak, Mohammad Farazi, Zhangsihao Yang, Keshav Nandakumar, Yalin Wang

    Abstract: Non-mydriatic retinal color fundus photography (CFP) is widely available due to the advantage of not requiring pupillary dilation, however, is prone to poor quality due to operators, systemic imperfections, or patient-related causes. Optimal retinal image quality is mandated for accurate medical diagnoses and automated analyses. Herein, we leveraged the Optimal Transport (OT) theory to propose an… ▽ More

    Submitted 8 April, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Accepted as a conference paper to The 28th biennial international conference on Information Processing in Medical Imaging (IPMI 2023)

  21. arXiv:2302.02991  [pdf, other

    eess.IV cs.CV stat.ML

    Optimal Transport Guided Unsupervised Learning for Enhancing low-quality Retinal Images

    Authors: Wenhui Zhu, Peijie Qiu, Mohammad Farazi, Keshav Nandakumar, Oana M. Dumitrascu, Yalin Wang

    Abstract: Real-world non-mydriatic retinal fundus photography is prone to artifacts, imperfections and low-quality when certain ocular or systemic co-morbidities exist. Artifacts may result in inaccuracy or ambiguity in clinical diagnoses. In this paper, we proposed a simple but effective end-to-end framework for enhancing poor-quality retinal fundus images. Leveraging the optimal transport theory, we propo… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: Accepted as a conference paper to 20th IEEE International Symposium on Biomedical Imaging(ISBI 2023)

  22. arXiv:2211.13465  [pdf

    cs.CV cs.AI

    On the Importance of Image Encoding in Automated Chest X-Ray Report Generation

    Authors: Otabek Nazarov, Mohammad Yaqub, Karthik Nandakumar

    Abstract: Chest X-ray is one of the most popular medical imaging modalities due to its accessibility and effectiveness. However, there is a chronic shortage of well-trained radiologists who can interpret these images and diagnose the patient's condition. Therefore, automated radiology report generation can be a very helpful tool in clinical practice. A typical report generation workflow consists of two main… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Journal ref: The British Machine Vision Conference, 2022

  23. arXiv:2211.09536  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Towards Building Text-To-Speech Systems for the Next Billion Users

    Authors: Gokul Karthik Kumar, Praveen S V, Pratyush Kumar, Mitesh M. Khapra, Karthik Nandakumar

    Abstract: Deep learning based text-to-speech (TTS) systems have been evolving rapidly with advances in model architectures, training methodologies, and generalization across speakers and languages. However, these advances have not been thoroughly investigated for Indian language speech synthesis. Such investigation is computationally expensive given the number and diversity of Indian languages, relatively l… ▽ More

    Submitted 17 February, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: Accepted at ICASSP 2023. Gokul and Praveen contributed equally

  24. arXiv:2210.05916  [pdf, other

    cs.CL cs.CV cs.LG cs.MM

    Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features

    Authors: Gokul Karthik Kumar, Karthik Nandakumar

    Abstract: Hateful memes are a growing menace on social media. While the image and its corresponding text in a meme are related, they do not necessarily convey the same meaning when viewed individually. Hence, detecting hateful memes requires careful consideration of both visual and textual information. Multimodal pre-training can be beneficial for this task because it effectively captures the relationship b… ▽ More

    Submitted 17 October, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted at EMNLP 2022 Workshop on NLP for Positive Impact

  25. arXiv:2210.00825  [pdf, other

    cs.LG q-bio.GN

    Self-omics: A Self-supervised Learning Framework for Multi-omics Cancer Data

    Authors: Sayed Hashim, Karthik Nandakumar, Mohammad Yaqub

    Abstract: We have gained access to vast amounts of multi-omics data thanks to Next Generation Sequencing. However, it is challenging to analyse this data due to its high dimensionality and much of it not being annotated. Lack of annotated data is a significant problem in machine learning, and Self-Supervised Learning (SSL) methods are typically used to deal with limited labelled data. However, there is a la… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

    Comments: Preprint of an article published in Pacific Symposium on Biocomputing $©$ 2022 World Scientific Publishing Co., Singapore, http://psb.stanford.edu/

  26. arXiv:2209.02425  [pdf, other

    cs.CV

    Learning an Ensemble of Deep Fingerprint Representations

    Authors: Akash Godbole, Karthik Nandakumar, Anil K. Jain

    Abstract: Deep neural networks (DNNs) have shown incredible promise in learning fixed-length representations from fingerprints. Since the representation learning is often focused on capturing specific prior knowledge (e.g., minutiae), there is no universal representation that comprehensively encapsulates all the discriminatory information available in a fingerprint. While learning an ensemble of representat… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

  27. arXiv:2208.02851  [pdf, other

    cs.CV

    Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image Classification

    Authors: Faris Almalik, Mohammad Yaqub, Karthik Nandakumar

    Abstract: Vision Transformers (ViT) are competing to replace Convolutional Neural Networks (CNN) for various computer vision tasks in medical imaging such as classification and segmentation. While the vulnerability of CNNs to adversarial attacks is a well-known problem, recent works have shown that ViTs are also susceptible to such attacks and suffer significant performance degradation under attack. The vul… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

  28. arXiv:2207.10804  [pdf, other

    cs.CR cs.LG math.OC

    Suppressing Poisoning Attacks on Federated Learning for Medical Imaging

    Authors: Naif Alkhunaizi, Dmitry Kamzolov, Martin Takáč, Karthik Nandakumar

    Abstract: Collaboration among multiple data-owning entities (e.g., hospitals) can accelerate the training process and yield better machine learning models due to the availability and diversity of data. However, privacy concerns make it challenging to exchange data while preserving confidentiality. Federated Learning (FL) is a promising solution that enables collaborative training through exchange of model p… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  29. arXiv:2205.09318  [pdf, other

    cs.CV

    On Demographic Bias in Fingerprint Recognition

    Authors: Akash Godbole, Steven A. Grosz, Karthik Nandakumar, Anil K. Jain

    Abstract: Fingerprint recognition systems have been deployed globally in numerous applications including personal devices, forensics, law enforcement, banking, and national identity systems. For these systems to be socially acceptable and trustworthy, it is critical that they perform equally well across different demographic groups. In this work, we propose a formal statistical framework to test for the exi… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  30. arXiv:2205.00142  [pdf, other

    cs.LG cs.CL cs.CV

    Multimodal Representation Learning With Text and Images

    Authors: Aishwarya Jayagopal, Ankireddy Monica Aiswarya, Ankita Garg, Srinivasan Kolumam Nandakumar

    Abstract: In recent years, multimodal AI has seen an upward trend as researchers are integrating data of different types such as text, images, speech into modelling to get the best results. This project leverages multimodal AI and matrix factorization techniques for representation learning, on text and image data simultaneously, thereby employing the widely used techniques of Natural Language Processing (NL… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

  31. arXiv:2204.05814  [pdf, other

    cs.CL cs.AI cs.LG

    MuCoT: Multilingual Contrastive Training for Question-Answering in Low-resource Languages

    Authors: Gokul Karthik Kumar, Abhishek Singh Gehlot, Sahal Shaji Mullappilly, Karthik Nandakumar

    Abstract: Accuracy of English-language Question Answering (QA) systems has improved significantly in recent years with the advent of Transformer-based models (e.g., BERT). These models are pre-trained in a self-supervised fashion with a large English text corpus and further fine-tuned with a massive English QA dataset (e.g., SQuAD). However, QA datasets on such a scale are not available for most of the othe… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: Accepted for oral presentation at ACL 2022 Workshop on Speech and Language Technologies for Dravidian Languages

  32. arXiv:2202.01672  [pdf, other

    cs.LG q-bio.QM

    SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data for Cancer Type Classification

    Authors: Sayed Hashim, Muhammad Ali, Karthik Nandakumar, Mohammad Yaqub

    Abstract: For personalized medicines, very crucial intrinsic information is present in high dimensional omics data which is difficult to capture due to the large number of molecular features and small number of available samples. Different types of omics data show various aspects of samples. Integration and analysis of multi-omics data give us a broad view of tumours, which can improve clinical decision mak… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

  33. arXiv:2110.03027  [pdf, other

    cs.CV

    Dynamically Decoding Source Domain Knowledge for Domain Generalization

    Authors: Cuicui Kang, Karthik Nandakumar

    Abstract: Optimizing the performance of classifiers on samples from unseen domains remains a challenging problem. While most existing studies on domain generalization focus on learning domain-invariant feature representations, multi-expert frameworks have been proposed as a possible solution and have demonstrated promising performance. However, current multi-expert learning frameworks fail to fully exploit… ▽ More

    Submitted 5 December, 2021; v1 submitted 6 October, 2021; originally announced October 2021.

  34. arXiv:2108.10046  [pdf, other

    cs.CV cs.AI

    Discovering Spatial Relationships by Transformers for Domain Generalization

    Authors: Cuicui Kang, Karthik Nandakumar

    Abstract: Due to the rapid increase in the diversity of image data, the problem of domain generalization has received increased attention recently. While domain generalization is a challenging problem, it has achieved great development thanks to the fast development of AI techniques in computer vision. Most of these advanced algorithms are proposed with deep architectures based on convolution neural nets (C… ▽ More

    Submitted 13 October, 2021; v1 submitted 23 August, 2021; originally announced August 2021.

  35. arXiv:2103.03411  [pdf, other

    cs.CR cs.AI cs.LG

    Efficient Encrypted Inference on Ensembles of Decision Trees

    Authors: Kanthi Sarpatwar, Karthik Nandakumar, Nalini Ratha, James Rayfield, Karthikeyan Shanmugam, Sharath Pankanti, Roman Vaculin

    Abstract: Data privacy concerns often prevent the use of cloud-based machine learning services for sensitive personal data. While homomorphic encryption (HE) offers a potential solution by enabling computations on encrypted data, the challenge is to obtain accurate machine learning models that work within the multiplicative depth constraints of a leveled HE scheme. Existing approaches for encrypted inferenc… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: 9 pages, 6 figures

  36. arXiv:2102.00319  [pdf, other

    cs.CR cs.LG

    Efficient CNN Building Blocks for Encrypted Data

    Authors: Nayna Jain, Karthik Nandakumar, Nalini Ratha, Sharath Pankanti, Uttam Kumar

    Abstract: Machine learning on encrypted data can address the concerns related to privacy and legality of sharing sensitive data with untrustworthy service providers. Fully Homomorphic Encryption (FHE) is a promising technique to enable machine learning and inferencing while providing strict guarantees against information leakage. Since deep convolutional neural networks (CNNs) have become the machine learni… ▽ More

    Submitted 30 January, 2021; originally announced February 2021.

    Comments: The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-21)

  37. arXiv:2007.09370  [pdf, other

    cs.CR cs.DC cs.LG stat.ML

    How to Democratise and Protect AI: Fair and Differentially Private Decentralised Deep Learning

    Authors: Lingjuan Lyu, Yitong Li, Karthik Nandakumar, Jiangshan Yu, Xingjun Ma

    Abstract: This paper firstly considers the research problem of fairness in collaborative deep learning, while ensuring privacy. A novel reputation system is proposed through digital tokens and local credibility to ensure fairness, in combination with differential privacy to guarantee privacy. In particular, we build a fair and differentially private decentralised deep learning framework called FDPDDL, which… ▽ More

    Submitted 18 July, 2020; originally announced July 2020.

    Comments: Accepted for publication in TDSC

  38. arXiv:1906.01167  [pdf, other

    cs.CR cs.AI cs.LG stat.ML

    Towards Fair and Privacy-Preserving Federated Deep Models

    Authors: Lingjuan Lyu, Jiangshan Yu, Karthik Nandakumar, Yitong Li, Xingjun Ma, Jiong **, Han Yu, Kee Siong Ng

    Abstract: The current standalone deep learning framework tends to result in overfitting and low utility. This problem can be addressed by either a centralized framework that deploys a central server to train a global model on the joint data from all parties, or a distributed framework that leverages a parameter server to aggregate local model updates. Server-based solutions are prone to the problem of a sin… ▽ More

    Submitted 19 May, 2020; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: Accepted for publication in TPDS

  39. arXiv:1904.10180  [pdf, other

    cs.CV

    High-frequency crowd insights for public safety and congestion control

    Authors: Karthik Nandakumar, Sebastien Blandin, Laura Wynter

    Abstract: We present results from several projects aimed at enabling the real-time understanding of crowds and their behaviour in the built environment. We make use of CCTV video cameras that are ubiquitous throughout the developed and develo** world and as such are able to play the role of a reliable sensing mechanism. We outline the novel methods developed for our crowd insights engine, and illustrate e… ▽ More

    Submitted 23 April, 2019; originally announced April 2019.