Skip to main content

Showing 1–28 of 28 results for author: Humayun, A

.
  1. arXiv:2406.09657  [pdf, other

    cs.LG stat.ML

    ScaLES: Scalable Latent Exploration Score for Pre-Trained Generative Networks

    Authors: Omer Ronen, Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk, Bin Yu

    Abstract: We develop Scalable Latent Exploration Score (ScaLES) to mitigate over-exploration in Latent Space Optimization (LSO), a popular method for solving black-box discrete optimization problems. LSO utilizes continuous optimization within the latent space of a Variational Autoencoder (VAE) and is known to be susceptible to over-exploration, which manifests in unrealistic solutions that reduce its pract… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2402.15555  [pdf, other

    cs.LG cs.AI cs.CV

    Deep Networks Always Grok and Here is Why

    Authors: Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk

    Abstract: Grokking, or delayed generalization, is a phenomenon where generalization in a deep neural network (DNN) occurs long after achieving near zero training error. Previous studies have reported the occurrence of grokking in specific controlled settings, such as DNNs initialized with large-norm parameters or transformers trained on algorithmic datasets. We demonstrate that grokking is actually much mor… ▽ More

    Submitted 6 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: ICML 2024. Website: https://bit.ly/grok-adversarial. Pages 24, Figures 36

  3. arXiv:2312.03533  [pdf, other

    cs.CV

    Low-shot Object Learning with Mutual Exclusivity Bias

    Authors: Anh Thai, Ahmad Humayun, Stefan Stojanov, Zixuan Huang, Bikram Boote, James M. Rehg

    Abstract: This paper introduces Low-shot Object Learning with Mutual Exclusivity Bias (LSME), the first computational framing of mutual exclusivity bias, a phenomenon commonly observed in infants during word learning. We provide a novel dataset, comprehensive baselines, and a state-of-the-art method to enable the ML community to tackle this challenging learning task. The goal of LSME is to analyze an RGB im… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Accepted at NeurIPS 2023, Datasets and Benchmarks Track. Project website https://ngailapdi.github.io/projects/lsme/

  4. arXiv:2310.12977  [pdf, other

    cs.LG cs.AI cs.CV

    Training Dynamics of Deep Network Linear Regions

    Authors: Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk

    Abstract: The study of Deep Network (DN) training dynamics has largely focused on the evolution of the loss function, evaluated on or around train and test set data points. In fact, many DN phenomenon were first introduced in literature with that respect, e.g., double descent, grokking. In this study, we look at the training dynamics of the input space partition or linear regions formed by continuous piecew… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 14 pages, 14 figures

  5. arXiv:2308.06270  [pdf

    cs.CY cs.HC

    Joy Learning: Smartphone Application For Children With Parkinson Disease

    Authors: Mujahid Rafiq, Ibrar Hussain, Muhammad Arif, Kinza Sardar, Ahsan Humayun

    Abstract: Parkinson's is a Neurologic disorder that not only affects the human body but also their social and personal life. Especially children having the Parkinson's disease come up with infinite difficulties in different areas of life mostly in social interaction, communication, connectedness, and other skills such as thinking, reasoning, learning, remembering. This study gives the solution to learning s… ▽ More

    Submitted 27 July, 2023; originally announced August 2023.

    Report number: Vol. 19 No. 12 pp. 147-150

    Journal ref: IJCSNS International Journal of Computer Science and Network Security 2019/12

  6. arXiv:2307.01850  [pdf, other

    cs.LG cs.AI cs.CV

    Self-Consuming Generative Models Go MAD

    Authors: Sina Alemohammad, Josue Casco-Rodriguez, Lorenzo Luzi, Ahmed Imtiaz Humayun, Hossein Babaei, Daniel LeJeune, Ali Siahkoohi, Richard G. Baraniuk

    Abstract: Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous (self-consuming) loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of au… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: 31 pages, 31 figures, pre-print

  7. arXiv:2306.01743  [pdf

    cs.CL

    Unicode Normalization and Grapheme Parsing of Indic Languages

    Authors: Nazmuddoha Ansary, Quazi Adibur Rahman Adib, Tahsin Reasat, Asif Shahriyar Sushmit, Ahmed Imtiaz Humayun, Sazia Mehnaz, Kanij Fatema, Mohammad Mamun Or Rashid, Farig Sadeque

    Abstract: Writing systems of Indic languages have orthographic syllables, also known as complex graphemes, as unique horizontal units. A prominent feature of these languages is these complex grapheme units that comprise consonants/consonant conjuncts, vowel diacritics, and consonant diacritics, which, together make a unique Language. Unicode-based writing schemes of these languages often disregard this feat… ▽ More

    Submitted 27 May, 2024; v1 submitted 11 May, 2023; originally announced June 2023.

    Comments: Published at LREC-COLING 2024

  8. arXiv:2305.09688  [pdf

    eess.AS cs.CL cs.LG

    OOD-Speech: A Large Bengali Speech Recognition Dataset for Out-of-Distribution Benchmarking

    Authors: Fazle Rabbi Rakib, Souhardya Saha Dip, Samiul Alam, Nazia Tasnim, Md. Istiak Hossain Shihab, Md. Nazmuddoha Ansary, Syed Mobassir Hossen, Marsia Haque Meghla, Mamunur Mamun, Farig Sadeque, Sayma Sultana Chowdhury, Tahsin Reasat, Asif Sushmit, Ahmed Imtiaz Humayun

    Abstract: We present OOD-Speech, the first out-of-distribution (OOD) benchmarking dataset for Bengali automatic speech recognition (ASR). Being one of the most spoken languages globally, Bengali portrays large diversity in dialects and prosodic features, which demands ASR frameworks to be robust towards distribution shifts. For example, islamic religious sermons in Bengali are delivered with a tonality that… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  9. arXiv:2303.05325  [pdf, other

    cs.CV

    BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset

    Authors: Md. Istiak Hossain Shihab, Md. Rakibul Hasan, Mahfuzur Rahman Emon, Syed Mobassir Hossen, Md. Nazmuddoha Ansary, Intesur Ahmed, Fazle Rabbi Rakib, Shahriar Elahi Dhruvo, Souhardya Saha Dip, Akib Hasan Pavel, Marsia Haque Meghla, Md. Rezwanul Haque, Sayma Sultana Chowdhury, Farig Sadeque, Tahsin Reasat, Ahmed Imtiaz Humayun, Asif Shahriyar Sushmit

    Abstract: While strides have been made in deep learning based Bengali Optical Character Recognition (OCR) in the past decade, the absence of large Document Layout Analysis (DLA) datasets has hindered the application of OCR in document transcription, e.g., transcribing historical documents and newspapers. Moreover, rule-based DLA systems that are currently being employed in practice are not robust to domain… ▽ More

    Submitted 5 May, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

  10. arXiv:2302.12828  [pdf, other

    cs.CV cs.LG

    SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries

    Authors: Ahmed Imtiaz Humayun, Randall Balestriero, Guha Balakrishnan, Richard Baraniuk

    Abstract: Current Deep Network (DN) visualization and interpretability methods rely heavily on data space visualizations such as scoring which dimensions of the data are responsible for their associated prediction or generating new data features or samples that best match a given DN unit or representation. In this paper, we go one step further by develo** the first provably exact method for computing the… ▽ More

    Submitted 6 June, 2024; v1 submitted 24 February, 2023; originally announced February 2023.

    Comments: 11 pages, 20 figures

  11. arXiv:2206.14053  [pdf

    cs.CL cs.SD eess.AS

    Bengali Common Voice Speech Dataset for Automatic Speech Recognition

    Authors: Samiul Alam, Asif Sushmit, Zaowad Abdullah, Shahrin Nakkhatra, MD. Nazmuddoha Ansary, Syed Mobassir Hossen, Sazia Morshed Mehnaz, Tahsin Reasat, Ahmed Imtiaz Humayun

    Abstract: Bengali is one of the most spoken languages in the world with over 300 million speakers globally. Despite its popularity, research into the development of Bengali speech recognition systems is hindered due to the lack of diverse open-source datasets. As a way forward, we have crowdsourced the Bengali Common Voice Speech Dataset, which is a sentence-level automatic speech recognition corpus. Collec… ▽ More

    Submitted 29 June, 2022; v1 submitted 28 June, 2022; originally announced June 2022.

  12. arXiv:2203.02502  [pdf, other

    cs.LG cs.AI

    No More Than 6ft Apart: Robust K-Means via Radius Upper Bounds

    Authors: Ahmed Imtiaz Humayun, Randall Balestriero, Anastasios Kyrillidis, Richard Baraniuk

    Abstract: Centroid based clustering methods such as k-means, k-medoids and k-centers are heavily applied as a go-to tool in exploratory data analysis. In many cases, those methods are used to obtain representative centroids of the data manifold for visualization or summarization of a dataset. Real world datasets often contain inherent abnormalities, e.g., repeated samples and sampling bias, that manifest im… ▽ More

    Submitted 15 June, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

    Comments: Accepted for ICASSP 2022, 8 figures, 1 table

  13. arXiv:2203.01993  [pdf, other

    cs.CV

    Polarity Sampling: Quality and Diversity Control of Pre-Trained Generative Networks via Singular Values

    Authors: Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk

    Abstract: We present Polarity Sampling, a theoretically justified plug-and-play method for controlling the generation quality and diversity of pre-trained deep generative networks DGNs). Leveraging the fact that DGNs are, or can be approximated by, continuous piecewise affine splines, we derive the analytical DGN output space distribution as a function of the product of the DGN's Jacobian singular values ra… ▽ More

    Submitted 6 May, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: 20 pages, 16 figures, CVPR 2022 Oral, Camera Ready

  14. arXiv:2110.08009  [pdf, other

    cs.LG cs.CV

    MaGNET: Uniform Sampling from Deep Generative Network Manifolds Without Retraining

    Authors: Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk

    Abstract: Deep Generative Networks (DGNs) are extensively employed in Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and their variants to approximate the data manifold and distribution. However, training samples are often distributed in a non-uniform fashion on the manifold, due to costs or convenience of collection. For example, the CelebA dataset contains a large fraction of smi… ▽ More

    Submitted 20 January, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ICLR Accepted version, 28 pages, 23 figures

  15. arXiv:2010.13975  [pdf, other

    eess.SP cs.LG

    Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels

    Authors: Sina Alemohammad, Hossein Babaei, Randall Balestriero, Matt Y. Cheung, Ahmed Imtiaz Humayun, Daniel LeJeune, Naiming Liu, Lorenzo Luzi, Jasper Tan, Zichao Wang, Richard G. Baraniuk

    Abstract: High dimensionality poses many challenges to the use of data, from visualization and interpretation, to prediction and storage for historical preservation. Techniques abound to reduce the dimensionality of fixed-length sequences, yet these methods rarely generalize to variable-length sequences. To address this gap, we extend existing methods that rely on the use of kernels to variable-length seque… ▽ More

    Submitted 17 April, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

  16. A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes

    Authors: Samiul Alam, Tahsin Reasat, Asif Shahriyar Sushmit, Sadi Mohammad Siddiquee, Fuad Rahman, Mahady Hasan, Ahmed Imtiaz Humayun

    Abstract: Latin has historically led the state-of-the-art in handwritten optical character recognition (OCR) research. Adapting existing systems from Latin to alpha-syllabary languages is particularly challenging due to a sharp contrast between their orthographies. The segmentation of graphical constituents corresponding to characters becomes significantly hard due to a cursive writing system and frequent u… ▽ More

    Submitted 13 January, 2021; v1 submitted 30 September, 2020; originally announced October 2020.

    Comments: 15 pages, 12 figures, 6 Tables, Submitted to CVPR-21

  17. Towards Domain Invariant Heart Sound Abnormality Detection using Learnable Filterbanks

    Authors: Ahmed Imtiaz Humayun, Shabnam Ghaffarzadegan, Md. Istiaq Ansari, Zhe Feng, Taufiq Hasan

    Abstract: Cardiac auscultation is the most practiced non-invasive and cost-effective procedure for the early diagnosis of heart diseases. While machine learning based systems can aid in automatically screening patients, the robustness of these systems is affected by numerous factors including the stethoscope/sensor, environment, and data collection protocol. This paper studies the adverse effect of domain v… ▽ More

    Submitted 1 October, 2020; v1 submitted 28 September, 2019; originally announced October 2019.

    Comments: Copyright 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Journal ref: IEEE Journal of Biomedical and Health Informatics 24 (2020) 2189 - 2198

  18. arXiv:1904.12271  [pdf, other

    cs.CV eess.IV

    X-Ray Image Compression Using Convolutional Recurrent Neural Networks

    Authors: Asif Shahriyar Sushmit, Shakib Uz Zaman, Ahmed Imtiaz Humayun, Taufiq Hasan, Mohammed Imamul Hassan Bhuiyan

    Abstract: In the advent of a digital health revolution, vast amounts of clinical data are being generated, stored and processed on a daily basis. This has made the storage and retrieval of large volumes of health-care data, especially, high-resolution medical images, particularly challenging. Effective image compression for medical images thus plays a vital role in today's healthcare information system, par… ▽ More

    Submitted 9 May, 2019; v1 submitted 28 April, 2019; originally announced April 2019.

    Comments: 4 pages, 2 figures, IEEE BHI 2019

  19. arXiv:1904.10255  [pdf, other

    cs.LG cs.CV eess.SP stat.ML

    End-to-end Sleep Staging with Raw Single Channel EEG using Deep Residual ConvNets

    Authors: Ahmed Imtiaz Humayun, Asif Shahriyar Sushmit, Taufiq Hasan, Mohammed Imamul Hassan Bhuiyan

    Abstract: Humans approximately spend a third of their life slee**, which makes monitoring sleep an integral part of well-being. In this paper, a 34-layer deep residual ConvNet architecture for end-to-end sleep staging is proposed. The network takes raw single channel electroencephalogram (Fpz-Cz) signal as input and yields hypnogram annotations for each 30s segments as output. Experiments are carried out… ▽ More

    Submitted 23 April, 2019; originally announced April 2019.

    Comments: 5 pages, 3 Figures, Appendix, IEEE BHI 2019

  20. arXiv:1810.07274  [pdf

    cs.NI cs.CR

    Mathematical Modeling of Routes Maintenance and Recovery Procedure for MANETs

    Authors: Zafar Iqbal, Tahreem Saeed, Tariq Rafiq, Ahsan Humayun

    Abstract: Routing is one of the most mysterious issues from the birth of networks up till now. Designing routing protocols for Mobile Ad hoc Networks (MANETs) is a complicated task because unpredictable mobility patterns of mobile nodes greatly effect routing decisions. Various routing protocols are designed to improve this very problem. Different simulator based routing protocols are designed but these pro… ▽ More

    Submitted 23 September, 2018; originally announced October 2018.

    Comments: 8 pages

    Journal ref: IJCSNS International Journal of Computer Science and Network Security, VOL.18 No.8, August 2018

  21. arXiv:1810.04452  [pdf, other

    cs.CV

    AI Learns to Recognize Bengali Handwritten Digits: Bengali.AI Computer Vision Challenge 2018

    Authors: Sharif Amit Kamran, Ahmed Imtiaz Humayun, Samiul Alam, Rashed Mohammad Doha, Manash Kumar Mandal, Tahsin Reasat, Fuad Rahman

    Abstract: Solving problems with Artificial intelligence in a competitive manner has long been absent in Bangladesh and Bengali-speaking community. On the other hand, there has not been a well structured database for Bengali Handwritten digits for mass public use. To bring out the best minds working in machine learning and use their expertise to create a model which can easily recognize Bengali Handwritten d… ▽ More

    Submitted 10 October, 2018; originally announced October 2018.

    Comments: 5 pages, 3 figures

  22. An Ensemble of Transfer, Semi-supervised and Supervised Learning Methods for Pathological Heart Sound Classification

    Authors: Ahmed Imtiaz Humayun, Md. Tauhiduzzaman Khan, Shabnam Ghaffarzadegan, Zhe Feng, Taufiq Hasan

    Abstract: In this work, we propose an ensemble of classifiers to distinguish between various degrees of abnormalities of the heart using Phonocardiogram (PCG) signals acquired using digital stethoscopes in a clinical setting, for the INTERSPEECH 2018 Computational Paralinguistics (ComParE) Heart Beats SubChallenge. Our primary classification framework constitutes a convolutional neural network with 1D-CNN t… ▽ More

    Submitted 7 October, 2018; v1 submitted 18 June, 2018; originally announced June 2018.

    Comments: 5 pages, 5 figures, Interspeech 2018 accepted manuscript

  23. arXiv:1806.05892  [pdf, other

    cs.CV cs.LG eess.SP stat.ML

    Learning Front-end Filter-bank Parameters using Convolutional Neural Networks for Abnormal Heart Sound Detection

    Authors: Ahmed Imtiaz Humayun, Shabnam Ghaffarzadegan, Zhe Feng, Taufiq Hasan

    Abstract: Automatic heart sound abnormality detection can play a vital role in the early diagnosis of heart diseases, particularly in low-resource settings. The state-of-the-art algorithms for this task utilize a set of Finite Impulse Response (FIR) band-pass filters as a front-end followed by a Convolutional Neural Network (CNN) model. In this work, we propound a novel CNN architecture that integrates the… ▽ More

    Submitted 15 June, 2018; originally announced June 2018.

    Comments: 4 pages, 6 figures, IEEE International Engineering in Medicine and Biology Conference (EMBC)

  24. arXiv:1806.02452  [pdf, other

    cs.CV

    NumtaDB - Assembled Bengali Handwritten Digits

    Authors: Samiul Alam, Tahsin Reasat, Rashed Mohammad Doha, Ahmed Imtiaz Humayun

    Abstract: To benchmark Bengali digit recognition algorithms, a large publicly available dataset is required which is free from biases originating from geographical location, gender, and age. With this aim in mind, NumtaDB, a dataset consisting of more than 85,000 images of hand-written Bengali digits, has been assembled. This paper documents the collection and curation process of numerals along with the sal… ▽ More

    Submitted 6 June, 2018; originally announced June 2018.

    Comments: 6 page, 12 figures

    MSC Class: 68T10 ACM Class: I.5.1; I.5.4

  25. arXiv:1705.10470  [pdf, other

    stat.ML cs.LG

    Iterative Machine Teaching

    Authors: Weiyang Liu, Bo Dai, Ahmad Humayun, Charlene Tay, Chen Yu, Linda B. Smith, James M. Rehg, Le Song

    Abstract: In this paper, we consider the problem of machine teaching, the inverse problem of machine learning. Different from traditional machine teaching which views the learners as batch algorithms, we study a new paradigm where the learner uses an iterative algorithm and a teacher can feed examples sequentially and intelligently based on the current performance of the learner. We show that the teaching c… ▽ More

    Submitted 17 November, 2017; v1 submitted 30 May, 2017; originally announced May 2017.

    Comments: Published in ICML 2017

  26. arXiv:1705.06021  [pdf

    cs.CY

    Impact on the Usage of Wireless Sensor Networks in Healthcare Sector

    Authors: Ahsan Humayun, Muneeb Niaz, Muhammad Umar, Muhammad Mujahid

    Abstract: Recent advancement in the wireless sensor networks has provided a platform to numerous applications in healthcare sector. It has become an active research area due to its large scale potential. This research focuses on the application areas of wireless sensor networks specifically in the healthcare sector. In this work, we have tried to explain the different challenges faced by the WSNs in order t… ▽ More

    Submitted 17 May, 2017; originally announced May 2017.

    Comments: 4 Pages

    Journal ref: IJCSNS International Journal of Computer Science and Network Security, VOL.17 No.4, April 2017 pp. 102-105

  27. arXiv:1701.04733  [pdf

    cs.DC

    BTAS: A Library for Tropical Algebra

    Authors: Ahsan Humayun, Dr. Muhammad Asif, Dr. Muhammmad Kashif Hanif

    Abstract: GPUs are dedicated processors used for complex calculations and simulations and they can be effectively used for tropical algebra computations. Tropical algebra is based on max-plus algebra and min-plus algebra. In this paper we proposed and designed a library based on Tropical Algebra which is used to provide standard vector and matrix operations namely Basic Tropical Algebra Subroutines (BTAS).… ▽ More

    Submitted 17 January, 2017; originally announced January 2017.

    Journal ref: International Journal of Computer Science and Information Security 2016 Volume 14 No.12

  28. Finding Temporally Consistent Occlusion Boundaries in Videos using Geometric Context

    Authors: S. Hussain Raza, Ahmad Humayun, Matthias Grundmann, David Anderson, Irfan Essa

    Abstract: We present an algorithm for finding temporally consistent occlusion boundaries in videos to support segmentation of dynamic scenes. We learn occlusion boundaries in a pairwise Markov random field (MRF) framework. We first estimate the probability of an spatio-temporal edge being an occlusion boundary by using appearance, flow, and geometric features. Next, we enforce occlusion boundary continuity… ▽ More

    Submitted 25 October, 2015; originally announced October 2015.

    Comments: Applications of Computer Vision (WACV), 2015 IEEE Winter Conference on