Skip to main content

Showing 1–16 of 16 results for author: Dhakal, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.11720  [pdf, other

    cs.AI

    GEOBIND: Binding Text, Image, and Audio through Satellite Images

    Authors: Aayush Dhakal, Subash Khanal, Srikumar Sastry, Adeel Ahmad, Nathan Jacobs

    Abstract: In remote sensing, we are interested in modeling various modalities for some geographic location. Several works have focused on learning the relationship between a location and type of landscape, habitability, audio, textual descriptions, etc. Recently, a common way to approach these problems is to train a deep-learning model that uses satellite images to infer some unique characteristics of the l… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 2024 IEEE International Geoscience and Remote Sensing Symposium

  2. arXiv:2404.06637  [pdf, other

    cs.CV

    GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis

    Authors: Srikumar Sastry, Subash Khanal, Aayush Dhakal, Nathan Jacobs

    Abstract: We present GeoSynth, a model for synthesizing satellite images with global style and image-driven layout control. The global style control is via textual prompts or geographic location. These enable the specification of scene semantics or regional appearance respectively, and can be used together. We train our model on a large dataset of paired satellite imagery, with automatically generated capti… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  3. arXiv:2404.03606  [pdf, other

    cs.SD cs.AI cs.IR eess.AS

    Analyzing Musical Characteristics of National Anthems in Relation to Global Indices

    Authors: S M Rakib Hasan, Aakar Dhakal, Ms. Ayesha Siddiqua, Mohammad Mominur Rahman, Md Maidul Islam, Mohammed Arfat Raihan Chowdhury, S M Masfequier Rahman Swapno, SM Nuruzzaman Nobel

    Abstract: Music plays a huge part in sha** peoples' psychology and behavioral patterns. This paper investigates the connection between national anthems and different global indices with computational music analysis and statistical correlation analysis. We analyze national anthem musical data to determine whether certain musical characteristics are associated with peace, happiness, suicide rate, crime rate… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  4. arXiv:2404.02375  [pdf, other

    cs.CL

    Optical Text Recognition in Nepali and Bengali: A Transformer-based Approach

    Authors: S M Rakib Hasan, Aakar Dhakal, Md Humaion Kabir Mehedi, Annajiat Alim Rasel

    Abstract: Efforts on the research and development of OCR systems for Low-Resource Languages are relatively new. Low-resource languages have little training data available for training Machine Translation systems or other systems. Even though a vast amount of text has been digitized and made available on the internet the text is still in PDF and Image format, which are not instantly accessible. This paper di… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted and Presented at ICAECC 2023, Bengaluru, India

  5. arXiv:2404.02372  [pdf, other

    cs.CR cs.CL cs.LG

    Obfuscated Malware Detection: Investigating Real-world Scenarios through Memory Analysis

    Authors: S M Rakib Hasan, Aakar Dhakal

    Abstract: In the era of the internet and smart devices, the detection of malware has become crucial for system security. Malware authors increasingly employ obfuscation techniques to evade advanced security solutions, making it challenging to detect and eliminate threats. Obfuscated malware, adept at hiding itself, poses a significant risk to various platforms, including computers, mobile devices, and IoT d… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted and Presented at IEEE-ICTP2023, Dhaka, Bangladesh

  6. arXiv:2312.08334  [pdf, other

    cs.CV

    LD-SDM: Language-Driven Hierarchical Species Distribution Modeling

    Authors: Srikumar Sastry, Xin Xing, Aayush Dhakal, Subash Khanal, Adeel Ahmad, Nathan Jacobs

    Abstract: We focus on the problem of species distribution modeling using global-scale presence-only data. Most previous studies have mapped the range of a given species using geographical and environmental features alone. To capture a stronger implicit relationship between species, we encode the taxonomic hierarchy of species using a large language model. This enables range map** for any taxonomic rank an… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: 17 pages, 9 figures

  7. arXiv:2312.07389  [pdf

    cs.CV

    Eroding Trust In Aerial Imagery: Comprehensive Analysis and Evaluation Of Adversarial Attacks In Geospatial Systems

    Authors: Michael Lanier, Aayush Dhakal, Zhexiao Xiong, Arthur Li, Nathan Jacobs, Yevgeniy Vorobeychik

    Abstract: In critical operations where aerial imagery plays an essential role, the integrity and trustworthiness of data are paramount. The emergence of adversarial attacks, particularly those that exploit control over labels or employ physically feasible trojans, threatens to erode that trust, making the analysis and mitigation of these attacks a matter of urgency. We demonstrate how adversarial attacks ca… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted at IEEE AIRP 2023

  8. arXiv:2311.16140  [pdf

    cs.CV cs.AI cs.LG

    Adapting Segment Anything Model (SAM) through Prompt-based Learning for Enhanced Protein Identification in Cryo-EM Micrographs

    Authors: Fei He, Zhiyuan Yang, Mingyue Gao, Biplab Poudel, Newgin Sam Ebin Sam Dhas, Rajan Gyawali, Ashwin Dhakal, Jianlin Cheng, Dong Xu

    Abstract: Cryo-electron microscopy (cryo-EM) remains pivotal in structural biology, yet the task of protein particle picking, integral for 3D protein structure construction, is laden with manual inefficiencies. While recent AI tools such as Topaz and crYOLO are advancing the field, they do not fully address the challenges of cryo-EM images, including low contrast, complex shapes, and heterogeneous conformat… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

  9. arXiv:2310.19168  [pdf, other

    cs.CV

    BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Map**

    Authors: Srikumar Sastry, Subash Khanal, Aayush Dhakal, Di Huang, Nathan Jacobs

    Abstract: We propose a metadata-aware self-supervised learning~(SSL)~framework useful for fine-grained classification and ecological map** of bird species around the world. Our framework unifies two SSL strategies: Contrastive Learning~(CL) and Masked Image Modeling~(MIM), while also enriching the embedding space with metadata available with ground-level imagery of birds. We separately train uni-modal and… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted at WACV 2024

  10. arXiv:2310.02903  [pdf, other

    cs.LG

    FroSSL: Frobenius Norm Minimization for Efficient Multiview Self-Supervised Learning

    Authors: Oscar Skean, Aayush Dhakal, Nathan Jacobs, Luis Gonzalo Sanchez Giraldo

    Abstract: Self-supervised learning (SSL) is a popular paradigm for representation learning. Recent multiview methods can be classified as sample-contrastive, dimension-contrastive, or asymmetric network-based, with each family having its own approach to avoiding informational collapse. While these families converge to solutions of similar quality, it can be empirically shown that some methods are epoch-inef… ▽ More

    Submitted 19 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Updated to reflect ECCV submission

  11. arXiv:2309.10667  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Learning Tri-modal Embeddings for Zero-Shot Soundscape Map**

    Authors: Subash Khanal, Srikumar Sastry, Aayush Dhakal, Nathan Jacobs

    Abstract: We focus on the task of soundscape map**, which involves predicting the most probable sounds that could be perceived at a particular geographic location. We utilise recent state-of-the-art models to encode geotagged audio, a textual description of the audio, and an overhead image of its capture location using contrastive pre-training. The end result is a shared embedding space for the three moda… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Accepted at BMVC 2023

  12. arXiv:2307.15904  [pdf, other

    cs.CV

    Sat2Cap: Map** Fine-Grained Textual Descriptions from Satellite Images

    Authors: Aayush Dhakal, Adeel Ahmad, Subash Khanal, Srikumar Sastry, Hannah Kerner, Nathan Jacobs

    Abstract: We propose a weakly supervised approach for creating maps using free-form textual descriptions. We refer to this work of creating textual maps as zero-shot map**. Prior works have approached map** tasks by develo** models that predict a fixed set of attributes using overhead imagery. However, these models are very restrictive as they can only solve highly specific tasks for which they were t… ▽ More

    Submitted 11 April, 2024; v1 submitted 29 July, 2023; originally announced July 2023.

    Comments: 16 pages

  13. arXiv:2304.13541  [pdf, other

    cs.DC cs.PF eess.SY

    D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs

    Authors: Aditya Dhakal, Sameer G. Kulkarni, K. K. Ramakrishnan

    Abstract: Hardware accelerators such as GPUs are required for real-time, low-latency inference with Deep Neural Networks (DNN). However, due to the inherent limits to the parallelism they can exploit, DNNs often under-utilize the capacity of today's high-end accelerators. Although spatial multiplexing of the GPU, leads to higher GPU utilization and higher inference throughput, there remain a number of chall… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

  14. Real-Time Helmet Violation Detection in AI City Challenge 2023 with Genetic Algorithm-Enhanced YOLOv5

    Authors: Elham Soltanikazemi, Ashwin Dhakal, Bijaya Kumar Hatuwal, Imad Eddine Toubal, Armstrong Aboah, Kannappan Palaniappan

    Abstract: This research focuses on real-time surveillance systems as a means for tackling the issue of non-compliance with helmet regulations, a practice that considerably amplifies the risk for motorcycle drivers or riders. Despite the well-established advantages of helmet usage, achieving widespread compliance remains challenging due to diverse contributing factors. To effectively address this concern, re… ▽ More

    Submitted 20 November, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    Journal ref: 2023 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)

  15. arXiv:2206.01696  [pdf

    cs.LG

    Deep Learning Prediction of Severe Health Risks for Pediatric COVID-19 Patients with a Large Feature Set in 2021 BARDA Data Challenge

    Authors: Sajid Mahmud, Elham Soltanikazemi, Frimpong Boadu, Ashwin Dhakal, Jianlin Cheng

    Abstract: Most children infected with COVID-19 have no or mild symptoms and can recover automatically by themselves, but some pediatric COVID-19 patients need to be hospitalized or even to receive intensive medical care (e.g., invasive mechanical ventilation or cardiovascular support) to recover from the illnesses. Therefore, it is critical to predict the severe health risk that COVID-19 infection poses to… ▽ More

    Submitted 6 June, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: Acknowledgment updated, minor typos fixed

  16. arXiv:2008.03602  [pdf, other

    cs.NE cs.DC eess.SY

    Spatial Sharing of GPU for Autotuning DNN models

    Authors: Aditya Dhakal, Junguk Cho, Sameer G. Kulkarni, K. K. Ramakrishnan, Puneet Sharma

    Abstract: GPUs are used for training, inference, and tuning the machine learning models. However, Deep Neural Network (DNN) vary widely in their ability to exploit the full power of high-performance GPUs. Spatial sharing of GPU enables multiplexing several DNNs on the GPU and can improve GPU utilization, thus improving throughput and lowering latency. DNN models given just the right amount of GPU resources… ▽ More

    Submitted 8 August, 2020; originally announced August 2020.