Skip to main content

Showing 1–48 of 48 results for author: Saha, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03986  [pdf, other

    cs.CL cs.IR

    On The Persona-based Summarization of Domain-Specific Documents

    Authors: Ankan Mullick, Sombit Bose, Rounak Saha, Ayan Kumar Bhowmick, Pawan Goyal, Niloy Ganguly, Prasenjit Dey, Ravi Kokku

    Abstract: In an ever-expanding world of domain-specific knowledge, the increasing complexity of consuming, and storing information necessitates the generation of summaries from large information repositories. However, every persona of a domain has different requirements of information and hence their summarization. For example, in the healthcare domain, a persona-based (such as Doctor, Nurse, Patient etc.)… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Journal ref: ACL 2024 Findings (Association for Computational Linguistics)

  2. arXiv:2406.03766  [pdf, other

    eess.SP cs.DC cs.IT cs.LG eess.SY

    Privacy Preserving Semi-Decentralized Mean Estimation over Intermittently-Connected Networks

    Authors: Rajarshi Saha, Mohamed Seif, Michal Yemini, Andrea J. Goldsmith, H. Vincent Poor

    Abstract: We consider the problem of privately estimating the mean of vectors distributed across different nodes of an unreliable wireless network, where communications between nodes can fail intermittently. We adopt a semi-decentralized setup, wherein to mitigate the impact of intermittently connected links, nodes can collaborate with their neighbors to compute a local consensus, which they relay to a cent… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 14 pages, 6 figures. arXiv admin note: text overlap with arXiv:2303.00035

  3. arXiv:2406.02648  [pdf, other

    cs.LG cs.AI

    Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

    Authors: Vojtech Halenka, Ahmed K. Kadhim, Paul F. A. Clarke, Bimal Bhattarai, Rupsa Saha, Ole-Christoffer Granmo, Lei Jiao, Per-Arne Andersen

    Abstract: Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large se… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 9 pages, 17 figures

  4. arXiv:2405.18886  [pdf, ps, other

    cs.LG cs.AI math.OC stat.ML

    Compressing Large Language Models using Low Rank and Low Precision Decomposition

    Authors: Rajarshi Saha, Naomi Sagan, Varun Srivastava, Andrea J. Goldsmith, Mert Pilanci

    Abstract: The prohibitive sizes of Large Language Models (LLMs) today make it difficult to deploy them on memory-constrained edge devices. This work introduces $\rm CALDERA$ -- a new post-training LLM compression algorithm that harnesses the inherent low-rank structure of a weight matrix $\mathbf{W}$ by approximating it via a low-rank, low-precision decomposition as… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 30 pages, 9 figures, 7 tables

  5. arXiv:2405.00024  [pdf

    cs.DC cs.RO

    Swarm UAVs Communication

    Authors: Arindam Majee, Rahul Saha, Snehasish Roy, Srilekha Mandal, Sayan Chatterjee

    Abstract: The advancement in cyber-physical systems has opened a new way in disaster management and rescue operations. The usage of UAVs is very promising in this context. UAVs, mainly quadcopters, are small in size and their payload capacity is limited. A single UAV can not traverse the whole area. Hence multiple UAVs or swarms of UAVs come into the picture managing the entire payload in a modular and equi… ▽ More

    Submitted 24 February, 2024; originally announced May 2024.

    Comments: 50 pages, 17 figures

  6. arXiv:2404.13605  [pdf, other

    cs.CV eess.IV

    Turb-Seg-Res: A Segment-then-Restore Pipeline for Dynamic Videos with Atmospheric Turbulence

    Authors: Ripon Kumar Saha, Dehao Qin, Nianyi Li, **wei Ye, Suren Jayasuriya

    Abstract: Tackling image degradation due to atmospheric turbulence, particularly in dynamic environment, remains a challenge for long-range imaging systems. Existing techniques have been primarily designed for static scenes or scenes with small motion. This paper presents the first segment-then-restore pipeline for restoring the videos of dynamic scenes in turbulent environment. We leverage mean optical flo… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Paper

  7. arXiv:2404.04245  [pdf

    cs.CR cs.CV cs.LG

    Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism

    Authors: Trilokesh Ranjan Sarkar, Nilanjan Das, Pralay Sankar Maitra, Bijoy Some, Ritwik Saha, Orijita Adhikary, Bishal Bose, Jaydip Sen

    Abstract: This technical report delves into an in-depth exploration of adversarial attacks specifically targeted at Deep Neural Networks (DNNs) utilized for image classification. The study also investigates defense mechanisms aimed at bolstering the robustness of machine learning models. The research focuses on comprehending the ramifications of two prominent attack methodologies: the Fast Gradient Sign Met… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: This report pertains to the Capstone Project done by Group 1 of the Fall batch of 2023 students at Praxis Tech School, Kolkata, India. The reports consists of 35 pages and it includes 15 figures and 10 tables. This is the preprint which will be submitted to to an IEEE international conference for review

  8. arXiv:2402.04335  [pdf, other

    cs.CL cs.AI cs.LG

    LegalLens: Leveraging LLMs for Legal Violation Identification in Unstructured Text

    Authors: Dor Bernsohn, Gil Semo, Yaron Vazana, Gila Hayat, Ben Hagag, Joel Niklaus, Rohit Saha, Kyryl Truskovskyi

    Abstract: In this study, we focus on two main tasks, the first for detecting legal violations within unstructured textual data, and the second for associating these violations with potentially affected individuals. We constructed two datasets using Large Language Models (LLMs) which were subsequently validated by domain expert annotators. Both tasks were designed specifically for the context of class-action… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  9. arXiv:2401.11021  [pdf

    cs.CL cs.AI cs.IR

    Analysis and Detection of Multilingual Hate Speech Using Transformer Based Deep Learning

    Authors: Arijit Das, Somashree Nandy, Rupam Saha, Srijan Das, Diganta Saha

    Abstract: Hate speech is harmful content that directly attacks or promotes hatred against members of groups or individuals based on actual or perceived aspects of identity, such as racism, religion, or sexual orientation. This can affect social life on social media platforms as hateful content shared through social media can harm both individuals and communities. As the prevalence of hate speech increases o… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: 20 pages

  10. arXiv:2311.03572  [pdf, other

    cs.CV

    Unsupervised Region-Growing Network for Object Segmentation in Atmospheric Turbulence

    Authors: Dehao Qin, Ripon Saha, Suren Jayasuriya, **wei Ye, Nianyi Li

    Abstract: In this paper, we present a two-stage unsupervised foreground object segmentation network tailored for dynamic scenes affected by atmospheric turbulence. In the first stage, we utilize averaged optical flow from turbulence-distorted image sequences to feed a novel region-growing algorithm, crafting preliminary masks for each moving object in the video. In the second stage, we employ a U-Net archit… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 9 pages, 4 figures

  11. arXiv:2310.18457  [pdf, other

    cs.AI cs.LG

    LLMSTEP: LLM proofstep suggestions in Lean

    Authors: Sean Welleck, Rahul Saha

    Abstract: We present LLMSTEP, a tool for integrating a language model into the Lean proof assistant. LLMSTEP is a Lean 4 tactic that sends a user's proof state to a server hosting a language model. The language model generates suggestions, which are checked in Lean and displayed to a user in their development environment. We provide a baseline language model, along with code for fine-tuning and evaluation t… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    ACM Class: I.2.2; I.2.5; I.2.7

  12. arXiv:2310.17207  [pdf, other

    cs.AI cs.CL

    Efficient Data Fusion using the Tsetlin Machine

    Authors: Rupsa Saha, Vladimir I. Zadorozhny, Ole-Christoffer Granmo

    Abstract: We propose a novel way of assessing and fusing noisy dynamic data using a Tsetlin Machine. Our approach consists in monitoring how explanations in form of logical clauses that a TM learns changes with possible noise in dynamic data. This way TM can recognize the noise by lowering weights of previously learned clauses, or reflect it in the form of new clauses. We also perform a comprehensive experi… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  13. arXiv:2310.11028  [pdf, other

    cs.LG cs.IT stat.ML

    Matrix Compression via Randomized Low Rank and Low Precision Factorization

    Authors: Rajarshi Saha, Varun Srivastava, Mert Pilanci

    Abstract: Matrices are exceptionally useful in various fields of study as they provide a convenient framework to organize and manipulate data in a structured manner. However, modern matrices can involve billions of elements, making their storage and processing quite demanding in terms of computational resources and memory usage. Although prohibitively large, such matrices are often approximately low rank. W… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted to the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  14. arXiv:2310.07957  [pdf, other

    cs.CL cs.AI

    A New Approach Towards Autoformalization

    Authors: Nilay Patel, Rahul Saha, Jeffrey Flanigan

    Abstract: Verifying mathematical proofs is difficult, but can be automated with the assistance of a computer. Autoformalization is the task of automatically translating natural language mathematics into a formal language that can be verified by a program. This is a challenging task, and especially for higher-level mathematics found in research papers. Research paper mathematics requires large amounts of bac… ▽ More

    Submitted 19 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: Under review at MATHAI 2023 @ NeurIPS 2023

  15. arXiv:2307.05827  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Relational Extraction on Wikipedia Tables using Convolutional and Memory Networks

    Authors: Arif Shahriar, Rohan Saha, Denilson Barbosa

    Abstract: Relation extraction (RE) is the task of extracting relations between entities in text. Most RE methods extract relations from free-form running text and leave out other rich data sources, such as tables. We explore RE from the perspective of applying neural methods on tabularly organized data. We introduce a new model consisting of Convolutional Neural Network (CNN) and Bidirectional-Long Short Te… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  16. arXiv:2305.03144  [pdf, other

    cs.LG cs.CL cs.IR

    Influence of various text embeddings on clustering performance in NLP

    Authors: Rohan Saha

    Abstract: With the advent of e-commerce platforms, reviews are crucial for customers to assess the credibility of a product. The star ratings do not always match the review text written by the customer. For example, a three star rating (out of five) may be incongruous with the review text, which may be more suitable for a five star review. A clustering approach can be used to relabel the correct star rating… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  17. arXiv:2303.00035  [pdf, other

    cs.IT cs.CR cs.DC cs.LG

    Collaborative Mean Estimation over Intermittently Connected Networks with Peer-To-Peer Privacy

    Authors: Rajarshi Saha, Mohamed Seif, Michal Yemini, Andrea J. Goldsmith, H. Vincent Poor

    Abstract: This work considers the problem of Distributed Mean Estimation (DME) over networks with intermittent connectivity, where the goal is to learn a global statistic over the data samples localized across distributed nodes with the help of a central server. To mitigate the impact of intermittent links, nodes can collaborate with their neighbors to compute local consensus which they forward to the centr… ▽ More

    Submitted 28 February, 2023; originally announced March 2023.

    Comments: 10 pages, 4 figures

  18. arXiv:2301.08190  [pdf, other

    cs.LG cs.AI cs.LO

    Building Concise Logical Patterns by Constraining Tsetlin Machine Clause Size

    Authors: K. Darshana Abeyrathna, Ahmed Abdulrahem Othman Abouzeid, Bimal Bhattarai, Charul Giri, Sondre Glimsdal, Ole-Christoffer Granmo, Lei Jiao, Rupsa Saha, Jivitesh Sharma, Svein Anders Tunheim, Xuan Zhang

    Abstract: Tsetlin machine (TM) is a logic-based machine learning approach with the crucial advantages of being transparent and hardware-friendly. While TMs match or surpass deep learning accuracy for an increasing number of applications, large clause pools tend to produce clauses with many literals (long clauses). As such, they become less interpretable. Further, longer clauses increase the switching activi… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

    Comments: 17 pages, 4 figures

  19. arXiv:2301.07526  [pdf, other

    cs.LG

    AutoFraudNet: A Multimodal Network to Detect Fraud in the Auto Insurance Industry

    Authors: Azin Asgarian, Rohit Saha, Daniel Jakubovitz, Julia Peyre

    Abstract: In the insurance industry detecting fraudulent claims is a critical task with a significant financial impact. A common strategy to identify fraudulent claims is looking for inconsistencies in the supporting evidence. However, this is a laborious and cognitively heavy task for human experts as insurance claims typically come with a plethora of data from different modalities (e.g. images, text and m… ▽ More

    Submitted 15 January, 2023; originally announced January 2023.

    Comments: Published at The AAAI-2023 Workshop On Multimodal AI For Financial Forecasting

  20. arXiv:2206.15176  [pdf, ps, other

    cs.NI

    A Time Series Forecasting Approach to Minimize Cold Start Time in Cloud-Serverless Platform

    Authors: Akash Puliyadi Jegannathan, Rounak Saha, Sourav Kanti Addya

    Abstract: Serverless computing is a buzzword that is being used commonly in the world of technology and among developers and businesses. Using the Function-as-a-Service (FaaS) model of serverless, one can easily deploy their applications to the cloud and go live in a matter of days, it facilitates the developers to focus on their core business logic and the backend process such as managing the infrastructur… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Journal ref: IEEE BlackSeaCom 2022

  21. arXiv:2206.09242  [pdf, other

    cs.CV cs.LG

    GaLeNet: Multimodal Learning for Disaster Prediction, Management and Relief

    Authors: Rohit Saha, Mengyi Fang, Angeline Yasodhara, Kyryl Truskovskyi, Azin Asgarian, Daniel Homola, Raahil Shah, Frederik Dieleman, Jack Weatheritt, Thomas Rogers

    Abstract: After a natural disaster, such as a hurricane, millions are left in need of emergency assistance. To allocate resources optimally, human planners need to accurately analyze data that can flow in large volumes from several sources. This motivates the development of multimodal machine learning frameworks that can integrate multiple data sources and leverage them efficiently. To date, the research co… ▽ More

    Submitted 18 June, 2022; originally announced June 2022.

    Comments: Accepted to CVPR 2022 Workshop on Multimodal Learning for Earth and Environment

  22. arXiv:2205.15543  [pdf, other

    q-bio.QM cs.CV eess.IV

    AI-based automated Meibomian gland segmentation, classification and reflection correction in infrared Meibography

    Authors: Ripon Kumar Saha, A. M. Mahmud Chowdhury, Kyung-Sun Na, Gyu Deok Hwang, Youngsub Eom, Jaeyoung Kim, Hae-Gon Jeon, Ho Sik Hwang, Euiheon Chung

    Abstract: Purpose: Develop a deep learning-based automated method to segment meibomian glands (MG) and eyelids, quantitatively analyze the MG area and MG ratio, estimate the meiboscore, and remove specular reflections from infrared images. Methods: A total of 1600 meibography images were captured in a clinical setting. 1000 images were precisely annotated with multiple revisions by investigators and graded… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: 11 pages, 13 Figures, 5 Supplementary Figures

  23. arXiv:2205.10998  [pdf, other

    cs.LG cs.DC cs.MA

    Semi-Decentralized Federated Learning with Collaborative Relaying

    Authors: Michal Yemini, Rajarshi Saha, Emre Ozfatura, Deniz Gündüz, Andrea J. Goldsmith

    Abstract: We present a semi-decentralized federated learning algorithm wherein clients collaborate by relaying their neighbors' local updates to a central parameter server (PS). At every communication round to the PS, each client computes a local consensus of the updates from its neighboring clients and eventually transmits a weighted average of its own update and those of its neighbors to the PS. We approp… ▽ More

    Submitted 22 May, 2022; originally announced May 2022.

    Comments: Accepted for presentation at the IEEE ISIT 2022. This is a conference version of arXiv:2202.11850

  24. arXiv:2202.11850  [pdf, other

    cs.DC cs.LG cs.MA

    Robust Federated Learning with Connectivity Failures: A Semi-Decentralized Framework with Collaborative Relaying

    Authors: Michal Yemini, Rajarshi Saha, Emre Ozfatura, Deniz Gündüz, Andrea J. Goldsmith

    Abstract: Intermittent connectivity of clients to the parameter server (PS) is a major bottleneck in federated edge learning frameworks. The lack of constant connectivity induces a large generalization gap, especially when the local data distribution amongst clients exhibits heterogeneity. To overcome intermittent communication outages between clients and the central PS, we introduce the concept of collabor… ▽ More

    Submitted 20 October, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

  25. arXiv:2202.11277  [pdf, other

    cs.IT cs.LG eess.SP stat.ML

    Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms

    Authors: Rajarshi Saha, Mert Pilanci, Andrea J. Goldsmith

    Abstract: High-dimensional models often have a large memory footprint and must be quantized after training before being deployed on resource-constrained edge devices for inference tasks. In this work, we develop an information-theoretic framework for the problem of quantizing a linear regressor learned from training data $(\mathbf{X}, \mathbf{y})$, for some underlying statistical relationship… ▽ More

    Submitted 30 August, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

    Comments: 50 pages, 31 figures, 9 tables

  26. arXiv:2202.10451  [pdf, other

    cs.LG cs.AI cs.SE

    SapientML: Synthesizing Machine Learning Pipelines by Learning from Human-Written Solutions

    Authors: Ripon K. Saha, Akira Ura, Sonal Mahajan, Chenguang Zhu, Linyi Li, Yang Hu, Hiroaki Yoshida, Sarfraz Khurshid, Mukul R. Prasad

    Abstract: Automatic machine learning, or AutoML, holds the promise of truly democratizing the use of machine learning (ML), by substantially automating the work of data scientists. However, the huge combinatorial search space of candidate pipelines means that current AutoML techniques, generate sub-optimal pipelines, or none at all, especially on large, complex datasets. In this work we propose an AutoML te… ▽ More

    Submitted 19 April, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: Accepted to the Technical Track of ICSE 2022

  27. Elixir: Effective object-oriented program repair

    Authors: Ripon K. Saha, Yingjun Lyu, Hiroaki Yoshida, Mukul R. Prasad

    Abstract: This work is motivated by the pervasive use of method invocations in object-oriented (OO) programs, and indeed their prevalence in patches of OO-program bugs. We propose a generate-and-validate repair technique, called ELIXIR designed to be able to generate such patches. ELIXIR aggressively uses method calls, on par with local variables, fields, or constants, to construct more expressive repair-ex… ▽ More

    Submitted 20 December, 2021; originally announced December 2021.

    Journal ref: 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE) 2017 Oct 30 (pp. 648-659). IEEE

  28. arXiv:2110.01015  [pdf, other

    cs.CV cs.AI cs.LG

    Spatio-Temporal Video Representation Learning for AI Based Video Playback Style Prediction

    Authors: Rishubh Parihar, Gaurav Ramola, Ranajit Saha, Ravi Kini, Aniket Rege, Sudha Velusamy

    Abstract: Ever-increasing smartphone-generated video content demands intelligent techniques to edit and enhance videos on power-constrained devices. Most of the best performing algorithms for video understanding tasks like action recognition, localization, etc., rely heavily on rich spatio-temporal representations to make accurate predictions. For effective learning of the spatio-temporal representation, it… ▽ More

    Submitted 3 October, 2021; originally announced October 2021.

    Comments: 10 pages, 5 figures, 4 tables, ICCV Workshops 2021 - SRVU

  29. arXiv:2110.00751  [pdf, other

    cs.LG cs.AI cs.MA cs.RO stat.ML

    Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams

    Authors: Erdem Bıyık, Anusha Lalitha, Rajarshi Saha, Andrea Goldsmith, Dorsa Sadigh

    Abstract: When humans collaborate with each other, they often make decisions by observing others and considering the consequences that their actions may have on the entire team, instead of greedily doing what is best for just themselves. We would like our AI agents to effectively collaborate in a similar way by capturing a model of their partners. In this work, we propose and analyze a decentralized Multi-A… ▽ More

    Submitted 16 December, 2021; v1 submitted 2 October, 2021; originally announced October 2021.

    Comments: 14 pages, 13 figures. To be presented at "Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI) 2022". Also presented at "Artificial Intelligence for Human-Robot Interaction (AI-HRI) at AAAI Fall Symposium Series 2021"

    Report number: AIHRI/2021/46

  30. arXiv:2103.07578  [pdf, other

    cs.LG cs.IT math.OC

    Efficient Randomized Subspace Embeddings for Distributed Optimization under a Communication Budget

    Authors: Rajarshi Saha, Mert Pilanci, Andrea J. Goldsmith

    Abstract: We study first-order optimization algorithms under the constraint that the descent direction is quantized using a pre-specified budget of $R$-bits per dimension, where $R \in (0 ,\infty)$. We propose computationally efficient optimization algorithms with convergence rates matching the information-theoretic performance lower bounds for: (i) Smooth and Strongly-Convex objectives with access to an Ex… ▽ More

    Submitted 15 August, 2022; v1 submitted 12 March, 2021; originally announced March 2021.

    Comments: 41 pages, 26 figures, 1 table. This work has been accepted for publication in the IEEE Journal on Selected Areas in Information Theory (JSAIT), Spl. issue on Distributed Coding and Computation

  31. arXiv:2103.03891  [pdf, other

    cs.CV cs.LG

    LOHO: Latent Optimization of Hairstyles via Orthogonalization

    Authors: Rohit Saha, Brendan Duke, Florian Shkurti, Graham W. Taylor, Parham Aarabi

    Abstract: Hairstyle transfer is challenging due to hair structure differences in the source and target hair. Therefore, we propose Latent Optimization of Hairstyles via Orthogonalization (LOHO), an optimization-based approach using GAN inversion to infill missing hair structure details in latent space during hairstyle transfer. Our approach decomposes hair into three attributes: perceptual structure, appear… ▽ More

    Submitted 10 March, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

    Comments: CVPR 2021

  32. arXiv:2102.10952  [pdf, other

    cs.CL cs.AI cs.LG cs.LO

    A Relational Tsetlin Machine with Applications to Natural Language Understanding

    Authors: Rupsa Saha, Ole-Christoffer Granmo, Vladimir I. Zadorozhny, Morten Goodwin

    Abstract: TMs are a pattern recognition approach that uses finite state machines for learning and propositional logic to represent patterns. In addition to being natively interpretable, they have provided competitive accuracy for various tasks. In this paper, we increase the computing power of TMs by proposing a first-order logic-based framework with Herbrand semantics. The resulting TM is relational and ca… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Comments: 14 pages, 3 figures, 7 tables, relational approach to TM in NLP

    ACM Class: I.2.7; I.2.4

  33. arXiv:2102.04327  [pdf, other

    astro-ph.CO cs.LG gr-qc hep-ph hep-th

    An Unbiased Estimator of the Full-sky CMB Angular Power Spectrum at Large Scales using Neural Networks

    Authors: Pallav Chanda, Rajib Saha

    Abstract: Accurate estimation of the Cosmic Microwave Background (CMB) angular power spectrum is enticing due to the prospect for precision cosmology it presents. Galactic foreground emissions, however, contaminate the CMB signal and need to be subtracted reliably in order to lessen systematic errors on the CMB temperature estimates. Typically bright foregrounds in a region lead to further uncertainty in te… ▽ More

    Submitted 22 September, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

    Comments: 10 pages, 11 figures; altered methodology, added links to references, updated analysis using latest available data, modified the write-up accordingly

  34. Analysis of Evolutionary Program Synthesis for Card Games

    Authors: Rohan Saha, Cassidy Pirlot

    Abstract: In this report, we inspect the application of an evolutionary approach to the game of Rack'O, which is a card game revolving around the notion of decision making. We first apply the evolutionary technique for obtaining a set of rules over many generations and then compare them with a script written by a human player. A high-level domain-specific language is used that deter-mines which the sets of… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  35. Homonym Identification using BERT -- Using a Clustering Approach

    Authors: Rohan Saha

    Abstract: Homonym identification is important for WSD that require coarse-grained partitions of senses. The goal of this project is to determine whether contextual information is sufficient for identifying a homonymous word. To capture the context, BERT embeddings are used as opposed to Word2Vec, which conflates senses into one vector. SemCor is leveraged to retrieve the embeddings. Various clustering algor… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  36. arXiv:2101.01904  [pdf, other

    astro-ph.EP astro-ph.IM cs.LG

    Comparing Classification Models on Kepler Data

    Authors: Rohan Saha

    Abstract: Even though the original Kepler mission ended due to mechanical failures, the Kepler satellite continues to collect data. Using classification models, we can understand the features exoplanets possess and then use those features to investigate further for any more information on the candidate planet. Based on the classification model, the idea is to find out the probability of the planet under obs… ▽ More

    Submitted 6 January, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

  37. arXiv:2009.04861  [pdf, other

    cs.AI cs.LG

    Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling

    Authors: K. Darshana Abeyrathna, Bimal Bhattarai, Morten Goodwin, Saeed Gorji, Ole-Christoffer Granmo, Lei Jiao, Rupsa Saha, Rohan K. Yadav

    Abstract: Using logical clauses to represent patterns, Tsetlin Machines (TMs) have recently obtained competitive performance in terms of accuracy, memory footprint, energy, and learning speed on several benchmarks. Each TM clause votes for or against a particular class, with classification resolved using a majority vote. While the evaluation of clauses is fast, being based on binary operators, the voting ma… ▽ More

    Submitted 9 June, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

    Comments: Accepted to ICML 2021

  38. arXiv:2008.10843  [pdf, ps, other

    cs.CV

    Graphical Object Detection in Document Images

    Authors: Ranajit Saha, Ajoy Mondal, C. V. Jawahar

    Abstract: Graphical elements: particularly tables and figures contain a visual summary of the most valuable information contained in a document. Therefore, localization of such graphical objects in the document images is the initial step to understand the content of such graphical objects or document images. In this paper, we present a novel end-to-end trainable deep learning based framework to localize gra… ▽ More

    Submitted 25 August, 2020; originally announced August 2020.

    Comments: 8

    Journal ref: ICDAR 2019

  39. Decentralized Accessibility of e-commerce Products through Blockchain Technology

    Authors: Gulshan Kumara, Rahul Sahaa, William J Buchanan, G. Geethaa, Reji Thomasa, Tai-Hoon Kimc, Mamoun Alazab

    Abstract: A distributed and transparent ledger system is considered for various e-commerce products including health medicines, electronics, security appliances, food products and many more to ensure technological and e-commerce sustainability. This solution, named as 'PRODCHAIN', is a generic blockchain framework with lattice-based cryptographic processes for reducing the complexity for tracing the e-comme… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

    Journal ref: Sustainable Cities and Society, 102361 (2020)

  40. arXiv:2005.06684  [pdf, other

    eess.IV cs.CV cs.LG

    W-Cell-Net: Multi-frame Interpolation of Cellular Microscopy Videos

    Authors: Rohit Saha, Abenezer Teklemariam, Ian Hsu, Alan M. Moses

    Abstract: Deep Neural Networks are increasingly used in video frame interpolation tasks such as frame rate changes as well as generating fake face videos. Our project aims to apply recent advances in Deep video interpolation to increase the temporal resolution of fluorescent microscopy time-lapse movies. To our knowledge, there is no previous work that uses Convolutional Neural Networks (CNN) to generate fr… ▽ More

    Submitted 13 May, 2020; originally announced May 2020.

  41. arXiv:1911.05627  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Wavelets to the Rescue: Improving Sample Quality of Latent Variable Deep Generative Models

    Authors: Prashnna K Gyawali, Rudra Saha, Linwei Wang, VSR Veeravasarapu, Maneesh Singh

    Abstract: Variational Autoencoders (VAE) are probabilistic deep generative models underpinned by elegant theory, stable training processes, and meaningful manifold representations. However, they produce blurry images due to a lack of explicit emphasis over high-frequency textural details of the images, and the difficulty to directly model the complex joint probability distribution over the high-dimensional… ▽ More

    Submitted 26 October, 2019; originally announced November 2019.

  42. arXiv:1906.08903  [pdf, other

    cs.SE

    Harnessing Evolution for Multi-Hunk Program Repair

    Authors: Seemanta Saha, Ripon K. Saha, Mukul R. Prasad

    Abstract: Despite significant advances in automatic program repair (APR)techniques over the past decade, practical deployment remains an elusive goal. One of the important challenges in this regard is the general inability of current APR techniques to produce patches that require edits in multiple locations, i.e., multi-hunk patches. In this work, we present a novel APR technique that generalizes single-hun… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

  43. arXiv:1812.03631  [pdf, other

    cs.CV

    Spatial Knowledge Distillation to aid Visual Reasoning

    Authors: Somak Aditya, Rudra Saha, Yezhou Yang, Chitta Baral

    Abstract: For tasks involving language and vision, the current state-of-the-art methods tend not to leverage any additional information that might be present to gather relevant (commonsense) knowledge. A representative task is Visual Question Answering where large diagnostic datasets have been proposed to test a system's capability of answering questions about images. The training data is often accompanied… ▽ More

    Submitted 11 December, 2018; v1 submitted 10 December, 2018; originally announced December 2018.

    Comments: Equal contribution by first two authors. Accepted in WACV 2019

  44. arXiv:1802.06947  [pdf, other

    cs.SE

    Entropy Guided Spectrum Based Bug Localization Using Statistical Language Model

    Authors: Saikat Chakraborty, Yujian Li, Matt Irvine, Ripon Saha, Baishakhi Ray

    Abstract: Locating bugs is challenging but one of the most important activities in software development and maintenance phase because there are no certain rules to identify all types of bugs. Existing automatic bug localization tools use various heuristics based on test coverage, pre-determined buggy patterns, or textual similarity with bug report, to rank suspicious program elements. However, since these t… ▽ More

    Submitted 19 February, 2018; originally announced February 2018.

    Comments: 13 pages

  45. arXiv:1606.00175  [pdf, other

    cs.LO

    Polynomial Analysis Algorithms for Free Choice Probabilistic Workflow Nets

    Authors: Javier Esparza, Philipp Hoffmann, Ratul Saha

    Abstract: We study Probabilistic Workflow Nets (PWNs), a model extending van der Aalst's workflow nets with probabilities. We give a semantics for PWNs in terms of Markov Decision Processes and introduce a reward model. Using a result by Varacca and Nielsen, we show that the expected reward of a complete execution of the PWN is independent of the scheduler. Extending previous work on reduction of non-probab… ▽ More

    Submitted 1 June, 2016; originally announced June 2016.

  46. arXiv:1408.0979  [pdf, other

    cs.DC cs.LO

    Distributed Markov Chains

    Authors: Sumit Kumar Jha, Madhavan Mukund, Ratul Saha, P S Thiagarajan

    Abstract: The formal verification of large probabilistic models is important and challenging. Exploiting the concurrency that is often present is one way to address this problem. Here we study a restricted class of asynchronous distributed probabilistic systems in which the synchronizations determine the probability distribution for the next moves of the participating agents. The key restriction we impose i… ▽ More

    Submitted 5 August, 2014; originally announced August 2014.

    ACM Class: D.2.4; F.1.2; F.3.1; F.4.1

  47. arXiv:1110.3379  [pdf

    cs.SE

    Identifying Reference Objects by Hierarchical Clustering in Java Environment

    Authors: Rahul Saha, Dr. G. Geetha

    Abstract: Recently Java programming environment has become so popular. Java programming language is a language that is designed to be portable enough to be executed in wide range of computers ranging from cell phones to supercomputers. Computer programs written in Java are compiled into Java Byte code instructions that are suitable for execution by a Java Virtual Machine implementation. Java virtual Machine… ▽ More

    Submitted 15 October, 2011; originally announced October 2011.

    Comments: 8 pages,13 tables,2 figures

    Journal ref: IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 5, No 3, September 2011 ISSN (Online): 1694-0814

  48. arXiv:0908.0080  [pdf

    cs.CR

    A Novel Generic Session Based Bit Level Encryption Technique to Enhance Information Security

    Authors: Manas Paul, Tanmay Bhattacharya, Suvajit Pal, Ranit Saha

    Abstract: - In this paper a session based symmetric key encryption system has been proposed and is termed as Permutated Cipher Technique (PCT). This technique is more fast, suitable and secure for larger files. In this technique the input file is broken down into blocks of various sizes (of 2 power n order) and encrypted by shifting the position of each bit by a certain value for a certain number of times… ▽ More

    Submitted 1 August, 2009; originally announced August 2009.

    Comments: 7 Pages, International Journal of Computer Science and Information Security, IJCSIS July 2009, ISSN 1947 5500, Impact Factor 0.423

    Journal ref: International Journal of Computer Science and Information Security, IJCSIS, Vol. 3, No. 1, July 2009, USA