Skip to main content

Showing 1–50 of 103,283 results for author: R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03320  [pdf, other

    cs.CV cs.CL

    InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

    Authors: Pan Zhang, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Rui Qian, Lin Chen, Qipeng Guo, Haodong Duan, Bin Wang, Linke Ouyang, Songyang Zhang, Wenwei Zhang, Yining Li, Yang Gao, Peng Sun, Xinyue Zhang, Wei Li, **gwen Li, Wenhai Wang, Hang Yan, Conghui He, Xingcheng Zhang, Kai Chen, Jifeng Dai, Yu Qiao , et al. (2 additional authors not shown)

    Abstract: We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities with merely 7B LLM backend. Trained with 24K interleaved image-text contexts, it can seamlessly extend to 96K long contexts via RoPE extrapolation. Th… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Technical Report. https://github.com/InternLM/InternLM-XComposer

  2. arXiv:2407.03314  [pdf, other

    cs.CV cs.CL cs.DB

    BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations

    Authors: Zhantao Yang, Ruili Feng, Keyu Yan, Huangji Wang, Zhicai Wang, Shangwen Zhu, Han Zhang, Jie Xiao, **yu Wu, Kai Zhu, Jixuan Chen, Chen-Wei Xie, Chaojie Mao, Yue Yang, Hongyang Zhang, Yu Liu, Fan Cheng

    Abstract: This paper presents Bag-of-Concept Graph (BACON) to gift models with limited linguistic abilities to taste the privilege of Vision Language Models (VLMs) and boost downstream tasks such as detection, visual question answering (VQA), and image generation. Since the visual scenes in physical worlds are structured with complex relations between objects, BACON breaks down annotations into basic minimu… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  3. arXiv:2407.03307  [pdf, other

    eess.IV cs.CV

    HoloHisto: End-to-end Gigapixel WSI Segmentation with 4K Resolution Sequential Tokenization

    Authors: Yucheng Tang, Yufan He, Vishwesh Nath, Pengfeig Guo, Ruining Deng, Tianyuan Yao, Quan Liu, Can Cui, Mengmeng Yin, Ziyue Xu, Holger Roth, Daguang Xu, Haichun Yang, Yuankai Huo

    Abstract: In digital pathology, the traditional method for deep learning-based image segmentation typically involves a two-stage process: initially segmenting high-resolution whole slide images (WSI) into smaller patches (e.g., 256x256, 512x512, 1024x1024) and subsequently reconstructing them to their original scale. This method often struggles to capture the complex details and vast scope of WSIs. In this… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  4. arXiv:2407.03289  [pdf, other

    cs.IT cs.CR cs.LG

    Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

    Authors: Sajani Vithana, Viveck R. Cadambe, Flavio P. Calmon, Haewon Jeong

    Abstract: Differentially private distributed mean estimation (DP-DME) is a fundamental building block in privacy-preserving federated learning, where a central server estimates the mean of $d$-dimensional vectors held by $n$ users while ensuring $(ε,δ)$-DP. Local differential privacy (LDP) and distributed DP with secure aggregation (SecAgg) are the most common notions of DP used in DP-DME settings with an u… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  5. arXiv:2407.03277  [pdf, other

    cs.CL

    Evaluating Automatic Metrics with Incremental Machine Translation Systems

    Authors: Guojun Wu, Shay B. Cohen, Rico Sennrich

    Abstract: We introduce a dataset comprising commercial machine translations, gathered weekly over six years across 12 translation directions. Since human A/B testing is commonly used, we assume commercial systems improve over time, which enables us to evaluate machine translation (MT) metrics based on their preference for more recent translations. Our study confirms several previous findings in MT metrics r… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  6. arXiv:2407.03241  [pdf, other

    cs.RO cs.LG

    Terrain Classification Enhanced with Uncertainty for Space Exploration Robots from Proprioceptive Data

    Authors: Mariela De Lucas Álvarez, Jichen Guo, Raul Domínguez, Matias Valdenegro-Toro

    Abstract: Terrain Classification is an essential task in space exploration, where unpredictable environments are difficult to observe using only exteroceptive sensors such as vision. Implementing Neural Network classifiers can have high performance but can be deemed untrustworthy as they lack transparency, which makes them unreliable for taking high-stakes decisions during mission planning. We address this… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 6 pages, 4 figures. LatinX in AI Workshop @ ICML 2023 Camera Ready

  7. arXiv:2407.03239  [pdf, other

    q-bio.QM cs.CV

    Solving the inverse problem of microscopy deconvolution with a residual Beylkin-Coifman-Rokhlin neural network

    Authors: Rui Li, Mikhail Kudryashev, Artur Yakimovich

    Abstract: Optic deconvolution in light microscopy (LM) refers to recovering the object details from images, revealing the ground truth of samples. Traditional explicit methods in LM rely on the point spread function (PSF) during image acquisition. Yet, these approaches often fall short due to inaccurate PSF models and noise artifacts, hampering the overall restoration quality. In this paper, we approached t… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 17 pages, 8 figures

    ACM Class: I.4; J.3

  8. arXiv:2407.03235  [pdf, other

    physics.optics cs.ET physics.app-ph

    Programming universal unitary transformations on a general-purpose silicon photonics platform

    Authors: Jose Roberto Rausell-Campo, Daniel Pérez, López, José Capmany Francoy

    Abstract: General-purpose programmable photonic processors provide a versatile platform for integrating diverse functionalities on a single chip. Leveraging a two-dimensional hexagonal waveguide mesh of Mach-Zehnder interferometers, these systems have demonstrated significant potential in microwave photonics applications. Additionally, they are a promising platform for creating unitary linear transformation… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  9. arXiv:2407.03218  [pdf, other

    physics.optics cs.ET

    Programmable Photonic Extreme Learning Machines

    Authors: Jose Roberto Rausell-Campo, Antonio Hurtado, Daniel Pérez-López, José Capmany Francoy

    Abstract: Photonic neural networks offer a promising alternative to traditional electronic systems for machine learning accelerators due to their low latency and energy efficiency. However, the challenge of implementing the backpropagation algorithm during training has limited their development. To address this, alternative machine learning schemes, such as extreme learning machines (ELMs), have been propos… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  10. arXiv:2407.03216  [pdf, other

    cs.CV cs.AI

    Learning Disentangled Representation in Object-Centric Models for Visual Dynamics Prediction via Transformers

    Authors: Sanket Gandhi, Atul, Samanyu Mahajan, Vishal Sharma, Rushil Gupta, Arnab Kumar Mondal, Parag Singla

    Abstract: Recent work has shown that object-centric representations can greatly help improve the accuracy of learning dynamics while also bringing interpretability. In this work, we take this idea one step further, ask the following question: "can learning disentangled representation further improve the accuracy of visual dynamics prediction in object-centric models?" While there has been some attempt to le… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  11. arXiv:2407.03203  [pdf, other

    cs.FL cs.AI

    TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

    Authors: Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang

    Abstract: Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. Similar methods have shown promising results in code generation. However, most modern LLMs exhibit suboptimal performance… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  12. arXiv:2407.03169  [pdf, other

    cs.CL cs.SD eess.AS

    Investigating Decoder-only Large Language Models for Speech-to-text Translation

    Authors: Chao-Wei Huang, Hui Lu, Hongyu Gong, Hirofumi Inaguma, Ilia Kulikov, Ruslan Mavlyutov, Sravya Popuri

    Abstract: Large language models (LLMs), known for their exceptional reasoning capabilities, generalizability, and fluency across diverse domains, present a promising avenue for enhancing speech-related tasks. In this paper, we focus on integrating decoder-only LLMs to the task of speech-to-text translation (S2TT). We propose a decoder-only architecture that enables the LLM to directly consume the encoded sp… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted to Interspeech 2024

  13. arXiv:2407.03163  [pdf, other

    cs.CV

    Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection

    Authors: Rui-Yang Ju, Chun-Tse Chien, Chia-Min Lin, Jen-Shiun Chiang

    Abstract: Children often suffer wrist injuries in daily life, while fracture injuring radiologists usually need to analyze and interpret X-ray images before surgical treatment by surgeons. The development of deep learning has enabled neural network models to work as computer-assisted diagnosis (CAD) tools to help doctors and experts in diagnosis. Since the YOLOv8 models have obtained the satisfactory succes… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  14. arXiv:2407.03162  [pdf, other

    cs.RO cs.CV cs.LG

    Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

    Authors: Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang

    Abstract: Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-bas… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: project page: https://dingry.github.io/projects/bunny_visionpro.html

  15. arXiv:2407.03154  [pdf, other

    cs.LG cs.AI q-bio.BM

    Reinforcement Learning for Sequence Design Leveraging Protein Language Models

    Authors: Jithendaraa Subramanian, Shivakanth Sujit, Niloy Irtisam, Umong Sain, Derek Nowrouzezahrai, Samira Ebrahimi Kahou, Riashat Islam

    Abstract: Protein sequence design, determined by amino acid sequences, are essential to protein engineering problems in drug discovery. Prior approaches have resorted to evolutionary strategies or Monte-Carlo methods for protein design, but often fail to exploit the structure of the combinatorial search space, to generalize to unseen sequences. In the context of discrete black box optimization over large se… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 22 pages, 7 figures, 4 tables

  16. arXiv:2407.03152  [pdf, other

    cs.CV cs.LG

    Stereo Risk: A Continuous Modeling Approach to Stereo Matching

    Authors: Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Yao Yao, Luc Van Gool

    Abstract: We introduce Stereo Risk, a new deep-learning approach to solve the classical stereo-matching problem in computer vision. As it is well-known that stereo matching boils down to a per-pixel disparity estimation problem, the popular state-of-the-art stereo-matching approaches widely rely on regressing the scene disparity values, yet via discretization of scene disparity values. Such discretization o… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted as an Oral Paper at ICML 2024. Draft info: 18 pages, 6 Figure, 16 Tables

  17. arXiv:2407.03140  [pdf, other

    cs.CV

    Machine Learning Models for Improved Tracking from Range-Doppler Map Images

    Authors: Elizabeth Hou, Ross Greenwood, Piyush Kumar

    Abstract: Statistical tracking filters depend on accurate target measurements and uncertainty estimates for good tracking performance. In this work, we propose novel machine learning models for target detection and uncertainty estimation in range-Doppler map (RDM) images for Ground Moving Target Indicator (GMTI) radars. We show that by using the outputs of these models, we can significantly improve the perf… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  18. arXiv:2407.03129  [pdf, other

    cs.CL

    Social Bias Evaluation for Large Language Models Requires Prompt Variations

    Authors: Rem Hida, Masahiro Kaneko, Naoaki Okazaki

    Abstract: Warning: This paper contains examples of stereotypes and biases. Large Language Models (LLMs) exhibit considerable social biases, and various studies have tried to evaluate and mitigate these biases accurately. Previous studies use downstream tasks as prompts to examine the degree of social biases for evaluation and mitigation. While LLMs' output highly depends on prompts, previous studies evaluat… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  19. arXiv:2407.03126  [pdf, other

    eess.SY cs.GT cs.SI

    Game-Theoretic Protection Adoption Against Networked SIS Epidemics

    Authors: Abhisek Satapathi, Ashish R. Hota

    Abstract: In this paper, we investigate game-theoretic strategies for containing spreading processes on large-scale networks. Specifically, we consider the class of networked susceptible-infected-susceptible (SIS) epidemics where a large population of agents strategically choose whether to adopt partially effective protection. We define the utilities of the agents which depends on the degree of the agent, i… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  20. arXiv:2407.03108  [pdf, other

    cs.LG cs.AI

    How Reliable and Stable are Explanations of XAI Methods?

    Authors: José Ribeiro, Lucas Cardoso, Vitor Santos, Eduardo Carvalho, Níkolas Carneiro, Ronnie Alves

    Abstract: Black box models are increasingly being used in the daily lives of human beings living in society. Along with this increase, there has been the emergence of Explainable Artificial Intelligence (XAI) methods aimed at generating additional explanations regarding how the model makes certain predictions. In this sense, methods such as Dalex, Eli5, eXirt, Lofo and Shap emerged as different proposals an… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 15 pages, 6 figures, submitted to BRACIS 2024

    ACM Class: I.2.6

  21. arXiv:2407.03086  [pdf, other

    cs.LG cs.AI cs.DC

    Effective Heterogeneous Federated Learning via Efficient Hypernetwork-based Weight Generation

    Authors: Yu** Shin, Kichang Lee, Sungmin Lee, You Rim Choi, Hyung-Sin Kim, JeongGil Ko

    Abstract: While federated learning leverages distributed client resources, it faces challenges due to heterogeneous client capabilities. This necessitates allocating models suited to clients' resources and careful parameter aggregation to accommodate this heterogeneity. We propose HypeMeFed, a novel federated learning framework for supporting client heterogeneity by combining a multi-exit network architectu… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  22. arXiv:2407.03076  [pdf, other

    cs.CL

    A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning

    Authors: Ramakrishna Appicharla, Baban Gain, Santanu Pal, Asif Ekbal, Pushpak Bhattacharyya

    Abstract: In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task lear… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted to EAMT 2024 (poster)

  23. arXiv:2407.03070  [pdf, other

    cs.CR cs.AI

    Federated Learning for Zero-Day Attack Detection in 5G and Beyond V2X Networks

    Authors: Abdelaziz Amara korba, Abdelwahab Boualouache, Bouziane Brik, Rabah Rahal, Yacine Ghamri-Doudane, Sidi Mohammed Senouci

    Abstract: Deploying Connected and Automated Vehicles (CAVs) on top of 5G and Beyond networks (5GB) makes them vulnerable to increasing vectors of security and privacy attacks. In this context, a wide range of advanced machine/deep learning based solutions have been designed to accurately detect security attacks. Specifically, supervised learning techniques have been widely applied to train attack detection… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  24. arXiv:2407.03068  [pdf, other

    cs.NI cs.AI

    xApp Distillation: AI-based Conflict Mitigation in B5G O-RAN

    Authors: Hakan Erdol, Xiaoyang Wang, Robert Piechocki, George Oikonomou, Arjun Parekh

    Abstract: The advancements of machine learning-based (ML) decision-making algorithms created various research and industrial opportunities. One of these areas is ML-based near-real-time network management applications (xApps) in Open-Radio Access Network (O-RAN). Normally, xApps are designed solely for the desired objectives, and fine-tuned for deployment. However, telecommunication companies can employ mul… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 5 Pages, 4 figures

  25. arXiv:2407.03043  [pdf, other

    cs.CV

    SlerpFace: Face Template Protection via Spherical Linear Interpolation

    Authors: Zhizhou Zhong, Yuxi Mi, Yuge Huang, Jianqing Xu, Guodong Mu, Shouhong Ding, **gyun Zhang, Rizen Guo, Yunsheng Wu, Shuigeng Zhou

    Abstract: Contemporary face recognition systems use feature templates extracted from face images to identify persons. To enhance privacy, face template protection techniques are widely employed to conceal sensitive identity and appearance information stored in the template. This paper identifies an emerging privacy attack form utilizing diffusion models that could nullify prior protection, referred to as in… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: face template protection

  26. arXiv:2407.02994  [pdf, other

    cs.DB cs.AI

    MedPix 2.0: A Comprehensive Multimodal Biomedical Dataset for Advanced AI Applications

    Authors: Irene Siragusa, Salvatore Contino, Massimo La Ciura, Rosario Alicata, Roberto Pirrone

    Abstract: The increasing interest in develo** Artificial Intelligence applications in the medical domain, suffers from the lack of high-quality dataset, mainly due to privacy-related issues. Moreover, the recent rising of Multimodal Large Language Models (MLLM) leads to a need for multimodal medical datasets, where clinical reports and findings are attached to the corresponding CT or MR scans. This paper… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  27. arXiv:2407.02984  [pdf, other

    cs.LG cs.NE q-bio.GN

    Semantically Rich Local Dataset Generation for Explainable AI in Genomics

    Authors: Pedro Barbosa, Rosina Savisaar, Alcides Fonseca

    Abstract: Black box deep learning models trained on genomic sequences excel at predicting the outcomes of different gene regulatory mechanisms. Therefore, interpreting these models may provide novel insights into the underlying biology, supporting downstream biomedical applications. Due to their complexity, interpretable surrogate models can only be built for local explanations (e.g., a single instance). Ho… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  28. arXiv:2407.02980  [pdf, other

    cs.SI

    Modelling the mitigation of anti-vaccine opinion propagation to suppress epidemic spread: A computational approach

    Authors: Sarah Alahmadi, Rebecca Hoyle, Michael Head, Markus Brede

    Abstract: Information regarding vaccines from sources such as health services, media, and social networks can significantly shape vaccination decisions. In particular, the dissemination of negative information can contribute to vaccine hesitancy, thereby exacerbating infectious disease outbreaks. This study investigates strategies to mitigate anti-vaccine social contagion through effective counter-campaigns… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Submitted to the PLOS ONE Journal

  29. arXiv:2407.02978  [pdf, other

    cs.CL cs.AI

    Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text

    Authors: Jainit Sushil Bafna, Hardik Mittal, Suyash Sethia, Manish Shrivastava, Radhika Mamidi

    Abstract: Large Language Models (LLMs) have showcased impressive abilities in generating fluent responses to diverse user queries. However, concerns regarding the potential misuse of such texts in journalism, educational, and academic contexts have surfaced. SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automat… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: SemEval-2024

  30. arXiv:2407.02963  [pdf, ps, other

    eess.SP cs.IT

    Subspace Coding for Spatial Sensing

    Authors: Hessam Mahdavifar, Robin Rajamäki, Piya Pal

    Abstract: A subspace code is defined as a collection of subspaces of an ambient vector space, where each information-encoding codeword is a subspace. This paper studies a class of spatial sensing problems, notably direction of arrival (DoA) estimation using multisensor arrays, from a novel subspace coding perspective. Specifically, we demonstrate how a canonical (passive) sensing model can be mapped into a… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: ©2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  31. arXiv:2407.02960  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    ObfuscaTune: Obfuscated Offsite Fine-tuning and Inference of Proprietary LLMs on Private Datasets

    Authors: Ahmed Frikha, Nassim Walha, Ricardo Mendes, Krishna Kanth Nakka, Xue Jiang, Xuebing Zhou

    Abstract: This work addresses the timely yet underexplored problem of performing inference and finetuning of a proprietary LLM owned by a model provider entity on the confidential/private data of another data owner entity, in a way that ensures the confidentiality of both the model and the data. Hereby, the finetuning is conducted offsite, i.e., on the computation infrastructure of a third-party cloud provi… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Preprint

  32. arXiv:2407.02956  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization

    Authors: Ahmed Frikha, Nassim Walha, Krishna Kanth Nakka, Ricardo Mendes, Xue Jiang, Xuebing Zhou

    Abstract: In this work, we address the problem of text anonymization where the goal is to prevent adversaries from correctly inferring private attributes of the author, while kee** the text utility, i.e., meaning and semantics. We propose IncogniText, a technique that anonymizes the text to mislead a potential adversary into predicting a wrong private attribute value. Our empirical evaluation shows a redu… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Preprint

  33. arXiv:2407.02943  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    PII-Compass: Guiding LLM training data extraction prompts towards the target PII via grounding

    Authors: Krishna Kanth Nakka, Ahmed Frikha, Ricardo Mendes, Xue Jiang, Xuebing Zhou

    Abstract: The latest and most impactful advances in large models stem from their increased size. Unfortunately, this translates into an improved memorization capacity, raising data privacy concerns. Specifically, it has been shown that models can output personal identifiable information (PII) contained in their training data. However, reported PIII extraction performance varies widely, and there is no conse… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted at ACL 2024

  34. Recompression Based JPEG Tamper Detection and Localization Using Deep Neural Network Eliminating Compression Factor Dependency

    Authors: Jamimamul Bakas, Praneta Rawat, Kalyan Kokkalla, Ruchira Naskar

    Abstract: In this work, we deal with the problem of re compression based image forgery detection, where some regions of an image are modified illegitimately, hence giving rise to presence of dual compression characteristics within a single image. There have been some significant researches in this direction, in the last decade. However, almost all existing techniques fail to detect this form of forgery, whe… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 24 pages, conference

    Journal ref: Information Systems Security: 14th International Conference, ICISS 2018, Bangalore, India, December 17-19, 2018, Proceedings. Vol. 11281. Springer, 2018

  35. arXiv:2407.02921  [pdf, other

    cs.ET

    In-Memory Mirroring: Cloning Without Reading

    Authors: Simranjeet Singh, Ankit Bende, Chandan Kumar Jha, Vikas Rana, Rolf Drechsler, Sachin Patkar, Farhad Merchant

    Abstract: In-memory computing (IMC) has gained significant attention recently as it attempts to reduce the impact of memory bottlenecks. Numerous schemes for digital IMC are presented in the literature, focusing on logic operations. Often, an application's description has data dependencies that must be resolved. Contemporary IMC architectures perform read followed by write operations for this purpose, which… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted in IFIP/IEEE VLSI-SoC 2024

  36. arXiv:2407.02920  [pdf, ps, other

    cs.CV

    EgoFlowNet: Non-Rigid Scene Flow from Point Clouds with Ego-Motion Support

    Authors: Ramy Battrawy, René Schuster, Didier Stricker

    Abstract: Recent weakly-supervised methods for scene flow estimation from LiDAR point clouds are limited to explicit reasoning on object-level. These methods perform multiple iterative optimizations for each rigid object, which makes them vulnerable to clustering robustness. In this paper, we propose our EgoFlowNet - a point-level scene flow estimation network trained in a weakly-supervised manner and witho… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: This paper is published in BMVC2023 (pp. 441-443)

  37. arXiv:2407.02914  [pdf, other

    cs.LG cs.SE

    The More the Merrier? Navigating Accuracy vs. Energy Efficiency Design Trade-Offs in Ensemble Learning Systems

    Authors: Rafiullah Omar, Justus Bogner, Henry Muccini, Patricia Lago, Silverio Martínez-Fernández, Xavier Franch

    Abstract: Background: Machine learning (ML) model composition is a popular technique to mitigate shortcomings of a single ML model and to design more effective ML-enabled systems. While ensemble learning, i.e., forwarding the same request to several models and fusing their predictions, has been studied extensively for accuracy, we have insufficient knowledge about how to design energy-efficient ensembles. O… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Currently under review at a journal

  38. arXiv:2407.02913  [pdf, other

    cs.LG cs.AI eess.IV eess.SP math.NA

    SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic

    Authors: Liulu He, Yufei Zhao, Rui Gao, Yuan Du, Li Du

    Abstract: Fast convolution algorithms, including Winograd and FFT, can efficiently accelerate convolution operations in deep models. However, these algorithms depend on high-precision arithmetic to maintain inference accuracy, which conflicts with the model quantization. To resolve this conflict and further improve the efficiency of quantized convolution, we proposes SFC, a new algebra transform for fast co… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: ICML 2024

  39. arXiv:2407.02911  [pdf, other

    eess.IV cs.CV

    Non-Adversarial Learning: Vector-Quantized Common Latent Space for Multi-Sequence MRI

    Authors: Luyi Han, Tao Tan, Tianyu Zhang, Xin Wang, Yuan Gao, Chunyao Lu, Xinglong Liang, Haoran Dou, Yunzhi Huang, Ritse Mann

    Abstract: Adversarial learning helps generative models translate MRI from source to target sequence when lacking paired samples. However, implementing MRI synthesis with adversarial learning in clinical settings is challenging due to training instability and mode collapse. To address this issue, we leverage intermediate sequences to estimate the common latent space among multi-sequence MRI, enabling the rec… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  40. arXiv:2407.02883  [pdf, other

    cs.IR cs.CL

    CoIR: A Comprehensive Benchmark for Code Information Retrieval Models

    Authors: Xiangyang Li, Kuicai Dong, Yi Quan Lee, Wei Xia, Yichun Yin, Hao Zhang, Yong Liu, Yasheng Wang, Ruiming Tang

    Abstract: Despite the substantial success of Information Retrieval (IR) in various NLP tasks, most IR systems predominantly handle queries and corpora in natural language, neglecting the domain of code retrieval. Code retrieval is critically important yet remains under-explored, with existing methods and benchmarks inadequately representing the diversity of code in various domains and tasks. Addressing this… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  41. arXiv:2407.02856  [pdf, other

    cs.LG cs.CR

    Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows

    Authors: Adrian Pekar, Richard Jozsa

    Abstract: This study investigates the efficacy of machine learning models, specifically Random Forest, in anomaly detection systems when trained on complete flow records and tested on partial flow data. We explore the performance disparity that arises when models are applied to incomplete data typical in real-world, real-time network environments. Our findings demonstrate a significant decline in model perf… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 9 pages, 5 tables, 2 figures

  42. arXiv:2407.02834  [pdf, ps, other

    cs.CL

    Aspect-Based Sentiment Analysis Techniques: A Comparative Study

    Authors: Dineth Jayakody, Koshila Isuranda, A V A Malkith, Nisansa de Silva, Sachintha Rajith Ponnamperuma, G G N Sandamali, K L K Sudheera

    Abstract: Since the dawn of the digitalisation era, customer feedback and online reviews are unequivocally major sources of insights for businesses. Consequently, conducting comparative analyses of such sources has become the de facto modus operandi of any business that wishes to give itself a competitive edge over its peers and improve customer loyalty. Sentiment analysis is one such method instrumental in… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  43. arXiv:2407.02828  [pdf

    cs.ET quant-ph

    Quantum Serverless Paradigm and Application Development using the QFaaS Framework

    Authors: Hoa T. Nguyen, Bui Binh An Pham, Muhammad Usman, Rajkumar Buyya

    Abstract: Quantum computing has the potential to solve complex problems beyond the capabilities of classical computers. However, its practical use is currently limited due to early-stage quantum software engineering and the constraints of Noisy Intermediate-Scale Quantum (NISQ) devices. To address this issue, this chapter introduces the concept of serverless quantum computing with examples using QFaaS, a pr… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Guidelines for deploying and using the QFaaS Framework (for the original paper, see https://doi.org/10.1016/j.future.2024.01.018)

  44. arXiv:2407.02811  [pdf, other

    cs.LG cs.IT

    SPLITZ: Certifiable Robustness via Split Lipschitz Randomized Smoothing

    Authors: Meiyu Zhong, Ravi Tandon

    Abstract: Certifiable robustness gives the guarantee that small perturbations around an input to a classifier will not change the prediction. There are two approaches to provide certifiable robustness to adversarial examples: a) explicitly training classifiers with small Lipschitz constants, and b) Randomized smoothing, which adds random noise to the input to create a smooth classifier. We propose \textit{S… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  45. arXiv:2407.02807  [pdf, other

    cs.SI

    Regional and Temporal Patterns of Partisan Polarization during the COVID-19 Pandemic in the United States and Canada

    Authors: Zachary Yang, Anne Imouza, Maximilian Puelma Touzel, Cecile Amadoro, Gabrielle Desrosiers-Brisebois, Kellin Pelrine, Sacha Levy, Jean-Francois Godbout, Reihaneh Rabbany

    Abstract: Public health measures were among the most polarizing topics debated online during the COVID-19 pandemic. Much of the discussion surrounded specific events, such as when and which particular interventions came into practise. In this work, we develop and apply an approach to measure subnational and event-driven variation of partisan polarization and explore how these dynamics varied both across and… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 19 pages (main paper), 9 figures, 1 table

    ACM Class: J.4

  46. arXiv:2407.02794  [pdf, other

    cs.CV

    Euler's Elastica Based Cartoon-Smooth-Texture Image Decomposition

    Authors: Roy Y. He, Hao Liu

    Abstract: We propose a novel model for decomposing grayscale images into three distinct components: the structural part, representing sharp boundaries and regions with strong light-to-dark transitions; the smooth part, capturing soft shadows and shades; and the oscillatory part, characterizing textures and noise. To capture the homogeneous structures, we introduce a combination of $L^0$-gradient and curvatu… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    MSC Class: 68U10; 94A08; 65D18

  47. arXiv:2407.02766  [pdf, other

    cs.CR

    Balancing Patient Privacy and Health Data Security: The Role of Compliance in Protected Health Information (PHI) Sharing

    Authors: Md Al Amin, Hemanth Tummala, Rushabh Shah, Indrajit Ray

    Abstract: Protected Health Information (PHI) sharing significantly enhances patient care quality and coordination, contributing to more accurate diagnoses, efficient treatment plans, and a comprehensive understanding of patient history. Compliance with strict privacy and security policies, such as those required by laws like HIPAA, is critical to protect PHI. Blockchain technology, which offers a decentrali… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: The 21st International Conference on Security and Cryptography (SECRYPT 2024)

  48. arXiv:2407.02751  [pdf, other

    cs.CL cs.AI

    Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset

    Authors: Rui Liu, Haolin Zuo, Zheng Lian, Xiaofen Xing, Björn W. Schuller, Haizhou Li

    Abstract: Emotion and Intent Joint Understanding in Multimodal Conversation (MC-EIU) aims to decode the semantic information manifested in a multimodal conversational history, while inferring the emotions and intents simultaneously for the current utterance. MC-EIU is enabling technology for many human-computer interfaces. However, there is a lack of available datasets in terms of annotation, modality, lang… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 26 pages, 8 figures, 12 tables, NeurIPS 2024 Dataset and Benchmark Track

  49. arXiv:2407.02750  [pdf, other

    cs.CL

    Learning to Reduce: Towards Improving Performance of Large Language Models on Structured Data

    Authors: Younghun Lee, Sungchul Kim, Ryan A. Rossi, Tong Yu, Xiang Chen

    Abstract: Large Language Models (LLMs) have been achieving competent performance on a wide range of downstream tasks, yet existing work shows that inference on structured data is challenging for LLMs. This is because LLMs need to either understand long structured data or select the most relevant evidence before inference, and both approaches are not trivial. This paper proposes a framework, Learning to Redu… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: ICML 2024 Workshop on Long-Context Foundation Models, Vienna, Austria 2024. arXiv admin note: substantial text overlap with arXiv:2402.14195

  50. arXiv:2407.02748  [pdf, other

    cs.DC cs.ET

    DRLQ: A Deep Reinforcement Learning-based Task Placement for Quantum Cloud Computing

    Authors: Hoa T. Nguyen, Muhammad Usman, Rajkumar Buyya

    Abstract: The quantum cloud computing paradigm presents unique challenges in task placement due to the dynamic and heterogeneous nature of quantum computation resources. Traditional heuristic approaches fall short in adapting to the rapidly evolving landscape of quantum computing. This paper proposes DRLQ, a novel Deep Reinforcement Learning (DRL)-based technique for task placement in quantum cloud computin… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted paper at IEEE CLOUD 2024 conference