Skip to main content

Showing 1–50 of 260 results for author: Adriana

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01556  [pdf, other

    cs.CY

    A Taxonomy of the Biases of the Images created by Generative Artificial Intelligence

    Authors: Adriana Fernández de Caleya Vázquez, Eduardo C. Garrido-Merchán

    Abstract: Generative artificial intelligence models show an amazing performance creating unique content automatically just by being given a prompt by the user, which is revolutionizing several fields such as marketing and design. Not only are there models whose generated output belongs to the text format but we also find models that are able to automatically generate high quality genuine images and videos g… ▽ More

    Submitted 2 May, 2024; originally announced July 2024.

  2. arXiv:2407.00121  [pdf, other

    cs.LG cs.AI cs.CL

    Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

    Authors: Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Sadhana Kumaravel, Matthew Stallone, Rameswar Panda, Yara Rizk, GP Bhargav, Maxwell Crouse, Chulaka Gunasekara, Shajith Ikbal, Sachin Joshi, Hima Karanam, Vineet Kumar, Asim Munawar, Sumit Neelam, Dinesh Raghu, Udit Sharma, Adriana Meza Soria, Dheeraj Sreedhar, Praveen Venkateswaran, Merve Unuvar, David Cox, Salim Roukos, Luis Lastras , et al. (1 additional authors not shown)

    Abstract: Large language models (LLMs) have recently shown tremendous promise in serving as the backbone to agentic systems, as demonstrated by their performance in multi-faceted, challenging benchmarks like SWE-Bench and Agent-Bench. However, to realize the true potential of LLMs as autonomous agents, they must learn to identify, call, and interact with external tools and application program interfaces (AP… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  3. arXiv:2406.18373  [pdf, other

    cs.CL cs.SD eess.AS

    Dynamic Data Pruning for Automatic Speech Recognition

    Authors: Qiao Xiao, **chuan Ma, Adriana Fernandez-Lopez, Boqian Wu, Lu Yin, Stavros Petridis, Mykola Pechenizkiy, Maja Pantic, Decebal Constantin Mocanu, Shiwei Liu

    Abstract: The recent success of Automatic Speech Recognition (ASR) is largely attributed to the ever-growing amount of training data. However, this trend has made model training prohibitively costly and imposed computational demands. While data pruning has been proposed to mitigate this issue by identifying a small subset of relevant data, its application in ASR has been barely explored, and existing works… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  4. arXiv:2406.17614  [pdf, other

    cs.CV cs.MM

    MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization

    Authors: Adriana Fernandez-Lopez, Honglie Chen, **chuan Ma, Lu Yin, Qiao Xiao, Stavros Petridis, Shiwei Liu, Maja Pantic

    Abstract: Pre-trained models have been a foundational approach in speech recognition, albeit with associated additional costs. In this study, we propose a regularization technique that facilitates the training of visual and audio-visual speech recognition models (VSR and AVSR) from scratch. This approach, abbreviated as \textbf{MSRS} (Multimodal Speech Recognition from Scratch), introduces a sparse regulari… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  5. arXiv:2406.12046  [pdf, ps, other

    cs.IT

    A Construction of Optimal Quasi-cyclic Locally Recoverable Codes using Constituent Codes

    Authors: Gustavo Terra Bastos, Angelynn Alvarez, Zachary Flores, Adriana Salerno

    Abstract: A locally recoverable code of locality $r$ over $\mathbb{F}_{q}$ is a code where every coordinate of a codeword can be recovered using the values of at most $r$ other coordinates of that codeword. Locally recoverable codes are efficient at restoring corrupted messages and data which make them highly applicable to distributed storage systems. Quasi-cyclic codes of length $n=m\ell$ and index $\ell$… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 12 pages

  6. arXiv:2406.11988  [pdf, other

    cs.CV cs.AI cs.CY cs.LG

    Decomposed evaluations of geographic disparities in text-to-image models

    Authors: Abhishek Sureddy, Dishant Padalia, Nandhinee Periyakaruppa, Oindrila Saha, Adina Williams, Adriana Romero-Soriano, Megan Richards, Polina Kirichenko, Melissa Hall

    Abstract: Recent work has identified substantial disparities in generated images of different geographic regions, including stereotypical depictions of everyday objects like houses and cars. However, existing measures for these disparities have been limited to either human evaluations, which are time-consuming and costly, or automatic metrics evaluating full images, which are unable to attribute these dispa… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  7. arXiv:2406.10429  [pdf, other

    cs.CV cs.AI

    Consistency-diversity-realism Pareto fronts of conditional image generative models

    Authors: Pietro Astolfi, Marlene Careil, Melissa Hall, Oscar Mañas, Matthew Muckley, Jakob Verbeek, Adriana Romero Soriano, Michal Drozdzal

    Abstract: Building world models that accurately and comprehensively represent the real world is the utmost aspiration for conditional image generative models as it would enable their use as world simulators. For these models to be successful world models, they should not only excel at image quality and prompt-image consistency but also ensure high representation diversity. However, current research in gener… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  8. arXiv:2406.04746  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction

    Authors: Eduard Poesina, Adriana Valentina Costache, Adrian-Gabriel Chifu, Josiane Mothe, Radu Tudor Ionescu

    Abstract: Text-to-image generation has recently emerged as a viable alternative to text-to-image retrieval, due to the visually impressive results of generative diffusion models. Although query performance prediction is an active research topic in information retrieval, to the best of our knowledge, there is no prior study that analyzes the difficulty of queries (prompts) in text-to-image generation, based… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  9. arXiv:2406.04551  [pdf, other

    cs.CV cs.AI cs.LG

    Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance

    Authors: Reyhane Askari Hemmat, Melissa Hall, Alicia Sun, Candace Ross, Michal Drozdzal, Adriana Romero-Soriano

    Abstract: With the growing popularity of text-to-image generative models, there has been increasing focus on understanding their risks and biases. Recent work has found that state-of-the-art models struggle to depict everyday objects with the true diversity of the real world and have notable gaps between geographic regions. In this work, we aim to increase the diversity of generated images of common objects… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  10. Simulation, Modelling and Classification of Wiki Contributors: Spotting The Good, The Bad, and The Ugly

    Authors: Silvia García Méndez, Fátima Leal, Benedita Malheiro, Juan Carlos Burguillo Rial, Bruno Veloso, Adriana E. Chis, Horacio González Vélez

    Abstract: Data crowdsourcing is a data acquisition process where groups of voluntary contributors feed platforms with highly relevant data ranging from news, comments, and media to knowledge and classifications. It typically processes user-generated data streams to provide and refine popular services such as wikis, collaborative maps, e-commerce sites, and social networks. Nevertheless, this modus operandi… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Journal ref: Simulation Modelling Practice and Theory, 120, 102616 (2022)

  11. arXiv:2405.17243  [pdf, other

    cs.LG cs.AI

    Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning

    Authors: Adriana Hugessen, Roger Creus Castanyer, Faisal Mohamed, Glen Berseth

    Abstract: Both entropy-minimizing and entropy-maximizing (curiosity) objectives for unsupervised reinforcement learning (RL) have been shown to be effective in different environments, depending on the environment's level of natural entropy. However, neither method alone results in an agent that will consistently learn intelligent behavior across environments. In an effort to find a single entropy-based meth… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Published at the Reinforcement Learning Conference 2024

  12. arXiv:2405.11092  [pdf, other

    cs.HC cs.RO

    What metrics of participation balance predict outcomes of collaborative learning with a robot?

    Authors: Yuya Asano, Diane Litman, Quentin King-Shepard, Tristan Maidment, Tyree Langley, Teresa Davison, Timothy Nokes-Malach, Adriana Kovashka, Erin Walker

    Abstract: One of the keys to the success of collaborative learning is balanced participation by all learners, but this does not always happen naturally. Pedagogical robots have the potential to facilitate balance. However, it remains unclear what participation balance robots should aim at; various metrics have been proposed, but it is still an open question whether we should balance human participation in h… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: To appear in Seventeenth International Conference on Educational Data Mining (EDM 2024)

  13. arXiv:2405.04457  [pdf, other

    cs.CV cs.CY cs.HC

    Towards Geographic Inclusion in the Evaluation of Text-to-Image Models

    Authors: Melissa Hall, Samuel J. Bell, Candace Ross, Adina Williams, Michal Drozdzal, Adriana Romero Soriano

    Abstract: Rapid progress in text-to-image generative models coupled with their deployment for visual content creation has magnified the importance of thoroughly evaluating their performance and identifying potential biases. In pursuit of models that generate images that are realistic, diverse, visually appealing, and consistent with the given prompt, researchers and practitioners often turn to automated met… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  14. arXiv:2405.04324  [pdf, other

    cs.AI cs.CL cs.SE

    Granite Code Models: A Family of Open Foundation Models for Code Intelligence

    Authors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal , et al. (21 additional authors not shown)

    Abstract: Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabili… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Corresponding Authors: Rameswar Panda, Ruchir Puri; Equal Contributors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang

  15. arXiv:2404.02894  [pdf, other

    cs.CY cs.SI

    Automated Transparency: A Legal and Empirical Analysis of the Digital Services Act Transparency Database

    Authors: Rishabh Kaushal, Jacob van de Kerkhof, Catalina Goanta, Gerasimos Spanakis, Adriana Iamnitchi

    Abstract: The Digital Services Act (DSA) is a much awaited platforms liability reform in the European Union that was adopted on 1 November 2022 with the ambition to set a global example in terms of accountability and transparency. Among other obligations, the DSA emphasizes the need for online platforms to report on their content moderation decisions (`statements of reasons' - SoRs), which is a novel transp… ▽ More

    Submitted 3 May, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: accepted to FAccT 2024; camera-ready version; 19 pages

  16. arXiv:2403.17804  [pdf, other

    cs.CV cs.CL

    Improving Text-to-Image Consistency via Automatic Prompt Optimization

    Authors: Oscar Mañas, Pietro Astolfi, Melissa Hall, Candace Ross, Jack Urbanek, Adina Williams, Aishwarya Agrawal, Adriana Romero-Soriano, Michal Drozdzal

    Abstract: Impressive advances in text-to-image (T2I) generative models have yielded a plethora of high performing models which are able to generate aesthetically appealing, photorealistic images. Despite the progress, these models still struggle to produce images that are consistent with the input prompt, oftentimes failing to capture object quantities, relations and attributes properly. Existing solutions… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  17. arXiv:2403.15749  [pdf, other

    math.OC cs.CC cs.LG

    Horoballs and the subgradient method

    Authors: Adrian S. Lewis, Genaro Lopez-Acedo, Adriana Nicolae

    Abstract: To explore convex optimization on Hadamard spaces, we consider an iteration in the style of a subgradient algorithm. Traditionally, such methods assume that the underlying spaces are manifolds and that the objectives are geodesically convex: the methods are described using tangent spaces and exponential maps. By contrast, our iteration applies in a general Hadamard space, is framed in the underlyi… ▽ More

    Submitted 2 April, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

    MSC Class: 90C48; 65Y20; 49M29 ACM Class: G.1.6

  18. arXiv:2403.15214  [pdf, other

    cs.CY cs.CL cs.SI

    InstaSynth: Opportunities and Challenges in Generating Synthetic Instagram Data with ChatGPT for Sponsored Content Detection

    Authors: Thales Bertaglia, Lily Heisig, Rishabh Kaushal, Adriana Iamnitchi

    Abstract: Large Language Models (LLMs) raise concerns about lowering the cost of generating texts that could be used for unethical or illegal purposes, especially on social media. This paper investigates the promise of such models to help enforce legal requirements related to the disclosure of sponsored content online. We investigate the use of LLMs for generating synthetic Instagram captions with two objec… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: To appear at the 18th International AAAI Conference on Web and Social Media (ICWSM 2024) -- please cite accordingly

  19. arXiv:2403.14421  [pdf, other

    cs.LG cs.CR cs.CV

    DP-RDM: Adapting Diffusion Models to Private Domains Without Fine-Tuning

    Authors: Jonathan Lebensold, Maziar Sanjabi, Pietro Astolfi, Adriana Romero-Soriano, Kamalika Chaudhuri, Mike Rabbat, Chuan Guo

    Abstract: Text-to-image diffusion models have been shown to suffer from sample-level memorization, possibly reproducing near-perfect replica of images that they are trained on, which may be undesirable. To remedy this issue, we develop the first differentially private (DP) retrieval-augmented generation algorithm that is capable of generating high-quality image samples while providing provable privacy guara… ▽ More

    Submitted 13 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  20. arXiv:2403.06009  [pdf, other

    cs.LG

    Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

    Authors: Swapnaja Achintalwar, Adriana Alvarado Garcia, Ateret Anaby-Tavor, Ioana Baldini, Sara E. Berger, Bishwaranjan Bhattacharjee, Djallel Bouneffouf, Subhajit Chaudhury, Pin-Yu Chen, Lamogha Chiazor, Elizabeth M. Daly, Kirushikesh DB, Rogério Abreu de Paula, Pierre Dognin, Eitan Farchi, Soumya Ghosh, Michael Hind, Raya Horesh, George Kour, Ja Young Lee, Nishtha Madaan, Sameep Mehta, Erik Miehling, Keerthiram Murugesan, Manish Nagireddy , et al. (13 additional authors not shown)

    Abstract: Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations. Due to several limiting factors surrounding LLMs (training cost, API access, data availability, etc.), it may not always be feasible to impose direct safety constraints on a deployed model. Therefore, an efficient and reliable alternative is required. To this end, we presen… ▽ More

    Submitted 13 June, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  21. arXiv:2403.05591  [pdf, other

    cs.HC cs.LG

    Data-Driven Ergonomic Risk Assessment of Complex Hand-intensive Manufacturing Processes

    Authors: Anand Krishnan, Xingjian Yang, Utsav Seth, Jonathan M. Jeyachandran, Jonathan Y. Ahn, Richard Gardner, Samuel F. Pedigo, Adriana, Blom-Schieber, Ashis G. Banerjee, Krithika Manohar

    Abstract: Hand-intensive manufacturing processes, such as composite layup and textile dra**, require significant human dexterity to accommodate task complexity. These strenuous hand motions often lead to musculoskeletal disorders and rehabilitation surgeries. We develop a data-driven ergonomic risk assessment system with a special focus on hand and finger activity to better identify and address ergonomic… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 26 pages, 7 figures

  22. arXiv:2403.03024  [pdf, other

    cs.SE

    Toward Improved Deep Learning-based Vulnerability Detection

    Authors: Adriana Sejfia, Satyaki Das, Saad Shafiq, Nenad Medvidović

    Abstract: Deep learning (DL) has been a common thread across several recent techniques for vulnerability detection. The rise of large, publicly available datasets of vulnerabilities has fueled the learning process underpinning these techniques. While these datasets help the DL-based vulnerability detectors, they also constrain these detectors' predictive abilities. Vulnerabilities in these datasets have to… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  23. arXiv:2402.09615  [pdf, other

    cs.CL cs.AI cs.LG

    API Pack: A Massive Multi-Programming Language Dataset for API Call Generation

    Authors: Zhen Guo, Adriana Meza Soria, Wei Sun, Yikang Shen, Rameswar Panda

    Abstract: We introduce API Pack, a massive multi-programming language dataset containing more than 1 million instruction-API call pairs to improve the API call generation capabilities of large language models. By fine-tuning CodeLlama-13B on 20,000 Python instances from API Pack, we enable it to outperform GPT-3.5 and GPT-4 in generating unseen API calls. Fine-tuning on API Pack also facilitates cross-progr… ▽ More

    Submitted 3 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  24. arXiv:2402.07691  [pdf, other

    cs.RO

    Evaluation of a Smart Mobile Robotic System for Industrial Plant Inspection and Supervision

    Authors: Georg K. J. Fischer, Max Bergau, D. Adriana Gómez-Rosal, Andreas Wachaja, Johannes Gräter, Matthias Odenweller, Uwe Piechottka, Fabian Hoeflinger, Nikhil Gosala, Niklas Wetzel, Daniel Büscher, Abhinav Valada, Wolfram Burgard

    Abstract: Automated and autonomous industrial inspection is a longstanding research field, driven by the necessity to enhance safety and efficiency within industrial settings. In addressing this need, we introduce an autonomously navigating robotic system designed for comprehensive plant inspection. This innovative system comprises a robotic platform equipped with a diverse array of sensors integrated to fa… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: Submitted for publication in IEEE Sensors Journal

  25. arXiv:2402.04334  [pdf

    eess.SY cs.CY

    Home Automation System based on Intelligent Transducer Enablers

    Authors: Manuel Suárez-Albela, Paula Fraga-Lamas, Tiago M. Fernández-Caramés, Adriana Dapena, Miguel González-López

    Abstract: This paper presents a novel home automation system named HASITE (Home Automation System based on Intelligent Transducer Enablers), which has been specifically designed to identify and configure transducers easily and quickly. These features are especially useful in situations where many transducers are deployed, since their setup becomes a cumbersome task that consumes a significant amount of time… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 27 pages, 17 figures, accepted version of Sensors journal article

    Journal ref: Sensors 2016, 16(10), 1595

  26. arXiv:2402.00052  [pdf, other

    cs.AI cs.CV cs.GR

    Zero-shot Sequential Neuro-symbolic Reasoning for Automatically Generating Architecture Schematic Designs

    Authors: Milin Kodnongbua, Lawrence H. Curtis, Adriana Schulz

    Abstract: This paper introduces a novel automated system for generating architecture schematic designs aimed at streamlining complex decision-making at the multifamily real estate development project's outset. Leveraging the combined strengths of generative AI (neuro reasoning) and mathematical program solvers (symbolic reasoning), the method addresses both the reliance on expert insights and technical chal… ▽ More

    Submitted 25 January, 2024; originally announced February 2024.

  27. arXiv:2401.15279  [pdf, other

    cs.GR cs.HC

    FabHacks: Transform Everyday Objects into Functional Fixtures

    Authors: Yuxuan Mei, Benjamin Jones, Dan Cascaval, Jennifer Mankoff, Etienne Vouga, Adriana Schulz

    Abstract: Storage, organizing, and decorating are an important part of home design. While one can buy commercial items for many of these tasks, this can be costly, and re-use is more sustainable. An alternative is a "home hack", a functional assembly that can be constructed from existing household items. However, coming up with such hacks requires combining objects to make a physically valid design, which m… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  28. arXiv:2401.05891  [pdf, other

    cs.CV

    LiDAR data acquisition and processing for ecology applications

    Authors: Ion Ciobotari, Adriana Príncipe, Maria Alexandra Oliveira, João Nuno Silva

    Abstract: The collection of ecological data in the field is essential to diagnose, monitor and manage ecosystems in a sustainable way. Since acquisition of this information through traditional methods are generally time-consuming, due to the capability of recording large volumes of data in short time periods, automation of data acquisition sees a growing trend. Terrestrial laser scanners (TLS), particularly… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  29. arXiv:2401.01990  [pdf, other

    cs.CV cs.AI cs.LG

    GPS-SSL: Guided Positive Sampling to Inject Prior Into Self-Supervised Learning

    Authors: Aarash Feizi, Randall Balestriero, Adriana Romero-Soriano, Reihaneh Rabbany

    Abstract: We propose Guided Positive Sampling Self-Supervised Learning (GPS-SSL), a general method to inject a priori knowledge into Self-Supervised Learning (SSL) positive samples selection. Current SSL methods leverage Data-Augmentations (DA) for generating positive samples and incorporate prior knowledge - an incorrect, or too weak DA will drastically reduce the quality of the learned representation. GPS… ▽ More

    Submitted 9 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

  30. arXiv:2401.01482  [pdf, other

    cs.CV cs.AI cs.LG

    Incorporating Geo-Diverse Knowledge into Prompting for Increased Geographical Robustness in Object Recognition

    Authors: Kyle Buettner, Sina Malakouti, Xiang Lorraine Li, Adriana Kovashka

    Abstract: Existing object recognition models have been shown to lack robustness in diverse geographical scenarios due to domain shifts in design and context. Class representations need to be adapted to more accurately reflect an object concept under these shifts. In the absence of training data from target geographies, we hypothesize that geographically diverse descriptive knowledge of categories can enhanc… ▽ More

    Submitted 29 March, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: To appear in IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2024

  31. arXiv:2312.15993  [pdf

    cs.AI cs.RO eess.SY

    Adaptive Kalman-based hybrid car following strategy using TD3 and CACC

    Authors: Yuqi Zheng, Ruidong Yan, Bin Jia, Rui Jiang, Adriana TAPUS, Xiao**g Chen, Shiteng Zheng, Ying Shang

    Abstract: In autonomous driving, the hybrid strategy of deep reinforcement learning and cooperative adaptive cruise control (CACC) can fully utilize the advantages of the two algorithms and significantly improve the performance of car following. However, it is challenging for the traditional hybrid strategy based on fixed coefficients to adapt to mixed traffic flow scenarios, which may decrease the performa… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 32pages,13figures

  32. arXiv:2312.08578  [pdf, other

    cs.CV

    A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

    Authors: Jack Urbanek, Florian Bordes, Pietro Astolfi, Mary Williamson, Vasu Sharma, Adriana Romero-Soriano

    Abstract: Curation methods for massive vision-language datasets trade off between dataset size and quality. However, even the highest quality of available curated captions are far too short to capture the rich visual detail in an image. To show the value of dense and highly-aligned image-text pairs, we collect the Densely Captioned Images (DCI) dataset, containing 7805 natural images human-annotated with ma… ▽ More

    Submitted 17 June, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

  33. arXiv:2311.17405  [pdf, other

    cs.RO

    Learning and Autonomy for Extraterrestrial Terrain Sampling: An Experience Report from OWLAT Deployment

    Authors: Pranay Thangeda, Ashish Goel, Erica Tevere, Yifan Zhu, Erik Kramer, Adriana Daca, Hari Nayar, Kris Hauser, Melkior Ornik

    Abstract: Extraterrestrial autonomous lander missions increasingly demand adaptive capabilities to handle the unpredictable and diverse nature of the terrain. This paper discusses the deployment of a Deep Meta-Learning with Controlled Deployment Gaps (CoDeGa) trained model for terrain scoo** tasks in Ocean Worlds Lander Autonomy Testbed (OWLAT) at NASA Jet Propulsion Laboratory. The CoDeGa-powered scoopin… ▽ More

    Submitted 4 December, 2023; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Updated references to include recent work on autonomy for ocean worlds

  34. arXiv:2311.09611  [pdf, other

    cs.HC

    DeltaLCA: Comparative Life-Cycle Assessment for Electronics Design

    Authors: Zhihan Zhang, Felix Hähnlein, Yuxuan Mei, Zachary Englhardt, Shwetak Patel, Adriana Schulz, Vikram Iyer

    Abstract: Reducing the environmental footprint of electronics and computing devices requires new tools that empower designers to make informed decisions about sustainability during the design process itself. This is not possible with current tools for life cycle assessment (LCA) which require substantial domain expertise and time to evaluate the numerous chips and other components that make up a device. We… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  35. arXiv:2311.03585  [pdf, ps, other

    cs.CR cs.LO cs.OS

    OpenBSD formal driver verification with SeL4

    Authors: Adriana Nicolae, Paul Irofti, Ioana Leustean

    Abstract: The seL4 microkernel is currently the only kernel that has been fully formally verified. In general, the increased interest in ensuring the security of a kernel's code results from its important role in the entire operating system. One of the basic features of an operating system is that it abstracts the handling of devices. This abstraction is represented by device drivers - the software that man… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  36. arXiv:2310.02902  [pdf, other

    cs.LG cond-mat.mtrl-sci cs.AI

    Searching for High-Value Molecules Using Reinforcement Learning and Transformers

    Authors: Raj Ghugare, Santiago Miret, Adriana Hugessen, Mariano Phielipp, Glen Berseth

    Abstract: Reinforcement learning (RL) over text representations can be effective for finding high-value policies that can search over graphs. However, RL requires careful structuring of the search space and algorithm design to be effective in this challenge. Through extensive experiments, we explore how different design choices for text grammar and algorithmic choices for training can affect an RL policy's… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  37. arXiv:2310.00158  [pdf, other

    cs.CV cs.AI cs.LG

    Feedback-guided Data Synthesis for Imbalanced Classification

    Authors: Reyhane Askari Hemmat, Mohammad Pezeshki, Florian Bordes, Michal Drozdzal, Adriana Romero-Soriano

    Abstract: Current status quo in machine learning is to use static datasets of real images for training, which often come from long-tailed distributions. With the recent advances in generative models, researchers have started augmenting these static datasets with synthetic data, reporting moderate performance improvements on classification tasks. We hypothesize that these performance gains are limited by the… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

  38. arXiv:2310.00091  [pdf, other

    cs.HC cs.SE

    Towards Automated Accessibility Report Generation for Mobile Apps

    Authors: Amanda Swearngin, Jason Wu, Xiaoyi Zhang, Esteban Gomez, Jen Coughenour, Rachel Stukenborg, Bhavya Garg, Greg Hughes, Adriana Hilliard, Jeffrey P. Bigham, Jeffrey Nichols

    Abstract: Many apps have basic accessibility issues, like missing labels or low contrast. Automated tools can help app developers catch basic issues, but can be laborious or require writing dedicated tests. We propose a system, motivated by a collaborative process with accessibility stakeholders at a large technology company, to generate whole app accessibility reports by combining varied data collection me… ▽ More

    Submitted 16 October, 2023; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: 24 pages, 8 figures

  39. arXiv:2309.13525  [pdf, other

    cs.CV

    Semi-Supervised Domain Generalization for Object Detection via Language-Guided Feature Alignment

    Authors: Sina Malakouti, Adriana Kovashka

    Abstract: Existing domain adaptation (DA) and generalization (DG) methods in object detection enforce feature alignment in the visual space but face challenges like object appearance variability and scene complexity, which make it difficult to distinguish between objects and achieve accurate detection. In this paper, we are the first to address the problem of semi-supervised domain generalization by explori… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted at BMVC 2023

  40. arXiv:2309.12764  [pdf, other

    cs.SI

    Multi-Modal Embeddings for Isolating Cross-Platform Coordinated Information Campaigns on Social Media

    Authors: Fabio Barbero, Sander op den Camp, Kristian van Kuijk, Carlos Soto García-Delgado, Gerasimos Spanakis, Adriana Iamnitchi

    Abstract: Coordinated multi-platform information operations are implemented in a variety of contexts on social media, including state-run disinformation campaigns, marketing strategies, and social activism. Characterized by the promotion of messages via multi-platform coordination, in which multiple user accounts, within a short time, post content advancing a shared informational agenda on multiple platform… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: To appear in the 5th Multidisciplinary International Symposium on Disinformation in Open Online Media (MISDOOM 2023)

    ACM Class: H.3.5; H.3.1

  41. arXiv:2309.12729  [pdf, other

    cs.SI

    Coordinated Information Campaigns on Social Media: A Multifaceted Framework for Detection and Analysis

    Authors: Kin Wai Ng, Adriana Iamnitchi

    Abstract: The prevalence of coordinated information campaigns in social media platforms has significant negative consequences across various domains, including social, political, and economic processes. This paper proposes a multifaceted framework for detecting and analysing coordinated message promotion on social media. By simultaneously considering features related to content, time, and network dimensions… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: To be presented in the 5th Multidisciplinary International Symposium on Disinformation in Open Online Media (MISDOOM 2023)

    ACM Class: H.3.5; H.3.1

  42. arXiv:2309.05384  [pdf, other

    eess.AS cs.SD

    Towards generalisable and calibrated synthetic speech detection with self-supervised representations

    Authors: Octavian Pascu, Adriana Stan, Dan Oneata, Elisabeta Oneata, Horia Cucu

    Abstract: Generalisation -- the ability of a model to perform well on unseen data -- is crucial for building reliable deepfake detectors. However, recent studies have shown that the current audio deepfake models fall short of this desideratum. In this work we investigate the potential of pretrained self-supervised representations in building general and calibrated audio deepfake detection models. We show th… ▽ More

    Submitted 12 June, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: Accepted at Interspeech 2024

  43. arXiv:2309.01318  [pdf, other

    cs.CV eess.IV

    An FPGA smart camera implementation of segmentation models for drone wildfire imagery

    Authors: Eduardo Guarduño-Martinez, Jorge Ciprian-Sanchez, Gerardo Valente, Vazquez-Garcia, Gerardo Rodriguez-Hernandez, Adriana Palacios-Rosas, Lucile Rossi-Tisson, Gilberto Ochoa-Ruiz

    Abstract: Wildfires represent one of the most relevant natural disasters worldwide, due to their impact on various societal and environmental levels. Thus, a significant amount of research has been carried out to investigate and apply computer vision techniques to address this problem. One of the most promising approaches for wildfire fighting is the use of drones equipped with visible and infrared cameras… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: This paper has been accepted at the 22nd Mexican International Conference on Artificial Intelligence (MICAI 2023)

  44. arXiv:2308.10372  [pdf

    eess.IV cs.CV cs.LG q-bio.QM

    Develo** a Machine Learning-Based Clinical Decision Support Tool for Uterine Tumor Imaging

    Authors: Darryl E. Wright, Adriana V. Gregory, Deema Anaam, Sepideh Yadollahi, Sumana Ramanathan, Kafayat A. Oyemade, Reem Alsibai, Heather Holmes, Harrison Gottlich, Cherie-Akilah G. Browne, Sarah L. Cohen Rassier, Isabel Green, Elizabeth A. Stewart, Hiroaki Takahashi, Bohyun Kim, Shannon Laughlin-Tommaso, Timothy L. Kline

    Abstract: Uterine leiomyosarcoma (LMS) is a rare but aggressive malignancy. On imaging, it is difficult to differentiate LMS from, for example, degenerated leiomyoma (LM), a prevalent but benign condition. We curated a data set of 115 axial T2-weighted MRI images from 110 patients (mean [range] age=45 [17-81] years) with UTs that included five different tumor types. These data were randomly split stratifyin… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  45. arXiv:2308.06198  [pdf, other

    cs.CV cs.HC

    DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity

    Authors: Melissa Hall, Candace Ross, Adina Williams, Nicolas Carion, Michal Drozdzal, Adriana Romero Soriano

    Abstract: The unprecedented photorealistic results achieved by recent text-to-image generative systems and their increasing use as plug-and-play content creation solutions make it crucial to understand their potential biases. In this work, we introduce three indicators to evaluate the realism, diversity and prompt-generation consistency of text-to-image generative systems when prompted to generate objects f… ▽ More

    Submitted 18 March, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

  46. arXiv:2308.05612  [pdf, other

    cs.RO cs.AI

    A Smart Robotic System for Industrial Plant Supervision

    Authors: D. Adriana Gómez-Rosal, Max Bergau, Georg K. J. Fischer, Andreas Wachaja, Johannes Gräter, Matthias Odenweller, Uwe Piechottka, Fabian Hoeflinger, Nikhil Gosala, Niklas Wetzel, Daniel Büscher, Abhinav Valada, Wolfram Burgard

    Abstract: In today's chemical plants, human field operators perform frequent integrity checks to guarantee high safety standards, and thus are possibly the first to encounter dangerous operating conditions. To alleviate their task, we present a system consisting of an autonomously navigating robot integrated with various sensors and intelligent data processing. It is able to detect methane leaks and estimat… ▽ More

    Submitted 1 September, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

    Comments: Final submission for IEEE Sensors 2023

  47. arXiv:2308.03919  [pdf, other

    cs.DC

    The FIDS Theorems: Tensions between Multinode and Multicore Performance in Transactional Systems

    Authors: Naama Ben-David, Gal Sela, Adriana Szekeres

    Abstract: Traditionally, distributed and parallel transactional systems have been studied in isolation, as they targeted different applications and experienced different bottlenecks. However, modern high-bandwidth networks have made the study of systems that are both distributed (i.e., employ multiple nodes) and parallel (i.e., employ multiple cores per node) necessary to truly make use of the available har… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  48. arXiv:2307.14482  [pdf

    eess.IV cs.CV cs.LG

    Role of Image Acquisition and Patient Phenotype Variations in Automatic Segmentation Model Generalization

    Authors: Timothy L. Kline, Sumana Ramanathan, Harrison C. Gottlich, Panagiotis Korfiatis, Adriana V. Gregory

    Abstract: Purpose: This study evaluated the out-of-domain performance and generalization capabilities of automated medical image segmentation models, with a particular focus on adaptation to new image acquisitions and disease type. Materials: Datasets from both non-contrast and contrast-enhanced abdominal CT scans of healthy patients and those with polycystic kidney disease (PKD) were used. A total of 400… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

  49. arXiv:2307.14377  [pdf, other

    cs.CL cs.AI

    How Can Large Language Models Help Humans in Design and Manufacturing?

    Authors: Liane Makatura, Michael Foshey, Bohan Wang, Felix HähnLein, **chuan Ma, Bolei Deng, Megan Tjandrasuwita, Andrew Spielberg, Crystal Elaine Owens, Peter Yichen Chen, Allan Zhao, Amy Zhu, Wil J Norton, Edward Gu, Joshua Jacob, Yifei Li, Adriana Schulz, Wojciech Matusik

    Abstract: The advancement of Large Language Models (LLMs), including GPT-4, provides exciting new opportunities for generative design. We investigate the application of this tool across the entire design and manufacturing workflow. Specifically, we scrutinize the utility of LLMs in tasks such as: converting a text-based prompt into a design specification, transforming a design into manufacturing instruction… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  50. arXiv:2307.09898  [pdf, other

    eess.AS cs.AI

    An analysis on the effects of speaker embedding choice in non auto-regressive TTS

    Authors: Adriana Stan, Johannah O'Mahony

    Abstract: In this paper we introduce a first attempt on understanding how a non-autoregressive factorised multi-speaker speech synthesis architecture exploits the information present in different speaker embedding sets. We analyse if jointly learning the representations, and initialising them from pretrained models determine any quality improvements for target speaker identities. In a separate analysis, we… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted for publication at ISCA Speech Synthesis Workshop 2023