Search | arXiv e-print repository

arXiv:2401.10389 [pdf, other]

Inverse Problem Approach to Aberration Correction for in vivo Transcranial Imaging Based on a Sparse Representation of Contrast-enhanced Ultrasound Data

Authors: Paul Xing, Antoine Malescot, Eric Martineau, Ravi Rungta, Jean Provost

Abstract: Transcranial ultrasound imaging is currently limited by attenuation and aberration induced by the skull. First used in contrast-enhanced ultrasound (CEUS), highly echoic microbubbles allowed for the development of novel imaging modalities such as ultrasound localization microscopy (ULM). Herein, we develop an inverse problem approach to aberration correction (IPAC) that leverages the sparsity of m… ▽ More Transcranial ultrasound imaging is currently limited by attenuation and aberration induced by the skull. First used in contrast-enhanced ultrasound (CEUS), highly echoic microbubbles allowed for the development of novel imaging modalities such as ultrasound localization microscopy (ULM). Herein, we develop an inverse problem approach to aberration correction (IPAC) that leverages the sparsity of microbubble signals. We propose to use the \textit{a priori} knowledge of the medium based upon microbubble localization and wave propagation to build a forward model to link the measured signals directly to the aberration function. A standard least-squares inversion is then used to retrieve the aberration function. We first validated IPAC on simulated data of a vascular network using plane wave as well as divergent wave emissions. We then evaluated the reproducibility of IPAC \textit{in vivo} in 5 mouse brains. We showed that aberration correction improved the contrast of CEUS images by 4.6 dB. For ULM images, IPAC yielded sharper vessels, reduced vessel duplications, and improved the resolution from 21.1 $μ$m to 18.3 $μ$m. Aberration correction also improved hemodynamic quantification for velocity magnitude and flow direction. △ Less

Submitted 14 May, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

arXiv:2312.06674 [pdf, other]

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Authors: Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, Madian Khabsa

Abstract: We introduce Llama Guard, an LLM-based input-output safeguard model geared towards Human-AI conversation use cases. Our model incorporates a safety risk taxonomy, a valuable tool for categorizing a specific set of safety risks found in LLM prompts (i.e., prompt classification). This taxonomy is also instrumental in classifying the responses generated by LLMs to these prompts, a process we refer to… ▽ More We introduce Llama Guard, an LLM-based input-output safeguard model geared towards Human-AI conversation use cases. Our model incorporates a safety risk taxonomy, a valuable tool for categorizing a specific set of safety risks found in LLM prompts (i.e., prompt classification). This taxonomy is also instrumental in classifying the responses generated by LLMs to these prompts, a process we refer to as response classification. For the purpose of both prompt and response classification, we have meticulously gathered a dataset of high quality. Llama Guard, a Llama2-7b model that is instruction-tuned on our collected dataset, albeit low in volume, demonstrates strong performance on existing benchmarks such as the OpenAI Moderation Evaluation dataset and ToxicChat, where its performance matches or exceeds that of currently available content moderation tools. Llama Guard functions as a language model, carrying out multi-class classification and generating binary decision scores. Furthermore, the instruction fine-tuning of Llama Guard allows for the customization of tasks and the adaptation of output formats. This feature enhances the model's capabilities, such as enabling the adjustment of taxonomy categories to align with specific use cases, and facilitating zero-shot or few-shot prompting with diverse taxonomies at the input. We are making Llama Guard model weights available and we encourage researchers to further develop and adapt them to meet the evolving needs of the community for AI safety. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2311.02772 [pdf, ps, other]

Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

Authors: Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel

Abstract: In this paper, we show that a simple self-supervised pre-trained audio model can achieve comparable inference efficiency to more complicated pre-trained models with speech transformer encoders. These speech transformers rely on mixing convolutional modules with self-attention modules. They achieve state-of-the-art performance on ASR with top efficiency. We first show that employing these speech tr… ▽ More In this paper, we show that a simple self-supervised pre-trained audio model can achieve comparable inference efficiency to more complicated pre-trained models with speech transformer encoders. These speech transformers rely on mixing convolutional modules with self-attention modules. They achieve state-of-the-art performance on ASR with top efficiency. We first show that employing these speech transformers as an encoder significantly improves the efficiency of pre-trained audio models as well. However, our study shows that we can achieve comparable efficiency with advanced self-attention solely. We demonstrate that this simpler approach is particularly beneficial with a low-bit weight quantization technique of a neural network to improve efficiency. We hypothesize that it prevents propagating the errors between different quantized modules compared to recent speech transformers mixing quantized convolution and the quantized self-attention modules. △ Less

Submitted 8 February, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

Comments: 5 pages; accepted to Self-supervision in Audio, Speech and Beyond (SASB) workshop in ICASSP24

arXiv:2309.16039 [pdf, other]

Effective Long-Context Scaling of Foundation Models

Authors: Wenhan Xiong, **gyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

Abstract: We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built through continual pretraining from Llama 2 with longer training sequences and on a dataset where long texts are upsampled. We perform extensive evaluation on language modeling, synthetic context probing tasks, and a wide range of research benchmarks. On research benchm… ▽ More We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built through continual pretraining from Llama 2 with longer training sequences and on a dataset where long texts are upsampled. We perform extensive evaluation on language modeling, synthetic context probing tasks, and a wide range of research benchmarks. On research benchmarks, our models achieve consistent improvements on most regular tasks and significant improvements on long-context tasks over Llama 2. Notably, with a cost-effective instruction tuning procedure that does not require human-annotated long instruction data, the 70B variant can already surpass gpt-3.5-turbo-16k's overall performance on a suite of long-context tasks. Alongside these results, we provide an in-depth analysis on the individual components of our method. We delve into Llama's position encodings and discuss its limitation in modeling long dependencies. We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences. △ Less

Submitted 13 November, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

arXiv:2307.09288 [pdf, other]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Authors: Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini , et al. (43 additional authors not shown)

Abstract: In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be… ▽ More In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs. △ Less

Submitted 19 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

arXiv:2209.13759 [pdf, other]

Structured Summarization: Unified Text Segmentation and Segment Labeling as a Generation Task

Authors: Hakan Inan, Rashi Rungta, Yashar Mehdad

Abstract: Text segmentation aims to divide text into contiguous, semantically coherent segments, while segment labeling deals with producing labels for each segment. Past work has shown success in tackling segmentation and labeling for documents and conversations. This has been possible with a combination of task-specific pipelines, supervised and unsupervised learning objectives. In this work, we propose a… ▽ More Text segmentation aims to divide text into contiguous, semantically coherent segments, while segment labeling deals with producing labels for each segment. Past work has shown success in tackling segmentation and labeling for documents and conversations. This has been possible with a combination of task-specific pipelines, supervised and unsupervised learning objectives. In this work, we propose a single encoder-decoder neural network that can handle long documents and conversations, trained simultaneously for both segmentation and segment labeling using only standard supervision. We successfully show a way to solve the combined task as a pure generation task, which we refer to as structured summarization. We apply the same technique to both document and conversational data, and we show state of the art performance across datasets for both segmentation and labeling, under both high- and low-resource settings. Our results establish a strong case for considering text segmentation and segment labeling as a whole, and moving towards general-purpose techniques that don't depend on domain expertise or task-specific components. △ Less

Submitted 27 September, 2022; originally announced September 2022.

arXiv:2209.10650 [pdf, other]

doi 10.1109/TMI.2023.3316995

Phase Aberration Correction for in vivo Ultrasound Localization Microscopy Using a Spatiotemporal Complex-Valued Neural Network

Authors: Paul Xing, Jonathan Porée, Brice Rauby, Antoine Malescot, Éric Martineau, Vincent Perrot, Ravi L. Rungta, Jean Provost

Abstract: Ultrasound Localization Microscopy (ULM) can map microvessels at a resolution of a few micrometers (μm). Transcranial ULM remains challenging in presence of aberrations caused by the skull, which lead to localization errors. Herein, we propose a deep learning approach based on complex-valued convolutional neural networks (CV-CNNs) to retrieve the aberration function, which can then be used to form… ▽ More Ultrasound Localization Microscopy (ULM) can map microvessels at a resolution of a few micrometers (μm). Transcranial ULM remains challenging in presence of aberrations caused by the skull, which lead to localization errors. Herein, we propose a deep learning approach based on complex-valued convolutional neural networks (CV-CNNs) to retrieve the aberration function, which can then be used to form enhanced images using standard delay-and-sum beamforming. CV-CNNs were selected as they can apply time delays through multiplication with in-phase quadrature input data. Predicting the aberration function rather than corrected images also confers enhanced explainability to the network. In addition, 3D spatiotemporal convolutions were used for the network to leverage entire microbubble tracks. For training and validation, we used an anatomically and hemodynamically realistic mouse brain microvascular network model to simulate the flow of microbubbles in presence of aberration. The proposed CV-CNN performance was compared to the coherence-based method by using microbubble tracks. We then confirmed the capability of the proposed network to generalize to transcranial \textit{in vivo} data in the mouse brain (n=3). Vascular reconstructions using a locally predicted aberration function included additional and sharper vessels. The CV-CNN was more robust than the coherence-based method and could perform aberration correction in a 6-month-old mouse. After correction, we measured a resolution of 15.6 μm for younger mice, representing an improvement of 25.8 $\%$, while the resolution was improved by 13.9 $\%$ for the 6-month-old mouse. This work leads to different applications for complex-valued convolutions in biomedical imaging and strategies to perform transcranial ULM. △ Less

Submitted 17 July, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

Showing 1–7 of 7 results for author: Rungta, R