-
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Authors:
Ivan Villa-Renteria,
Mason L. Wang,
Zachary Shah,
Zhe Li,
Soohyun Kim,
Neelesh Ramachandran,
Mert Pilanci
Abstract:
We present Subtractive Training, a simple and novel method for synthesizing individual musical instrument stems given other instruments as context. This method pairs a dataset of complete music mixes with 1) a variant of the dataset lacking a specific stem, and 2) LLM-generated instructions describing how the missing stem should be reintroduced. We then fine-tune a pretrained text-to-audio diffusi…
▽ More
We present Subtractive Training, a simple and novel method for synthesizing individual musical instrument stems given other instruments as context. This method pairs a dataset of complete music mixes with 1) a variant of the dataset lacking a specific stem, and 2) LLM-generated instructions describing how the missing stem should be reintroduced. We then fine-tune a pretrained text-to-audio diffusion model to generate the missing instrument stem, guided by both the existing stems and the text instruction. Our results demonstrate Subtractive Training's efficacy in creating authentic drum stems that seamlessly blend with the existing tracks. We also show that we can use the text instruction to control the generation of the inserted stem in terms of rhythm, dynamics, and genre, allowing us to modify the style of a single instrument in a full song while kee** the remaining instruments the same. Lastly, we extend this technique to MIDI formats, successfully generating compatible bass, drum, and guitar parts for incomplete arrangements.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Algoritmos de minerĂa de datos en la industria sanitaria
Authors:
Marta Li Wang
Abstract:
In this paper, we review data mining approaches for health applications. Our focus is on hardware-centric approaches. Modern computers consist of multiple processors, each equipped with multiple cores, each with a set of arithmetic/logical units. Thus, a modern computer may be composed of several thousand units capable of doing arithmetic operations like addition and multiplication. Graphic proces…
▽ More
In this paper, we review data mining approaches for health applications. Our focus is on hardware-centric approaches. Modern computers consist of multiple processors, each equipped with multiple cores, each with a set of arithmetic/logical units. Thus, a modern computer may be composed of several thousand units capable of doing arithmetic operations like addition and multiplication. Graphic processors, in addition may offer some thousand such units. In both cases, single instruction multiple data and multiple instruction multiple data parallelism must be exploited. We review the principles of algorithms which exploit this parallelism and focus also on the memory issues when multiple processing units access main memory through caches. This is important for many applications of health, such as ECG, EEG, CT, SPECT, fMRI, DTI, ultrasound, microscopy, dermascopy, etc.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Extracting full-field subpixel structural displacements from videos via deep learning
Authors:
Lele Luan,
**gwei Zheng,
Yongchao Yang,
Ming L. Wang,
Hao Sun
Abstract:
This paper develops a deep learning framework based on convolutional neural networks (CNNs) that enable real-time extraction of full-field subpixel structural displacements from videos. In particular, two new CNN architectures are designed and trained on a dataset generated by the phase-based motion extraction method from a single lab-recorded high-speed video of a dynamic structure. As displaceme…
▽ More
This paper develops a deep learning framework based on convolutional neural networks (CNNs) that enable real-time extraction of full-field subpixel structural displacements from videos. In particular, two new CNN architectures are designed and trained on a dataset generated by the phase-based motion extraction method from a single lab-recorded high-speed video of a dynamic structure. As displacement is only reliable in the regions with sufficient texture contrast, the sparsity of motion field induced by the texture mask is considered via the network architecture design and loss function definition. Results show that, with the supervision of full and sparse motion field, the trained network is capable of identifying the pixels with sufficient texture contrast as well as their subpixel motions. The performance of the trained networks is tested on various videos of other structures to extract the full-field motion (e.g., displacement time histories), which indicates that the trained networks have generalizability to accurately extract full-field subtle displacements for pixels with sufficient texture contrast.
△ Less
Submitted 3 September, 2020; v1 submitted 31 August, 2020;
originally announced August 2020.
-
A Novel Approach of using AR and Smart Surgical Glasses Supported Trauma Care
Authors:
Anurag Lal,
Ming-Hsien Hu,
Pei-Yuan Lee,
Min Liang Wang
Abstract:
BACKGROUND: Augmented reality (AR) is gaining popularity in varying field such as computer gaming and medical education fields. However, still few of applications in real surgeries. Orthopedic surgical applications are currently limited and underdeveloped. - METHODS: The clinic validation was prepared with the currently available AR equipment and software. A total of 1 Vertebroplasty, 2 ORIF Pelvi…
▽ More
BACKGROUND: Augmented reality (AR) is gaining popularity in varying field such as computer gaming and medical education fields. However, still few of applications in real surgeries. Orthopedic surgical applications are currently limited and underdeveloped. - METHODS: The clinic validation was prepared with the currently available AR equipment and software. A total of 1 Vertebroplasty, 2 ORIF Pelvis fracture, 1 ORIF with PFN for Proximal Femoral Fracture, 1 CRIF for distal radius fracture and 2 ORIF for Tibia Fracture cases were performed with fluoroscopy combined with AR smart surgical glasses system. - RESULTS: A total of 1 Vertebroplasty, 2 ORIF Pelvis fracture, 1 ORIF with PFN for Proximal Femoral Fracture, 1 CRIF for distal radius fracture and 2 ORIF for Tibia Fracture cases are performed to evaluate the benefits of AR surgery. Among the AR surgeries, surgeons wear the smart surgical are lot reduce of eyes of turns to focus on the monitors. This paper shows the potential ability of augmented reality technology for trauma surgery.
△ Less
Submitted 25 May, 2020;
originally announced May 2020.
-
Exploiting Spatial Degrees of Freedom for High Data Rate Ultrasound Communication with Implantable Devices
Authors:
Max L. Wang,
Amin Arbabian
Abstract:
We propose and demonstrate an ultrasonic communication link using spatial degrees of freedom to increase data rates for deeply implantable medical devices. Low attenuation and millimeter wavelengths make ultrasound an ideal communication medium for miniaturized low-power implants. While small spectral bandwidth has drastically limited achievable data rates in conventional ultrasonic implants, larg…
▽ More
We propose and demonstrate an ultrasonic communication link using spatial degrees of freedom to increase data rates for deeply implantable medical devices. Low attenuation and millimeter wavelengths make ultrasound an ideal communication medium for miniaturized low-power implants. While small spectral bandwidth has drastically limited achievable data rates in conventional ultrasonic implants, large spatial bandwidth can be exploited by using multiple transducers in a multiple-input/multiple-output system to provide spatial multiplexing gain without additional power, larger bandwidth, or complicated packaging. We experimentally verify the communication link in mineral oil with a transmitter and receiver 5 cm apart, each housing two custom-designed mm-sized piezoelectric transducers operating at the same frequency. Two streams of data modulated with quadrature phase-shift keying at 125 kbps are simultaneously transmitted and received on both channels, effectively doubling the data rate to 250 kbps with a measured bit error rate below 1e-4. We also evaluate the performance and robustness of the channel separation network by testing the communication link after introducing position offsets. These results demonstrate the potential of spatial multiplexing to enable more complex implant applications requiring higher data rates.
△ Less
Submitted 16 February, 2017;
originally announced February 2017.