Search | arXiv e-print repository

Diff-MST: Differentiable Mixing Style Transfer

Authors: Soumya Sai Vanka, Christian Steinmetz, Jean-Baptiste Rolland, Joshua Reiss, George Fazekas

Abstract: Mixing style transfer automates the generation of a multitrack mix for a given set of tracks by inferring production attributes from a reference song. However, existing systems for mixing style transfer are limited in that they often operate only on a fixed number of tracks, introduce artifacts, and produce mixes in an end-to-end fashion, without grounding in traditional audio effects, prohibiting… ▽ More Mixing style transfer automates the generation of a multitrack mix for a given set of tracks by inferring production attributes from a reference song. However, existing systems for mixing style transfer are limited in that they often operate only on a fixed number of tracks, introduce artifacts, and produce mixes in an end-to-end fashion, without grounding in traditional audio effects, prohibiting interpretability and controllability. To overcome these challenges, we introduce Diff-MST, a framework comprising a differentiable mixing console, a transformer controller, and an audio production style loss function. By inputting raw tracks and a reference song, our model estimates control parameters for audio effects within a differentiable mixing console, producing high-quality mixes and enabling post-hoc adjustments. Moreover, our architecture supports an arbitrary number of input tracks without source labelling, enabling real-world applications. We evaluate our model's performance against robust baselines and showcase the effectiveness of our approach, architectural design, tailored audio production style loss, and innovative training methodology for the given task. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: Accepted to be published at the Proceedings of the 25th International Society for Music Information Retrieval Conference 2024

arXiv:2309.03404 [pdf, other]

The Role of Communication and Reference Songs in the Mixing Process: Insights from Professional Mix Engineers

Authors: Soumya Sai Vanka, Maryam Safi, Jean-Baptiste Rolland, György Fazekas

Abstract: Effective music mixing requires technical and creative finesse, but clear communication with the client is crucial. The mixing engineer must grasp the client's expectations, and preferences, and collaborate to achieve the desired sound. The tacit agreement for the desired sound of the mix is often established using guides like reference songs and demo mixes exchanged between the artist and the eng… ▽ More Effective music mixing requires technical and creative finesse, but clear communication with the client is crucial. The mixing engineer must grasp the client's expectations, and preferences, and collaborate to achieve the desired sound. The tacit agreement for the desired sound of the mix is often established using guides like reference songs and demo mixes exchanged between the artist and the engineer and sometimes verbalised using semantic terms. This paper presents the findings of a two-phased exploratory study aimed at understanding how professional mixing engineers interact with clients and use their feedback to guide the mixing process. For phase one, semi-structured interviews were conducted with five mixing engineers with the aim of gathering insights about their communication strategies, creative processes, and decision-making criteria. Based on the inferences from these interviews, an online questionnaire was designed and administered to a larger group of 22 mixing engineers during the second phase. The results of this study shed light on the importance of collaboration, empathy, and intention in the mixing process, and can inform the development of smart multi-track mixing systems that better support these practices. By highlighting the significance of these findings, this paper contributes to the growing body of research on the collaborative nature of music production and provides actionable recommendations for the design and implementation of innovative mixing tools. △ Less

Submitted 29 September, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

arXiv:2304.03407 [pdf, other]

Adoption of AI Technology in the Music Mixing Workflow: An Investigation

Authors: Soumya Sai Vanka, Maryam Safi, Jean-Baptiste Rolland, George Fazekas

Abstract: The integration of artificial intelligence (AI) technology in the music industry is driving a significant change in the way music is being composed, produced and mixed. This study investigates the current state of AI in the mixing workflows and its adoption by different user groups. Through semi-structured interviews, a questionnaire-based study, and analyzing web forums, the study confirms three… ▽ More The integration of artificial intelligence (AI) technology in the music industry is driving a significant change in the way music is being composed, produced and mixed. This study investigates the current state of AI in the mixing workflows and its adoption by different user groups. Through semi-structured interviews, a questionnaire-based study, and analyzing web forums, the study confirms three user groups comprising amateurs, pro-ams, and professionals. Our findings show that while AI mixing tools can simplify the process and provide decent results for amateurs, pro-ams seek precise control and customization options, while professionals desire control and customization options in addition to assistive and collaborative technologies. The study provides strategies for designing effective AI mixing tools for different user groups and outlines future directions. △ Less

Submitted 8 September, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

Journal ref: Paper number 10653, 154th AES Convention 2023

arXiv:2102.08588 [pdf, other]

doi 10.1016/j.neucom.2022.04.058

NODE-SELECT: A Graph Neural Network Based On A Selective Propagation Technique

Authors: Steph-Yves Louis, Alireza Nasiri, Fatima J. Rolland, Cameron Mitro, Jianjun Hu

Abstract: While there exists a wide variety of graph neural networks (GNN) for node classification, only a minority of them adopt mechanisms that effectively target noise propagation during the message-passing procedure. Additionally, a very important challenge that significantly affects graph neural networks is the issue of scalability which limits their application to larger graphs. In this paper we propo… ▽ More While there exists a wide variety of graph neural networks (GNN) for node classification, only a minority of them adopt mechanisms that effectively target noise propagation during the message-passing procedure. Additionally, a very important challenge that significantly affects graph neural networks is the issue of scalability which limits their application to larger graphs. In this paper we propose our method named NODE-SELECT: an efficient graph neural network that uses subsetting layers which only allow the best sharing-fitting nodes to propagate their information. By having a selection mechanism within each layer which we stack in parallel, our proposed method NODE-SELECT is able to both reduce the amount noise propagated and adapt the restrictive sharing concept observed in real world graphs. Our NODE-SELECT significantly outperformed existing GNN frameworks in noise experiments and matched state-of-the art results in experiments without noise over different benchmark datasets. △ Less

Submitted 17 February, 2021; originally announced February 2021.

arXiv:1902.07769 [pdf]

Development of Head-Mounted Projection Displays for Distributed, Collaborative, Augmented Reality Applications

Authors: Jannick P. Rolland, Frank Biocca, Felix G. Hamza-Lup, Yanggang Ha, Ricardo Martins

Abstract: Distributed systems technologies supporting 3D visualization and social collaboration will be increasing in frequency and type over time. An emerging type of head-mounted display referred to as the head-mounted projection display (HMPD) was recently developed that only requires ultralight optics (i.e., less than 8 g per eye) that enables immersive multiuser, mobile augmented reality 3D visualizati… ▽ More Distributed systems technologies supporting 3D visualization and social collaboration will be increasing in frequency and type over time. An emerging type of head-mounted display referred to as the head-mounted projection display (HMPD) was recently developed that only requires ultralight optics (i.e., less than 8 g per eye) that enables immersive multiuser, mobile augmented reality 3D visualization, as well as remote 3D collaborations. In this paper a review of the development of lightweight HMPD technology is provided, together with insight into what makes this technology timely and so unique. Two novel emerging HMPD-based technologies are then described: a teleportal HMPD(T-HMPD) enabling face-to-face communication and visualization of shared 3D virtual objects, and a mobile HMPD (M-HMPD) designed for outdoor wearable visualization and communication. Finally, the use of HMPD in medical visualization and training, as well as in infospaces, two applications developed in the ODA and MIND labs respectively, are discussed. △ Less

Submitted 20 February, 2019; originally announced February 2019.

Journal ref: Immersive Projection Technology, 2005, PRESENCE, Vol. 14(5), pp. 528-549

arXiv:1812.03322 [pdf]

Scene Synchronization for Real-Time Interaction in Distributed Mixed Reality and Virtual Reality Environments

Authors: Felix G. Hamza-Lup, Jannick P. Rolland

Abstract: Advances in computer networks and rendering systems facilitate the creation of distributed collaborative environments in which the distribution of information at remote locations allows efficient communication. One of the challenges in networked virtual environments is maintaining a consistent view of the shared state in the presence of inevitable network latency and jitter. A consistent view in a… ▽ More Advances in computer networks and rendering systems facilitate the creation of distributed collaborative environments in which the distribution of information at remote locations allows efficient communication. One of the challenges in networked virtual environments is maintaining a consistent view of the shared state in the presence of inevitable network latency and jitter. A consistent view in a shared scene may significantly increase the sense of presence among participants and facilitate their interactivity. The dynamic shared state is directly affected by the frequency of actions applied on the objects in the scene. Mixed Reality (MR) and Virtual Reality (VR) environments contain several types of action producers including human users, a wide range of electronic motion sensors, and haptic devices. In this paper, the authors propose a novel criterion for categorization of distributed MR/VR systems and present an adaptive synchronization algorithm for distributed MR/VR collaborative environments. In spite of significant network latency, results show that for low levels of update frequencies the dynamic shared state can be maintained consistent at multiple remotely located sites. △ Less

Submitted 8 December, 2018; originally announced December 2018.

Journal ref: Special Issue: Collaborative Virtual Environments (2004), PRESENCE, Vol. 13(3), pp. 315-327 (ISSN 1054-7460)

arXiv:1811.12815 [pdf]

A Distributed Augmented Reality System for Medical Training and Simulation

Authors: Felix G. Hamza-Lup, Jannick P. Rolland, Charles Hughes

Abstract: Augmented Reality (AR) systems describe the class of systems that use computers to overlay virtual information on the real world. AR environments allow the development of promising tools in several application domains. In medical training and simulation the learning potential of AR is significantly amplified by the capability of the system to present 3D medical models in real-time at remote locati… ▽ More Augmented Reality (AR) systems describe the class of systems that use computers to overlay virtual information on the real world. AR environments allow the development of promising tools in several application domains. In medical training and simulation the learning potential of AR is significantly amplified by the capability of the system to present 3D medical models in real-time at remote locations. Furthermore the simulation applicability is broadened by the use of real-time deformable medical models. This work presents a distributed medical training prototype designed to train medical practitioners' hand-eye coordination when performing endotracheal intubations. The system we present accomplishes this task with the help of AR paradigms. An extension of this prototype to medical simulations by employing deformable medical models is possible. The shared state maintenance of the collaborative AR environment is assured through a novel adaptive synchronization algorithm (ASA) that increases the sense of presence among participants and facilitates their interactivity in spite of infrastructure delays. The system will allow paramedics, pre-hospital personnel, and students to practice their skills without touching a real patient and will provide them with the visual feedback they could not otherwise obtain. Such a distributed AR training tool has the potential to: allow an instructor to simultaneously train local and remotely located students and, allow students to actually "see" the internal anatomy and therefore better understand their actions on a human patient simulator (HPS). △ Less

Submitted 28 November, 2018; originally announced November 2018.

Comments: arXiv admin note: text overlap with arXiv:1111.2993 by other authors

Journal ref: Energy, Simulation-Training, Ocean Engineering and Instrumentation: Research Papers of the Link Foundation Fellows (2004), Vol. 4, pp. 213-235

arXiv:1811.11955 [pdf]

Sensors in Distributed Mixed Reality Environments

Authors: Felix G. Hamza-Lup, Charles Hughes, Jannick P. Rolland

Abstract: With the advances in sensors and computer networks an increased number of Mixed Reality (MR) applications require large amounts of information from the real world. Such information is collected through sensors (e.g. position and orientation tracking sensors). These sensors collect data from the physical environment in real-time at different locations and a distributed system connecting them must a… ▽ More With the advances in sensors and computer networks an increased number of Mixed Reality (MR) applications require large amounts of information from the real world. Such information is collected through sensors (e.g. position and orientation tracking sensors). These sensors collect data from the physical environment in real-time at different locations and a distributed system connecting them must assure data distribution among collaborative sites at interactive speeds. We propose a new architecture for sensor based interactive distributed environments that falls in-between the atomistic peer-to-peer model and the traditional client-server model. Each node in the system is autonomous and fully manages its resources and connectivity. The dynamic behavior of the nodes is triggered by the human participants that manipulate the sensors attached to the nodes. △ Less

Submitted 28 November, 2018; originally announced November 2018.

Journal ref: Journal of Systemics, Cybernetics and Informatics (2006), Vol. 3(2), pp. 96-101

arXiv:1811.11953 [pdf]

Distributed Augmented Reality with 3D Lung Dynamics -- A Planning Tool Concept

Authors: Felix G. Hamza-Lup, Anand P. Santhanam, Celina Imielinska, Sanford Meeks, Jannick P. Rolland

Abstract: Augmented Reality (AR) systems add visual information to the world by using advanced display techniques. The advances in miniaturization and reduced costs make some of these systems feasible for applications in a wide set of fields. We present a potential component of the cyber infrastructure for the operating room of the future; a distributed AR based software-hardware system that allows real-tim… ▽ More Augmented Reality (AR) systems add visual information to the world by using advanced display techniques. The advances in miniaturization and reduced costs make some of these systems feasible for applications in a wide set of fields. We present a potential component of the cyber infrastructure for the operating room of the future; a distributed AR based software-hardware system that allows real-time visualization of 3D lung dynamics superimposed directly on the patient's body. Several emergency events (e.g. closed and tension pneumothorax) and surgical procedures related to the lung (e.g. lung transplantation, lung volume reduction surgery, surgical treatment of lung infections, lung cancer surgery) could benefit from the proposed prototype. △ Less

Submitted 28 November, 2018; originally announced November 2018.

Journal ref: IEEE Transactions on Information Technology in Biomedicine (2007), Vol. 11(1), pp. 40-46

arXiv:1811.08833 [pdf]

doi 10.1007/s12008-007-0027-z

Beyond the Desktop: Emerging Technologies for Supporting 3D Collaborative Teams

Authors: Jannick Rolland, Ozan Cakmakci, Jeff Covelli, Cali Fidopiastis, Florian Fournier, Ricardo Martins, Felix G. Hamza-Lup, Denise Nicholson

Abstract: The emergence of several trends, including the increased availability of wireless networks, miniaturization of electronics and sensing technologies, and novel input and output devices, is creating a demand for integrated, full-time displays for use across a wide range of applications, including collaborative environments. In this paper, we present and discuss emerging visualization methods we are… ▽ More The emergence of several trends, including the increased availability of wireless networks, miniaturization of electronics and sensing technologies, and novel input and output devices, is creating a demand for integrated, full-time displays for use across a wide range of applications, including collaborative environments. In this paper, we present and discuss emerging visualization methods we are develo** particularly as they relate to deployable displays and displays worn on the body to support mobile users. △ Less

Submitted 20 November, 2018; originally announced November 2018.

Journal ref: International Journal on Interactive Design and Manufacturing (2007), Vol. 4(1), pp. 239-241

arXiv:1811.08053 [pdf]

doi 10.1097/SIH.0b013e31816b5d54

Generating Classes of 3D Virtual Mandibles for AR-Based Medical Simulation

Authors: Neha R. Hippalgaonkar, Alexa D. Sider, Felix G. Hamza-Lup, Anand P. Santhanam, Bala Jaganathan, Celina Imielinska, Jannick P. Rolland

Abstract: Simulation and modeling represent promising tools for several application domains from engineering to forensic science and medicine. Advances in 3D imaging technology convey paradigms such as augmented reality (AR) and mixed reality inside promising simulation tools for the training industry. Motivated by the requirement for superimposing anatomically correct 3D models on a Human Patient Simulator… ▽ More Simulation and modeling represent promising tools for several application domains from engineering to forensic science and medicine. Advances in 3D imaging technology convey paradigms such as augmented reality (AR) and mixed reality inside promising simulation tools for the training industry. Motivated by the requirement for superimposing anatomically correct 3D models on a Human Patient Simulator (HPS) and visualizing them in an AR environment, the purpose of this research effort is to derive method for scaling a source human mandible to a target human mandible. Results show that, given a distance between two same landmarks on two different mandibles, a relative scaling factor may be computed. Using this scaling factor, results show that a 3D virtual mandible model can be made morphometrically equivalent to a real target-specific mandible within a 1.30 millimeter average error bound. The virtual mandible may be further used as a reference target for registering other anatomical models, such as the lungs, on the HPS. Such registration will be made possible by physical constraints among the mandible and the spinal column in the horizontal normal rest position. △ Less

Submitted 19 November, 2018; originally announced November 2018.

Journal ref: Journal of the Society for Simulation in Healthcare (2008), vol. 3(2), pp. 103-110

Showing 1–11 of 11 results for author: Rolland, J