-
Is Disentanglement enough? On Latent Representations for Controllable Music Generation
Authors:
Ashis Pati,
Alexander Lerch
Abstract:
Improving controllability or the ability to manipulate one or more attributes of the generated data has become a topic of interest in the context of deep generative models of music. Recent attempts in this direction have relied on learning disentangled representations from data such that the underlying factors of variation are well separated. In this paper, we focus on the relationship between dis…
▽ More
Improving controllability or the ability to manipulate one or more attributes of the generated data has become a topic of interest in the context of deep generative models of music. Recent attempts in this direction have relied on learning disentangled representations from data such that the underlying factors of variation are well separated. In this paper, we focus on the relationship between disentanglement and controllability by conducting a systematic study using different supervised disentanglement learning algorithms based on the Variational Auto-Encoder (VAE) architecture. Our experiments show that a high degree of disentanglement can be achieved by using different forms of supervision to train a strong discriminative encoder. However, in the absence of a strong generative decoder, disentanglement does not necessarily imply controllability. The structure of the latent space with respect to the VAE-decoder plays an important role in boosting the ability of a generative model to manipulate different attributes. To this end, we also propose methods and metrics to help evaluate the quality of a latent space with respect to the afforded degree of controllability.
△ Less
Submitted 1 August, 2021;
originally announced August 2021.
-
An Interdisciplinary Review of Music Performance Analysis
Authors:
Alexander Lerch,
Claire Arthur,
Ashis Pati,
Siddharth Gururani
Abstract:
A musical performance renders an acoustic realization of a musical score or other representation of a composition. Different performances of the same composition may vary in terms of performance parameters such as timing or dynamics, and these variations may have a major impact on how a listener perceives the music. The analysis of music performance has traditionally been a peripheral topic for th…
▽ More
A musical performance renders an acoustic realization of a musical score or other representation of a composition. Different performances of the same composition may vary in terms of performance parameters such as timing or dynamics, and these variations may have a major impact on how a listener perceives the music. The analysis of music performance has traditionally been a peripheral topic for the MIR research community, where often a single audio recording is used as representative of a musical work. This paper surveys the field of Music Performance Analysis (MPA) from several perspectives including the measurement of performance parameters, the relation of those parameters to the actions and intentions of a performer or perceptual effects on a listener, and finally the assessment of musical performance. This paper also discusses MPA as it relates to MIR, pointing out opportunities for collaboration and future research in both areas.
△ Less
Submitted 18 April, 2021;
originally announced April 2021.
-
Score-informed Networks for Music Performance Assessment
Authors:
Jiawen Huang,
Yun-Ning Hung,
Ashis Pati,
Siddharth Kumar Gururani,
Alexander Lerch
Abstract:
The assessment of music performances in most cases takes into account the underlying musical score being performed. While there have been several automatic approaches for objective music performance assessment (MPA) based on extracted features from both the performance audio and the score, deep neural network-based methods incorporating score information into MPA models have not yet been investiga…
▽ More
The assessment of music performances in most cases takes into account the underlying musical score being performed. While there have been several automatic approaches for objective music performance assessment (MPA) based on extracted features from both the performance audio and the score, deep neural network-based methods incorporating score information into MPA models have not yet been investigated. In this paper, we introduce three different models capable of score-informed performance assessment. These are (i) a convolutional neural network that utilizes a simple time-series input comprising of aligned pitch contours and score, (ii) a joint embedding model which learns a joint latent space for pitch contours and scores, and (iii) a distance matrix-based convolutional neural network which utilizes patterns in the distance matrix between pitch contours and musical score to predict assessment ratings. Our results provide insights into the suitability of different architectures and input representations and demonstrate the benefits of score-informed models as compared to score-independent models.
△ Less
Submitted 1 August, 2020;
originally announced August 2020.
-
dMelodies: A Music Dataset for Disentanglement Learning
Authors:
Ashis Pati,
Siddharth Gururani,
Alexander Lerch
Abstract:
Representation learning focused on disentangling the underlying factors of variation in given data has become an important area of research in machine learning. However, most of the studies in this area have relied on datasets from the computer vision domain and thus, have not been readily extended to music. In this paper, we present a new symbolic music dataset that will help researchers working…
▽ More
Representation learning focused on disentangling the underlying factors of variation in given data has become an important area of research in machine learning. However, most of the studies in this area have relied on datasets from the computer vision domain and thus, have not been readily extended to music. In this paper, we present a new symbolic music dataset that will help researchers working on disentanglement problems demonstrate the efficacy of their algorithms on diverse domains. This will also provide a means for evaluating algorithms specifically designed for music. To this end, we create a dataset comprising of 2-bar monophonic melodies where each melody is the result of a unique combination of nine latent factors that span ordinal, categorical, and binary types. The dataset is large enough (approx. 1.3 million data points) to train and test deep networks for disentanglement learning. In addition, we present benchmarking experiments using popular unsupervised disentanglement algorithms on this dataset and compare the results with those obtained on an image-based dataset.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Attribute-based Regularization of Latent Spaces for Variational Auto-Encoders
Authors:
Ashis Pati,
Alexander Lerch
Abstract:
Selective manipulation of data attributes using deep generative models is an active area of research. In this paper, we present a novel method to structure the latent space of a Variational Auto-Encoder (VAE) to encode different continuous-valued attributes explicitly. This is accomplished by using an attribute regularization loss which enforces a monotonic relationship between the attribute value…
▽ More
Selective manipulation of data attributes using deep generative models is an active area of research. In this paper, we present a novel method to structure the latent space of a Variational Auto-Encoder (VAE) to encode different continuous-valued attributes explicitly. This is accomplished by using an attribute regularization loss which enforces a monotonic relationship between the attribute values and the latent code of the dimension along which the attribute is to be encoded. Consequently, post-training, the model can be used to manipulate the attribute by simply changing the latent code of the corresponding regularized dimension. The results obtained from several quantitative and qualitative experiments show that the proposed method leads to disentangled and interpretable latent spaces that can be used to effectively manipulate a wide range of data attributes spanning image and symbolic music domains.
△ Less
Submitted 28 July, 2020; v1 submitted 11 April, 2020;
originally announced April 2020.
-
Explicitly Conditioned Melody Generation: A Case Study with Interdependent RNNs
Authors:
Benjamin Genchel,
Ashis Pati,
Alexander Lerch
Abstract:
Deep generative models for symbolic music are typically designed to model temporal dependencies in music so as to predict the next musical event given previous events. In many cases, such models are expected to learn abstract concepts such as harmony, meter, and rhythm from raw musical data without any additional information. In this study, we investigate the effects of explicitly conditioning dee…
▽ More
Deep generative models for symbolic music are typically designed to model temporal dependencies in music so as to predict the next musical event given previous events. In many cases, such models are expected to learn abstract concepts such as harmony, meter, and rhythm from raw musical data without any additional information. In this study, we investigate the effects of explicitly conditioning deep generative models with musically relevant information. Specifically, we study the effects of four different conditioning inputs on the performance of a recurrent monophonic melody generation model. Several combinations of these conditioning inputs are used to train different model variants which are then evaluated using three objective evaluation paradigms across two genres of music. The results indicate musically relevant conditioning significantly improves learning and performance, and reveal how this information affects learning of musical features related to pitch and rhythm. An informal subjective evaluation suggests a corresponding improvement in the aesthetic quality of generations.
△ Less
Submitted 9 July, 2019;
originally announced July 2019.
-
Learning to Traverse Latent Spaces for Musical Score Inpainting
Authors:
Ashis Pati,
Alexander Lerch,
Gaëtan Hadjeres
Abstract:
Music Inpainting is the task of filling in missing or lost information in a piece of music. We investigate this task from an interactive music creation perspective. To this end, a novel deep learning-based approach for musical score inpainting is proposed. The designed model takes both past and future musical context into account and is capable of suggesting ways to connect them in a musically mea…
▽ More
Music Inpainting is the task of filling in missing or lost information in a piece of music. We investigate this task from an interactive music creation perspective. To this end, a novel deep learning-based approach for musical score inpainting is proposed. The designed model takes both past and future musical context into account and is capable of suggesting ways to connect them in a musically meaningful manner. To achieve this, we leverage the representational power of the latent space of a Variational Auto-Encoder and train a Recurrent Neural Network which learns to traverse this latent space conditioned on the past and future musical contexts. Consequently, the designed model is capable of generating several measures of music to connect two musical excerpts. The capabilities and performance of the model are showcased by comparison with competitive baselines using several objective and subjective evaluation methods. The results show that the model generates meaningful inpaintings and can be used in interactive music creation applications. Overall, the method demonstrates the merit of learning complex trajectories in the latent spaces of deep generative models.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.
-
Music Performance Analysis: A Survey
Authors:
Alexander Lerch,
Claire Arthur,
Ashis Pati,
Siddharth Gururani
Abstract:
Music Information Retrieval (MIR) tends to focus on the analysis of audio signals. Often, a single music recording is used as representative of a "song" even though different performances of the same song may reveal different properties. A performance is distinct in many ways from a (arguably more abstract) representation of a "song," "piece," or musical score. The characteristics of the (recorded…
▽ More
Music Information Retrieval (MIR) tends to focus on the analysis of audio signals. Often, a single music recording is used as representative of a "song" even though different performances of the same song may reveal different properties. A performance is distinct in many ways from a (arguably more abstract) representation of a "song," "piece," or musical score. The characteristics of the (recorded) performance -- as opposed to the score or musical idea -- can have a major impact on how a listener perceives music. The analysis of music performance, however, has been traditionally only a peripheral topic for the MIR research community. This paper surveys the field of Music Performance Analysis (MPA) from various perspectives, discusses its significance to the field of MIR, and points out opportunities for future research in this field.
△ Less
Submitted 29 June, 2019;
originally announced July 2019.
-
Quantifying coherence with quantum addition
Authors:
Chiranjib Mukhopadhyay,
Arun Kumar Pati,
Sk Sazim
Abstract:
Quantum addition channels have been recently introduced in the context of deriving entropic power inequalities for finite dimensional quantum systems. We prove a reverse entropy power equality which can be used to analytically prove an inequality conjectured recently for arbitrary dimension and arbitrary addition weight. We show that the relative entropic difference between the output of such a qu…
▽ More
Quantum addition channels have been recently introduced in the context of deriving entropic power inequalities for finite dimensional quantum systems. We prove a reverse entropy power equality which can be used to analytically prove an inequality conjectured recently for arbitrary dimension and arbitrary addition weight. We show that the relative entropic difference between the output of such a quantum additon channel and the corresponding classical mixture quantitatively captures the amount of coherence present in a quantum system. This new coherence measure admits an upper bound in terms of the relative entropy of coherence and is utilized to formulate a state-dependent uncertainty relation for two observables. Our results may provide deep insights to the origin of quantum coherence for mixed states that truly come from the discrepancy between quantum addition and the classical mixture.
△ Less
Submitted 20 March, 2018; v1 submitted 19 March, 2018;
originally announced March 2018.
-
A Rule-Based Computational Model of Cognitive Arithmetic
Authors:
Ashis Pati,
Kantwon Rogers,
Hanqing Zhu
Abstract:
Cognitive arithmetic studies the mental processes used in solving math problems. This area of research explores the retrieval mechanisms and strategies used by people during a common cognitive task. Past research has shown that human performance in arithmetic operations is correlated to the numerical size of the problem. Past research on cognitive arithmetic has pinpointed this trend to either ret…
▽ More
Cognitive arithmetic studies the mental processes used in solving math problems. This area of research explores the retrieval mechanisms and strategies used by people during a common cognitive task. Past research has shown that human performance in arithmetic operations is correlated to the numerical size of the problem. Past research on cognitive arithmetic has pinpointed this trend to either retrieval strength, error checking, or strategy-based approaches when solving equations. This paper describes a rule-based computational model that performs the four major arithmetic operations (addition, subtraction, multiplication and division) on two operands. We then evaluated our model to probe its validity in representing the prevailing concepts observed in psychology experiments from the related works. The experiments specifically explore the problem size effect, an activation-based model for fact retrieval, backup strategies when retrieval fails, and finally optimization strategies when faced with large operands. From our experimental results, we concluded that our model's response times were comparable to results observed when people performed similar tasks during psychology experiments. The fit of our model in reproducing these results and incorporating accuracy into our model are discussed.
△ Less
Submitted 2 May, 2017;
originally announced May 2017.
-
Stronger Error Disturbance Relations for Incompatible Quantum Measurements
Authors:
Chiranjib Mukhopadhyay,
Namrata Shukla,
Arun Kumar Pati
Abstract:
We formulate a new error-disturbance relation, which is free from explicit dependence upon variances in observables. This error-disturbance relation shows improvement over the one provided by the Branciard inequality and the Ozawa inequality for some initial states and for particular class of joint measurements under consideration. We also prove a modified form of Ozawa's error-disturbance relatio…
▽ More
We formulate a new error-disturbance relation, which is free from explicit dependence upon variances in observables. This error-disturbance relation shows improvement over the one provided by the Branciard inequality and the Ozawa inequality for some initial states and for particular class of joint measurements under consideration. We also prove a modified form of Ozawa's error-disturbance relation. The later relation provides a tighter bound compared to the Ozawa and the Branciard inequalities for a small number of states.
△ Less
Submitted 13 December, 2016; v1 submitted 17 March, 2015;
originally announced March 2015.
-
Monogamy, polygamy, and other properties of entanglement of purification
Authors:
Shrobona Bagchi,
Arun Kumar Pati
Abstract:
For bipartite pure and mixed quantum states, in addition to the quantum mutual information, there is another measure of total correlation, namely, the entanglement of purification. We study the monogamy, polygamy, and additivity properties of the entanglement of purification for pure and mixed states. In this paper, we show that, in contrast to the quantum mutual information which is strictly mono…
▽ More
For bipartite pure and mixed quantum states, in addition to the quantum mutual information, there is another measure of total correlation, namely, the entanglement of purification. We study the monogamy, polygamy, and additivity properties of the entanglement of purification for pure and mixed states. In this paper, we show that, in contrast to the quantum mutual information which is strictly monogamous for any tripartite pure states, the entanglement of purification is polygamous for the same. This shows that there can be genuinely two types of total correlation across any bipartite cross in a pure tripartite state. Furthermore, we find the lower bound and actual values of the entanglement of purification for different classes of tripartite and higher-dimensional bipartite mixed states. Thereafter, we show that if entanglement of purification is not additive on tensor product states, it is actually subadditive. Using these results, we identify some states which are additive on tensor products for entanglement of purification. The implications of these findings on the quantum advantage of dense coding are briefly discussed, whereby we show that for tripartite pure states, it is strictly monogamous and if it is nonadditive, then it is superadditive on tensor product states.
△ Less
Submitted 15 November, 2015; v1 submitted 4 February, 2015;
originally announced February 2015.