Search | arXiv e-print repository

doi 10.18653/v1/2021.findings-emnlp.339

Detecting Frames in News Headlines and Lead Images in U.S. Gun Violence Coverage

Authors: Isidora Chara Tourni, Lei Guo, Hengchang Hu, Edward Halim, Prakash Ishwar, Taufiq Daryanto, Mona Jalal, Boqi Chen, Margrit Betke, Fabian Zhafransyah, Sha Lai, Derry Tanti Wijaya

Abstract: News media structure their reporting of events or issues using certain perspectives. When describing an incident involving gun violence, for example, some journalists may focus on mental health or gun regulation, while others may emphasize the discussion of gun rights. Such perspectives are called \say{frames} in communication research. We study, for the first time, the value of combining lead i… ▽ More News media structure their reporting of events or issues using certain perspectives. When describing an incident involving gun violence, for example, some journalists may focus on mental health or gun regulation, while others may emphasize the discussion of gun rights. Such perspectives are called \say{frames} in communication research. We study, for the first time, the value of combining lead images and their contextual information with text to identify the frame of a given news article. We observe that using multiple modes of information(article- and image-derived features) improves prediction of news frames over any single mode of information when the images are relevant to the frames of the headlines. We also observe that frame image relevance is related to the ease of conveying frames via images, which we call frame concreteness. Additionally, we release the first multimodal news framing dataset related to gun violence in the U.S., curated and annotated by communication researchers. The dataset will allow researchers to further examine the use of multiple information modalities for studying media framing. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: published at Findings of the Association for Computational Linguistics: EMNLP 2021

arXiv:2406.17159 [pdf, other]

Exploring compressibility of transformer based text-to-music (TTM) models

Authors: Vasileios Moschopoulos, Thanasis Kotsiopoulos, Pablo Peso Parada, Konstantinos Nikiforidis, Alexandros Stergiadis, Gerasimos Papakostas, Md Asif Jalal, Jisi Zhang, Anastasios Drosou, Karthikeyan Saravanan

Abstract: State-of-the art Text-To-Music (TTM) generative AI models are large and require desktop or server class compute, making them infeasible for deployment on mobile phones. This paper presents an analysis of trade-offs between model compression and generation performance of TTM models. We study compression through knowledge distillation and specific modifications that enable applicability over the var… ▽ More State-of-the art Text-To-Music (TTM) generative AI models are large and require desktop or server class compute, making them infeasible for deployment on mobile phones. This paper presents an analysis of trade-offs between model compression and generation performance of TTM models. We study compression through knowledge distillation and specific modifications that enable applicability over the various components of the TTM model (encoder, generative model and the decoder). Leveraging these methods we create TinyTTM (89.2M params) that achieves a FAD of 3.66 and KL of 1.32 on MusicBench dataset, better than MusicGen-Small (557.6M params) but not lower than MusicGen-small fine-tuned on MusicBench. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: Proceedings of INTERSPEECH 2024

arXiv:2404.19570 [pdf, other]

Morphodynamics of chloroplast network control light-avoidance response in the non-motile dinoflagellate Pyrocystis lunula

Authors: Nico Schramma, Gloria Casas Canales, Maziyar Jalaal

Abstract: Photosynthetic algae play a significant role in oceanic carbon capture. Their performance, however, is constantly challenged by fluctuations in environmental light conditions. Here, we show that the non-motile single-celled marine dinoflagellate Pyrocystis lunula can internally contract its chloroplast network in response to light. By exposing the cell to various physiological light conditions and… ▽ More Photosynthetic algae play a significant role in oceanic carbon capture. Their performance, however, is constantly challenged by fluctuations in environmental light conditions. Here, we show that the non-motile single-celled marine dinoflagellate Pyrocystis lunula can internally contract its chloroplast network in response to light. By exposing the cell to various physiological light conditions and applying temporal illumination sequences, we find that network morphodynamics follows simple rules, as established in a mathematical model. Our analysis of the chloroplast structure reveals that its unusual reticulated morphology constitutes properties similar to auxetic metamaterials, facilitating drastic deformations for light-avoidance, while confined by the cell wall. Our study shows how the topologically complex network of chloroplasts is crucial in supporting the dinoflagellate's adaptation to varying light conditions, thereby facilitating essential life-sustaining processes. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2401.13146 [pdf, other]

Locality enhanced dynamic biasing and sampling strategies for contextual ASR

Authors: Md Asif Jalal, Pablo Peso Parada, George Pavlidis, Vasileios Moschopoulos, Karthikeyan Saravanan, Chrysovalantis-Giorgos Kontoulis, Jisi Zhang, Anastasios Drosou, Gil Ho Lee, Jungin Lee, Seokyeong Jung

Abstract: Automatic Speech Recognition (ASR) still face challenges when recognizing time-variant rare-phrases. Contextual biasing (CB) modules bias ASR model towards such contextually-relevant phrases. During training, a list of biasing phrases are selected from a large pool of phrases following a sampling strategy. In this work we firstly analyse different sampling strategies to provide insights into the t… ▽ More Automatic Speech Recognition (ASR) still face challenges when recognizing time-variant rare-phrases. Contextual biasing (CB) modules bias ASR model towards such contextually-relevant phrases. During training, a list of biasing phrases are selected from a large pool of phrases following a sampling strategy. In this work we firstly analyse different sampling strategies to provide insights into the training of CB for ASR with correlation plots between the bias embeddings among various training stages. Secondly, we introduce a neighbourhood attention (NA) that localizes self attention (SA) to the nearest neighbouring frames to further refine the CB output. The results show that this proposed approach provides on average a 25.84% relative WER improvement on LibriSpeech sets and rare-word evaluation compared to the baseline. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: Accepted for IEEE ASRU 2023

arXiv:2401.12085 [pdf, other]

Consistency Based Unsupervised Self-training For ASR Personalisation

Authors: Jisi Zhang, Vandana Rajan, Haaris Mehmood, David Tuckey, Pablo Peso Parada, Md Asif Jalal, Karthikeyan Saravanan, Gil Ho Lee, Jungin Lee, Seokyeong Jung

Abstract: On-device Automatic Speech Recognition (ASR) models trained on speech data of a large population might underperform for individuals unseen during training. This is due to a domain shift between user data and the original training data, differed by user's speaking characteristics and environmental acoustic conditions. ASR personalisation is a solution that aims to exploit user data to improve model… ▽ More On-device Automatic Speech Recognition (ASR) models trained on speech data of a large population might underperform for individuals unseen during training. This is due to a domain shift between user data and the original training data, differed by user's speaking characteristics and environmental acoustic conditions. ASR personalisation is a solution that aims to exploit user data to improve model robustness. The majority of ASR personalisation methods assume labelled user data for supervision. Personalisation without any labelled data is challenging due to limited data size and poor quality of recorded audio samples. This work addresses unsupervised personalisation by develo** a novel consistency based training method via pseudo-labelling. Our method achieves a relative Word Error Rate Reduction (WERR) of 17.3% on unlabelled training data and 8.1% on held-out data compared to a pre-trained model, and outperforms the current state-of-the art methods. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: Accepted for IEEE ASRU 2023

arXiv:2401.02298 [pdf, other]

Optimal shape design of printing nozzles for extrusion-based additive manufacturing

Authors: Tomas Schuller, Maziyar Jalaal, Paola Fanzio, Francisco J. Galindo-Rosales

Abstract: The optimal design seeks the best possible solution(s) for a mechanical structure, device, or system, satisfying a series of requirements and leading to the best performance. In this work, optimized nozzle shapes have been designed for a wide range of polymer melts to be used in extrusion-based additive manufacturing, which aims to minimize pressure drop and allow greater flow control at large ext… ▽ More The optimal design seeks the best possible solution(s) for a mechanical structure, device, or system, satisfying a series of requirements and leading to the best performance. In this work, optimized nozzle shapes have been designed for a wide range of polymer melts to be used in extrusion-based additive manufacturing, which aims to minimize pressure drop and allow greater flow control at large extrusion velocities. This is achieved with a twofold approach, combining a global optimization algorithm with computational fluid dynamics for optimizing a contraction geometry for viscoelastic fluids and validating these geometries experimentally. In the optimization process, variable coordinates for the nozzle's contraction section are defined, the objective function is selected, and the optimization algorithm is guided within manufacturing constraints. Comparisons of flow-type and streamline plots reveal that the nozzle shape significantly influences flow patterns. Depending on the rheological properties, the optimized solution either promotes shear or extensional flow, enhancing the material flow rate. Finally, experimental validation of the nozzle performance assessed the actual printing flow, the extrusion force and the overall print control. It is shown that optimizing the nozzle can significantly reduce backflow-related pressure drop, positively impacting total pressure drop (up to 41%) and reducing backflow effects. This work has real-world implications for the additive manufacturing industry, offering opportunities for increased printing speeds, enhanced productivity, and improved printing quality and reliability. Our research contributes to advancing extrusion-based printing processes technology, addressing industry demands and enhancing the field of additive manufacturing. △ Less

Submitted 4 January, 2024; originally announced January 2024.

arXiv:2312.14278 [pdf, other]

Ductile-to-brittle transition and yielding in soft amorphous materials: perspectives and open questions

Authors: Thibaut Divoux, Elisabeth Agoritsas, Stefano Aime, Catherine Barentin, Jean-Louis Barrat, Roberto Benzi, Ludovic Berthier, Dapeng Bi, Giulio Biroli, Daniel Bonn, Philippe Bourrianne, Mehdi Bouzid, Emanuela Del Gado, Hélène Delanoë-Ayari, Kasra Farain, Suzanne Fielding, Matthias Fuchs, Jasper van der Gucht, Silke Henkes, Maziyar Jalaal, Yogesh M. Joshi, Anaël Lemaître, Robert L. Leheny, Sébastien Manneville, Kirsten Martens , et al. (15 additional authors not shown)

Abstract: Soft amorphous materials are viscoelastic solids ubiquitously found around us, from clays and cementitious pastes to emulsions and physical gels encountered in food or biomedical engineering. Under an external deformation, these materials undergo a noteworthy transition from a solid to a liquid state that reshapes the material microstructure. This yielding transition was the main theme of a worksh… ▽ More Soft amorphous materials are viscoelastic solids ubiquitously found around us, from clays and cementitious pastes to emulsions and physical gels encountered in food or biomedical engineering. Under an external deformation, these materials undergo a noteworthy transition from a solid to a liquid state that reshapes the material microstructure. This yielding transition was the main theme of a workshop held from January 9 to 13, 2023 at the Lorentz Center in Leiden. The manuscript presented here offers a critical perspective on the subject, synthesizing insights from the various brainstorming sessions and informal discussions that unfolded during this week of vibrant exchange of ideas. The result of these exchanges takes the form of a series of open questions that represent outstanding experimental, numerical, and theoretical challenges to be tackled in the near future. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: 21 pages, 7 figures, perspective on the workshop 'Yield stress and fluidization in brittle and ductile amorphous systems' held from January 9 to 13, 2023 at the Lorentz Center in Leiden

arXiv:2307.13343 [pdf, other]

On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer

Authors: Md Asif Jalal, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Mete Ozay, Myoungji Han, Jung In Lee, Seokyeong Jung

Abstract: Smart devices serviced by large-scale AI models necessitates user data transfer to the cloud for inference. For speech applications, this means transferring private user information, e.g., speaker identity. Our paper proposes a privacy-enhancing framework that targets speaker identity anonymization while preserving speech recognition accuracy for our downstream task~-~Automatic Speech Recognition… ▽ More Smart devices serviced by large-scale AI models necessitates user data transfer to the cloud for inference. For speech applications, this means transferring private user information, e.g., speaker identity. Our paper proposes a privacy-enhancing framework that targets speaker identity anonymization while preserving speech recognition accuracy for our downstream task~-~Automatic Speech Recognition (ASR). The proposed framework attaches flexible gradient reversal based speaker adversarial layers to target layers within an ASR model, where speaker adversarial training anonymizes acoustic embeddings generated by the targeted layers to remove speaker identity. We propose on-device deployment by execution of initial layers of the ASR model, and transmitting anonymized embeddings to the cloud, where the rest of the model is executed while preserving privacy. Experimental results show that our method efficiently reduces speaker recognition relative accuracy by 33%, and improves ASR performance by achieving 6.2% relative Word Error Rate (WER) reduction. △ Less

Submitted 25 July, 2023; originally announced July 2023.

Comments: Proceedings of INTERSPEECH 2023

arXiv:2307.01734 [pdf, other]

Billiards with Spatial Memory

Authors: Thijs Albers, Stijn Delnoij, Nico Schramma, Maziyar Jalaal

Abstract: Many classes of active matter develop spatial memory by encoding information in space, leading to complex pattern formation. It has been proposed that spatial memory can lead to more efficient navigation and collective behaviour in biological systems and influence the fate of synthetic systems. This raises important questions about the fundamental properties of dynamical systems with spatial memor… ▽ More Many classes of active matter develop spatial memory by encoding information in space, leading to complex pattern formation. It has been proposed that spatial memory can lead to more efficient navigation and collective behaviour in biological systems and influence the fate of synthetic systems. This raises important questions about the fundamental properties of dynamical systems with spatial memory. We present a framework based on mathematical billiards in which particles remember their past trajectories and react to them. Despite the simplicity of its fundamental deterministic rules, such a system is strongly non-ergodic and exhibits highly-intermittent statistics, manifesting in complex pattern formation. We show how these self-memory-induced complexities emerge from the temporal change of topology and the consequent chaos in the system. We study the fundamental properties of these billiards and particularly the long-time behaviour when the particles are self-trapped in an arrested state. We exploit numerical simulations of several millions of particles to explore pattern formation and the corresponding statistics in polygonal billiards of different geometries. Our work illustrates how the dynamics of a single-body system can dramatically change when particles feature spatial memory and provide a scheme to further explore systems with complex memory kernels. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 11 pages, 6 figures

arXiv:2306.17500 [pdf, other]

Empirical Interpretation of the Relationship Between Speech Acoustic Context and Emotion Recognition

Authors: Anna Ollerenshaw, Md Asif Jalal, Rosanna Milner, Thomas Hain

Abstract: Speech emotion recognition (SER) is vital for obtaining emotional intelligence and understanding the contextual meaning of speech. Variations of consonant-vowel (CV) phonemic boundaries can enrich acoustic context with linguistic cues, which impacts SER. In practice, speech emotions are treated as single labels over an acoustic segment for a given time duration. However, phone boundaries within sp… ▽ More Speech emotion recognition (SER) is vital for obtaining emotional intelligence and understanding the contextual meaning of speech. Variations of consonant-vowel (CV) phonemic boundaries can enrich acoustic context with linguistic cues, which impacts SER. In practice, speech emotions are treated as single labels over an acoustic segment for a given time duration. However, phone boundaries within speech are not discrete events, therefore the perceived emotion state should also be distributed over potentially continuous time-windows. This research explores the implication of acoustic context and phone boundaries on local markers for SER using an attention-based approach. The benefits of using a distributed approach to speech emotion understanding are supported by the results of cross-corpora analysis experiments. Experiments where phones and words are mapped to the attention vectors along with the fundamental frequency to observe the overlap** distributions and thereby the relationship between acoustic context and emotion. This work aims to bridge psycholinguistic theory research with computational modelling for SER. △ Less

Submitted 30 June, 2023; originally announced June 2023.

arXiv:2306.06640 [pdf, other]

Elasto-viscoplastic Spreading: from Plastocapillarity to Elastocapillarity

Authors: Hugo L. Franc, Maziyar Jalaal, Cassio M. Oishi

Abstract: We study the spreading of elastoviscoplastic (EVP) droplets under surface tension effects. The non- Newtonian material flows like a viscoelastic liquid above the yield stress and behaves like a viscoelastic solid below it. Hence, the droplet initially flows under surface tension forces but eventually reaches a final equilibrium shape when the stress everywhere inside the droplet falls below the re… ▽ More We study the spreading of elastoviscoplastic (EVP) droplets under surface tension effects. The non- Newtonian material flows like a viscoelastic liquid above the yield stress and behaves like a viscoelastic solid below it. Hence, the droplet initially flows under surface tension forces but eventually reaches a final equilibrium shape when the stress everywhere inside the droplet falls below the resisting rheological stresses. We use numerical simulations and combine Volume-of-Fluid (VOF) method and an EVP constitutive model to systematically study the dynamics of spreading and the final shape of the droplets. The spreading process examined in this study finds applications in coating, droplet-based inkjet printing, and 3D printing, where complex fluids such as paints, thermoplastic filaments, or bio-inks are deposited onto surfaces. Additionally, the computational framework enables the study of a wide range of multiphase interfacial phenomena, from elastocapillarity to plastocapillarity. △ Less

Submitted 11 June, 2023; originally announced June 2023.

Comments: 15 pages, 14 figures

arXiv:2303.13442 [pdf, other]

Arrested on heating: controlling the motility of active droplets by temperature

Authors: Prashanth Ramesh, Yibo Chen, Petra Räder, Svenja Morsbach, Maziyar Jalaal, Corinna C Maass

Abstract: Self-propelling active matter relies on the conversion of energy from the undirected, nanoscopic scale to directed, macroscopic motion. One of the challenges in the design of synthetic active matter lies in the control of dynamic states, or motility gaits. Here, we present an experimental system of self-propelling droplets with thermally controllable and reversible dynamic states, from unsteady ov… ▽ More Self-propelling active matter relies on the conversion of energy from the undirected, nanoscopic scale to directed, macroscopic motion. One of the challenges in the design of synthetic active matter lies in the control of dynamic states, or motility gaits. Here, we present an experimental system of self-propelling droplets with thermally controllable and reversible dynamic states, from unsteady over meandering to persistent to arrested motion. These states depend on the Péclet number of the molecular process powering the motion, which we can tune by using a temperature sensitive mixture of surfactants as a fuel medium. We quantify the droplet dynamics by analysing flow and chemical fields for the individual states, comparing them to canonical models for autophoretic particles. In the context of these models, we experimentally observe, in situ, the fundamental first broken symmetry that translates an isotropic, immotile base state to self-propelled motility. △ Less

Submitted 4 November, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

arXiv:2303.00550 [pdf, other]

Towards domain generalisation in ASR with elitist sampling and ensemble knowledge distillation

Authors: Rehan Ahmad, Md Asif Jalal, Muhammad Umar Farooq, Anna Ollerenshaw, Thomas Hain

Abstract: Knowledge distillation has widely been used for model compression and domain adaptation for speech applications. In the presence of multiple teachers, knowledge can easily be transferred to the student by averaging the models output. However, previous research shows that the student do not adapt well with such combination. This paper propose to use an elitist sampling strategy at the output of ens… ▽ More Knowledge distillation has widely been used for model compression and domain adaptation for speech applications. In the presence of multiple teachers, knowledge can easily be transferred to the student by averaging the models output. However, previous research shows that the student do not adapt well with such combination. This paper propose to use an elitist sampling strategy at the output of ensemble teacher models to select the best-decoded utterance generated by completely out-of-domain teacher models for generalizing unseen domain. The teacher models are trained on AMI, LibriSpeech and WSJ while the student is adapted for the Switchboard data. The results show that with the selection strategy based on the individual models posteriors the student model achieves a better WER compared to all the teachers and baselines with a minimum absolute improvement of about 8.4 percent. Furthermore, an insights on the model adaptation with out-of-domain data has also been studied via correlation analysis. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2212.13306 [pdf, other]

Viscoplastic Lines: Printing a Single Filament of Yield Stress Material on a Surface

Authors: Jesse van der Klok, Daniël Tieman, Maziyar Jalaal

Abstract: This study presents the spreading of a single filament of a yield stress (viscoplastic) fluid extruded onto a pre-wetted solid surface. The filaments spread laterally under surface tension forces until they reach a final equilibrium shape when the yield stress dominates. We use a simple experimental setup to print the filaments on a moving surface and measure their final width using optical cohere… ▽ More This study presents the spreading of a single filament of a yield stress (viscoplastic) fluid extruded onto a pre-wetted solid surface. The filaments spread laterally under surface tension forces until they reach a final equilibrium shape when the yield stress dominates. We use a simple experimental setup to print the filaments on a moving surface and measure their final width using optical coherence tomography. Additionally, we present a scaling law for the final width and determine the corresponding pre-factor using asymptotic analysis. We then analyse the level of agreement between the theory and experiments and discuss the possible origins of discrepancies. The process studied here has applications in extrusion-based thermoplastic and bio-3D printing. △ Less

Submitted 26 December, 2022; originally announced December 2022.

arXiv:2211.02000 [pdf, other]

Dynamic Kernels and Channel Attention for Low Resource Speaker Verification

Authors: Anna Ollerenshaw, Md Asif Jalal, Thomas Hain

Abstract: State-of-the-art speaker verification frameworks have typically focused on develo** models with increasingly deeper (more layers) and wider (number of channels) models to improve their verification performance. Instead, this paper proposes an approach to increase the model resolution capability using attention-based dynamic kernels in a convolutional neural network to adapt the model parameters… ▽ More State-of-the-art speaker verification frameworks have typically focused on develo** models with increasingly deeper (more layers) and wider (number of channels) models to improve their verification performance. Instead, this paper proposes an approach to increase the model resolution capability using attention-based dynamic kernels in a convolutional neural network to adapt the model parameters to be feature-conditioned. The attention weights on the kernels are further distilled by channel attention and multi-layer feature aggregation to learn global features from speech. This approach provides an efficient solution to improving representation capacity with lower data resources. This is due to the self-adaptation to inputs of the structures of the model parameters. The proposed dynamic convolutional model achieved 1.62\% EER and 0.18 miniDCF on the VoxCeleb1 test set and has a 17\% relative improvement compared to the ECAPA-TDNN using the same training resources. △ Less

Submitted 27 February, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

arXiv:2211.01993 [pdf, other]

Probing Statistical Representations For End-To-End ASR

Authors: Anna Ollerenshaw, Md Asif Jalal, Thomas Hain

Abstract: End-to-End automatic speech recognition (ASR) models aim to learn a generalised speech representation to perform recognition. In this domain there is little research to analyse internal representation dependencies and their relationship to modelling approaches. This paper investigates cross-domain language model dependencies within transformer architectures using SVCCA and uses these insights to e… ▽ More End-to-End automatic speech recognition (ASR) models aim to learn a generalised speech representation to perform recognition. In this domain there is little research to analyse internal representation dependencies and their relationship to modelling approaches. This paper investigates cross-domain language model dependencies within transformer architectures using SVCCA and uses these insights to exploit modelling approaches. It was found that specific neural representations within the transformer layers exhibit correlated behaviour which impacts recognition performance. Altogether, this work provides analysis of the modelling approaches affecting contextual dependencies and ASR performance, and can be used to create or adapt better performing End-to-End ASR models and also for downstream tasks. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Comments: Submitted to ICASSP 2023

arXiv:2210.03228 [pdf, other]

Interfacial aggregation of self-propelled Janus colloids in sessile droplets

Authors: Maziyar Jalaal, Borge ten Hagen, Hai le The, Christian Diddens, Detlef Lohse, Alvaro Marin

Abstract: Living microorganisms in confined systems typically experience an affinity to populate boundaries. The reason for such affinity to interfaces can be a combination of their directed motion and hydrodynamic interactions at distances larger than their own size. Here we will show that self-propelled Janus particles (polystyrene particles partially coated with platinum) immersed in droplets of water an… ▽ More Living microorganisms in confined systems typically experience an affinity to populate boundaries. The reason for such affinity to interfaces can be a combination of their directed motion and hydrodynamic interactions at distances larger than their own size. Here we will show that self-propelled Janus particles (polystyrene particles partially coated with platinum) immersed in droplets of water and hydrogen peroxide tend to accumulate in the vicinity of the liquid/gas interface. Interestingly, the interfacial accumulation occurs despite the presence of an evaporation-driven flow caused by a solutal Marangoni flow, which typically tends to redistribute the particles within the droplet's bulk. By performing additional experiments with passive colloids (flow tracers) and comparing with numerical simulations for both particle active motion and the fluid flow, we disentangle the dominating mechanisms behind the observed interfacial particle accumulation. These results allow us to make an analogy between active Janus particles and some biological microswimmers concerning how they interact with their environment. △ Less

Submitted 1 December, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

Comments: 9 pages, 7 figures

arXiv:2207.07928 [pdf, other]

Elastocapillary Worthington jets

Authors: Uddalok Sen, Detlef Lohse, Maziyar Jalaal

Abstract: The retraction of an impacting droplet on a non-wetting substrate is often associated with the formation of a Worthington jet, which is fed by the retracting liquid. A non-Newtonian rheology of the liquid is known to affect the retraction of the impacting droplet. Here we present a novel phenomenon related to the impact of viscoelastic droplets on non-wettable substrates. We reveal that the viscoe… ▽ More The retraction of an impacting droplet on a non-wetting substrate is often associated with the formation of a Worthington jet, which is fed by the retracting liquid. A non-Newtonian rheology of the liquid is known to affect the retraction of the impacting droplet. Here we present a novel phenomenon related to the impact of viscoelastic droplets on non-wettable substrates. We reveal that the viscoelasticity of the liquid results in an \emph{elastocapillary} regime in the stretching Worthington jet, distinguished by a pinned contact line and a slender jet that does not detach from the droplet. We identify the impact conditions, in the Weber number -- Deborah number phase space, for observing these \emph{elastocapillary} Worthington jets. Such jets exhibit an effectively nearly linear (in time) variation of the strain rate. Upon further extension, the jet exhibits beads-on-a-string structures, characteristic of the \emph{elastocapillary} thinning of slender viscoelastic liquid filaments. The \emph{elastocapillary} Worthington jet is not only relevant for a droplet impact on a solid substrate scenario, but can also be expected in other configurations where a Worthington jet is observed for viscoelastic liquids, such as drop impact on a liquid pool and bubble bursting at an interface. △ Less

Submitted 17 January, 2023; v1 submitted 16 July, 2022; originally announced July 2022.

arXiv:2207.02104 [pdf, ps, other]

doi 10.1109/ASRU46091.2019.9003838

A cross-corpus study on speech emotion recognition

Authors: Rosanna Milner, Md Asif Jalal, Raymond W. M. Ng, Thomas Hain

Abstract: For speech emotion datasets, it has been difficult to acquire large quantities of reliable data and acted emotions may be over the top compared to less expressive emotions displayed in everyday life. Lately, larger datasets with natural emotions have been created. Instead of ignoring smaller, acted datasets, this study investigates whether information learnt from acted emotions is useful for detec… ▽ More For speech emotion datasets, it has been difficult to acquire large quantities of reliable data and acted emotions may be over the top compared to less expressive emotions displayed in everyday life. Lately, larger datasets with natural emotions have been created. Instead of ignoring smaller, acted datasets, this study investigates whether information learnt from acted emotions is useful for detecting natural emotions. Cross-corpus research has mostly considered cross-lingual and even cross-age datasets, and difficulties arise from different methods of annotating emotions causing a drop in performance. To be consistent, four adult English datasets covering acted, elicited and natural emotions are considered. A state-of-the-art model is proposed to accurately investigate the degradation of performance. The system involves a bi-directional LSTM with an attention mechanism to classify emotions across datasets. Experiments study the effects of training models in a cross-corpus and multi-domain fashion and results show the transfer of information is not successful. Out-of-domain models, followed by adapting to the missing dataset, and domain adversarial training (DAT) are shown to be more suitable to generalising to emotions across datasets. This shows positive information transfer from acted datasets to those with more natural emotions and the benefits from training on different corpora. △ Less

Submitted 5 July, 2022; originally announced July 2022.

Comments: ASRU 2019

Journal ref: IEEE Workshop on Automatic Speech Recognition and Understanding 2019

arXiv:2206.14595 [pdf, other]

doi 10.1017/jfm.2023.176

Sessile drop evaporation in a gap -- crossover between diffusion-limited and phase transition-limited regime

Authors: S. Hartmann, C. Diddens, M. Jalaal, U. Thiele

Abstract: We consider the time evolution of a sessile drop of volatile partially wetting liquid on a rigid solid substrate. Thereby, the drop evaporates under strong confinement, namely, it sits on one of the two parallel plates that form a narrow gap. First, we develop an efficient mesoscopic thin-film description in gradient dynamics form. It couples the diffusive dynamics of the vertically averaged vapou… ▽ More We consider the time evolution of a sessile drop of volatile partially wetting liquid on a rigid solid substrate. Thereby, the drop evaporates under strong confinement, namely, it sits on one of the two parallel plates that form a narrow gap. First, we develop an efficient mesoscopic thin-film description in gradient dynamics form. It couples the diffusive dynamics of the vertically averaged vapour density in the narrow gap to an evolution equation for the profile of the volatile drop. The underlying free energy functional incorporates wetting, interface and bulk energies of the liquid and gas entropy. The model allows us to investigate the transition between diffusion-limited and phase transition-limited evaporation for shallow droplets. Its gradient dynamics character also allows for a full-curvature formulation. Second, we compare results obtained with the mesoscopic model to corresponding direct numerical simulations solving the Stokes equation for the drop coupled to the diffusion equation for the vapour as well as to selected experiments. In passing, we discuss the influence of contact line pinning. △ Less

Submitted 29 June, 2022; originally announced June 2022.

Journal ref: J. Fluid Mech. 960, A32 (2023)

arXiv:2205.09456 [pdf, other]

doi 10.21437/Interspeech.2021-1516

Insights on Neural Representations for End-to-End Speech Recognition

Authors: Anna Ollerenshaw, Md Asif Jalal, Thomas Hain

Abstract: End-to-end automatic speech recognition (ASR) models aim to learn a generalised speech representation. However, there are limited tools available to understand the internal functions and the effect of hierarchical dependencies within the model architecture. It is crucial to understand the correlations between the layer-wise representations, to derive insights on the relationship between neural rep… ▽ More End-to-end automatic speech recognition (ASR) models aim to learn a generalised speech representation. However, there are limited tools available to understand the internal functions and the effect of hierarchical dependencies within the model architecture. It is crucial to understand the correlations between the layer-wise representations, to derive insights on the relationship between neural representations and performance. Previous investigations of network similarities using correlation analysis techniques have not been explored for End-to-End ASR models. This paper analyses and explores the internal dynamics between layers during training with CNN, LSTM and Transformer based approaches using Canonical correlation analysis (CCA) and centered kernel alignment (CKA) for the experiments. It was found that neural representations within CNN layers exhibit hierarchical correlation dependencies as layer depth increases but this is mostly limited to cases where neural representation correlates more closely. This behaviour is not observed in LSTM architecture, however there is a bottom-up pattern observed across the training process, while Transformer encoder layers exhibit irregular coefficiency correlation as neural depth increases. Altogether, these results provide new insights into the role that neural architectures have upon speech recognition performance. More specifically, these techniques can be used as indicators to build better performing speech recognition models. △ Less

Submitted 19 May, 2022; originally announced May 2022.

Comments: Submitted to Interspeech 2021

Journal ref: Proc. Interspeech 2021, 4079-4083

arXiv:2204.07574 [pdf, other]

doi 10.1063/5.0105624

Spreading of droplets under various gravitational accelerations

Authors: Olfa D'Angelo, Felix Kuthe, Kasper van Nieuwland, Clint Ederveen Janssen, Thomas Voigtmann, Maziyar Jalaal

Abstract: We describe a setup to perform systematic studies on the spreading of droplets of complex fluids under microgravity conditions. Tweaking the gravitational acceleration under which droplets are deposited provides access to different regimes of the spreading dynamics, quantified through the Bond number. In particular, microgravity allows to form large droplets while remaining in the regime where sur… ▽ More We describe a setup to perform systematic studies on the spreading of droplets of complex fluids under microgravity conditions. Tweaking the gravitational acceleration under which droplets are deposited provides access to different regimes of the spreading dynamics, quantified through the Bond number. In particular, microgravity allows to form large droplets while remaining in the regime where surface tension effects and internal driving stresses are predominant over hydrostatic forces. The VIP-DROP2 experimental module provides a versatile platform to study a wide range of complex fluids through the deposition of axisymmetric droplets. The module offers the possibility to deposit droplets on a precursor layer, which can be composed of the same or of a different fluid. Besides, it allows to deposit four droplets simultaneously, while conducting shadowgraphy on all of them, and observing either the flow field (through particle image velocimetry), or the stress distribution inside the droplet in the case of stress birefringent fluids. Developed for a drop tower catapult system, it is designed to withstand a vertical acceleration of up to 30 times Earth's gravitational acceleration in the downwards direction, and can operate remotely, under microgravity conditions. We provide a detailed description of the module, and exemplary data analysis for droplets spreading on-ground and in microgravity. △ Less

Submitted 27 October, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

Comments: Review of Scientific Instruments (2022)

arXiv:2204.07386 [pdf, other]

doi 10.1073/pnas.2216497120

Chloroplasts in plant cells show active glassy behavior under low light conditions

Authors: Nico Schramma, Cintia Perugachi Israëls, Maziyar Jalaal

Abstract: Plants have developed intricate mechanisms to adapt to changing light conditions. Besides photo- and helio- tropism -- the differential growth towards light and the diurnal motion with respect to sunlight -- chloroplast motion acts as a fast mechanism to change the intracellular structure of leaf cells. While chloroplasts move towards the sides of the plant cell to avoid strong light, they accumul… ▽ More Plants have developed intricate mechanisms to adapt to changing light conditions. Besides photo- and helio- tropism -- the differential growth towards light and the diurnal motion with respect to sunlight -- chloroplast motion acts as a fast mechanism to change the intracellular structure of leaf cells. While chloroplasts move towards the sides of the plant cell to avoid strong light, they accumulate and spread out into a layer on the bottom of the cell at low light to increase the light absorption efficiency. Although the motion of chloroplasts has been studied for over a century, the collective organelle-motion leading to light adapting self-organized structures remains elusive. Here we study the active motion of chloroplasts under dim light conditions, leading to an accumulation in a densely packed quasi-2D layer. We observe burst-like re-arrangements and show that these dynamics resemble colloidal systems close to the glass transition by tracking individual chloroplasts. Furthermore, we provide a minimal mathematical model to uncover relevant system parameters controlling the stability of the dense configuration of chloroplasts. Our study suggests that the meta-stable caging close to the glass-transition in the chloroplast mono-layer serves a physiological relevance. Chloroplasts remain in a spread-out configuration to increase the light uptake, but can easily fluidize when the activity is increased to efficiently re-arrange the structure towards an avoidance state. Our research opens new questions about the role that dynamical phase transitions could play in self-organized intracellular responses of plant cells towards environmental cues. △ Less

Submitted 28 September, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

arXiv:2107.04821 [pdf, other]

doi 10.1017/jfm.2021.676

Crown formation from a cavitating bubble close to a free surface

Authors: Youssef Saade, Maziyar Jalaal, Andrea Prosperetti, Detlef Lohse

Abstract: A rapidly growing bubble close to a free surface induces jetting: a central jet protruding outwards and a crown surrounding it at later stages. While the formation mechanism of the central jet is known and documented, that of the crown remains unsettled. We perform axisymmetric simulations of the problem using the free software program basilisk, where a finite-volume compressible solver has been i… ▽ More A rapidly growing bubble close to a free surface induces jetting: a central jet protruding outwards and a crown surrounding it at later stages. While the formation mechanism of the central jet is known and documented, that of the crown remains unsettled. We perform axisymmetric simulations of the problem using the free software program basilisk, where a finite-volume compressible solver has been implemented, that uses a geometric Volume-of-Fluid method (VoF) for the tracking of the interface. We show that the mechanism of crown formation is a combination of a pressure distortion over the curved interface, inducing flow focusing, and of a flow reversal, caused by the second expansion of the toroidal bubble that drives the crown. The work culminates in a parametric study with the Weber number, the Reynolds number, the pressure ratio and the dimensionless bubble distance to the free surface as control parameters. Their effects on both the central jet and the crown are explored. For high Weber numbers, we observe the formation of weaker "secondary crowns", highly correlated with the third oscillation cycle of the bubble. △ Less

Submitted 10 July, 2021; originally announced July 2021.

Journal ref: J. Fluid Mech. 926 (2021) A5

arXiv:2102.11420 [pdf, other]

Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion

Authors: Samuel J. Broughton, Md Asif Jalal, Roger K. Moore

Abstract: Generative Adversarial Networks (GANs) are machine learning networks based around creating synthetic data. Voice Conversion (VC) is a subset of voice translation that involves translating the paralinguistic features of a source speaker to a target speaker while preserving the linguistic information. The aim of non-parallel conditional GANs for VC is to translate an acoustic speech feature sequence… ▽ More Generative Adversarial Networks (GANs) are machine learning networks based around creating synthetic data. Voice Conversion (VC) is a subset of voice translation that involves translating the paralinguistic features of a source speaker to a target speaker while preserving the linguistic information. The aim of non-parallel conditional GANs for VC is to translate an acoustic speech feature sequence from one domain to another without the use of paired data. In the study reported here, we investigated the interpretability of state-of-the-art implementations of non-parallel GANs in the domain of VC. We show that the learned representations in the repeating layers of a particular GAN architecture remain close to their original random initialised parameters, demonstrating that it is the number of repeating layers that is more responsible for the quality of the output. We also analysed the learned representations of a model trained on one particular dataset when used during transfer learning on another dataset. This showed extremely high levels of similarity across the entire network. Together, these results provide new insight into how the learned representations of deep generative networks change during learning and the importance in the number of layers. △ Less

Submitted 22 February, 2021; originally announced February 2021.

Comments: For demo, see https://samuelbroughton.github.io/interpretability-demo-2020/

arXiv:2101.07744 [pdf, other]

doi 10.1017/jfm.2021.489

Bursting Bubble in a Viscoplastic Medium

Authors: Vatsal Sanjay, Detlef Lohse, Maziyar Jalaal

Abstract: When a rising bubble in a Newtonian liquid reaches the liquid-air interface, it can burst, leading to the formation of capillary waves and a jet on the surface. Here, we numerically study this phenomenon in a yield stress fluid. We show how viscoplasticity controls the fate of these capillary waves and their interaction at the bottom of the cavity. Unlike Newtonian liquids, the free surface conver… ▽ More When a rising bubble in a Newtonian liquid reaches the liquid-air interface, it can burst, leading to the formation of capillary waves and a jet on the surface. Here, we numerically study this phenomenon in a yield stress fluid. We show how viscoplasticity controls the fate of these capillary waves and their interaction at the bottom of the cavity. Unlike Newtonian liquids, the free surface converges to a non-flat final equilibrium shape once the driving stresses inside the pool fall below the yield stress. Details of the dynamics, including the flow's energy budgets, are discussed. The work culminates in a regime map with four main regimes with different characteristic behaviours. △ Less

Submitted 5 July, 2021; v1 submitted 19 January, 2021; originally announced January 2021.

Comments: Please find the supplementary videos here: https://youtube.com/playlist?list=PLf5C5HCrvhLFETl6iaRr21pzr5Xab1OCM

Journal ref: J. Fluid Mech. 922 (2021) A2

arXiv:2010.02894 [pdf, other]

doi 10.1017/jfm.2020.886

Spreading of viscoplastic droplets

Authors: Maziyar Jalaal, Boris Stoeber, Neil Balmforth

Abstract: The spreading under surface tension and gravity of a droplet of yield-stress fluid over a thin film of the same material is studied. The droplet converges to a final equilibrium shape once the driving stresses inside the droplet fall below the yield stress. Scaling laws are presented for the final radius and complemented with an asymptotic analysis for shallow droplets. Moreover, numerical simulat… ▽ More The spreading under surface tension and gravity of a droplet of yield-stress fluid over a thin film of the same material is studied. The droplet converges to a final equilibrium shape once the driving stresses inside the droplet fall below the yield stress. Scaling laws are presented for the final radius and complemented with an asymptotic analysis for shallow droplets. Moreover, numerical simulations using the volume-of-fluid method and a regularized constitutive law, and experiments with an aqueous solution of Carbopol are presented. △ Less

Submitted 6 October, 2020; originally announced October 2020.

Comments: 18 pages, 12 figures

Journal ref: J. Fluid Mech. 914 (2021) A21

arXiv:2008.06974 [pdf, other]

OpenFraming: We brought the ML; you bring the data. Interact with your data and discover its frames

Authors: Alyssa Smith, David Assefa Tofu, Mona Jalal, Edward Edberg Halim, Yimeng Sun, Vidya Akavoor, Margrit Betke, Prakash Ishwar, Lei Guo, Derry Wijaya

Abstract: When journalists cover a news story, they can cover the story from multiple angles or perspectives. A news article written about COVID-19 for example, might focus on personal preventative actions such as mask-wearing, while another might focus on COVID-19's impact on the economy. These perspectives are called "frames," which when used may influence public perception and opinion of the issue. We in… ▽ More When journalists cover a news story, they can cover the story from multiple angles or perspectives. A news article written about COVID-19 for example, might focus on personal preventative actions such as mask-wearing, while another might focus on COVID-19's impact on the economy. These perspectives are called "frames," which when used may influence public perception and opinion of the issue. We introduce a Web-based system for analyzing and classifying frames in text documents. Our goal is to make effective tools for automatic frame discovery and labeling based on topic modeling and deep learning widely accessible to researchers from a diverse array of disciplines. To this end, we provide both state-of-the-art pre-trained frame classification models on various issues as well as a user-friendly pipeline for training novel classification models on user-provided corpora. Researchers can submit their documents and obtain frames of the documents. The degree of user involvement is flexible: they can run models that have been pre-trained on select issues; submit labeled documents and train a new model for frame classification; or submit unlabeled documents and obtain potential frames of the documents. The code making up our system is also open-sourced and well-documented, making the system transparent and expandable. The system is available on-line at http://www.openframing.org and via our GitHub page https://github.com/davidatbu/openFraming . △ Less

Submitted 16 August, 2020; originally announced August 2020.

Comments: 8 pages, 8 figures, EMNLP 2020 demonstration papers

arXiv:2008.05955 [pdf, other]

doi 10.1109/CVPRW.2019.00063

SIDOD: A Synthetic Image Dataset for 3D Object Pose Recognition with Distractors

Authors: Mona Jalal, Josef Spjut, Ben Boudaoud, Margrit Betke

Abstract: We present a new, publicly-available image dataset generated by the NVIDIA Deep Learning Data Synthesizer intended for use in object detection, pose estimation, and tracking applications. This dataset contains 144k stereo image pairs that synthetically combine 18 camera viewpoints of three photorealistic virtual environments with up to 10 objects (chosen randomly from the 21 object models of the Y… ▽ More We present a new, publicly-available image dataset generated by the NVIDIA Deep Learning Data Synthesizer intended for use in object detection, pose estimation, and tracking applications. This dataset contains 144k stereo image pairs that synthetically combine 18 camera viewpoints of three photorealistic virtual environments with up to 10 objects (chosen randomly from the 21 object models of the YCB dataset [1]) and flying distractors. Object and camera pose, scene lighting, and quantity of objects and distractors were randomized. Each provided view includes RGB, depth, segmentation, and surface normal images, all pixel level. We describe our approach for domain randomization and provide insight into the decisions that produced the dataset. △ Less

Submitted 11 August, 2020; originally announced August 2020.

Comments: 3 pages, 4 figures, 1 table, Accepted at CVPR 2019 Workshop

arXiv:2008.05060 [pdf, other]

doi 10.1109/CVPR.2017.533

Online Graph Completion: Multivariate Signal Recovery in Computer Vision

Authors: Won Hwa Kim, Mona Jalal, Seongjae Hwang, Sterling C. Johnson, Vikas Singh

Abstract: The adoption of "human-in-the-loop" paradigms in computer vision and machine learning is leading to various applications where the actual data acquisition (e.g., human supervision) and the underlying inference algorithms are closely interwined. While classical work in active learning provides effective solutions when the learning module involves classification and regression tasks, many practical… ▽ More The adoption of "human-in-the-loop" paradigms in computer vision and machine learning is leading to various applications where the actual data acquisition (e.g., human supervision) and the underlying inference algorithms are closely interwined. While classical work in active learning provides effective solutions when the learning module involves classification and regression tasks, many practical issues such as partially observed measurements, financial constraints and even additional distributional or structural aspects of the data typically fall outside the scope of this treatment. For instance, with sequential acquisition of partial measurements of data that manifest as a matrix (or tensor), novel strategies for completion (or collaborative filtering) of the remaining entries have only been studied recently. Motivated by vision problems where we seek to annotate a large dataset of images via a crowdsourced platform or alternatively, complement results from a state-of-the-art object detector using human feedback, we study the "completion" problem defined on graphs, where requests for additional measurements must be made sequentially. We design the optimization model in the Fourier domain of the graph describing how ideas based on adaptive submodularity provide algorithms that work well in practice. On a large set of images collected from Imgur, we see promising results on images that are otherwise difficult to categorize. We also show applications to an experimental design problem in neuroimaging. △ Less

Submitted 11 August, 2020; originally announced August 2020.

Comments: 9 pages, 7 figures, CVPR 2017 Conference

arXiv:2005.12721 [pdf, other]

doi 10.1103/PhysRevX.11.011043

Emergence of bimodal motility in active droplets

Authors: Babak Vajdi Hokmabad, Ranabir Dey, Maziyar Jalaal, Devaditya Mohanty, Madina Almukambetova, Kyle A Baldwin, Detlef Lohse, Corinna C Maass

Abstract: To explore and react to their environment, living micro-swimmers have developed sophisticated strategies for locomotion - in particular, motility with multiple gaits. To understand the physical principles associated with such a behavioural variability,synthetic model systems capable of mimicking it are needed. Here, we demonstrate bimodal gait switching in autophoretic droplet swimmers. This minim… ▽ More To explore and react to their environment, living micro-swimmers have developed sophisticated strategies for locomotion - in particular, motility with multiple gaits. To understand the physical principles associated with such a behavioural variability,synthetic model systems capable of mimicking it are needed. Here, we demonstrate bimodal gait switching in autophoretic droplet swimmers. This minimal experimental system is isotropic at rest, a symmetry that can be spontaneously broken due to the nonlinear coupling between hydrodynamic and chemical fields, inducing a variety of flow patterns that lead to different propulsive modes. We report a dynamical transition from quasi-ballistic to bimodal chaotic motion, controlled by the viscosity of the swimming medium. By simultaneous visualisation of the chemical and hydrodynamic fields, supported quantitatively by an advection-diffusion model, we show that higher hydrodynamic modes become excitable with increasing viscosity, while the recurrent mode-switching is driven by the droplet's interaction with self-generated chemical gradients. We further demonstrate that this gradient interaction results in anomalous diffusive swimming akin to self-avoiding spatial exploration strategies observed in nature. △ Less

Submitted 6 March, 2021; v1 submitted 26 May, 2020; originally announced May 2020.

Journal ref: Phys. Rev. X 11, 011043 (2021)

arXiv:2003.08427 [pdf, other]

doi 10.1103/PhysRevLett.125.028102

Stress-Induced Dinoflagellate Bioluminescence at the Single Cell Level

Authors: Maziyar Jalaal, Nico Schramma, Antoine Dode, Helene de Maleprade, Christophe Raufaste, Raymond E. Goldstein

Abstract: One of the characteristic features of many marine dinoflagellates is their bioluminescence, which lights up nighttime breaking waves or seawater sliced by a ship's prow. While the internal biochemistry of light production by these microorganisms is well established, the manner by which fluid shear or mechanical forces trigger bioluminescence is still poorly understood. We report controlled measure… ▽ More One of the characteristic features of many marine dinoflagellates is their bioluminescence, which lights up nighttime breaking waves or seawater sliced by a ship's prow. While the internal biochemistry of light production by these microorganisms is well established, the manner by which fluid shear or mechanical forces trigger bioluminescence is still poorly understood. We report controlled measurements of the relation between mechanical stress and light production at the single-cell level, using high-speed imaging of micropipette-held cells of the marine dinoflagellate $Pyrocystis~lunula$ subjected to localized fluid flows or direct indentation. We find a viscoelastic response in which light intensity depends on both the amplitude and rate of deformation, consistent with the action of stretch-activated ion channels. A phenomenological model captures the experimental observations. △ Less

Submitted 18 March, 2020; originally announced March 2020.

Comments: 6 pages, 5 figures plus 4 pages of Supplementary Material with 4 figures; videos available on website of REG

Journal ref: Phys. Rev. Lett. 125, 028102 (2020)

arXiv:2002.05242 [pdf, other]

Leveraging Affect Transfer Learning for Behavior Prediction in an Intelligent Tutoring System

Authors: Nataniel Ruiz, Hao Yu, Danielle A. Allessio, Mona Jalal, Ajjen Joshi, Thomas Murray, John J. Magee, Jacob R. Whitehill, Vitaly Ablavsky, Ivon Arroyo, Beverly P. Woolf, Stan Sclaroff, Margrit Betke

Abstract: In this work, we propose a video-based transfer learning approach for predicting problem outcomes of students working with an intelligent tutoring system (ITS). By analyzing a student's face and gestures, our method predicts the outcome of a student answering a problem in an ITS from a video feed. Our work is motivated by the reasoning that the ability to predict such outcomes enables tutoring sys… ▽ More In this work, we propose a video-based transfer learning approach for predicting problem outcomes of students working with an intelligent tutoring system (ITS). By analyzing a student's face and gestures, our method predicts the outcome of a student answering a problem in an ITS from a video feed. Our work is motivated by the reasoning that the ability to predict such outcomes enables tutoring systems to adjust interventions, such as hints and encouragement, and to ultimately yield improved student learning. We collected a large labeled dataset of student interactions with an intelligent online math tutor consisting of 68 sessions, where 54 individual students solved 2,749 problems. The dataset is public and available at https://www.cs.bu.edu/faculty/betke/research/learning/ . Working with this dataset, our transfer-learning challenge was to design a representation in the source domain of pictures obtained "in the wild" for the task of facial expression analysis, and transferring this learned representation to the task of human behavior prediction in the domain of webcam videos of students in a classroom environment. We developed a novel facial affect representation and a user-personalized training scheme that unlocks the potential of this representation. We designed several variants of a recurrent neural network that models the temporal structure of video sequences of students solving math problems. Our final model, named ATL-BP for Affect Transfer Learning for Behavior Prediction, achieves a relative increase in mean F-score of 50% over the state-of-the-art method on this new dataset. △ Less

Submitted 8 April, 2022; v1 submitted 12 February, 2020; originally announced February 2020.

Comments: Published at IEEE International Conference on Automatic Face and Gesture Recognition (FG), 2021 - Best Poster Award (4% award rate)

arXiv:2002.04181 [pdf, other]

Performance Comparison of Crowdworkers and NLP Tools on Named-Entity Recognition and Sentiment Analysis of Political Tweets

Authors: Mona Jalal, Kate K. Mays, Lei Guo, Margrit Betke

Abstract: We report results of a comparison of the accuracy of crowdworkers and seven Natural Language Processing (NLP) toolkits in solving two important NLP tasks, named-entity recognition (NER) and entity-level sentiment (ELS) analysis. We here focus on a challenging dataset, 1,000 political tweets that were collected during the U.S. presidential primary election in February 2016. Each tweet refers to at… ▽ More We report results of a comparison of the accuracy of crowdworkers and seven Natural Language Processing (NLP) toolkits in solving two important NLP tasks, named-entity recognition (NER) and entity-level sentiment (ELS) analysis. We here focus on a challenging dataset, 1,000 political tweets that were collected during the U.S. presidential primary election in February 2016. Each tweet refers to at least one of four presidential candidates, i.e., four named entities. The groundtruth, established by experts in political communication, has entity-level sentiment information for each candidate mentioned in the tweet. We tested several commercial and open-source tools. Our experiments show that, for our dataset of political tweets, the most accurate NER system, Google Cloud NL, performed almost on par with crowdworkers, but the most accurate ELS analysis system, TensiStrength, did not match the accuracy of crowdworkers by a large margin of more than 30 percent points. △ Less

Submitted 11 August, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

Comments: 4 pages, 1 figure, Accepted at WiNLP Workshop at NAACL 2018

arXiv:1909.00134 [pdf, other]

Scra** Social Media Photos Posted in Kenya and Elsewhere to Detect and Analyze Food Types

Authors: Kaihong Wang, Mona Jalal, Sankara Jefferson, Yi Zheng, Elaine O. Nsoesie, Margrit Betke

Abstract: Monitoring population-level changes in diet could be useful for education and for implementing interventions to improve health. Research has shown that data from social media sources can be used for monitoring dietary behavior. We propose a scrape-by-location methodology to create food image datasets from Instagram posts. We used it to collect 3.56 million images over a period of 20 days in March… ▽ More Monitoring population-level changes in diet could be useful for education and for implementing interventions to improve health. Research has shown that data from social media sources can be used for monitoring dietary behavior. We propose a scrape-by-location methodology to create food image datasets from Instagram posts. We used it to collect 3.56 million images over a period of 20 days in March 2019. We also propose a scrape-by-keywords methodology and used it to scrape ~30,000 images and their captions of 38 Kenyan food types. We publish two datasets of 104,000 and 8,174 image/caption pairs, respectively. With the first dataset, Kenya104K, we train a Kenyan Food Classifier, called KenyanFC, to distinguish Kenyan food from non-food images posted in Kenya. We used the second dataset, KenyanFood13, to train a classifier KenyanFTR, short for Kenyan Food Type Recognizer, to recognize 13 popular food types in Kenya. The KenyanFTR is a multimodal deep neural network that can identify 13 types of Kenyan foods using both images and their corresponding captions. Experiments show that the average top-1 accuracy of KenyanFC is 99% over 10,400 tested Instagram images and of KenyanFTR is 81% over 8,174 tested data points. Ablation studies show that three of the 13 food types are particularly difficult to categorize based on image content only and that adding analysis of captions to the image analysis yields a classifier that is 9 percent points more accurate than a classifier that relies only on images. Our food trend analysis revealed that cakes and roasted meats were the most popular foods in photographs on Instagram in Kenya in March 2019. △ Less

Submitted 31 August, 2019; originally announced September 2019.

Comments: Another version of the paper was submitted to the ACM International Conference on Multimedia (ACMMM2019)

arXiv:1906.00290 [pdf, other]

Adaptive Online Learning for Gradient-Based Optimizers

Authors: Saeed Masoudian, Ali Arabzadeh, Mahdi Jafari Siavoshani, Milad Jalal, Alireza Amouzad

Abstract: As application demands for online convex optimization accelerate, the need for designing new methods that simultaneously cover a large class of convex functions and impose the lowest possible regret is highly rising. Known online optimization methods usually perform well only in specific settings, and their performance depends highly on the geometry of the decision space and cost functions. Howeve… ▽ More As application demands for online convex optimization accelerate, the need for designing new methods that simultaneously cover a large class of convex functions and impose the lowest possible regret is highly rising. Known online optimization methods usually perform well only in specific settings, and their performance depends highly on the geometry of the decision space and cost functions. However, in practice, lack of such geometric information leads to confusion in using the appropriate algorithm. To address this issue, some adaptive methods have been proposed that focus on adaptively learning parameters such as step size, Lipschitz constant, and strong convexity coefficient, or on specific parametric families such as quadratic regularizers. In this work, we generalize these methods and propose a framework that competes with the best algorithm in a family of expert algorithms. Our framework includes many of the well-known adaptive methods including MetaGrad, MetaGrad+C, and Ader. We also introduce a second algorithm that computationally outperforms our first algorithm with at most a constant factor increase in regret. Finally, as a representative application of our proposed algorithm, we study the problem of learning the best regularizer from a family of regularizers for Online Mirror Descent. Empirically, we support our theoretical findings in the problem of learning the best regularizer on the simplex and $l_2$-ball in a multiclass learning problem. △ Less

Submitted 1 June, 2019; originally announced June 2019.

arXiv:1903.03797 [pdf, other]

doi 10.1017/jfm.2019.734

Ripples in Thin Films

Authors: Maziyar Jalaal, Carola Seyfert, Jacco Snoeijer

Abstract: Capillary ripples on thin viscous films are important features of coating and lubrication flows. Here we present experiments based on Digital Holographic Microscopy, measuring the morphology of capillary ripples ahead of a viscous drop spreading on a prewetted surface with a nanoscale resolution. Our experiments reveal that upon increasing the spreading velocity, the amplitude of the ripples first… ▽ More Capillary ripples on thin viscous films are important features of coating and lubrication flows. Here we present experiments based on Digital Holographic Microscopy, measuring the morphology of capillary ripples ahead of a viscous drop spreading on a prewetted surface with a nanoscale resolution. Our experiments reveal that upon increasing the spreading velocity, the amplitude of the ripples first increases and subsequently decreases. Above a critical spreading velocity, the ripples even disappear completely and this transition is accompanied by a divergence of the ripple wavelength. These observations are explained quantitatively using linear wave analysis, beyond the usual lubrication approximation, illustrating that new phenomena arise when the capillary number becomes order unity. △ Less

Submitted 9 March, 2019; originally announced March 2019.

Comments: 9 pages, 4 figures

arXiv:1810.01771 [pdf, other]

SAVOIAS: A Diverse, Multi-Category Visual Complexity Dataset

Authors: Elham Saraee, Mona Jalal, Margrit Betke

Abstract: Visual complexity identifies the level of intricacy and details in an image or the level of difficulty to describe the image. It is an important concept in a variety of areas such as cognitive psychology, computer vision and visualization, and advertisement. Yet, efforts to create large, downloadable image datasets with diverse content and unbiased groundtruthing are lacking. In this work, we intr… ▽ More Visual complexity identifies the level of intricacy and details in an image or the level of difficulty to describe the image. It is an important concept in a variety of areas such as cognitive psychology, computer vision and visualization, and advertisement. Yet, efforts to create large, downloadable image datasets with diverse content and unbiased groundtruthing are lacking. In this work, we introduce Savoias, a visual complexity dataset that compromises of more than 1,400 images from seven image categories relevant to the above research areas, namely Scenes, Advertisements, Visualization and infographics, Objects, Interior design, Art, and Suprematism. The images in each category portray diverse characteristics including various low-level and high-level features, objects, backgrounds, textures and patterns, text, and graphics. The ground truth for Savoias is obtained by crowdsourcing more than 37,000 pairwise comparisons of images using the forced-choice methodology and with more than 1,600 contributors. The resulting relative scores are then converted to absolute visual complexity scores using the Bradley-Terry method and matrix completion. When applying five state-of-the-art algorithms to analyze the visual complexity of the images in the Savoias dataset, we found that the scores obtained from these baseline tools only correlate well with crowdsourced labels for abstract patterns in the Suprematism category (Pearson correlation r=0.84). For the other categories, in particular, the objects and advertisement categories, low correlation coefficients were revealed (r=0.3 and 0.56, respectively). These findings suggest that (1) state-of-the-art approaches are mostly insufficient and (2) Savoias enables category-specific method development, which is likely to improve the impact of visual complexity analysis on specific application areas, including computer vision. △ Less

Submitted 3 October, 2018; originally announced October 2018.

Comments: 10 pages, 4 figures, 4 tables

arXiv:1110.3374 [pdf, other]

A falling droplet as it falls apart

Authors: M. Jalaal, M. Schwalbach, K. Mehravaran

Abstract: Using direct numerical simulations, the fragmentation of falling liquid droplets in a quiescent media is studied. Three simulations with different Eotvos numbers were performed. An adaptive volume of fluid(VOF) method based on octree meshing is used, providing a notable reduction of computational cost. The current video includes 4 main parts describing the fragmentation of the falling droplet. Using direct numerical simulations, the fragmentation of falling liquid droplets in a quiescent media is studied. Three simulations with different Eotvos numbers were performed. An adaptive volume of fluid(VOF) method based on octree meshing is used, providing a notable reduction of computational cost. The current video includes 4 main parts describing the fragmentation of the falling droplet. △ Less

Submitted 14 October, 2011; originally announced October 2011.

Comments: 2 pages, video for APSDFD2011

Showing 1–39 of 39 results for author: Jalal, M