Search | arXiv e-print repository

Pedaling, Fast and Slow: The Race Towards an Optimized Power Strategy

Authors: Steven DiSilvio, Anthony Ozerov, Leon Zhou

Abstract: With the advent of power-meters allowing cyclists to precisely track their power outputs throughout the duration of a race, devising optimal power output strategies for races has become increasingly important in competitive cycling. To do so, the track, weather, and individual cyclist's abilities must all be considered. We propose differential equation models of fatigue and kinematics to simulate… ▽ More With the advent of power-meters allowing cyclists to precisely track their power outputs throughout the duration of a race, devising optimal power output strategies for races has become increasingly important in competitive cycling. To do so, the track, weather, and individual cyclist's abilities must all be considered. We propose differential equation models of fatigue and kinematics to simulate the performance of such strategies, and an innovative optimization algorithm to find the optimal strategy. Our model for fatigue translates a cyclist's power curve (obtained by fitting the Omni-Power Duration Model to power curve data) into a differential equation to capture which power output strategies are feasible. Our kinematics model calculates the forces on the rider, and with power output models the cyclist's velocity and position via a system of differential equations. Using track data, including the slope of the track and velocity of the wind, the model accurately computes race times given a power output strategy on the exact track being raced. To make power strategy optimization computationally tractable, we split the track into segments based on changes in slope and discretize the power output levels. As the space of possible strategies is large, we vectorize the differential equation model for efficient numerical integration of many simulations at once and develop a parallelized Tree Exploration with Monte-Carlo Evaluation algorithm. The algorithm is efficient, running in $O(ab\sqrt{n})$ time and $O(n)$ space where $n$ is the number of simulations done for each choice, $a$ is the number of segments, and $b$ is the number of discrete power output levels. We present results of this optimization for several different tracks and athletes. As an example, the model's time for Filippo Ganna in Tokyo 2020 differs from his real time by just 18%, supporting our model's efficacy. △ Less

Submitted 30 November, 2023; originally announced December 2023.

arXiv:2311.17343 [pdf, other]

Forays into Fungal Fighting and Mycological Moisture Modeling

Authors: John Blackwelder, Steven DiSilvio, Anthony Ozerov

Abstract: As the impending consequences of climate change loom over the Earth, it has become vital for researchers to understand the role microorganisms play in this process. In this paper, we examine how environmental factors, including moisture levels and temperature, affect the expression of certain fungal characteristics on a microscale, and how these in turn affect fungal biodiversity and ecosystem dec… ▽ More As the impending consequences of climate change loom over the Earth, it has become vital for researchers to understand the role microorganisms play in this process. In this paper, we examine how environmental factors, including moisture levels and temperature, affect the expression of certain fungal characteristics on a microscale, and how these in turn affect fungal biodiversity and ecosystem decomposition rates over time. We first present a differential equation model to understand how the distribution of different fungal isolates depends on regional moisture levels. We introduce both slow and sudden variations into the environment in order to represent the various ways climate change will impact fungal ecosystems. This model demonstrates that increased variability in moisture (both short-term and long-term) increases biodiversity and that fungal populations will shift towards more stress-tolerant fungi as aridity increases. The model further suggests the lack of any direct link between biodiversity and decomposition rates. To better describe fungal competition with respect to space, we develop a local agent-based model (ABM). Unlike the previous model, our ABM focuses on individuals, tracking each fungus and the result of its interactions. Our ABM also features a more accurate spatial combat system, allowing us to precisely discern the influence of fungal interactions on the environment. This model corroborates the results of the differential equation model and further suggests that moisture, through its link with temperature and effects on fungal population, also plays a strong role in determining fungal decomposition rates. Together, these models suggest that climate change, which portends increasing variability in regional conditions and higher average temperatures worldwide, will lead to an increase in both wood decomposition rates and, independently, fungal biodiversity. △ Less

Submitted 28 November, 2023; originally announced November 2023.

arXiv:2311.16585 [pdf, other]

Sorting Out New York City's Trash Problem

Authors: Steven DiSilvio, Anthony Ozerov, Leon Zhou

Abstract: To reduce waste and improve public health and sanitation in New York City, innovative policies tailored to the city's unique urban landscape are necessary. The first program we propose is the Dumpster and Compost Accessibility Program. This program is affordable and utilizes dumpsters placed near fire hydrants to keep waste off the street without eliminating parking spaces. It also includes legal… ▽ More To reduce waste and improve public health and sanitation in New York City, innovative policies tailored to the city's unique urban landscape are necessary. The first program we propose is the Dumpster and Compost Accessibility Program. This program is affordable and utilizes dumpsters placed near fire hydrants to keep waste off the street without eliminating parking spaces. It also includes legal changes and the provision of compost bins to single/two-family households, which together will increase composting rates. The second program is the Pay-As-You-Throw Program. This requires New Yorkers living in single/two-family households to purchase stickers for each refuse bag they have collected by the city, incentivizing them to sort out compostable waste and recyclables. We conduct a weighted multi-objective optimization to determine the optimal sticker price based on the City's priorities. Roughly in proportion to the price, this program will increase diversion rates and decrease the net costs to New York City's Department of Sanitation. In conjunction, these two programs will improve NYC's diversion rates, eliminate garbage bags from the streets, and potentially save New York City money. △ Less

Submitted 28 November, 2023; originally announced November 2023.

arXiv:2110.02724 [pdf, other]

ParaDiS: Parallelly Distributable Slimmable Neural Networks

Authors: Alexey Ozerov, Anne Lambert, Suresh Kirthi Kumaraswamy

Abstract: When several limited power devices are available, one of the most efficient ways to make profit of these resources, while reducing the processing latency and communication load, is to run in parallel several neural sub-networks and to fuse the result at the end of processing. However, such a combination of sub-networks must be trained specifically for each particular configuration of devices (char… ▽ More When several limited power devices are available, one of the most efficient ways to make profit of these resources, while reducing the processing latency and communication load, is to run in parallel several neural sub-networks and to fuse the result at the end of processing. However, such a combination of sub-networks must be trained specifically for each particular configuration of devices (characterized by number of devices and their capacities) which may vary over different model deployments and even within the same deployment. In this work we introduce parallelly distributable slimmable (ParaDiS) neural networks that are splittable in parallel among various device configurations without retraining. While inspired by slimmable networks allowing instant adaptation to resources on just one device, ParaDiS networks consist of several multi-device distributable configurations or switches that strongly share the parameters between them. We evaluate ParaDiS framework on MobileNet v1 and ResNet-50 architectures on ImageNet classification task and WDSR architecture for image super-resolution task. We show that ParaDiS switches achieve similar or better accuracy than the individual models, i.e., distributed models of the same structure trained individually. Moreover, we show that, as compared to universally slimmable networks that are not distributable, the accuracy of distributable ParaDiS switches either does not drop at all or drops by a maximum of 1 % only in the worst cases. Finally, once distributed over several devices, ParaDiS outperforms greatly slimmable models. △ Less

Submitted 29 November, 2021; v1 submitted 6 October, 2021; originally announced October 2021.

arXiv:2110.00879 [pdf, other]

Traders in a Strange Land: Agent-based discrete-event market simulation of the Figgie card game

Authors: Steven DiSilvio, Yu, Luo, Anthony Ozerov

Abstract: Figgie is a card game that approximates open-outcry commodities trading. We design strategies for Figgie and study their performance and the resulting market behavior. To do this, we develop a flexible agent-based discrete-event market simulation in which agents operating under our strategies can play Figgie. Our simulation builds upon previous work by simulating latencies between agents and the m… ▽ More Figgie is a card game that approximates open-outcry commodities trading. We design strategies for Figgie and study their performance and the resulting market behavior. To do this, we develop a flexible agent-based discrete-event market simulation in which agents operating under our strategies can play Figgie. Our simulation builds upon previous work by simulating latencies between agents and the market in a novel and efficient way. The fundamentalist strategy we develop takes advantage of Figgie's unique notion of asset value, and is, on average, the profit-maximizing strategy in all combinations of agent strategies tested. We develop a strategy, the "bottom-feeder", which estimates value by observing orders sent by other agents, and find that it limits the success of fundamentalists. We also find that chartist strategies implemented, including one from the literature, fail by going into feedback loops in the small Figgie market. We further develop a bootstrap method for statistically comparing strategies in a zero-sum game. Our results demonstrate the wide-ranging applicability of agent-based discrete-event simulations in studying markets. △ Less

Submitted 2 October, 2021; originally announced October 2021.

ACM Class: I.6; J.4

arXiv:2102.05749 [pdf, ps, other]

doi 10.1109/ICASSP39728.2021.9414235

Self-Supervised VQ-VAE for One-Shot Music Style Transfer

Authors: Ondřej Cífka, Alexey Ozerov, Umut Şimşekli, Gaël Richard

Abstract: Neural style transfer, allowing to apply the artistic style of one image to another, has become one of the most widely showcased computer vision applications shortly after its introduction. In contrast, related tasks in the music audio domain remained, until recently, largely untackled. While several style conversion methods tailored to musical signals have been proposed, most lack the 'one-shot'… ▽ More Neural style transfer, allowing to apply the artistic style of one image to another, has become one of the most widely showcased computer vision applications shortly after its introduction. In contrast, related tasks in the music audio domain remained, until recently, largely untackled. While several style conversion methods tailored to musical signals have been proposed, most lack the 'one-shot' capability of classical image style transfer algorithms. On the other hand, the results of existing one-shot audio style transfer methods on musical inputs are not as compelling. In this work, we are specifically interested in the problem of one-shot timbre transfer. We present a novel method for this task, based on an extension of the vector-quantized variational autoencoder (VQ-VAE), along with a simple self-supervised learning strategy designed to obtain disentangled representations of timbre and pitch. We evaluate the method using a set of objective metrics and show that it is able to outperform selected baselines. △ Less

Submitted 10 June, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

Comments: ICASSP 2021. Website: https://adasp.telecom-paris.fr/s/ss-vq-vae

Journal ref: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (2021) 96-100

arXiv:2007.07663 [pdf, other]

doi 10.1109/JSTSP.2020.3042071

A survey and an extensive evaluation of popular audio declip** methods

Authors: Pavel Záviška, Pavel Rajmic, Alexey Ozerov, Lucas Rencker

Abstract: Dynamic range limitations in signal processing often lead to clip**, or saturation, in signals. The task of audio declip** is estimating the original audio signal, given its clipped measurements, and has attracted much interest in recent years. Audio declip** algorithms often make assumptions about the underlying signal, such as sparsity or low-rankness, and about the measurement system. In… ▽ More Dynamic range limitations in signal processing often lead to clip**, or saturation, in signals. The task of audio declip** is estimating the original audio signal, given its clipped measurements, and has attracted much interest in recent years. Audio declip** algorithms often make assumptions about the underlying signal, such as sparsity or low-rankness, and about the measurement system. In this paper, we provide an extensive review of audio declip** algorithms proposed in the literature. For each algorithm, we present assumptions that are made about the audio signal, the modeling domain, and the optimization algorithm. Furthermore, we provide an extensive numerical evaluation of popular declip** algorithms, on real audio data. We evaluate each algorithm in terms of the Signal-to-Distortion Ratio, and also using perceptual metrics of sound quality. The article is accompanied by a repository containing the evaluated methods. △ Less

Submitted 4 January, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

Journal ref: IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 1, pp. 5-24, Jan. 2021

arXiv:1811.04000 [pdf, other]

Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision

Authors: Sanjeel Parekh, Alexey Ozerov, Slim Essid, Ngoc Duong, Patrick Pérez, Gaël Richard

Abstract: We tackle the problem of audiovisual scene analysis for weakly-labeled data. To this end, we build upon our previous audiovisual representation learning framework to perform object classification in noisy acoustic environments and integrate audio source enhancement capability. This is made possible by a novel use of non-negative matrix factorization for the audio modality. Our approach is founded… ▽ More We tackle the problem of audiovisual scene analysis for weakly-labeled data. To this end, we build upon our previous audiovisual representation learning framework to perform object classification in noisy acoustic environments and integrate audio source enhancement capability. This is made possible by a novel use of non-negative matrix factorization for the audio modality. Our approach is founded on the multiple instance learning paradigm. Its effectiveness is established through experiments over a challenging dataset of music instrument performance videos. We also show encouraging visual object localization results. △ Less

Submitted 9 November, 2018; originally announced November 2018.

arXiv:1804.07345 [pdf, other]

Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events

Authors: Sanjeel Parekh, Slim Essid, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez, Gaël Richard

Abstract: Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events. To this end, we propose a novel multimodal framework that instantiates multiple instance learning. We show that the learnt representations are useful for classifying events and localizing their characteristic audio-visual elements. The system is traine… ▽ More Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events. To this end, we propose a novel multimodal framework that instantiates multiple instance learning. We show that the learnt representations are useful for classifying events and localizing their characteristic audio-visual elements. The system is trained using only video-level event labels without any timing information. An important feature of our method is its capacity to learn from unsynchronized audio-visual events. We achieve state-of-the-art results on a large-scale dataset of weakly-labeled audio event videos. Visualizations of localized visual regions and audio segments substantiate our system's efficacy, especially when dealing with noisy situations where modality-specific cues appear asynchronously. △ Less

Submitted 9 July, 2018; v1 submitted 19 April, 2018; originally announced April 2018.

arXiv:1710.11385 [pdf, other]

doi 10.1109/ICASSP.2018.8461711

Audio style transfer

Authors: Eric Grinstein, Ngoc Duong, Alexey Ozerov, Patrick Pérez

Abstract: 'Style transfer' among images has recently emerged as a very active research topic, fuelled by the power of convolution neural networks (CNNs), and has become fast a very popular technology in social media. This paper investigates the analogous problem in the audio domain: How to transfer the style of a reference audio signal to a target audio content? We propose a flexible framework for the task,… ▽ More 'Style transfer' among images has recently emerged as a very active research topic, fuelled by the power of convolution neural networks (CNNs), and has become fast a very popular technology in social media. This paper investigates the analogous problem in the audio domain: How to transfer the style of a reference audio signal to a target audio content? We propose a flexible framework for the task, which uses a sound texture model to extract statistics characterizing the reference audio style, followed by an optimization-based audio texture synthesis to modify the target content. In contrast to mainstream optimization-based visual transfer method, the proposed process is initialized by the target content instead of random noise and the optimized loss is only about texture, not structure. These differences proved key for audio style transfer in our experiments. In order to extract features of interest, we investigate different architectures, whether pre-trained on other tasks, as done in image style transfer, or engineered based on the human auditory system. Experimental results on different types of audio signal confirm the potential of the proposed approach. △ Less

Submitted 7 November, 2018; v1 submitted 31 October, 2017; originally announced October 2017.

Comments: ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2018, Calgary, France. IEEE

Showing 1–10 of 10 results for author: Ozerov, A