Search | arXiv e-print repository

Open-Source Conversational AI with SpeechBrain 1.0

Authors: Mirco Ravanelli, Titouan Parcollet, Adel Moumen, Sylvain de Langen, Cem Subakan, Peter Plantinga, Yingzhi Wang, Pooneh Mousavi, Luca Della Libera, Artem Ploujnikov, Francesco Paissan, Davide Borra, Salah Zaiem, Zeyu Zhao, Shucong Zhang, Georgios Karakasidis, Sung-Lin Yeh, Aku Rouhe, Rudolf Braun, Florian Mai, Juan Zuluaga-Gomez, Seyed Mahed Mousavi, Andreas Nautsch, Xuechen Liu, Sangeet Sagar , et al. (5 additional authors not shown)

Abstract: SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more.It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper presen… ▽ More SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more.It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper presents SpeechBrain 1.0, a significant milestone in the evolution of the toolkit, which now has over 200 recipes for speech, audio, and language processing tasks, and more than 100 models available on Hugging Face. SpeechBrain 1.0 introduces new technologies to support diverse learning modalities, Large Language Model (LLM) integration, and advanced decoding strategies, along with novel models, tasks, and modalities. It also includes a new benchmark repository, offering researchers a unified platform for evaluating models across diverse tasks. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: Submitted to JMLR (Machine Learning Open Source Software)

arXiv:2306.08258 [pdf, other]

Transmission and Distribution Coordination for DER-rich Energy Markets: A Parametric Programming Approach

Authors: Mohammad Mousavi, Meng Wu

Abstract: In this paper, a framework is proposed to coordinate the operation of the independent system operator (ISO) and distribution system operator (DSO). The framework is compatible with current practice of the U.S. wholesale market to enable massive distributed energy resources (DERs) to participate in the wholesale market. The DSO builds a bid-in cost function to be submitted to the ISO market through… ▽ More In this paper, a framework is proposed to coordinate the operation of the independent system operator (ISO) and distribution system operator (DSO). The framework is compatible with current practice of the U.S. wholesale market to enable massive distributed energy resources (DERs) to participate in the wholesale market. The DSO builds a bid-in cost function to be submitted to the ISO market through parametric programming. Once the ISO clears the wholesale market, the dispatch and payment of the DSO are determined by ISO. Then, the DSO determines the dispatch and payment of the DER aggregators. To compare the proposed framework, an ideal case is defined in which DER aggregators can participate in the wholesale market directly and ISO overseas operation of both transmission and distribution systems. We proved 1) the dispatches of the proposed ISO-DSO coordination framework are identical to those of the ideal case; 2) the payments to each DER aggregator are identical in the proposed framework and in the ideal case. Case studies are performed on a small illustrative example as well as a large test system which includes IEEE 118 bus transmission system and two distribution systems - the IEEE 33 node and IEEE 240 node test systems. △ Less

Submitted 14 June, 2023; originally announced June 2023.

Comments: 10 pages

arXiv:2301.12176 [pdf]

Neural Gas Network Image Features and Segmentation for Brain Tumor Detection Using Magnetic Resonance Imaging Data

Authors: S. Muhammad Hossein Mousavi

Abstract: Accurate detection of brain tumors could save lots of lives and increasing the accuracy of this binary classification even as much as a few percent has high importance. Neural Gas Networks (NGN) is a fast, unsupervised algorithm that could be used in data clustering, image pattern recognition, and image segmentation. In this research, we used the metaheuristic Firefly Algorithm (FA) for image cont… ▽ More Accurate detection of brain tumors could save lots of lives and increasing the accuracy of this binary classification even as much as a few percent has high importance. Neural Gas Networks (NGN) is a fast, unsupervised algorithm that could be used in data clustering, image pattern recognition, and image segmentation. In this research, we used the metaheuristic Firefly Algorithm (FA) for image contrast enhancement as pre-processing and NGN weights for feature extraction and segmentation of Magnetic Resonance Imaging (MRI) data on two brain tumor datasets from the Kaggle platform. Also, tumor classification is conducted by Support Vector Machine (SVM) classification algorithms and compared with a deep learning technique plus other features in train and test phases. Additionally, NGN tumor segmentation is evaluated by famous performance metrics such as Accuracy, F-measure, Jaccard, and more versus ground truth data and compared with traditional segmentation techniques. The proposed method is fast and precise in both tasks of tumor classification and segmentation compared with other methods. A classification accuracy of 95.14 % and segmentation accuracy of 0.977 is achieved by the proposed method. △ Less

Submitted 28 January, 2023; originally announced January 2023.

Comments: 7 pages

arXiv:2209.05658 [pdf]

EV Charging Station Wholesale Market Participation: A Strategic Bidding and Pricing Approach

Authors: Mohammad Mousavi, Li "Lisa" Qi, Alexander Brissette, Meng Wu

Abstract: This paper presents a framework for simultaneous bidding and pricing strategy for wholesale market participation of electric vehicle (EV) charging stations aggregator. The proposed framework incorporates the EV charging stations' technical constraints as well as EV owners' preferences. A bi-level optimization is adopted to model the problem. In the upper level, the total profit of the EV charging… ▽ More This paper presents a framework for simultaneous bidding and pricing strategy for wholesale market participation of electric vehicle (EV) charging stations aggregator. The proposed framework incorporates the EV charging stations' technical constraints as well as EV owners' preferences. A bi-level optimization is adopted to model the problem. In the upper level, the total profit of the EV charging station aggregator is maximized. In the lower-level problem, the EV owner's utility function is maximized. The EV owners' preferences are modeled using the quadratic utility function. The bi-level optimization problem which is non-convex and hard to solve is converted to a mixed-integer convex quadratic programming model by writing the optimal conditions of the lower-level problem that is solvable with commercial solvers. The effectiveness of the proposed framework is investigated by implementing simulation results. △ Less

Submitted 12 September, 2022; originally announced September 2022.

arXiv:2202.04542 [pdf, other]

Spectrally Adaptive Common Spatial Patterns

Authors: Mahta Mousavi, Eric Lybrand, Shuangquan Feng, Shuai Tang, Rayan Saab, Virginia de Sa

Abstract: The method of Common Spatial Patterns (CSP) is widely used for feature extraction of electroencephalography (EEG) data, such as in motor imagery brain-computer interface (BCI) systems. It is a data-driven method estimating a set of spatial filters so that the power of the filtered EEG signal is maximized for one motor imagery class and minimized for the other. This method, however, is prone to ove… ▽ More The method of Common Spatial Patterns (CSP) is widely used for feature extraction of electroencephalography (EEG) data, such as in motor imagery brain-computer interface (BCI) systems. It is a data-driven method estimating a set of spatial filters so that the power of the filtered EEG signal is maximized for one motor imagery class and minimized for the other. This method, however, is prone to overfitting and is known to suffer from poor generalization especially with limited calibration data. Additionally, due to the high heterogeneity in brain data and the non-stationarity of brain activity, CSP is usually trained for each user separately resulting in long calibration sessions or frequent re-calibrations that are tiring for the user. In this work, we propose a novel algorithm called Spectrally Adaptive Common Spatial Patterns (SACSP) that improves CSP by learning a temporal/spectral filter for each spatial filter so that the spatial filters are concentrated on the most relevant temporal frequencies for each user. We show the efficacy of SACSP in providing better generalizability and higher classification accuracy from calibration to online control compared to existing methods. Furthermore, we show that SACSP provides neurophysiologically relevant information about the temporal frequencies of the filtered signals. Our results highlight the differences in the motor imagery signal among BCI users as well as spectral differences in the signals generated for each class, and show the importance of learning robust user-specific features in a data-driven manner. △ Less

Submitted 9 February, 2022; originally announced February 2022.

arXiv:2201.07433 [pdf, other]

ISO and DSO Coordination: A Parametric Programming Approach

Authors: Mohammad Mousavi, Meng Wu

Abstract: In this paper, a framework is proposed to coordinate the operation of the independent system operator (ISO) and distribution system operator (DSO) to leverage the wholesale market participation of distributed energy resources (DERs) aggregators while ensuring secure operation of distribution grids. The proposed coordination framework is based on parametric programming. The DSO builds the bid-in co… ▽ More In this paper, a framework is proposed to coordinate the operation of the independent system operator (ISO) and distribution system operator (DSO) to leverage the wholesale market participation of distributed energy resources (DERs) aggregators while ensuring secure operation of distribution grids. The proposed coordination framework is based on parametric programming. The DSO builds the bid-in cost function based on the distribution system market considering its market player constraints and distribution system physical constraints including the power balance equations and voltage limitation constraints. The DSO submits the resulting bid-in cost function to the wholesale market operated by the ISO. After the clearance of the wholesale market, the DSO determines the share of its retail market participants (i.e., DER aggregators). Case studies are performed to verify the effectiveness of the proposed method. △ Less

Submitted 19 January, 2022; originally announced January 2022.

arXiv:2009.08248 [pdf, other]

A Two-stage Stochastic Programming DSO Framework for Comprehensive Market Participation of DER Aggregators under Uncertainty

Authors: Mohammad Mousavi, Meng Wu

Abstract: In this paper, a distribution system operator (DSO) framework is proposed for comprehensive retail and wholesale markets participation of distributed energy resource (DER) aggregators under uncertainty based on two-stage stochastic programming. Different kinds of DER aggregators including energy storage aggregators (ESAGs), demand response aggregators (DRAGs), electric vehicle (EV) aggregating cha… ▽ More In this paper, a distribution system operator (DSO) framework is proposed for comprehensive retail and wholesale markets participation of distributed energy resource (DER) aggregators under uncertainty based on two-stage stochastic programming. Different kinds of DER aggregators including energy storage aggregators (ESAGs), demand response aggregators (DRAGs), electric vehicle (EV) aggregating charging stations (EVCSs), dispatchable distributed generation (DDG) aggregators (DDGAGs), and renewable energy aggregators (REAGs) are modeled. Distribution network operation constraints are considered using a linearized power flow. The problem is modeled using mixed-integer linear programming (MILP) which can be solved by using commercial solvers. Case studies are conducted to investigate the performance of the proposed DSO framework. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: arXiv admin note: text overlap with arXiv:2006.06673

arXiv:2006.06673 [pdf, other]

A DSO Framework for Comprehensive Market Participation of DER Aggregators

Authors: Mohammad Mousavi, Meng Wu

Abstract: In this paper, a distribution system operator (DSO) framework is proposed to optimally coordinate distributed energy resources (DER) aggregators' comprehensive participation in retail energy market as well as wholesale energy and regulation markets. Various types of DER aggregators, including energy storage aggregators (ESAGs), dispatchable distributed generation aggregators (DDGAGs), electric veh… ▽ More In this paper, a distribution system operator (DSO) framework is proposed to optimally coordinate distributed energy resources (DER) aggregators' comprehensive participation in retail energy market as well as wholesale energy and regulation markets. Various types of DER aggregators, including energy storage aggregators (ESAGs), dispatchable distributed generation aggregators (DDGAGs), electric vehicles charging stations (EVCSs), and demand response aggregators (DRAGs), are modeled in the proposed DSO framework. Distribution network constraints are considered by using a linearized power flow. The problem is modeled using mixed-integer linear programming (MILP) which can be solved by commercial solvers. Case studies are performed to analyze the interactions between DER aggregators and wholesale/retail electricity markets. △ Less

Submitted 8 June, 2020; originally announced June 2020.

arXiv:1912.01144 [pdf, other]

doi 10.1109/TGRS.2020.2988770

Bayesian-Deep-Learning Estimation of Earthquake Location from Single-Station Observations

Authors: S. Mostafa Mousavi, Gregory C. Beroza

Abstract: We present a deep learning method for single-station earthquake location, which we approach as a regression problem using two separate Bayesian neural networks. We use a multi-task temporal-convolutional neural network to learn epicentral distance and P travel time from 1-minute seismograms. The network estimates epicentral distance and P travel time with absolute mean errors of 0.23 km and 0.03 s… ▽ More We present a deep learning method for single-station earthquake location, which we approach as a regression problem using two separate Bayesian neural networks. We use a multi-task temporal-convolutional neural network to learn epicentral distance and P travel time from 1-minute seismograms. The network estimates epicentral distance and P travel time with absolute mean errors of 0.23 km and 0.03 s respectively, along with their epistemic and aleatory uncertainties. We design a separate multi-input network using standard convolutional layers to estimate the back-azimuth angle, and its epistemic uncertainty. This network estimates the direction from which seismic waves arrive to the station with a mean error of 1 degree. Using this information, we estimate the epicenter, origin time, and depth along with their confidence intervals. We use a global dataset of earthquake signals recorded within 1 degree (~112 km) from the event to build the model and to demonstrate its performance. Our model can predict epicenter, origin time, and depth with mean errors of 7.3 km, 0.4 second, and 6.7 km respectively, at different locations around the world. Our approach can be used for fast earthquake source characterization with a limited number of observations, and also for estimating location of earthquakes that are sparsely recorded -- either because they are small or because stations are widely separated. △ Less

Submitted 2 December, 2019; originally announced December 2019.

arXiv:1911.05975 [pdf, other]

doi 10.1029/2019GL085976

A Machine-Learning Approach for Earthquake Magnitude Estimation

Authors: S. Mostafa Mousavi, Gregory C. Beroza

Abstract: In this study we develop a single-station deep-learning approach for fast and reliable estimation of earthquake magnitude directly from raw waveforms. We design a regressor composed of convolutional and recurrent neural networks that is not sensitive to the data normalization, hence waveform amplitude information can be utilized during the training. Our network can predict earthquake magnitudes wi… ▽ More In this study we develop a single-station deep-learning approach for fast and reliable estimation of earthquake magnitude directly from raw waveforms. We design a regressor composed of convolutional and recurrent neural networks that is not sensitive to the data normalization, hence waveform amplitude information can be utilized during the training. Our network can predict earthquake magnitudes with an average error close to zero and standard deviation of ~0.2 based on single-station waveforms without instrument response correction. We test the network for both local and duration magnitude scales and show a station-based learning can be an effective approach for improving the performance. The proposed approach has a variety of potential applications from routine earthquake monitoring to early warning systems. △ Less

Submitted 14 November, 2019; originally announced November 2019.

arXiv:1911.02607 [pdf, other]

Energy and Social Cost Minimization for Data Dissemination in Wireless Networks: Centralized and Decentralized Approaches

Authors: Mahdi Mousavi, Anja Klein

Abstract: We study multi-hop data-dissemination in a wireless network from one source to multiple nodes where some of the nodes of the network act as re-transmitting nodes and help the source in data dissemination. In this network, we study two scenarios; i) the transmitting nodes do not need an incentive for transmission and ii) they do need an incentive and are paid by their corresponding receiving nodes… ▽ More We study multi-hop data-dissemination in a wireless network from one source to multiple nodes where some of the nodes of the network act as re-transmitting nodes and help the source in data dissemination. In this network, we study two scenarios; i) the transmitting nodes do not need an incentive for transmission and ii) they do need an incentive and are paid by their corresponding receiving nodes by virtual tokens. We investigate two problems; P1) network power minimization for the first scenario and P2) social cost minimization for the second scenario, defined as the total cost paid by the nodes of the network for receiving data. In this paper, to address P1 and P2, we propose centralized and decentralized approaches that determine which of the nodes of the network should act as transmitting nodes, find their transmit powers and their corresponding receiving nodes. For the sake of energy efficiency, in our model, we employ maximal-ratio combining (MRC) at the receivers so that a receiver can be served by multiple transmitters. The proposed decentralized approach is based on a non-cooperative cost-sharing game (CSG). In our proposed game, every receiving node chooses its respective transmitting nodes and consequently, a cost is assigned to it according to the power imposed on its chosen transmitting nodes. We discuss how the network is formed in a decentralized way, find the action of the nodes in the game and show that, despite being decentralized, the proposed game converges to a stable solution. To find the centralized global optimum, which is a benchmark to our decentralized approach, we use a mixed-integer-liner-program (MILP). Simulation results show that our proposed decentralized approach outperforms the conventional algorithms in terms of energy efficiency and social cost while it can address the need for an incentive for collaboration. △ Less

Submitted 21 March, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

Comments: 14 pages, 7 figures

arXiv:1811.02695 [pdf, other]

doi 10.1109/TGRS.2019.2926772

Seismic Signal Denoising and Decomposition Using Deep Neural Networks

Authors: Weiqiang Zhu, S. Mostafa Mousavi, Gregory C. Beroza

Abstract: Denoising and filtering are widely used in routine seismic-data-processing to improve the signal-to-noise ratio (SNR) of recorded signals and by doing so to improve subsequent analyses. In this paper we develop a new denoising/decomposition method, DeepDenoiser, based on a deep neural network. This network is able to learn simultaneously a sparse representation of data in the time-frequency domain… ▽ More Denoising and filtering are widely used in routine seismic-data-processing to improve the signal-to-noise ratio (SNR) of recorded signals and by doing so to improve subsequent analyses. In this paper we develop a new denoising/decomposition method, DeepDenoiser, based on a deep neural network. This network is able to learn simultaneously a sparse representation of data in the time-frequency domain and a non-linear function that maps this representation into masks that decompose input data into a signal of interest and noise (defined as any non-seismic signal). We show that DeepDenoiser achieves impressive denoising of seismic signals even when the signal and noise share a common frequency band. Our method properly handles a variety of colored noise and non-earthquake signals. DeepDenoiser can significantly improve the SNR with minimal changes in the waveform shape of interest, even in presence of high noise levels. We demonstrate the effect of our method on improving earthquake detection. There are clear applications of DeepDenoiser to seismic imaging, micro-seismic monitoring, and preprocessing of ambient noise data. We also note that potential applications of our approach are not limited to these applications or even to earthquake data, and that our approach can be adapted to diverse signals and applications in other settings. △ Less

Submitted 6 November, 2018; originally announced November 2018.

arXiv:1712.00834 [pdf, other]

Femtosecond CDMA Using Dielectric Metasurfaces: Design Procedure and Challenges

Authors: Taha Rajabzadeh, Mohammad Hosein Mousavi, Sajjad Abdollahramezani, Mohammad Vahid Jamali, Jawad A. Salehi

Abstract: Inspired by the ever-increasing demand for higher data transmission rates and the tremendous attention toward all-optical signal processing based on miniaturized nanophotonics, in this paper, for the first time, we investigate the integrable design of coherent ultrashort light pulse code-division multiple-access (CDMA) technique, also known as femtosecond CDMA, using all-dielectric metasurfaces (M… ▽ More Inspired by the ever-increasing demand for higher data transmission rates and the tremendous attention toward all-optical signal processing based on miniaturized nanophotonics, in this paper, for the first time, we investigate the integrable design of coherent ultrashort light pulse code-division multiple-access (CDMA) technique, also known as femtosecond CDMA, using all-dielectric metasurfaces (MSs). In this technique, the data bits are firstly modulated using ultrashort femtosecond optical pulses generated by mode-locked lasers, and then by employing a unique phase metamask for each data stream, in order to provide the multiple access capability, the optical signals are spectrally encoded. This procedure spreads the optical signal in the temporal domain and generates low-intensity pseudo-noise bursts through random phase coding leading to minimized multiple access interference. This paper comprehensively presents the principles and design approach to realize fundamental components of a typical femtosecond CDMA encoder, including the grating, lens, and phase mask, by employing high-contrast CMOS-compatible MSs. By controlling the interference between the provided Mie and Fabry-Perot resonance modes, we tailor the spectral and spatial responses of the im**ing light locally and independently. Accordingly, we design a MS-based grating with the highest possible refracted angle and, in the meantime, the maximized efficiency which results in a reasonable diameter for the subsequent lens. Moreover, to design our MS-based lens commensurate with the spot size and distance requirements of the pursuant phase mask, we leverage a new optimization method which splits the lens structure into central and peripheral parts, and then design the peripheral part using a collection of gratings converging the im**ing at the subsequent phase mask. △ Less

Submitted 3 December, 2017; originally announced December 2017.

arXiv:1612.04975 [pdf, ps, other]

doi 10.4204/EPTCS.232.8

Towards an Approximate Conformance Relation for Hybrid I/O Automata

Authors: Morteza Mohaqeqi, Mohammad Reza Mousavi

Abstract: Several notions of conformance have been proposed for checking the behavior of cyber-physical systems against their hybrid systems models. In this paper, we explore the initial idea of a notion of approximate conformance that allows for comparison of both observable discrete actions and (sampled) continuous trajectories. As such, this notion will consolidate two earlier notions, namely the notion… ▽ More Several notions of conformance have been proposed for checking the behavior of cyber-physical systems against their hybrid systems models. In this paper, we explore the initial idea of a notion of approximate conformance that allows for comparison of both observable discrete actions and (sampled) continuous trajectories. As such, this notion will consolidate two earlier notions, namely the notion of Hybrid Input-Output Conformance (HIOCO) by M. van Osch and the notion of Hybrid Conformance by H. Abbas and G.E. Fainekos. We prove that our proposed notion of conformance satisfies a semi-transitivity property, which makes it suitable for a step-wise proof of conformance or refinement. △ Less

Submitted 15 December, 2016; originally announced December 2016.

Comments: In Proceedings V2CPS-16, arXiv:1612.04023

Journal ref: EPTCS 232, 2016, pp. 53-64

Showing 1–14 of 14 results for author: Mousavi, M