-
High Significant Fault Detection in Azure Core Workload Insights
Authors:
Pranay Lohia,
Laurent Boue,
Sharath Rangappa,
Vijay Agneeswaran
Abstract:
Azure Core workload insights have time-series data with different metric units. Faults or Anomalies are observed in these time-series data owing to faults observed with respect to metric name, resources region, dimensions, and its dimension value associated with the data. For Azure Core, an important task is to highlight faults or anomalies to the user on a dashboard that they can perceive easily.…
▽ More
Azure Core workload insights have time-series data with different metric units. Faults or Anomalies are observed in these time-series data owing to faults observed with respect to metric name, resources region, dimensions, and its dimension value associated with the data. For Azure Core, an important task is to highlight faults or anomalies to the user on a dashboard that they can perceive easily. The number of anomalies reported should be highly significant and in a limited number, e.g., 5-20 anomalies reported per hour. The reported anomalies will have significant user perception and high reconstruction error in any time-series forecasting model. Hence, our task is to automatically identify 'high significant anomalies' and their associated information for user perception.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
A case study of Generative AI in MSX Sales Copilot: Improving seller productivity with a real-time question-answering system for content recommendation
Authors:
Manpreet Singh,
Ravdeep Pasricha,
Nitish Singh,
Ravi Prasad Kondapalli,
Manoj R,
Kiran R,
Laurent Boué
Abstract:
In this paper, we design a real-time question-answering system specifically targeted for hel** sellers get relevant material/documentation they can share live with their customers or refer to during a call. Taking the Seismic content repository as a relatively large scale example of a diverse dataset of sales material, we demonstrate how LLM embeddings of sellers' queries can be matched with the…
▽ More
In this paper, we design a real-time question-answering system specifically targeted for hel** sellers get relevant material/documentation they can share live with their customers or refer to during a call. Taking the Seismic content repository as a relatively large scale example of a diverse dataset of sales material, we demonstrate how LLM embeddings of sellers' queries can be matched with the relevant content. We achieve this by engineering prompts in an elaborate fashion that makes use of the rich set of meta-features available for documents and sellers. Using a bi-encoder with cross-encoder re-ranker architecture, we show how the solution returns the most relevant content recommendations in just a few seconds even for large datasets. Our recommender system is deployed as an AML endpoint for real-time inferencing and has been integrated into a Copilot interface that is now deployed in the production version of the Dynamics CRM, known as MSX, used daily by Microsoft sellers.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Searching, fast and slow, through product catalogs
Authors:
Dayananda Ubrangala,
Juhi Sharma,
Sharath Kumar Rangappa,
Kiran R,
Ravi Prasad Kondapalli,
Laurent Boué
Abstract:
String matching algorithms in the presence of abbreviations, such as in Stock Kee** Unit (SKU) product catalogs, remains a relatively unexplored topic. In this paper, we present a unified architecture for SKU search that provides both a real-time suggestion system (based on a Trie data structure) as well as a lower latency search system (making use of character level TF-IDF in combination with l…
▽ More
String matching algorithms in the presence of abbreviations, such as in Stock Kee** Unit (SKU) product catalogs, remains a relatively unexplored topic. In this paper, we present a unified architecture for SKU search that provides both a real-time suggestion system (based on a Trie data structure) as well as a lower latency search system (making use of character level TF-IDF in combination with language model vector embeddings) where users initiate the search process explicitly. We carry out ablation studies that justify designing a complex search system composed of multiple components to address the delicate trade-off between speed and accuracy. Using SKU search in the Dynamics CRM as an example, we show how our system vastly outperforms, in all aspects, the results provided by the default search engine. Finally, we show how SKU descriptions may be enhanced via generative text models (using gpt-3.5-turbo) so that the consumers of the search results may get more context and a generally better experience when presented with the results of their SKU search.
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
Improving search relevance of Azure Cognitive Search by Bayesian optimization
Authors:
Nitin Agarwal,
Ashish Kumar,
Kiran R,
Manish Gupta,
Laurent Boué
Abstract:
Azure Cognitive Search (ACS) has emerged as a major contender in "Search as a Service" cloud products in recent years. However, one of the major challenges for ACS users is to improve the relevance of the search results for their specific usecases. In this paper, we propose a novel method to find the optimal ACS configuration that maximizes search relevance for a specific usecase (product search,…
▽ More
Azure Cognitive Search (ACS) has emerged as a major contender in "Search as a Service" cloud products in recent years. However, one of the major challenges for ACS users is to improve the relevance of the search results for their specific usecases. In this paper, we propose a novel method to find the optimal ACS configuration that maximizes search relevance for a specific usecase (product search, document search...) The proposed solution improves key online marketplace metrics such as click through rates (CTR) by formulating the search relevance problem as hyperparameter tuning. We have observed significant improvements in real-world search call to action (CTA) rate in multiple marketplaces by introducing optimized weights generated from the proposed approach.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Domain specificity and data efficiency in typo tolerant spell checkers: the case of search in online marketplaces
Authors:
Dayananda Ubrangala,
Juhi Sharma,
Ravi Prasad Kondapalli,
Kiran R,
Amit Agarwala,
Laurent Boué
Abstract:
Typographical errors are a major source of frustration for visitors of online marketplaces. Because of the domain-specific nature of these marketplaces and the very short queries users tend to search for, traditional spell cheking solutions do not perform well in correcting typos. We present a data augmentation method to address the lack of annotated typo data and train a recurrent neural network…
▽ More
Typographical errors are a major source of frustration for visitors of online marketplaces. Because of the domain-specific nature of these marketplaces and the very short queries users tend to search for, traditional spell cheking solutions do not perform well in correcting typos. We present a data augmentation method to address the lack of annotated typo data and train a recurrent neural network to learn context-limited domain-specific embeddings. Those embeddings are deployed in a real-time inferencing API for the Microsoft AppSource marketplace to find the closest match between a misspelled user query and the available product names. Our data efficient solution shows that controlled high quality synthetic data may be a powerful tool especially considering the current climate of large language models which rely on prohibitively huge and often uncontrolled datasets.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
A Data Source Dependency Analysis Framework for Large Scale Data Science Projects
Authors:
Laurent Boué,
Pratap Kunireddy,
Pavle Subotić
Abstract:
Dependency hell is a well-known pain point in the development of large software projects and machine learning (ML) code bases are not immune from it. In fact, ML applications suffer from an additional form, namely, "data source dependency hell". This term refers to the central role played by data and its unique quirks that often lead to unexpected failures of ML models which cannot be explained by…
▽ More
Dependency hell is a well-known pain point in the development of large software projects and machine learning (ML) code bases are not immune from it. In fact, ML applications suffer from an additional form, namely, "data source dependency hell". This term refers to the central role played by data and its unique quirks that often lead to unexpected failures of ML models which cannot be explained by code changes. In this paper, we present an automated dependency map** framework that allows MLOps engineers to monitor the whole dependency map of their models in a fast paced engineering environment and thus mitigate ahead of time the consequences of any data source changes (e.g., re-train model, ignore data, set default data etc.). Our system is based on a unified and generic approach, employing techniques from static analysis, from which data sources can be identified reliably for any type of dependency on a wide range of source languages and artefacts. The dependency map** framework is exposed as a REST web API where the only input is the path to the Git repository hosting the code base. Currently used by MLOps engineers at Microsoft, we expect such dependency map APIs to be adopted more widely by MLOps engineers in the future.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.
-
Real numbers, data science and chaos: How to fit any dataset with a single parameter
Authors:
Laurent Boué
Abstract:
We show how any dataset of any modality (time-series, images, sound...) can be approximated by a well-behaved (continuous, differentiable...) scalar function with a single real-valued parameter. Building upon elementary concepts from chaos theory, we adopt a pedagogical approach demonstrating how to adjust this parameter in order to achieve arbitrary precision fit to all samples of the data. Targe…
▽ More
We show how any dataset of any modality (time-series, images, sound...) can be approximated by a well-behaved (continuous, differentiable...) scalar function with a single real-valued parameter. Building upon elementary concepts from chaos theory, we adopt a pedagogical approach demonstrating how to adjust this parameter in order to achieve arbitrary precision fit to all samples of the data. Targeting an audience of data scientists with a taste for the curious and unusual, the results presented here expand on previous similar observations regarding expressiveness power and generalization of machine learning models.
△ Less
Submitted 28 April, 2019;
originally announced April 2019.
-
Deep learning for pedestrians: backpropagation in CNNs
Authors:
Laurent Boué
Abstract:
The goal of this document is to provide a pedagogical introduction to the main concepts underpinning the training of deep neural networks using gradient descent; a process known as backpropagation. Although we focus on a very influential class of architectures called "convolutional neural networks" (CNNs) the approach is generic and useful to the machine learning community as a whole. Motivated by…
▽ More
The goal of this document is to provide a pedagogical introduction to the main concepts underpinning the training of deep neural networks using gradient descent; a process known as backpropagation. Although we focus on a very influential class of architectures called "convolutional neural networks" (CNNs) the approach is generic and useful to the machine learning community as a whole. Motivated by the observation that derivations of backpropagation are often obscured by clumsy index-heavy narratives that appear somewhat mathemagical, we aim to offer a conceptually clear, vectorized description that articulates well the higher level logic. Following the principle of "writing is nature's way of letting you know how sloppy your thinking is", we try to make the calculations meticulous, self-contained and yet as intuitive as possible. Taking nothing for granted, ample illustrations serve as visual guides and an extensive bibliography is provided for further explorations.
(For the sake of clarity, long mathematical derivations and visualizations have been broken up into short "summarized views" and longer "detailed views" encoded into the PDF as optional content groups. Some figures contain animations designed to illustrate important concepts in a more engaging style. For these reasons, we advise to download the document locally and open it using Adobe Acrobat Reader. Other viewers were not tested and may not render the detailed views, animations correctly.)
△ Less
Submitted 29 November, 2018;
originally announced November 2018.
-
Energy and Vorticity Spectra in Turbulent Superfluid $^4$He from $T=0$ to $T_λ$
Authors:
Laurent Boué,
Victor S. L'vov,
Yotam Nagar,
Sergey V. Nazarenko,
Anna Pomyalov,
Itamar Procaccia
Abstract:
We discuss the energy and vorticity spectra of turbulent superfluid $^4$He in all the temperature range from $T=0$ up to the phase transition "$λ$ point", $T_λ\simeq 2.17\,$K. Contrary to classical developed turbulence in which there are only two typical scales, i.e. the energy injection $L$ and the dissipation scales $η$, here the quantization of vorticity introduces two additional scales, i.e th…
▽ More
We discuss the energy and vorticity spectra of turbulent superfluid $^4$He in all the temperature range from $T=0$ up to the phase transition "$λ$ point", $T_λ\simeq 2.17\,$K. Contrary to classical developed turbulence in which there are only two typical scales, i.e. the energy injection $L$ and the dissipation scales $η$, here the quantization of vorticity introduces two additional scales, i.e the vortex core radius $a_0$ and the mean vortex spacing $\ell$. We present these spectra for the super- and normal-fluid components in the entire range of scales from $L$ to $a_0$ including the cross-over scale $\ell$ where the hydrodynamic eddy-cascade is replaced by the cascade of Kelvin waves on individual vortices. At this scale a bottleneck accumulation of the energy was found earlier at $T=0$.
We show that even very small mutual friction dramatically suppresses the bottleneck effect due to the dissipation of the Kelvin waves. Using our results for the spectra we estimate the Vinen "effective viscosity" $ν'$ in the entire temperature range and show agreement with numerous experimental observation for $ν'(T)$.
△ Less
Submitted 3 April, 2015; v1 submitted 31 December, 2014;
originally announced April 2015.
-
Analytic solution of the dynamics of quantum vortex reconnection
Authors:
Laurent Boué,
Dmytro Khomenko,
Victor S. L'vov,
Itamar Procaccia
Abstract:
Experimental and simulational studies of the dynamics of vortex reconnections in quantum fluids showedthat the distance $d$ between the reconnecting vortices is close to a universal time dependence $d=D[κ|t_0-t|]^α$ with $α$ fluctuating around 1/2 and $κ=h/m$ is the quantum of circulation. Dimensional analysis, based on the assumption that the quantum of circulation $κ=h/m$ is the only relevant pa…
▽ More
Experimental and simulational studies of the dynamics of vortex reconnections in quantum fluids showedthat the distance $d$ between the reconnecting vortices is close to a universal time dependence $d=D[κ|t_0-t|]^α$ with $α$ fluctuating around 1/2 and $κ=h/m$ is the quantum of circulation. Dimensional analysis, based on the assumption that the quantum of circulation $κ=h/m$ is the only relevant parameter in the problem, predicts $α=1/2$. The theoretical calculation of the dimensionless coefficient $D$ in this formula remained an open problem. In this Letter we present an analytic calculation of $D$ in terms of the given geometry of the reconnecting vortices. We start from the numerically observed generic geometry on the way to vortex reconnection and demonstrate that the dynamics is well described by a self-similar analytic solution which provides the wanted information.
△ Less
Submitted 18 July, 2013;
originally announced July 2013.
-
Enhancement of intermittency in superfluid turbulence
Authors:
Laurent Boué,
Victor L'vov,
Anna Pomyalov,
Itamar Procaccia
Abstract:
We consider the intermittent behavior of superfluid turbulence in $^4$He. Due to the similarity in the nonlinear structure of the two-fluid model of superfluidity and the Euler and Navier-Stokes equations one expects the scaling exponents of the structure functions to be the same as in classical turbulence for temperatures close to the superfluid transition $T_λ$ and also for $T\ll T_λ$. This is n…
▽ More
We consider the intermittent behavior of superfluid turbulence in $^4$He. Due to the similarity in the nonlinear structure of the two-fluid model of superfluidity and the Euler and Navier-Stokes equations one expects the scaling exponents of the structure functions to be the same as in classical turbulence for temperatures close to the superfluid transition $T_λ$ and also for $T\ll T_λ$. This is not the case when mutual friction becomes important. Using shell model simulations, we propose that for an intermediate regime of temperatures, such that the density of normal and superfluid components are comparable to each other, there exists a range of scales in which the effective exponents indicate stronger intermittency. We offer a bridge relation between these effective and the classical scaling exponents. Since this effect occurs at accessible temperatures and Reynolds numbers, we propose that experiments should be conducted to further assess the validity and implications of this prediction.
△ Less
Submitted 26 July, 2012;
originally announced July 2012.
-
Temperature suppression of Kelvin-wave turbulence in superfluids
Authors:
Laurent Boué,
Victor L'vov,
Itamar Procaccia
Abstract:
Kelvin waves propagating on quantum vortices play a crucial role in the phenomenology of energy dissipation of superfluid turbulence. Previous theoretical studies have consistently focused on the zero-temperature limit of the statistical physics of Kelvin-wave turbulence. In this letter, we go beyond this athermal limit by introducing a small but finite temperature in the form of non-zero mutual f…
▽ More
Kelvin waves propagating on quantum vortices play a crucial role in the phenomenology of energy dissipation of superfluid turbulence. Previous theoretical studies have consistently focused on the zero-temperature limit of the statistical physics of Kelvin-wave turbulence. In this letter, we go beyond this athermal limit by introducing a small but finite temperature in the form of non-zero mutual friction dissipative force; A situation regularly encountered in actual experiments of superfluid turbulence. In this case we show that there exists a new typical length-scale separating a quasi-inertial range of Kelvin wave turbulence from a far dissipation range. The letter culminates with analytical predictions for the energy spectrum of the Kelvin-wave turbulence in both of these regimes.
△ Less
Submitted 29 July, 2012; v1 submitted 2 May, 2012;
originally announced May 2012.
-
Energy Spectra of Superfluid Turbulence in $^3$He
Authors:
Laurent Boué,
Victor L'vov,
Anna Pomyalov,
Itamar Procaccia
Abstract:
In superfluid $^3$He turbulence is carried predominantly by the superfluid component. To explore the statistical properties of this quantum turbulence and its differences from the classical counterpart we adopt the time-honored approach of shell models. Using this approach we provide numerical simulations of a Sabra-shell model that allows us to uncover the nature of the energy spectrum in the rel…
▽ More
In superfluid $^3$He turbulence is carried predominantly by the superfluid component. To explore the statistical properties of this quantum turbulence and its differences from the classical counterpart we adopt the time-honored approach of shell models. Using this approach we provide numerical simulations of a Sabra-shell model that allows us to uncover the nature of the energy spectrum in the relevant hydrodynamic regimes. These results are in qualitative agreement with analytical expressions for the superfluid turbulent energy spectra that were found using a differential approximation for the energy flux.
△ Less
Submitted 27 December, 2011;
originally announced December 2011.
-
Kelvin-wave turbulence in superfluids
Authors:
Laurent Boué,
Ratul Dasgupta,
Jason Laurie,
Victor L'vov,
Sergey Nazarenko,
Itamar Procaccia
Abstract:
We study the statistical and dynamical behavior of turbulent Kelvin waves propagating on quantized vortices in superfluids, and address the controversy concerning the energy spectrum that is associated with these excitations. Finding the correct energy spectrum is important because Kelvin waves play a major role in the dissipation of energy in superfluid turbulence at near-zero temperatures. In th…
▽ More
We study the statistical and dynamical behavior of turbulent Kelvin waves propagating on quantized vortices in superfluids, and address the controversy concerning the energy spectrum that is associated with these excitations. Finding the correct energy spectrum is important because Kelvin waves play a major role in the dissipation of energy in superfluid turbulence at near-zero temperatures. In this paper, we show analytically that the solution proposed in Ref. \cite{10LN} enjoys existence, uniqueness and regularity of the pre-factor. Furthermore, we present numerical results of the dynamical equation that describes to leading order the non-local regime of the Kelvin wave dynamics. We compare our findings with the analytical results from the proposed local and non-local theories for Kelvin wave dynamics and show an agreement with the non-local predictions. Accordingly, the spectrum proposed in Ref. \cite{10LN} should be used in future theories of quantum turbulence. Finally, for weaker wave forcing we observe an intermittent behavior of the wave spectrum with a fluctuating dissipative scale, which we interpreted as a finite-size effect characteristic to mesoscopic wave turbulence.
△ Less
Submitted 17 July, 2011; v1 submitted 30 March, 2011;
originally announced March 2011.
-
Statistical Mechanics of Glass Formation in Molecular Liquids with OTP as an Example
Authors:
Laurent Boué,
H. G. E. Hentschel,
Valery Ilyin,
Itamar Procaccia
Abstract:
We extend our statistical mechanical theory of the glass transition from examples consisting of point particles to molecular liquids with internal degrees of freedom. As before, the fundamental assertion is that super-cooled liquids are ergodic, although becoming very viscous at lower temperatures, and are therefore describable in principle by statistical mechanics. The theory is based on analyzin…
▽ More
We extend our statistical mechanical theory of the glass transition from examples consisting of point particles to molecular liquids with internal degrees of freedom. As before, the fundamental assertion is that super-cooled liquids are ergodic, although becoming very viscous at lower temperatures, and are therefore describable in principle by statistical mechanics. The theory is based on analyzing the local neighborhoods of each molecule, and a statistical mechanical weight is assigned to every possible local organization. This results in an approximate theory that is in very good agreement with simulations regarding both thermodynamical and dynamical properties.
△ Less
Submitted 30 March, 2011;
originally announced March 2011.
-
Statistical distributions in the folding of elastic structures
Authors:
Mokhtar Adda-Bedia,
Arezki Boudaoud,
Laurent Boué,
Stephanie Deboeuf
Abstract:
The behaviour of elastic structures undergoing large deformations is the result of the competition between confining conditions, self-avoidance and elasticity. This combination of multiple phenomena creates a geometrical frustration that leads to complex fold patterns. By studying the case of a rod confined isotropically into a disk, we show that the emergence of the complexity is associated with…
▽ More
The behaviour of elastic structures undergoing large deformations is the result of the competition between confining conditions, self-avoidance and elasticity. This combination of multiple phenomena creates a geometrical frustration that leads to complex fold patterns. By studying the case of a rod confined isotropically into a disk, we show that the emergence of the complexity is associated with a well defined underlying statistical measure that determines the energy distribution of sub-elements,``branches'', of the rod. This result suggests that branches act as the ``microscopic'' degrees of freedom laying the foundations for a statistical mechanical theory of this athermal and amorphous system.
△ Less
Submitted 6 September, 2010;
originally announced September 2010.
-
Time Scales in the Theory of Elasto-Plasticity of Amorphous Solids
Authors:
Laurent Boue,
Peter Harrowell,
Smarajit Karmakar,
Edan Lerner,
Itamar Procaccia,
Ido Regev,
Jacques Zylberg
Abstract:
Develo** a macroscopic theory of elasto-plasticity in amorphous solids calls for (i) identifying the relevant macro state-variables and (ii) discriminating the different time-scales which characterize these variables. In current theories it is assumed that the stress reaches its elasto-plastic steady state value on the same time-scale as the configurational variables (be they the configuration…
▽ More
Develo** a macroscopic theory of elasto-plasticity in amorphous solids calls for (i) identifying the relevant macro state-variables and (ii) discriminating the different time-scales which characterize these variables. In current theories it is assumed that the stress reaches its elasto-plastic steady state value on the same time-scale as the configurational variables (be they the configurational energy, configurational entropy or the effective temperature). By examining numerical simulations in two and three dimensions we show that this is generally not the case, the configurational degrees of freedom may reach the elasto-plastic steady state on the time scales which can be very different from the time scale of the stress. We provide a physical discussion to rationalize these findings.
△ Less
Submitted 24 November, 2009;
originally announced November 2009.
-
Predictive Statistical Mechanics for Glass Forming Systems
Authors:
Laurent Boue,
Edan Lerner,
Itamar Procaccia,
Jacques Zylberg
Abstract:
Using two extremely different models of glass formers in two and three dimensions we demonstrate how to encode the subtle changes in the geometric rearrangement of particles during the scenario of the glass transition. We construct a statistical mechanical description that is able to explain and predict the geometric rearrangement, the temperature dependent thermodynamic functions and the $α$-re…
▽ More
Using two extremely different models of glass formers in two and three dimensions we demonstrate how to encode the subtle changes in the geometric rearrangement of particles during the scenario of the glass transition. We construct a statistical mechanical description that is able to explain and predict the geometric rearrangement, the temperature dependent thermodynamic functions and the $α$-relaxation time within the measured temperature range and beyond. The theory is based on an up-scaling to proper variables (quasi-species) which is validated using a simple criterion. Once constructed, the theory provides an accurate predictive tool for quantities like the specific heat or the entropy at temperatures that cannot be reached by measurements. In addition, the theory identifies a rapidly increasing typical length scale $ξ$ as the temperature decreases. This growing spatial length scale determines the $α$-relaxation time as $τ_α\sim \exp(μξ/T)$ where $μ$ is a typical chemical potential per unit length.
△ Less
Submitted 25 May, 2009;
originally announced May 2009.