-
Collinear three-photon excitation of a strongly forbidden optical clock transition
Authors:
Samuel P. Carman,
Jan Rudolph,
Benjamin E. Garber,
Michael J. Van de Graaff,
Hunter Swan,
Yijun Jiang,
Megan Nantel,
Mahiro Abe,
Rachel L. Barcklay,
Jason M. Hogan
Abstract:
The ${{^1\mathrm{S}_0}\!-\!{^3\mathrm{P}_0}}$ clock transition in strontium serves as the foundation for the world's best atomic clocks and for gravitational wave detector concepts in clock atom interferometry. This transition is weakly allowed in the fermionic isotope $^{87}$Sr but strongly forbidden in bosonic isotopes. Here we demonstrate coherent excitation of the clock transition in bosonic…
▽ More
The ${{^1\mathrm{S}_0}\!-\!{^3\mathrm{P}_0}}$ clock transition in strontium serves as the foundation for the world's best atomic clocks and for gravitational wave detector concepts in clock atom interferometry. This transition is weakly allowed in the fermionic isotope $^{87}$Sr but strongly forbidden in bosonic isotopes. Here we demonstrate coherent excitation of the clock transition in bosonic ${}^{88}$Sr using a novel collinear three-photon process in a weak magnetic field. We observe Rabi oscillations with frequencies of up to $50~\text{kHz}$ using $\text{W}/\text{cm}^{2}$ laser intensities and Gauss-level magnetic field amplitudes. The absence of nuclear spin in bosonic isotopes offers decreased sensitivity to magnetic fields and optical lattice light shifts, enabling atomic clocks with reduced systematic errors. The collinear propagation of the laser fields permits the interrogation of spatially separated atomic ensembles with common laser pulses, a key requirement for dark matter searches and gravitational wave detection with next-generation quantum sensors.
△ Less
Submitted 27 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Ziv-Zakai-Optimal OFDM Resource Allocation for Time-of-Arrival Estimation
Authors:
Andrew M. Graff,
Todd E. Humphreys
Abstract:
This paper presents methods of optimizing the placement and power allocations of pilots in an orthogonal frequency-division multiplexing (OFDM) signal to minimize time-of-arrival (TOA) estimation errors under power and resource allocation constraints. TOA errors in this optimization are quantified through the Ziv-Zakai bound (ZZB), which captures error thresholding effects caused by sidelobes in t…
▽ More
This paper presents methods of optimizing the placement and power allocations of pilots in an orthogonal frequency-division multiplexing (OFDM) signal to minimize time-of-arrival (TOA) estimation errors under power and resource allocation constraints. TOA errors in this optimization are quantified through the Ziv-Zakai bound (ZZB), which captures error thresholding effects caused by sidelobes in the signal's autocorrelation function (ACF) which are not captured by the Cramer-Rao lower bound. This paper is the first to solve for these ZZB-optimal allocations in the context of OFDM signals, under integer resource allocation constraints, and under both coherent and noncoherent reception. Under convex constraints, the optimization of the ZZB is proven to be convex; under integer constraints, the optimization is lower bounded by a convex relaxation and a branch-and-bound algorithm is proposed for efficiently allocating pilot resources. These allocations are evaluated by their ZZBs and ACFs, compared against a typical uniform allocation, and deployed on a software-defined radio TOA measurement platform to demonstrate their applicability in real-world systems.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Analysis of Systems' Performance in Natural Language Processing Competitions
Authors:
Sergio Nava-Muñoz,
Mario Graff,
Hugo Jair Escalante
Abstract:
Collaborative competitions have gained popularity in the scientific and technological fields. These competitions involve defining tasks, selecting evaluation scores, and devising result verification methods. In the standard scenario, participants receive a training set and are expected to provide a solution for a held-out dataset kept by organizers. An essential challenge for organizers arises whe…
▽ More
Collaborative competitions have gained popularity in the scientific and technological fields. These competitions involve defining tasks, selecting evaluation scores, and devising result verification methods. In the standard scenario, participants receive a training set and are expected to provide a solution for a held-out dataset kept by organizers. An essential challenge for organizers arises when comparing algorithms' performance, assessing multiple participants, and ranking them. Statistical tools are often used for this purpose; however, traditional statistical methods often fail to capture decisive differences between systems' performance. This manuscript describes an evaluation methodology for statistically analyzing competition results and competition. The methodology is designed to be universally applicable; however, it is illustrated using eight natural language competitions as case studies involving classification and regression problems. The proposed methodology offers several advantages, including off-the-shell comparisons with correction mechanisms and the inclusion of confidence intervals. Furthermore, we introduce metrics that allow organizers to assess the difficulty of competitions. Our analysis shows the potential usefulness of our methodology for effectively evaluating competition results.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Signal Identification and Entrainment for Practical FMCW Radar Spoofing Attacks
Authors:
Andrew M. Graff,
Todd E. Humphreys
Abstract:
This paper proposes a method of passively estimating the parameters of frequency-modulated-continuous-wave (FMCW) radar signals with a wide range of structural parameter values and analyzes how a malicious actor could employ such estimates to track and spoof a target radar. When radars are implemented to support automated driver assistance systems, an intelligent spoofer has the potential to subst…
▽ More
This paper proposes a method of passively estimating the parameters of frequency-modulated-continuous-wave (FMCW) radar signals with a wide range of structural parameter values and analyzes how a malicious actor could employ such estimates to track and spoof a target radar. When radars are implemented to support automated driver assistance systems, an intelligent spoofer has the potential to substantially disrupt safe navigation by inducing its target to perceive false objects. Such a spoofer must acquire highly accurate estimates of the target radar's chirp sweep, timing, and frequency parameters while additionally tracking and compensating for time and Doppler shifts due to clock errors and relative movement. This is a difficult task for millimeter-wave radars due to severe Doppler shifts and fast sweep rates, especially when the spoofer uses off-the-shelf FMCW equipment. Algorithms and techniques for acquiring and tracking an FMCW radar are proposed and verified through simulation, which will help guide future decisions on appropriate radar spoofing countermeasures.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Signal Structure of the Starlink Ku-Band Downlink
Authors:
Todd E. Humphreys,
Peter A. Iannucci,
Zacharias Komodromos,
Andrew M. Graff
Abstract:
We develop a technique for blind signal identification of the Starlink downlink signal in the 10.7 to 12.7 GHz band and present a detailed picture of the signal's structure. Importantly, the signal characterization offered herein includes the exact values of synchronization sequences embedded in the signal that can be exploited to produce pseudorange measurements. Such an understanding of the sign…
▽ More
We develop a technique for blind signal identification of the Starlink downlink signal in the 10.7 to 12.7 GHz band and present a detailed picture of the signal's structure. Importantly, the signal characterization offered herein includes the exact values of synchronization sequences embedded in the signal that can be exploited to produce pseudorange measurements. Such an understanding of the signal is essential to emerging efforts that seek to dual-purpose Starlink signals for positioning, navigation, and timing, despite their being designed solely for broadband Internet provision.
△ Less
Submitted 30 August, 2023; v1 submitted 20 October, 2022;
originally announced October 2022.
-
Regionalized models for Spanish language variations based on Twitter
Authors:
Eric S. Tellez,
Daniela Moctezuma,
Sabino Miranda,
Mario Graff,
Guillermo Ruiz
Abstract:
Spanish is one of the most spoken languages in the globe, but not necessarily Spanish is written and spoken in the same way in different countries. Understanding local language variations can help to improve model performances on regional tasks, both understanding local structures and also improving the message's content. For instance, think about a machine learning engineer who automatizes some l…
▽ More
Spanish is one of the most spoken languages in the globe, but not necessarily Spanish is written and spoken in the same way in different countries. Understanding local language variations can help to improve model performances on regional tasks, both understanding local structures and also improving the message's content. For instance, think about a machine learning engineer who automatizes some language classification task on a particular region or a social scientist trying to understand a regional event with echoes on social media; both can take advantage of dialect-based language models to understand what is happening with more contextual information hence more precision.
This manuscript presents and describes a set of regionalized resources for the Spanish language built on four-year Twitter public messages geotagged in 26 Spanish-speaking countries. We introduce word embeddings based on FastText, language models based on BERT, and per-region sample corpora. We also provide a broad comparison among regions covering lexical and semantical similarities; as well as examples of using regional resources on message classification tasks.
△ Less
Submitted 9 December, 2022; v1 submitted 12 October, 2021;
originally announced October 2021.
-
A Case Study of Spanish Text Transformations for Twitter Sentiment Analysis
Authors:
Eric S. Tellez,
Sabino Miranda-Jiménez,
Mario Graff,
Daniela Moctezuma,
Oscar S. Siodia,
Elio A. Villaseñor
Abstract:
Sentiment analysis is a text mining task that determines the polarity of a given text, i.e., its positiveness or negativeness. Recently, it has received a lot of attention given the interest in opinion mining in micro-blogging platforms. These new forms of textual expressions present new challenges to analyze text given the use of slang, orthographic and grammatical errors, among others. Along wit…
▽ More
Sentiment analysis is a text mining task that determines the polarity of a given text, i.e., its positiveness or negativeness. Recently, it has received a lot of attention given the interest in opinion mining in micro-blogging platforms. These new forms of textual expressions present new challenges to analyze text given the use of slang, orthographic and grammatical errors, among others. Along with these challenges, a practical sentiment classifier should be able to handle efficiently large workloads.
The aim of this research is to identify which text transformations (lemmatization, stemming, entity removal, among others), tokenizers (e.g., words $n$-grams), and tokens weighting schemes impact the most the accuracy of a classifier (Support Vector Machine) trained on two Spanish corpus. The methodology used is to exhaustively analyze all the combinations of the text transformations and their respective parameters to find out which characteristics the best performing classifiers have in common. Furthermore, among the different text transformations studied, we introduce a novel approach based on the combination of word based $n$-grams and character based $q$-grams. The results show that this novel combination of words and characters produces a classifier that outperforms the traditional word based combination by $11.17\%$ and $5.62\%$ on the INEGI and TASS'15 dataset, respectively.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
A Python Library for Exploratory Data Analysis on Twitter Data based on Tokens and Aggregated Origin-Destination Information
Authors:
Mario Graff,
Daniela Moctezuma,
Sabino Miranda-Jiménez,
Eric S. Tellez
Abstract:
Twitter is perhaps the social media more amenable for research. It requires only a few steps to obtain information, and there are plenty of libraries that can help in this regard. Nonetheless, knowing whether a particular event is expressed on Twitter is a challenging task that requires a considerable collection of tweets. This proposal aims to facilitate, to a researcher interested, the process o…
▽ More
Twitter is perhaps the social media more amenable for research. It requires only a few steps to obtain information, and there are plenty of libraries that can help in this regard. Nonetheless, knowing whether a particular event is expressed on Twitter is a challenging task that requires a considerable collection of tweets. This proposal aims to facilitate, to a researcher interested, the process of mining events on Twitter by opening a collection of processed information taken from Twitter since December 2015. The events could be related to natural disasters, health issues, and people's mobility, among other studies that can be pursued with the library proposed. Different applications are presented in this contribution to illustrate the library's capabilities: an exploratory analysis of the topics discovered in tweets, a study on similarity among dialects of the Spanish language, and a mobility report on different countries. In summary, the Python library presented is applied to different domains and retrieves a plethora of information in terms of frequencies by day of words and bi-grams of words for Arabic, English, Spanish, and Russian languages. As well as mobility information related to the number of travels among locations for more than 200 countries or territories.
△ Less
Submitted 24 November, 2021; v1 submitted 3 September, 2020;
originally announced September 2020.
-
Observation of Efimov Universality across a Non-Universal Feshbach Resonance in \textsuperscript{39}K
Authors:
Xin Xie,
Michael J. Van de Graaff,
Roman Chapurin,
Matthew D. Frye,
Jeremy M. Hutson,
José P. D'Incao,
Paul S. Julienne,
Jun Ye,
Eric A. Cornell
Abstract:
We study three-atom inelastic scattering in ultracold \textsuperscript{39}K near a Feshbach resonance of intermediate coupling strength. The non-universal character of such resonance leads to an abnormally large Efimov absolute length scale and a relatively small effective range $r_e$, allowing the features of the \textsuperscript{39}K Efimov spectrum to be better isolated from the short-range phy…
▽ More
We study three-atom inelastic scattering in ultracold \textsuperscript{39}K near a Feshbach resonance of intermediate coupling strength. The non-universal character of such resonance leads to an abnormally large Efimov absolute length scale and a relatively small effective range $r_e$, allowing the features of the \textsuperscript{39}K Efimov spectrum to be better isolated from the short-range physics. Meticulous characterization of and correction for finite temperature effects ensure high accuracy on the measurements of these features at large-magnitude scattering lengths. For a single Feshbach resonance, we unambiguously locate four distinct features in the Efimov structure. Three of these features form ratios that obey the Efimov universal scaling to within 10\%, while the fourth feature, occurring at a value of scattering length closest to $r_e$, instead deviates from the universal value.
△ Less
Submitted 2 August, 2020;
originally announced August 2020.
-
Selection Heuristics on Semantic Genetic Programming for Classification Problems
Authors:
Claudia N. Sánchez,
Mario Graff
Abstract:
Individual's semantics have been used for guiding the learning process of Genetic Programming solving supervised learning problems. The semantics has been used to proposed novel genetic operators as well as different ways of performing parent selection. The latter is the focus of this contribution by proposing three heuristics for parent selection that replace the fitness function on the selection…
▽ More
Individual's semantics have been used for guiding the learning process of Genetic Programming solving supervised learning problems. The semantics has been used to proposed novel genetic operators as well as different ways of performing parent selection. The latter is the focus of this contribution by proposing three heuristics for parent selection that replace the fitness function on the selection mechanism entirely. These heuristics complement previous work by being inspired in the characteristics of the addition, Naive Bayes, and Nearest Centroid functions and applying them only when the function is used to create an offspring. These heuristics use different similarity measures among the parents to decide which of them is more appropriate given a function. The similarity functions considered are the cosine similarity, Pearson's correlation, and agreement. We analyze these heuristics' performance against random selection, state-of-the-art selection schemes, and 18 classifiers, including auto-machine-learning techniques, on 30 classification problems with a variable number of samples, variables, and classes. The result indicated that the combination of parent selection based on agreement and random selection to replace an individual in the population produces statistically better results than the classical selection and state-of-the-art schemes, and it is competitive with state-of-the-art classifiers. Finally, the code is released as open-source software.
△ Less
Submitted 2 April, 2021; v1 submitted 16 July, 2019;
originally announced July 2019.
-
Improving classification performance by feature space transformations and model selection
Authors:
Jose Ortiz-Bejar,
Eric S. Tellez,
Mario Graff
Abstract:
Improving the performance of classifiers is the realm of feature map**, prototype selection, and kernel function transformations; these techniques aim for reducing the complexity, and also, improving the accuracy of models. In particular, our objective is to combine them to transform data's shape into another more convenient distribution; such that some simple algorithms, such as Naïve Bayes or…
▽ More
Improving the performance of classifiers is the realm of feature map**, prototype selection, and kernel function transformations; these techniques aim for reducing the complexity, and also, improving the accuracy of models. In particular, our objective is to combine them to transform data's shape into another more convenient distribution; such that some simple algorithms, such as Naïve Bayes or k-Nearest Neighbors, can produce competitive classifiers. In this paper, we introduce a family of classifiers based on feature map** and kernel functions, orchestrated by a model selection scheme that excels in performance. We provide an extensive experimental comparison of our methods with sixteen popular classifiers on more than thirty benchmarks supporting our claims. In addition to their competitive performance, our statistical tests also found that our methods are different among them, supporting our claim of a compelling family of classifiers.
△ Less
Submitted 2 October, 2019; v1 submitted 14 July, 2019;
originally announced July 2019.
-
Precision Test of the Limits to Universality in Few-Body Physics
Authors:
Roman Chapurin,
Xin Xie,
Michael J. Van de Graaff,
Jared S. Popowski,
Jose P. D'Incao,
Paul S. Julienne,
Jun Ye,
Eric A. Cornell
Abstract:
We perform precise studies of two- and three-body interactions near an intermediate-strength Feshbach resonance in $^{39}\mathrm{K}$ at $33.5820(14)\thinspace$G. Precise measurement of dimer binding energies, spanning three orders of magnitude, enables the construction of a complete two-body coupled-channel model for determination of the scattering lengths with an unprecedented low uncertainty. Ut…
▽ More
We perform precise studies of two- and three-body interactions near an intermediate-strength Feshbach resonance in $^{39}\mathrm{K}$ at $33.5820(14)\thinspace$G. Precise measurement of dimer binding energies, spanning three orders of magnitude, enables the construction of a complete two-body coupled-channel model for determination of the scattering lengths with an unprecedented low uncertainty. Utilizing an accurate scattering length map, we measure the precise location of the Efimov ground state to test van der Waals universality. Precise control of the sample's temperature and density ensures that systematic effects on the Efimov trimer state are well understood. We measure the ground Efimov resonance location to be at $-14.05(17)$ times the van der Waals length $r_{\mathrm{vdW}}$, significantly deviating from the value of $-9.7 \thinspace r_{\mathrm{vdW}}$ predicted by van der Waals universality. We find that a refined multichannel three-body model, built on our measurement of two-body physics, can account for this difference and even successfully predict the Efimov inelasticity parameter $η$.
△ Less
Submitted 24 November, 2019; v1 submitted 1 July, 2019;
originally announced July 2019.
-
On the role of numerical viscosity in the study of the local limit of nonlocal conservation laws
Authors:
Maria Colombo,
Gianluca Crippa,
Marie Graff,
Laura V. Spinolo
Abstract:
We deal with the numerical investigation of the local limit of nonlocal conservation laws. Previous numerical experiments suggest convergence in the local limit. However, recent analytic results state that (i) in general convergence does not hold because one can exhibit counterexamples; (ii) convergence can be recovered provided viscosity is added to both the local and the nonlocal equations. Moti…
▽ More
We deal with the numerical investigation of the local limit of nonlocal conservation laws. Previous numerical experiments suggest convergence in the local limit. However, recent analytic results state that (i) in general convergence does not hold because one can exhibit counterexamples; (ii) convergence can be recovered provided viscosity is added to both the local and the nonlocal equations. Motivated by these analytic results, we investigate the role of numerical viscosity in the numerical study of the local limit of nonlocal conservation laws. In particular, we show that the numerical viscosity of Lax-Friedrichs type schemes jeopardizes the reliability of the numerical scheme and erroneously detects convergence in cases where convergence is ruled out by analytic results. We also test Godunov type schemes, less affected by numerical viscosity, and show that in some cases they provide more reliable results.
△ Less
Submitted 20 February, 2019;
originally announced February 2019.
-
Recent results on the singular local limit for nonlocal conservation laws
Authors:
Maria Colombo,
Gianluca Crippa,
Marie Graff,
Laura V. Spinolo
Abstract:
We provide an informal overview of recent developments concerning the singular local limit of nonlocal conservation laws. In particular, we discuss some counterexamples to convergence and we highlight the role of numerical viscosity in the numerical investigation of the nonlocal-to-local limit. We also state some open questions and describe recent related progress.
We provide an informal overview of recent developments concerning the singular local limit of nonlocal conservation laws. In particular, we discuss some counterexamples to convergence and we highlight the role of numerical viscosity in the numerical investigation of the nonlocal-to-local limit. We also state some open questions and describe recent related progress.
△ Less
Submitted 19 February, 2019;
originally announced February 2019.
-
EvoMSA: A Multilingual Evolutionary Approach for Sentiment Analysis
Authors:
Mario Graff,
Sabino Miranda-Jiménez,
Eric S. Tellez,
Daniela Moctezuma
Abstract:
Sentiment analysis (SA) is a task related to understanding people's feelings in written text; the starting point would be to identify the polarity level (positive, neutral or negative) of a given text, moving on to identify emotions or whether a text is humorous or not. This task has been the subject of several research competitions in a number of languages, e.g., English, Spanish, and Arabic, amo…
▽ More
Sentiment analysis (SA) is a task related to understanding people's feelings in written text; the starting point would be to identify the polarity level (positive, neutral or negative) of a given text, moving on to identify emotions or whether a text is humorous or not. This task has been the subject of several research competitions in a number of languages, e.g., English, Spanish, and Arabic, among others. In this contribution, we propose an SA system, namely EvoMSA, that unifies our participating systems in various SA competitions, making it domain independent and multilingual by processing text using only language-independent techniques. EvoMSA is a classifier, based on Genetic Programming, that works by combining the output of different text classifiers and text models to produce the final prediction. We analyze EvoMSA on different SA competitions to provide a global overview of its performance, and as the results show, EvoMSA is competitive obtaining top rankings in several SA competitions. Furthermore, we performed an analysis of EvoMSA's components to measure their contribution to the performance; the idea is to facilitate a practitioner or newcomer to implement a competitive SA classifier. Finally, it is worth to mention that EvoMSA is available as open-source software.
△ Less
Submitted 30 September, 2019; v1 submitted 29 November, 2018;
originally announced December 2018.
-
A scalable solution to the nearest neighbor search problem through local-search methods on neighbor graphs
Authors:
Eric S. Tellez,
Guillermo Ruiz,
Edgar Chavez,
Mario Graff
Abstract:
Near neighbor search (NNS) is a powerful abstraction for data access; however, data indexing is troublesome even for approximate indexes. For intrinsically high-dimensional data, high-quality fast searches demand either indexes with impractically large memory usage or preprocessing time.
In this paper, we introduce an algorithm to solve a nearest-neighbor query $q$ by minimizing a kernel functio…
▽ More
Near neighbor search (NNS) is a powerful abstraction for data access; however, data indexing is troublesome even for approximate indexes. For intrinsically high-dimensional data, high-quality fast searches demand either indexes with impractically large memory usage or preprocessing time.
In this paper, we introduce an algorithm to solve a nearest-neighbor query $q$ by minimizing a kernel function defined by the distance from $q$ to each object in the database. The minimization is performed using metaheuristics to solve the problem rapidly; even when some methods in the literature use this strategy behind the scenes, our approach is the first one using it explicitly. We also provide two approaches to select edges in the graph's construction stage that limit memory footprint and reduce the number of free parameters simultaneously.
We carry out a thorough experimental comparison with state-of-the-art indexes through synthetic and real-world datasets; we found out that our contributions achieve competitive performances regarding speed, accuracy, and memory in almost any of our benchmarks.
△ Less
Submitted 29 June, 2021; v1 submitted 29 May, 2017;
originally announced May 2017.
-
An Automated Text Categorization Framework based on Hyperparameter Optimization
Authors:
Eric S. Tellez,
Daniela Moctezuma,
Sabino Miranda-Jímenez,
Mario Graff
Abstract:
A great variety of text tasks such as topic or spam identification, user profiling, and sentiment analysis can be posed as a supervised learning problem and tackle using a text classifier. A text classifier consists of several subprocesses, some of them are general enough to be applied to any supervised learning problem, whereas others are specifically designed to tackle a particular task, using c…
▽ More
A great variety of text tasks such as topic or spam identification, user profiling, and sentiment analysis can be posed as a supervised learning problem and tackle using a text classifier. A text classifier consists of several subprocesses, some of them are general enough to be applied to any supervised learning problem, whereas others are specifically designed to tackle a particular task, using complex and computational expensive processes such as lemmatization, syntactic analysis, etc. Contrary to traditional approaches, we propose a minimalistic and wide system able to tackle text classification tasks independent of domain and language, namely microTC. It is composed by some easy to implement text transformations, text representations, and a supervised learning algorithm. These pieces produce a competitive classifier even in the domain of informally written text. We provide a detailed description of microTC along with an extensive experimental comparison with relevant state-of-the-art methods. mircoTC was compared on 30 different datasets. Regarding accuracy, microTC obtained the best performance in 20 datasets while achieves competitive results in the remaining 10. The compared datasets include several problems like topic and polarity classification, spam detection, user profiling and authorship attribution. Furthermore, it is important to state that our approach allows the usage of the technology even without knowledge of machine learning and natural language processing.
△ Less
Submitted 14 September, 2017; v1 submitted 6 April, 2017;
originally announced April 2017.
-
A Simple Approach to Multilingual Polarity Classification in Twitter
Authors:
Eric S. Tellez,
Sabino Miranda Jiménez,
Mario Graff,
Daniela Moctezuma,
Ranyart R. Suárez,
Oscar S. Siordia
Abstract:
Recently, sentiment analysis has received a lot of attention due to the interest in mining opinions of social media users. Sentiment analysis consists in determining the polarity of a given text, i.e., its degree of positiveness or negativeness. Traditionally, Sentiment Analysis algorithms have been tailored to a specific language given the complexity of having a number of lexical variations and e…
▽ More
Recently, sentiment analysis has received a lot of attention due to the interest in mining opinions of social media users. Sentiment analysis consists in determining the polarity of a given text, i.e., its degree of positiveness or negativeness. Traditionally, Sentiment Analysis algorithms have been tailored to a specific language given the complexity of having a number of lexical variations and errors introduced by the people generating content. In this contribution, our aim is to provide a simple to implement and easy to use multilingual framework, that can serve as a baseline for sentiment analysis contests, and as starting point to build new sentiment analysis systems. We compare our approach in eight different languages, three of them have important international contests, namely, SemEval (English), TASS (Spanish), and SENTIPOLC (Italian). Within the competitions our approach reaches from medium to high positions in the rankings; whereas in the remaining languages our approach outperforms the reported results.
△ Less
Submitted 15 December, 2016;
originally announced December 2016.
-
Bose polarons in the strongly interacting regime
Authors:
Ming-Guang Hu,
Michael J. Van de Graaff,
Dhruv Kedar,
John P. Corson,
Eric A. Cornell,
Deborah S. **
Abstract:
When an impurity is immersed in a Bose-Einstein condensate, impurity-boson interactions are expected to dress the impurity into a quasiparticle, the Bose polaron. We superimpose an ultracold atomic gas of $^{87}$Rb with a much lower density gas of fermionic $^{40}$K impurities. Through the use of a Feshbach resonance and RF spectroscopy, we characterize the energy, spectral width and lifetime of t…
▽ More
When an impurity is immersed in a Bose-Einstein condensate, impurity-boson interactions are expected to dress the impurity into a quasiparticle, the Bose polaron. We superimpose an ultracold atomic gas of $^{87}$Rb with a much lower density gas of fermionic $^{40}$K impurities. Through the use of a Feshbach resonance and RF spectroscopy, we characterize the energy, spectral width and lifetime of the resultant polaron on both the attractive and the repulsive branches in the strongly interacting regime. The width of the polaron in the attractive branch is narrow compared to its binding energy, even as the two-body scattering length formally diverges.
△ Less
Submitted 2 May, 2016;
originally announced May 2016.
-
Term-Weighting Learning via Genetic Programming for Text Classification
Authors:
Hugo Jair Escalante,
Mauricio A. García-Limón,
Alicia Morales-Reyes,
Mario Graff,
Manuel Montes-y-Gómez,
Eduardo F. Morales
Abstract:
This paper describes a novel approach to learning term-weighting schemes (TWSs) in the context of text classification. In text mining a TWS determines the way in which documents will be represented in a vector space model, before applying a classifier. Whereas acceptable performance has been obtained with standard TWSs (e.g., Boolean and term-frequency schemes), the definition of TWSs has been tra…
▽ More
This paper describes a novel approach to learning term-weighting schemes (TWSs) in the context of text classification. In text mining a TWS determines the way in which documents will be represented in a vector space model, before applying a classifier. Whereas acceptable performance has been obtained with standard TWSs (e.g., Boolean and term-frequency schemes), the definition of TWSs has been traditionally an art. Further, it is still a difficult task to determine what is the best TWS for a particular problem and it is not clear yet, whether better schemes, than those currently available, can be generated by combining known TWS. We propose in this article a genetic program that aims at learning effective TWSs that can improve the performance of current schemes in text classification. The genetic program learns how to combine a set of basic units to give rise to discriminative TWSs. We report an extensive experimental study comprising data sets from thematic and non-thematic text classification as well as from image classification. Our study shows the validity of the proposed method; in fact, we show that TWSs learned with the genetic program outperform traditional schemes and other TWSs proposed in recent works. Further, we show that TWSs learned from a specific domain can be effectively used for other tasks.
△ Less
Submitted 6 October, 2014; v1 submitted 2 October, 2014;
originally announced October 2014.
-
The static structure factor of amorphous silicon and vitreous silica
Authors:
Adam M. R. de Graff,
M. F. Thorpe
Abstract:
Liquids are in thermal equilibrium and have a non-zero static structure factor S(Q->0) = [<N^2>-<N>^2]/<N> = rho*k_B*T*Chi_T where rho is the number density, T is the temperature, Q is the scattering vector and Chi_T is the isothermal compressibility. The first part of this result involving the number N (or density) fluctuations is a purely geometrical result and does not involve any assumptions…
▽ More
Liquids are in thermal equilibrium and have a non-zero static structure factor S(Q->0) = [<N^2>-<N>^2]/<N> = rho*k_B*T*Chi_T where rho is the number density, T is the temperature, Q is the scattering vector and Chi_T is the isothermal compressibility. The first part of this result involving the number N (or density) fluctuations is a purely geometrical result and does not involve any assumptions about thermal equilibrium or ergodicity and so is obeyed by all materials. From a large computer model of amorphous silicon, local number fluctuations extrapolate to give S(0) = 0.035+/-0.001. The same computation on a large model of vitreous silica using only the silicon atoms and rescaling the distances gives S(0) = 0.039+/-0.001, which suggests that this numerical result is robust and similar for all amorphous tetrahedral networks. For vitreous silica, we find that S(0) = 0.116+/-0.003, close to the experimental value of S(0) = 0.0900+/-0.0048 obtained recently by small angle neutron scattering. More detailed experimental and modelling studies are needed to determine the relationship between the fictive temperature and structure.
△ Less
Submitted 13 September, 2009;
originally announced September 2009.