-
Latent Structured Ranking
Authors:
Jason Weston,
John Blitzer
Abstract:
Many latent (factorized) models have been proposed for recommendation tasks like collaborative filtering and for ranking tasks like document or image retrieval and annotation. Common to all those methods is that during inference the items are scored independently by their similarity to the query in the latent embedding space. The structure of the ranked list (i.e. considering the set of items retu…
▽ More
Many latent (factorized) models have been proposed for recommendation tasks like collaborative filtering and for ranking tasks like document or image retrieval and annotation. Common to all those methods is that during inference the items are scored independently by their similarity to the query in the latent embedding space. The structure of the ranked list (i.e. considering the set of items returned as a whole) is not taken into account. This can be a problem because the set of top predictions can be either too diverse (contain results that contradict each other) or are not diverse enough. In this paper we introduce a method for learning latent structured rankings that improves over existing methods by providing the right blend of predictions at the top of the ranked list. Particular emphasis is put on making this method scalable. Empirical results on large scale image annotation and music recommendation tasks show improvements over existing approaches.
△ Less
Submitted 16 October, 2012;
originally announced October 2012.
-
Latent Collaborative Retrieval
Authors:
Jason Weston,
Chong Wang,
Ron Weiss,
Adam Berenzweig
Abstract:
Retrieval tasks typically require a ranking of items given a query. Collaborative filtering tasks, on the other hand, learn to model user's preferences over items. In this paper we study the joint problem of recommending items to a user with respect to a given query, which is a surprisingly common task. This setup differs from the standard collaborative filtering one in that we are given a query x…
▽ More
Retrieval tasks typically require a ranking of items given a query. Collaborative filtering tasks, on the other hand, learn to model user's preferences over items. In this paper we study the joint problem of recommending items to a user with respect to a given query, which is a surprisingly common task. This setup differs from the standard collaborative filtering one in that we are given a query x user x item tensor for training instead of the more traditional user x item matrix. Compared to document retrieval we do have a query, but we may or may not have content features (we will consider both cases) and we can also take account of the user's profile. We introduce a factorized model for this new task that optimizes the top-ranked items returned for the given query and user. We report empirical results where it outperforms several baselines.
△ Less
Submitted 18 June, 2012;
originally announced June 2012.
-
Towards Open-Text Semantic Parsing via Multi-Task Learning of Structured Embeddings
Authors:
Antoine Bordes,
Xavier Glorot,
Jason Weston,
Yoshua Bengio
Abstract:
Open-text (or open-domain) semantic parsers are designed to interpret any statement in natural language by inferring a corresponding meaning representation (MR). Unfortunately, large scale systems cannot be easily machine-learned due to lack of directly supervised data. We propose here a method that learns to assign MRs to a wide range of text (using a dictionary of more than 70,000 words, which a…
▽ More
Open-text (or open-domain) semantic parsers are designed to interpret any statement in natural language by inferring a corresponding meaning representation (MR). Unfortunately, large scale systems cannot be easily machine-learned due to lack of directly supervised data. We propose here a method that learns to assign MRs to a wide range of text (using a dictionary of more than 70,000 words, which are mapped to more than 40,000 entities) thanks to a training scheme that combines learning from WordNet and ConceptNet with learning from raw text. The model learns structured embeddings of words, entities and MRs via a multi-task training process operating on these diverse sources of data that integrates all the learnt knowledge into a single system. This work ends up combining methods for knowledge acquisition, semantic parsing, and word-sense disambiguation. Experiments on various tasks indicate that our approach is indeed successful and can form a basis for future more sophisticated systems.
△ Less
Submitted 19 July, 2011;
originally announced July 2011.
-
Large-Scale Music Annotation and Retrieval: Learning to Rank in Joint Semantic Spaces
Authors:
Jason Weston,
Samy Bengio,
Philippe Hamel
Abstract:
Music prediction tasks range from predicting tags given a song or clip of audio, predicting the name of the artist, or predicting related songs given a song, clip, artist name or tag. That is, we are interested in every semantic relationship between the different musical concepts in our database. In realistically sized databases, the number of songs is measured in the hundreds of thousands or more…
▽ More
Music prediction tasks range from predicting tags given a song or clip of audio, predicting the name of the artist, or predicting related songs given a song, clip, artist name or tag. That is, we are interested in every semantic relationship between the different musical concepts in our database. In realistically sized databases, the number of songs is measured in the hundreds of thousands or more, and the number of artists in the tens of thousands or more, providing a considerable challenge to standard machine learning techniques. In this work, we propose a method that scales to such datasets which attempts to capture the semantic similarities between the database items by modeling audio, artist names, and tags in a single low-dimensional semantic space. This choice of space is learnt by optimizing the set of prediction tasks of interest jointly using multi-task learning. Our method both outperforms baseline methods and, in comparison to them, is faster and consumes less memory. We then demonstrate how our method learns an interpretable model, where the semantic space captures well the similarities of interest.
△ Less
Submitted 25 May, 2011;
originally announced May 2011.
-
Natural Language Processing (almost) from Scratch
Authors:
Ronan Collobert,
Jason Weston,
Leon Bottou,
Michael Karlen,
Koray Kavukcuoglu,
Pavel Kuksa
Abstract:
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input…
▽ More
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.
△ Less
Submitted 2 March, 2011;
originally announced March 2011.
-
Bayesian anomaly detection methods for social networks
Authors:
Nicholas A. Heard,
David J. Weston,
Kiriaki Platanioti,
David J. Hand
Abstract:
Learning the network structure of a large graph is computationally demanding, and dynamically monitoring the network over time for any changes in structure threatens to be more challenging still. This paper presents a two-stage method for anomaly detection in dynamic graphs: the first stage uses simple, conjugate Bayesian models for discrete time counting processes to track the pairwise links of a…
▽ More
Learning the network structure of a large graph is computationally demanding, and dynamically monitoring the network over time for any changes in structure threatens to be more challenging still. This paper presents a two-stage method for anomaly detection in dynamic graphs: the first stage uses simple, conjugate Bayesian models for discrete time counting processes to track the pairwise links of all nodes in the graph to assess normality of behavior; the second stage applies standard network inference tools on a greatly reduced subset of potentially anomalous nodes. The utility of the method is demonstrated on simulated and real data sets.
△ Less
Submitted 8 November, 2010;
originally announced November 2010.
-
The POINT-AGAPE Survey: Comparing Automated Searches of Microlensing Events toward M31
Authors:
Y. Tsapras,
B. J. Carr,
M. J. Weston,
E. Kerins,
P. Baillon,
A. Gould,
S. Paulin-Henriksson
Abstract:
Searching for microlensing in M31 using automated superpixel surveys raises a number of difficulties which are not present in more conventional techniques. Here we focus on the problem that the list of microlensing candidates is sensitive to the selection criteria or "cuts" imposed and some subjectivity is involved in this. Weakening the cuts will generate a longer list of microlensing candidate…
▽ More
Searching for microlensing in M31 using automated superpixel surveys raises a number of difficulties which are not present in more conventional techniques. Here we focus on the problem that the list of microlensing candidates is sensitive to the selection criteria or "cuts" imposed and some subjectivity is involved in this. Weakening the cuts will generate a longer list of microlensing candidates but with a greater fraction of spurious ones; strengthening the cuts will produce a shorter list but may exclude some genuine events. We illustrate this by comparing three analyses of the same data-set obtained from a 3-year observing run on the INT in La Palma. The results of two of these analyses have been already reported: Belokurov et al. (2005) obtained between 3 and 22 candidates, depending on the strength of their cuts, while Calchi Novati et al. (2005) obtained 6 candidates. The third analysis is presented here for the first time and reports 10 microlensing candidates, 7 of which are new. Only two of the candidates are common to all three analyses. In order to understand why these analyses produce different candidate lists, a comparison is made of the cuts used by the three groups...
△ Less
Submitted 14 December, 2009;
originally announced December 2009.
-
POINT-AGAPE Pixel Lensing Survey of M31 : Evidence for a MACHO contribution to Galactic Halos
Authors:
S. Calchi Novati,
S. Paulin-Henriksson,
J. An,
P. Baillon,
V. Belokurov,
B. J. Carr,
M. Creze,
N. W. Evans,
Y. Giraud-Heraud,
A. Gould,
P. Hewett,
Ph. Jetzer,
J. Kaplan,
E. Kerins,
S. J. Smartt,
C. S. Stalin,
Y. Tsapras,
M. J. Weston
Abstract:
The POINT-AGAPE collaboration is carrying out a search for gravitational microlensing toward M31 to reveal galactic dark matter in the form of MACHOs (Massive Astrophysical Compact Halo Objects) in the halos of the Milky Way and M31. A high-threshold analysis of 3 years of data yields 6 bright, short--duration microlensing events, which are confronted to a simulation of the observations and the…
▽ More
The POINT-AGAPE collaboration is carrying out a search for gravitational microlensing toward M31 to reveal galactic dark matter in the form of MACHOs (Massive Astrophysical Compact Halo Objects) in the halos of the Milky Way and M31. A high-threshold analysis of 3 years of data yields 6 bright, short--duration microlensing events, which are confronted to a simulation of the observations and the analysis. The observed signal is much larger than expected from self lensing alone and we conclude, at the 95% confidence level, that at least 20% of the halo mass in the direction of M31 must be in the form of MACHOs if their average mass lies in the range 0.5-1 M$_\odot$. This lower bound drops to 8% for MACHOs with masses $\sim 0.01$ M$_\odot$. In addition, we discuss a likely binary microlensing candidate with caustic crossing. Its location, some 32' away from the centre of M31, supports our conclusion that we are detecting a MACHO signal in the direction of M31.
△ Less
Submitted 18 November, 2005; v1 submitted 7 April, 2005;
originally announced April 2005.
-
The POINT-AGAPE survey II: An Unrestricted Search for Microlensing Events towards M31
Authors:
V. Belokurov,
J. An,
N. W. Evans,
P. Hewett,
P. Baillon,
S. Calchi Novati,
B. J. Carr,
M. Creze,
Y. Giraud-Heraud,
A. Gould,
Ph. Jetzer,
J. Kaplan,
E. Kerins,
S. Paulin-Henriksson,
S. J. Smartt,
C. S. Stalin,
Y. Tsapras,
M. J. Weston,
.
Abstract:
An automated search is carried out for microlensing events using a catalogue of 44554 variable superpixel lightcurves derived from our three-year monitoring program of M31. Each step of our candidate selection is objective and reproducible by a computer. Our search is unrestricted, in the sense that it has no explicit timescale cut. So, it must overcome the awkward problem of distinguishing long…
▽ More
An automated search is carried out for microlensing events using a catalogue of 44554 variable superpixel lightcurves derived from our three-year monitoring program of M31. Each step of our candidate selection is objective and reproducible by a computer. Our search is unrestricted, in the sense that it has no explicit timescale cut. So, it must overcome the awkward problem of distinguishing long-timescale microlensing events from long-period stellar variables. The basis of the selection algorithm is the fitting of the superpixel lightcurves to two different theoretical models, using variable star and blended microlensing templates. Only if microlensing is preferred is an event retained as a possible candidate. Further cuts are made with regard to (i) sampling, (ii) goodness of fit of the peak to a Paczynski curve, (iii) consistency of the microlensing hypothesis with the absence of a resolved source, (iv) achromaticity, (v) position in the colour-magnitude diagram and (vi) signal-to-noise ratio. Our results are reported in terms of first-level candidates, which are the most trustworthy, and second-level candidates, which are possible microlensing but have lower signal-to-noise and are more questionable. The pipeline leaves just 3 first-level candidates, all of which have very short full-width half-maximum timescale (<5 days) and 3 second-level candidates, which have timescales of 31, 36 and 51 days respectively. We also show 16 third-level lightcurves, as an illustration of the events that just fail the threshold for designation as microlensing candidates. They are almost certainly mainly variable stars. Two of the 3 first-level candidates correspond to known events (PA 00-S3 and PA 00-S4) already reported by the POINT-AGAPE project. The remaining first-level candidate is new.
△ Less
Submitted 8 November, 2004;
originally announced November 2004.
-
Modelling horizontal and vertical concentration profiles of ozone and oxides of nitrogen within high-latitude urban area
Authors:
J. P. Nicholson,
K. J. Weston,
D. Fowler
Abstract:
A Lagrangian column model has been developed to simulate the mean (monthly and annual) three-dimensional structure in ozone and nitrogen oxides concentrations in the boundary layer within and immediately around an urban area. Short time-scale photochemical processes of ozone, as well as emissions and deposition to the ground are simulated. The results show that the average surface ozone concentr…
▽ More
A Lagrangian column model has been developed to simulate the mean (monthly and annual) three-dimensional structure in ozone and nitrogen oxides concentrations in the boundary layer within and immediately around an urban area. Short time-scale photochemical processes of ozone, as well as emissions and deposition to the ground are simulated. The results show that the average surface ozone concentration in the urban area is lower than the surrounding rural areas by typically 50%. Model results are compared with observations.
△ Less
Submitted 1 December, 2000; v1 submitted 30 November, 2000;
originally announced November 2000.
-
Photon Structure Functions Beyind the SUSY Threshold
Authors:
D. A. Ross,
L. J. Weston
Abstract:
We evolve virtual photon parton densities up to the SUSY threshold and higher using coupled inhomogeneous DGLAP differential equations. Reliable input parameterizations were available from the c-quark threshold. Limited
$P^2$ ( target photon virtuality ) dependence is observed. The difference to the photon structure function is shown to be significant with the introduction of SUSY dependent spl…
▽ More
We evolve virtual photon parton densities up to the SUSY threshold and higher using coupled inhomogeneous DGLAP differential equations. Reliable input parameterizations were available from the c-quark threshold. Limited
$P^2$ ( target photon virtuality ) dependence is observed. The difference to the photon structure function is shown to be significant with the introduction of SUSY dependent splitting functions. A negligible difference is observed by letting the gluino mass enter after the squark mass. An effort is made to include the squark threshold effect in such a way that both the renormalization group equations are satisfied and the perturbative calculation is reproduced.
△ Less
Submitted 25 August, 2000;
originally announced August 2000.