-
Computerized Tomography and Reproducing Kernels
Authors:
Ho Yun,
Victor M. Panaretos
Abstract:
The X-ray transform is one of the most fundamental integral operators in image processing and reconstruction. In this article, we revisit the formalism of the X-ray transform by considering it as an operator between Reproducing Kernel Hilbert Spaces (RKHS). Within this framework, the X-ray transform can be viewed as a natural analogue of Euclidean projection. The RKHS framework considerably simpli…
▽ More
The X-ray transform is one of the most fundamental integral operators in image processing and reconstruction. In this article, we revisit the formalism of the X-ray transform by considering it as an operator between Reproducing Kernel Hilbert Spaces (RKHS). Within this framework, the X-ray transform can be viewed as a natural analogue of Euclidean projection. The RKHS framework considerably simplifies projection image interpolation, and leads to an analogue of the celebrated representer theorem for the problem of tomographic reconstruction. It leads to methodology that is dimension-free and stands apart from conventional filtered back-projection techniques, as it does not hinge on the Fourier transform. It also allows us to establish sharp stability results at a genuinely functional level (i.e. without recourse to discretization), but in the realistic setting where the data are discrete and noisy. The RKHS framework is versatile, accommodating any reproducing kernel on a unit ball, affording a high level of generality. When the kernel is chosen to be rotation-invariant, explicit spectral representations can be obtained, elucidating the regularity structure of the associated Hilbert spaces. Moreover, the reconstruction problem can be solved at the same computational cost as filtered back-projection.
△ Less
Submitted 24 June, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
MICO: Selective Search with Mutual Information Co-training
Authors:
Zhanyu Wang,
Xiao Zhang,
Hyokun Yun,
Choon Hui Teo,
Trishul Chilimbi
Abstract:
In contrast to traditional exhaustive search, selective search first clusters documents into several groups before all the documents are searched exhaustively by a query, to limit the search executed within one group or only a few groups. Selective search is designed to reduce the latency and computation in modern large-scale search systems. In this study, we propose MICO, a Mutual Information CO-…
▽ More
In contrast to traditional exhaustive search, selective search first clusters documents into several groups before all the documents are searched exhaustively by a query, to limit the search executed within one group or only a few groups. Selective search is designed to reduce the latency and computation in modern large-scale search systems. In this study, we propose MICO, a Mutual Information CO-training framework for selective search with minimal supervision using the search logs. After training, MICO does not only cluster the documents, but also routes unseen queries to the relevant clusters for efficient retrieval. In our empirical experiments, MICO significantly improves the performance on multiple metrics of selective search and outperforms a number of existing competitive baselines.
△ Less
Submitted 9 September, 2022;
originally announced September 2022.
-
DS-MLR: Exploiting Double Separability for Scaling up Distributed Multinomial Logistic Regression
Authors:
Parameswaran Raman,
Sriram Srinivasan,
Shin Matsushima,
Xinhua Zhang,
Hyokun Yun,
S. V. N. Vishwanathan
Abstract:
Scaling multinomial logistic regression to datasets with very large number of data points and classes is challenging. This is primarily because one needs to compute the log-partition function on every data point. This makes distributing the computation hard. In this paper, we present a distributed stochastic gradient descent based optimization method (DS-MLR) for scaling up multinomial logistic re…
▽ More
Scaling multinomial logistic regression to datasets with very large number of data points and classes is challenging. This is primarily because one needs to compute the log-partition function on every data point. This makes distributing the computation hard. In this paper, we present a distributed stochastic gradient descent based optimization method (DS-MLR) for scaling up multinomial logistic regression problems to massive scale datasets without hitting any storage constraints on the data and model parameters. Our algorithm exploits double-separability, an attractive property that allows us to achieve both data as well as model parallelism simultaneously. In addition, we introduce a non-blocking and asynchronous variant of our algorithm that avoids bulk-synchronization. We demonstrate the versatility of DS-MLR to various scenarios in data and model parallelism, through an extensive empirical study using several real-world datasets. In particular, we demonstrate the scalability of DS-MLR by solving an extreme multi-class classification problem on the Reddit dataset (159 GB data, 358 GB parameters) where, to the best of our knowledge, no other existing methods apply.
△ Less
Submitted 3 August, 2018; v1 submitted 16 April, 2016;
originally announced April 2016.
-
WordRank: Learning Word Embeddings via Robust Ranking
Authors:
Shihao Ji,
Hyokun Yun,
Pinar Yanardag,
Shin Matsushima,
S. V. N. Vishwanathan
Abstract:
Embedding words in a vector space has gained a lot of attention in recent years. While state-of-the-art methods provide efficient computation of word similarities via a low-dimensional matrix embedding, their motivation is often left unclear. In this paper, we argue that word embedding can be naturally viewed as a ranking problem due to the ranking nature of the evaluation metrics. Then, based on…
▽ More
Embedding words in a vector space has gained a lot of attention in recent years. While state-of-the-art methods provide efficient computation of word similarities via a low-dimensional matrix embedding, their motivation is often left unclear. In this paper, we argue that word embedding can be naturally viewed as a ranking problem due to the ranking nature of the evaluation metrics. Then, based on this insight, we propose a novel framework WordRank that efficiently estimates word representations via robust ranking, in which the attention mechanism and robustness to noise are readily achieved via the DCG-like ranking losses. The performance of WordRank is measured in word similarity and word analogy benchmarks, and the results are compared to the state-of-the-art word embedding techniques. Our algorithm is very competitive to the state-of-the- arts on large corpora, while outperforms them by a significant margin when the training set is limited (i.e., sparse and noisy). With 17 million tokens, WordRank performs almost as well as existing methods using 7.2 billion tokens on a popular word similarity benchmark. Our multi-node distributed implementation of WordRank is publicly available for general usage.
△ Less
Submitted 27 September, 2016; v1 submitted 8 June, 2015;
originally announced June 2015.
-
Distributed Stochastic Optimization of the Regularized Risk
Authors:
Shin Matsushima,
Hyokun Yun,
Xinhua Zhang,
S. V. N. Vishwanathan
Abstract:
Many machine learning algorithms minimize a regularized risk, and stochastic optimization is widely used for this task. When working with massive data, it is desirable to perform stochastic optimization in parallel. Unfortunately, many existing stochastic optimization algorithms cannot be parallelized efficiently. In this paper we show that one can rewrite the regularized risk minimization problem…
▽ More
Many machine learning algorithms minimize a regularized risk, and stochastic optimization is widely used for this task. When working with massive data, it is desirable to perform stochastic optimization in parallel. Unfortunately, many existing stochastic optimization algorithms cannot be parallelized efficiently. In this paper we show that one can rewrite the regularized risk minimization problem as an equivalent saddle-point problem, and propose an efficient distributed stochastic optimization (DSO) algorithm. We prove the algorithm's rate of convergence; remarkably, our analysis shows that the algorithm scales almost linearly with the number of processors. We also verify with empirical evaluations that the proposed algorithm is competitive with other parallel, general purpose stochastic and batch optimization algorithms for regularized risk minimization.
△ Less
Submitted 9 June, 2015; v1 submitted 17 June, 2014;
originally announced June 2014.
-
Ranking via Robust Binary Classification and Parallel Parameter Estimation in Large-Scale Data
Authors:
Hyokun Yun,
Parameswaran Raman,
S. V. N. Vishwanathan
Abstract:
We propose RoBiRank, a ranking algorithm that is motivated by observing a close connection between evaluation metrics for learning to rank and loss functions for robust classification. The algorithm shows a very competitive performance on standard benchmark datasets against other representative algorithms in the literature. On the other hand, in large scale problems where explicit feature vectors…
▽ More
We propose RoBiRank, a ranking algorithm that is motivated by observing a close connection between evaluation metrics for learning to rank and loss functions for robust classification. The algorithm shows a very competitive performance on standard benchmark datasets against other representative algorithms in the literature. On the other hand, in large scale problems where explicit feature vectors and scores are not given, our algorithm can be efficiently parallelized across a large number of machines; for a task that requires 386,133 x 49,824,519 pairwise interactions between items to be ranked, our algorithm finds solutions that are of dramatically higher quality than that can be found by a state-of-the-art competitor algorithm, given the same amount of wall-clock time for computation.
△ Less
Submitted 21 August, 2014; v1 submitted 11 February, 2014;
originally announced February 2014.
-
Efficiently Sampling Multiplicative Attribute Graphs Using a Ball-Drop** Process
Authors:
Hyokun Yun,
S. V. N. Vishwanathan
Abstract:
We introduce a novel and efficient sampling algorithm for the Multiplicative Attribute Graph Model (MAGM - Kim and Leskovec (2010)}). Our algorithm is \emph{strictly} more efficient than the algorithm proposed by Yun and Vishwanathan (2012), in the sense that our method extends the \emph{best} time complexity guarantee of their algorithm to a larger fraction of parameter space. Both in theory and…
▽ More
We introduce a novel and efficient sampling algorithm for the Multiplicative Attribute Graph Model (MAGM - Kim and Leskovec (2010)}). Our algorithm is \emph{strictly} more efficient than the algorithm proposed by Yun and Vishwanathan (2012), in the sense that our method extends the \emph{best} time complexity guarantee of their algorithm to a larger fraction of parameter space. Both in theory and in empirical evaluation on sparse graphs, our new algorithm outperforms the previous one. To design our algorithm, we first define a stochastic \emph{ball-drop** process} (BDP). Although a special case of this process was introduced as an efficient approximate sampling algorithm for the Kronecker Product Graph Model (KPGM - Leskovec et al. (2010)}), neither \emph{why} such an approximation works nor \emph{what} is the actual distribution this process is sampling from has been addressed so far to the best of our knowledge. Our rigorous treatment of the BDP enables us to clarify the rational behind a BDP approximation of KPGM, and design an efficient sampling algorithm for the MAGM.
△ Less
Submitted 27 February, 2012; v1 submitted 27 February, 2012;
originally announced February 2012.
-
Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs
Authors:
Hyokun Yun,
S. V. N. Vishwanathan
Abstract:
We describe the first sub-quadratic sampling algorithm for the Multiplicative Attribute Graph Model (MAGM) of Kim and Leskovec (2010). We exploit the close connection between MAGM and the Kronecker Product Graph Model (KPGM) of Leskovec et al. (2010), and show that to sample a graph from a MAGM it suffices to sample small number of KPGM graphs and \emph{quilt} them together. Under a restricted set…
▽ More
We describe the first sub-quadratic sampling algorithm for the Multiplicative Attribute Graph Model (MAGM) of Kim and Leskovec (2010). We exploit the close connection between MAGM and the Kronecker Product Graph Model (KPGM) of Leskovec et al. (2010), and show that to sample a graph from a MAGM it suffices to sample small number of KPGM graphs and \emph{quilt} them together. Under a restricted set of technical conditions our algorithm runs in $O((\log_2(n))^3 |E|)$ time, where $n$ is the number of nodes and $|E|$ is the number of edges in the sampled graph. We demonstrate the scalability of our algorithm via extensive empirical evaluation; we can sample a MAGM graph with 8 million nodes and 20 billion edges in under 6 hours.
△ Less
Submitted 9 February, 2012; v1 submitted 24 October, 2011;
originally announced October 2011.
-
Using Logistic Regression to Analyze the Balance of a Game: The Case of StarCraft II
Authors:
Hyokun Yun
Abstract:
Recently, the market size of online game has been increasing astonishingly fast, and so does the importance of good game design. In online games, usually a human user competes with others, so the fairness of the game system to all users is of great importance not to lose interests of users on the game. Furthermore, the emergence and success of electronic sports (e-sports) and professional gaming w…
▽ More
Recently, the market size of online game has been increasing astonishingly fast, and so does the importance of good game design. In online games, usually a human user competes with others, so the fairness of the game system to all users is of great importance not to lose interests of users on the game. Furthermore, the emergence and success of electronic sports (e-sports) and professional gaming which specially talented gamers compete with others draws more attention on whether they are competing in the fair environment. No matter how fierce the debates are in the game-design community, it is rarely the case that one employs statistical analysis to answer this question seriously. But considering the fact that we can easily gather large amount of user behavior data on games, it seems potentially beneficial to make use of this data to aid making decisions on design problems of games. Actually, modern games do not aim to perfectly design the game at once: rather, they first release the game, and then monitor users' behavior to better balance the game. In such a scenario, statistical analysis can be particularly helpful. Specifically, we chose to analyze the balance of StarCraft II, which is a very successful recently-released real-time strategy (RTS) game. It is a central icon in current e-Sports and professional gaming community: from April 1st to 15th, there were 18 tournaments of StarCraft II. However, there is endless debate on whether the winner of the tournament is actually superior to others, or it is largely due to certain design flaws of the game. In this paper, we aim to answer such a question using traditional statistical tool, logistic regression.
△ Less
Submitted 4 May, 2011;
originally announced May 2011.