-
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
Authors:
Kiho Park,
Yo Joong Choe,
Yibo Jiang,
Victor Veitch
Abstract:
Understanding how semantic meaning is encoded in the representation spaces of large language models is a fundamental problem in interpretability. In this paper, we study the two foundational questions in this area. First, how are categorical concepts, such as {'mammal', 'bird', 'reptile', 'fish'}, represented? Second, how are hierarchical relations between concepts encoded? For example, how is the…
▽ More
Understanding how semantic meaning is encoded in the representation spaces of large language models is a fundamental problem in interpretability. In this paper, we study the two foundational questions in this area. First, how are categorical concepts, such as {'mammal', 'bird', 'reptile', 'fish'}, represented? Second, how are hierarchical relations between concepts encoded? For example, how is the fact that 'dog' is a kind of 'mammal' encoded? We show how to extend the linear representation hypothesis to answer these questions. We find a remarkably simple structure: simple categorical concepts are represented as simplices, hierarchically related concepts are orthogonal in a sense we make precise, and (in consequence) complex concepts are represented as polytopes constructed from direct sums of simplices, reflecting the hierarchical structure. We validate these theoretical results on the Gemma large language model, estimating representations for 957 hierarchically related concepts using data from WordNet.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Combining Evidence Across Filtrations Using Adjusters
Authors:
Yo Joong Choe,
Aaditya Ramdas
Abstract:
In anytime-valid sequential inference, it is known that any admissible procedure must be based on e-processes, which are composite generalizations of test martingales that quantify the accumulated evidence against a composite null hypothesis at any arbitrary stop** time. This paper studies methods for combining e-processes constructed using different information sets (filtrations) for the same n…
▽ More
In anytime-valid sequential inference, it is known that any admissible procedure must be based on e-processes, which are composite generalizations of test martingales that quantify the accumulated evidence against a composite null hypothesis at any arbitrary stop** time. This paper studies methods for combining e-processes constructed using different information sets (filtrations) for the same null. Although e-processes constructed in the same filtration can be combined effortlessly (e.g., by averaging), e-processes constructed in different filtrations cannot, because their validity in a coarser filtration does not translate to validity in a finer filtration. This issue arises in exchangeability tests, independence tests, and tests for comparing forecasts with lags. We first establish that a class of functions called adjusters allows us to lift e-processes from a coarser filtration into any finer filtration. We then introduce a characterization theorem for adjusters, formalizing a sense in which using adjusters is necessary. There are two major implications. First, if we have a powerful e-process in a coarsened filtration, then we readily have a powerful e-process in the original filtration. Second, when we coarsen the filtration to construct an e-process, there is an asymptotically logarithmic cost of recovering anytime-validity in the original filtration.
△ Less
Submitted 28 May, 2024; v1 submitted 14 February, 2024;
originally announced February 2024.
-
Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation Models
Authors:
Yae Jee Cho,
Luyang Liu,
Zheng Xu,
Aldi Fahrezi,
Gauri Joshi
Abstract:
Foundation models (FMs) adapt well to specific domains or tasks with fine-tuning, and federated learning (FL) enables the potential for privacy-preserving fine-tuning of the FMs with on-device local data. For federated fine-tuning of FMs, we consider the FMs with small to medium parameter sizes of single digit billion at maximum, referred to as on-device FMs (ODFMs) that can be deployed on devices…
▽ More
Foundation models (FMs) adapt well to specific domains or tasks with fine-tuning, and federated learning (FL) enables the potential for privacy-preserving fine-tuning of the FMs with on-device local data. For federated fine-tuning of FMs, we consider the FMs with small to medium parameter sizes of single digit billion at maximum, referred to as on-device FMs (ODFMs) that can be deployed on devices for inference but can only be fine-tuned with parameter efficient methods. In our work, we tackle the data and system heterogeneity problem of federated fine-tuning of ODFMs by proposing a novel method using heterogeneous low-rank approximations (LoRAs), namely HetLoRA. First, we show that the naive approach of using homogeneous LoRA ranks across devices face a trade-off between overfitting and slow convergence, and thus propose HetLoRA, which allows heterogeneous ranks across client devices and efficiently aggregates and distributes these heterogeneous LoRA modules. By applying rank self-pruning locally and sparsity-weighted aggregation at the server, HetLoRA combines the advantages of high and low-rank LoRAs, which achieves improved convergence speed and final performance compared to homogeneous LoRA. Furthermore, HetLoRA offers enhanced computation efficiency compared to full fine-tuning, making it suitable for federated fine-tuning across heterogeneous devices.
△ Less
Submitted 20 February, 2024; v1 submitted 12 January, 2024;
originally announced January 2024.
-
The Linear Representation Hypothesis and the Geometry of Large Language Models
Authors:
Kiho Park,
Yo Joong Choe,
Victor Veitch
Abstract:
Informally, the 'linear representation hypothesis' is the idea that high-level concepts are represented linearly as directions in some representation space. In this paper, we address two closely related questions: What does "linear representation" actually mean? And, how do we make sense of geometric notions (e.g., cosine similarity or projection) in the representation space? To answer these, we u…
▽ More
Informally, the 'linear representation hypothesis' is the idea that high-level concepts are represented linearly as directions in some representation space. In this paper, we address two closely related questions: What does "linear representation" actually mean? And, how do we make sense of geometric notions (e.g., cosine similarity or projection) in the representation space? To answer these, we use the language of counterfactuals to give two formalizations of "linear representation", one in the output (word) representation space, and one in the input (sentence) space. We then prove these connect to linear probing and model steering, respectively. To make sense of geometric notions, we use the formalization to identify a particular (non-Euclidean) inner product that respects language structure in a sense we make precise. Using this causal inner product, we show how to unify all notions of linear representation. In particular, this allows the construction of probes and steering vectors using counterfactual pairs. Experiments with LLaMA-2 demonstrate the existence of linear representations of concepts, the connection to interpretation and control, and the fundamental role of the choice of inner product.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Local or Global: Selective Knowledge Assimilation for Federated Learning with Limited Labels
Authors:
Yae Jee Cho,
Gauri Joshi,
Dimitrios Dimitriadis
Abstract:
Many existing FL methods assume clients with fully-labeled data, while in realistic settings, clients have limited labels due to the expensive and laborious process of labeling. Limited labeled local data of the clients often leads to their local model having poor generalization abilities to their larger unlabeled local data, such as having class-distribution mismatch with the unlabeled data. As a…
▽ More
Many existing FL methods assume clients with fully-labeled data, while in realistic settings, clients have limited labels due to the expensive and laborious process of labeling. Limited labeled local data of the clients often leads to their local model having poor generalization abilities to their larger unlabeled local data, such as having class-distribution mismatch with the unlabeled data. As a result, clients may instead look to benefit from the global model trained across clients to leverage their unlabeled data, but this also becomes difficult due to data heterogeneity across clients. In our work, we propose FedLabel where clients selectively choose the local or global model to pseudo-label their unlabeled data depending on which is more of an expert of the data. We further utilize both the local and global models' knowledge via global-local consistency regularization which minimizes the divergence between the two models' outputs when they have identical pseudo-labels for the unlabeled data. Unlike other semi-supervised FL baselines, our method does not require additional experts other than the local or global model, nor require additional parameters to be communicated. We also do not assume any server-labeled data or fully labeled clients. For both cross-device and cross-silo settings, we show that FedLabel outperforms other semi-supervised FL baselines by $8$-$24\%$, and even outperforms standard fully supervised FL baselines ($100\%$ labeled data) with only $5$-$20\%$ of labeled data.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Counterfactually Comparing Abstaining Classifiers
Authors:
Yo Joong Choe,
Aditya Gangrade,
Aaditya Ramdas
Abstract:
Abstaining classifiers have the option to abstain from making predictions on inputs that they are unsure about. These classifiers are becoming increasingly popular in high-stakes decision-making problems, as they can withhold uncertain predictions to improve their reliability and safety. When evaluating black-box abstaining classifier(s), however, we lack a principled approach that accounts for wh…
▽ More
Abstaining classifiers have the option to abstain from making predictions on inputs that they are unsure about. These classifiers are becoming increasingly popular in high-stakes decision-making problems, as they can withhold uncertain predictions to improve their reliability and safety. When evaluating black-box abstaining classifier(s), however, we lack a principled approach that accounts for what the classifier would have predicted on its abstentions. These missing predictions matter when they can eventually be utilized, either directly or as a backup option in a failure mode. In this paper, we introduce a novel approach and perspective to the problem of evaluating and comparing abstaining classifiers by treating abstentions as missing data. Our evaluation approach is centered around defining the counterfactual score of an abstaining classifier, defined as the expected performance of the classifier had it not been allowed to abstain. We specify the conditions under which the counterfactual score is identifiable: if the abstentions are stochastic, and if the evaluation data is independent of the training data (ensuring that the predictions are missing at random), then the score is identifiable. Note that, if abstentions are deterministic, then the score is unidentifiable because the classifier can perform arbitrarily poorly on its abstentions. Leveraging tools from observational causal inference, we then develop nonparametric and doubly robust methods to efficiently estimate this quantity under identification. Our approach is examined in both simulated and real data experiments.
△ Less
Submitted 9 November, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
On the Convergence of Federated Averaging with Cyclic Client Participation
Authors:
Yae Jee Cho,
Pranay Sharma,
Gauri Joshi,
Zheng Xu,
Satyen Kale,
Tong Zhang
Abstract:
Federated Averaging (FedAvg) and its variants are the most popular optimization algorithms in federated learning (FL). Previous convergence analyses of FedAvg either assume full client participation or partial client participation where the clients can be uniformly sampled. However, in practical cross-device FL systems, only a subset of clients that satisfy local criteria such as battery status, n…
▽ More
Federated Averaging (FedAvg) and its variants are the most popular optimization algorithms in federated learning (FL). Previous convergence analyses of FedAvg either assume full client participation or partial client participation where the clients can be uniformly sampled. However, in practical cross-device FL systems, only a subset of clients that satisfy local criteria such as battery status, network connectivity, and maximum participation frequency requirements (to ensure privacy) are available for training at a given time. As a result, client availability follows a natural cyclic pattern. We provide (to our knowledge) the first theoretical framework to analyze the convergence of FedAvg with cyclic client participation with several different client optimizers such as GD, SGD, and shuffled SGD. Our analysis discovers that cyclic client participation can achieve a faster asymptotic convergence rate than vanilla FedAvg with uniform client participation under suitable conditions, providing valuable insights into the design of client sampling protocols.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Maximizing Global Model Appeal in Federated Learning
Authors:
Yae Jee Cho,
Divyansh Jhunjhunwala,
Tian Li,
Virginia Smith,
Gauri Joshi
Abstract:
Federated learning typically considers collaboratively training a global model using local data at edge clients. Clients may have their own individual requirements, such as having a minimal training loss threshold, which they expect to be met by the global model. However, due to client heterogeneity, the global model may not meet each client's requirements, and only a small subset may find the glo…
▽ More
Federated learning typically considers collaboratively training a global model using local data at edge clients. Clients may have their own individual requirements, such as having a minimal training loss threshold, which they expect to be met by the global model. However, due to client heterogeneity, the global model may not meet each client's requirements, and only a small subset may find the global model appealing. In this work, we explore the problem of the global model lacking appeal to the clients due to not being able to satisfy local requirements. We propose MaxFL, which aims to maximize the number of clients that find the global model appealing. We show that having a high global model appeal is important to maintain an adequate pool of clients for training, and can directly improve the test accuracy on both seen and unseen clients. We provide convergence guarantees for MaxFL and show that MaxFL achieves a $22$-$40\%$ and $18$-$50\%$ test accuracy improvement for the training clients and unseen clients respectively, compared to a wide range of FL modeling approaches, including those that tackle data heterogeneity, aim to incentivize clients, and learn personalized or fair models.
△ Less
Submitted 4 February, 2023; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Heterogeneous Ensemble Knowledge Transfer for Training Large Models in Federated Learning
Authors:
Yae Jee Cho,
Andre Manoel,
Gauri Joshi,
Robert Sim,
Dimitrios Dimitriadis
Abstract:
Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server. Most existing FL algorithms require models of identical architecture to be deployed across the clients and server, making it infeasible to train large models due to clients' limited system resources. In this work, we propose a novel ensemble knowledge…
▽ More
Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server. Most existing FL algorithms require models of identical architecture to be deployed across the clients and server, making it infeasible to train large models due to clients' limited system resources. In this work, we propose a novel ensemble knowledge transfer method named Fed-ET in which small models (different in architecture) are trained on clients, and used to train a larger model at the server. Unlike in conventional ensemble learning, in FL the ensemble can be trained on clients' highly heterogeneous data. Cognizant of this property, Fed-ET uses a weighted consensus distillation scheme with diversity regularization that efficiently extracts reliable consensus from the ensemble while improving generalization by exploiting the diversity within the ensemble. We show the generalization bound for the ensemble of weighted models trained on heterogeneous datasets that supports the intuition of Fed-ET. Our experiments on image and language tasks show that Fed-ET significantly outperforms other state-of-the-art FL algorithms with fewer communicated parameters, and is also robust against high data-heterogeneity.
△ Less
Submitted 27 April, 2022;
originally announced April 2022.
-
Comparing Sequential Forecasters
Authors:
Yo Joong Choe,
Aaditya Ramdas
Abstract:
Consider two forecasters, each making a single prediction for a sequence of events over time. We ask a relatively basic question: how might we compare these forecasters, either online or post-hoc, while avoiding unverifiable assumptions on how the forecasts and outcomes were generated? In this paper, we present a rigorous answer to this question by designing novel sequential inference procedures f…
▽ More
Consider two forecasters, each making a single prediction for a sequence of events over time. We ask a relatively basic question: how might we compare these forecasters, either online or post-hoc, while avoiding unverifiable assumptions on how the forecasts and outcomes were generated? In this paper, we present a rigorous answer to this question by designing novel sequential inference procedures for estimating the time-varying difference in forecast scores. To do this, we employ confidence sequences (CS), which are sequences of confidence intervals that can be continuously monitored and are valid at arbitrary data-dependent stop** times ("anytime-valid"). The widths of our CSs are adaptive to the underlying variance of the score differences. Underlying their construction is a game-theoretic statistical framework, in which we further identify e-processes and p-processes for sequentially testing a weak null hypothesis -- whether one forecaster outperforms another on average (rather than always). Our methods do not make distributional assumptions on the forecasts or outcomes; our main theorems apply to any bounded scores, and we later provide alternative methods for unbounded scores. We empirically validate our approaches by comparing real-world baseball and weather forecasters.
△ Less
Submitted 9 November, 2023; v1 submitted 30 September, 2021;
originally announced October 2021.
-
Personalized Federated Learning for Heterogeneous Clients with Clustered Knowledge Transfer
Authors:
Yae Jee Cho,
Jianyu Wang,
Tarun Chiruvolu,
Gauri Joshi
Abstract:
Personalized federated learning (FL) aims to train model(s) that can perform well for individual clients that are highly data and system heterogeneous. Most work in personalized FL, however, assumes using the same model architecture at all clients and increases the communication cost by sending/receiving models. This may not be feasible for realistic scenarios of FL. In practice, clients have high…
▽ More
Personalized federated learning (FL) aims to train model(s) that can perform well for individual clients that are highly data and system heterogeneous. Most work in personalized FL, however, assumes using the same model architecture at all clients and increases the communication cost by sending/receiving models. This may not be feasible for realistic scenarios of FL. In practice, clients have highly heterogeneous system-capabilities and limited communication resources. In our work, we propose a personalized FL framework, PerFed-CKT, where clients can use heterogeneous model architectures and do not directly communicate their model parameters. PerFed-CKT uses clustered co-distillation, where clients use logits to transfer their knowledge to other clients that have similar data-distributions. We theoretically show the convergence and generalization properties of PerFed-CKT and empirically show that PerFed-CKT achieves high test accuracy with several orders of magnitude lower communication cost compared to the state-of-the-art personalized FL schemes.
△ Less
Submitted 16 September, 2021;
originally announced September 2021.
-
Bandit-based Communication-Efficient Client Selection Strategies for Federated Learning
Authors:
Yae Jee Cho,
Samarth Gupta,
Gauri Joshi,
Osman Yağan
Abstract:
Due to communication constraints and intermittent client availability in federated learning, only a subset of clients can participate in each training round. While most prior works assume uniform and unbiased client selection, recent work on biased client selection has shown that selecting clients with higher local losses can improve error convergence speed. However, previously proposed biased sel…
▽ More
Due to communication constraints and intermittent client availability in federated learning, only a subset of clients can participate in each training round. While most prior works assume uniform and unbiased client selection, recent work on biased client selection has shown that selecting clients with higher local losses can improve error convergence speed. However, previously proposed biased selection strategies either require additional communication cost for evaluating the exact local loss or utilize stale local loss, which can even make the model diverge. In this paper, we present a bandit-based communication-efficient client selection strategy UCB-CS that achieves faster convergence with lower communication overhead. We also demonstrate how client selection can be used to improve fairness.
△ Less
Submitted 14 December, 2020;
originally announced December 2020.
-
Client Selection in Federated Learning: Convergence Analysis and Power-of-Choice Selection Strategies
Authors:
Yae Jee Cho,
Jianyu Wang,
Gauri Joshi
Abstract:
Federated learning is a distributed optimization paradigm that enables a large number of resource-limited client nodes to cooperatively train a model without data sharing. Several works have analyzed the convergence of federated learning by accounting of data heterogeneity, communication and computation limitations, and partial client participation. However, they assume unbiased client participati…
▽ More
Federated learning is a distributed optimization paradigm that enables a large number of resource-limited client nodes to cooperatively train a model without data sharing. Several works have analyzed the convergence of federated learning by accounting of data heterogeneity, communication and computation limitations, and partial client participation. However, they assume unbiased client participation, where clients are selected at random or in proportion of their data sizes. In this paper, we present the first convergence analysis of federated optimization for biased client selection strategies, and quantify how the selection bias affects convergence speed. We reveal that biasing client selection towards clients with higher local loss achieves faster error convergence. Using this insight, we propose Power-of-Choice, a communication- and computation-efficient client selection framework that can flexibly span the trade-off between convergence speed and solution bias. Our experiments demonstrate that Power-of-Choice strategies converge up to 3 $\times$ faster and give $10$% higher test accuracy than the baseline random selection.
△ Less
Submitted 2 October, 2020;
originally announced October 2020.
-
An Empirical Study of Invariant Risk Minimization
Authors:
Yo Joong Choe,
Jiyeon Ham,
Kyubyong Park
Abstract:
Invariant risk minimization (IRM) (Arjovsky et al., 2019) is a recently proposed framework designed for learning predictors that are invariant to spurious correlations across different training environments. Yet, despite its theoretical justifications, IRM has not been extensively tested across various settings. In an attempt to gain a better understanding of the framework, we empirically investig…
▽ More
Invariant risk minimization (IRM) (Arjovsky et al., 2019) is a recently proposed framework designed for learning predictors that are invariant to spurious correlations across different training environments. Yet, despite its theoretical justifications, IRM has not been extensively tested across various settings. In an attempt to gain a better understanding of the framework, we empirically investigate several research questions using IRMv1, which is the first practical algorithm proposed to approximately solve IRM. By extending the ColoredMNIST experiment in different ways, we find that IRMv1 (i) performs better as the spurious correlation varies more widely between training environments, (ii) learns an approximately invariant predictor when the underlying relationship is approximately invariant, and (iii) can be extended to an analogous setting for text classification.
△ Less
Submitted 6 July, 2020; v1 submitted 10 April, 2020;
originally announced April 2020.
-
KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding
Authors:
Jiyeon Ham,
Yo Joong Choe,
Kyubyong Park,
Ilji Choi,
Hyungjoon Soh
Abstract:
Natural language inference (NLI) and semantic textual similarity (STS) are key tasks in natural language understanding (NLU). Although several benchmark datasets for those tasks have been released in English and a few other languages, there are no publicly available NLI or STS datasets in the Korean language. Motivated by this, we construct and release new datasets for Korean NLI and STS, dubbed K…
▽ More
Natural language inference (NLI) and semantic textual similarity (STS) are key tasks in natural language understanding (NLU). Although several benchmark datasets for those tasks have been released in English and a few other languages, there are no publicly available NLI or STS datasets in the Korean language. Motivated by this, we construct and release new datasets for Korean NLI and STS, dubbed KorNLI and KorSTS, respectively. Following previous approaches, we machine-translate existing English training sets and manually translate development and test sets into Korean. To accelerate research on Korean NLU, we also establish baselines on KorNLI and KorSTS. Our datasets are publicly available at https://github.com/kakaobrain/KorNLUDatasets.
△ Less
Submitted 5 October, 2020; v1 submitted 7 April, 2020;
originally announced April 2020.
-
Jejueo Datasets for Machine Translation and Speech Synthesis
Authors:
Kyubyong Park,
Yo Joong Choe,
Jiyeon Ham
Abstract:
Jejueo was classified as critically endangered by UNESCO in 2010. Although diverse efforts to revitalize it have been made, there have been few computational approaches. Motivated by this, we construct two new Jejueo datasets: Jejueo Interview Transcripts (JIT) and Jejueo Single Speaker Speech (JSS). The JIT dataset is a parallel corpus containing 170k+ Jejueo-Korean sentences, and the JSS dataset…
▽ More
Jejueo was classified as critically endangered by UNESCO in 2010. Although diverse efforts to revitalize it have been made, there have been few computational approaches. Motivated by this, we construct two new Jejueo datasets: Jejueo Interview Transcripts (JIT) and Jejueo Single Speaker Speech (JSS). The JIT dataset is a parallel corpus containing 170k+ Jejueo-Korean sentences, and the JSS dataset consists of 10k high-quality audio files recorded by a native Jejueo speaker and a transcript file. Subsequently, we build neural systems of machine translation and speech synthesis using them. All resources are publicly available via our GitHub repository. We hope that these datasets will attract interest of both language and machine learning communities.
△ Less
Submitted 27 November, 2019;
originally announced November 2019.
-
word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs
Authors:
Yo Joong Choe,
Kyubyong Park,
Dongwoo Kim
Abstract:
We present word2word, a publicly available dataset and an open-source Python package for cross-lingual word translations extracted from sentence-level parallel corpora. Our dataset provides top-k word translations in 3,564 (directed) language pairs across 62 languages in OpenSubtitles2018 (Lison et al., 2018). To obtain this dataset, we use a count-based bilingual lexicon extraction model based on…
▽ More
We present word2word, a publicly available dataset and an open-source Python package for cross-lingual word translations extracted from sentence-level parallel corpora. Our dataset provides top-k word translations in 3,564 (directed) language pairs across 62 languages in OpenSubtitles2018 (Lison et al., 2018). To obtain this dataset, we use a count-based bilingual lexicon extraction model based on the observation that not only source and target words but also source words themselves can be highly correlated. We illustrate that the resulting bilingual lexicons have high coverage and attain competitive translation quality for several language pairs. We wrap our dataset and model in an easy-to-use Python library, which supports downloading and retrieving top-k word translations in any of the supported language pairs as well as computing top-k word translations for custom parallel corpora.
△ Less
Submitted 27 November, 2019;
originally announced November 2019.
-
A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning
Authors:
Yo Joong Choe,
Jiyeon Ham,
Kyubyong Park,
Yeoil Yoon
Abstract:
Grammatical error correction can be viewed as a low-resource sequence-to-sequence task, because publicly available parallel corpora are limited. To tackle this challenge, we first generate erroneous versions of large unannotated corpora using a realistic noising function. The resulting parallel corpora are subsequently used to pre-train Transformer models. Then, by sequentially applying transfer l…
▽ More
Grammatical error correction can be viewed as a low-resource sequence-to-sequence task, because publicly available parallel corpora are limited. To tackle this challenge, we first generate erroneous versions of large unannotated corpora using a realistic noising function. The resulting parallel corpora are subsequently used to pre-train Transformer models. Then, by sequentially applying transfer learning, we adapt these models to the domain and style of the test set. Combined with a context-aware neural spellchecker, our system achieves competitive results in both restricted and low resource tracks in ACL 2019 BEA Shared Task. We release all of our code and materials for reproducibility.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.
-
Predicting drug-target interaction using 3D structure-embedded graph representations from graph neural networks
Authors:
Jaechang Lim,
Seongok Ryu,
Kyubyong Park,
Yo Joong Choe,
Jiyeon Ham,
Woo Youn Kim
Abstract:
Accurate prediction of drug-target interaction (DTI) is essential for in silico drug design. For the purpose, we propose a novel approach for predicting DTI using a GNN that directly incorporates the 3D structure of a protein-ligand complex. We also apply a distance-aware graph attention algorithm with gate augmentation to increase the performance of our model. As a result, our model shows better…
▽ More
Accurate prediction of drug-target interaction (DTI) is essential for in silico drug design. For the purpose, we propose a novel approach for predicting DTI using a GNN that directly incorporates the 3D structure of a protein-ligand complex. We also apply a distance-aware graph attention algorithm with gate augmentation to increase the performance of our model. As a result, our model shows better performance than docking and other deep learning methods for both virtual screening and pose prediction. In addition, our model can reproduce the natural population distribution of active molecules and inactive molecules.
△ Less
Submitted 17 April, 2019;
originally announced April 2019.
-
Golden ratio algorithms with new stepsize rules for variational inequalities
Authors:
Dang Van Hieu,
Yeol Je Cho,
Yi-bin Xiao
Abstract:
In this paper, we introduce two golden ratio algorithms with new stepsize rules for solving pseudomonotone and Lipschitz variational inequalities in finite dimensional Hilbert spaces. The presented stepsize rules allow the resulting algorithms to work without the prior knowledge of the Lipschitz constant of operator. The first algorithm uses a sequence of stepsizes which is previously chosen, dimi…
▽ More
In this paper, we introduce two golden ratio algorithms with new stepsize rules for solving pseudomonotone and Lipschitz variational inequalities in finite dimensional Hilbert spaces. The presented stepsize rules allow the resulting algorithms to work without the prior knowledge of the Lipschitz constant of operator. The first algorithm uses a sequence of stepsizes which is previously chosen, diminishing and non-summable. While the stepsizes in the second one are updated at each iteration and by a simple computation. A special point is that the sequence of stepsizes generated by the second algorithm is separated from zero. The convergence as well as the convergence rate of the proposed algorithms are established under some standard conditions. Also, we give several numerical results to show the behavior of the algorithms in comparisons with other algorithms.
△ Less
Submitted 16 April, 2019;
originally announced April 2019.
-
Discovery of Natural Language Concepts in Individual Units of CNNs
Authors:
Seil Na,
Yo Joong Choe,
Dong-Hyun Lee,
Gunhee Kim
Abstract:
Although deep convolutional networks have achieved improved performance in many natural language tasks, they have been treated as black boxes because they are difficult to interpret. Especially, little is known about how they represent language in their intermediate layers. In an attempt to understand the representations of deep convolutional networks trained on language tasks, we show that indivi…
▽ More
Although deep convolutional networks have achieved improved performance in many natural language tasks, they have been treated as black boxes because they are difficult to interpret. Especially, little is known about how they represent language in their intermediate layers. In an attempt to understand the representations of deep convolutional networks trained on language tasks, we show that individual units are selectively responsive to specific morphemes, words, and phrases, rather than responding to arbitrary and uninterpretable patterns. In order to quantitatively analyze such an intriguing phenomenon, we propose a concept alignment method based on how units respond to the replicated text. We conduct analyses with different architectures on multiple datasets for classification and translation tasks and provide new insights into how deep models understand natural language.
△ Less
Submitted 28 February, 2019; v1 submitted 18 February, 2019;
originally announced February 2019.
-
Planar and van der Waals heterostructures for vertical tunnelling single electron transistors
Authors:
Gwangwoo Kim,
Sung-Soo Kim,
Jonghyuk Jeon,
Seong In Yoon,
Seokmo Hong,
Young ** Cho,
Abhishek Misra,
Servet Ozdemir,
Jun Yin,
Davit Ghazaryan,
Mathew Holwill,
Artem Mishchenko,
Daria V. Andreeva,
Yong-** Kim,
Hu Young Jeong,
A-Rang Jang,
Hyun-Jong Chung,
Andre K. Geim,
Kostya S. Novoselov,
Byeong-Hyeok Sohn,
Hyeon Suk Shin
Abstract:
Despite a rich choice of two-dimensional materials, which exists these days, heterostructures, both vertical (van der Waals) and in-plane, offer an unprecedented control over the properties and functionalities of the resulted structures. Thus, planar heterostructures allow p-n junctions between different two-dimensional semiconductors and graphene nanoribbons with well-defined edges; and vertical…
▽ More
Despite a rich choice of two-dimensional materials, which exists these days, heterostructures, both vertical (van der Waals) and in-plane, offer an unprecedented control over the properties and functionalities of the resulted structures. Thus, planar heterostructures allow p-n junctions between different two-dimensional semiconductors and graphene nanoribbons with well-defined edges; and vertical heterostructures resulted in the observation of superconductivity in purely carbon-based systems and realisation of vertical tunnelling transistors. Here we demonstrate simultaneous use of in-plane and van der Waals heterostructures to build vertical single electron tunnelling transistors. We grow graphene quantum dots inside the matrix of hexagonal boron nitride, which allows a dramatic reduction of the number of localised states along the perimeter of the quantum dots. The use of hexagonal boron nitride tunnel barriers as contacts to the graphene quantum dots make our transistors reproducible and not dependent on the localised states, opening even larger flexibility when designing future devices.
△ Less
Submitted 16 January, 2019;
originally announced January 2019.
-
Approximating Fixed points of Bregman Generalized $α$-nonexpansive map**s
Authors:
K. Muangchoo,
P. Kumam,
Y. J. Cho,
S. Dhompongsa
Abstract:
In this paper, we introduce a new class of Bregman generalized $α$-nonexpansive map**s in terms of Bregman distances, and investigate the Ishikawa and Noor iterations for these map**s. We establish weak and strong convergence theorems of Ishikawa and Noor iterative schemes for Bregman generalized $α$-nonexpansive map**s in Banach spaces. Furthermore, we propose an example of our generated ma…
▽ More
In this paper, we introduce a new class of Bregman generalized $α$-nonexpansive map**s in terms of Bregman distances, and investigate the Ishikawa and Noor iterations for these map**s. We establish weak and strong convergence theorems of Ishikawa and Noor iterative schemes for Bregman generalized $α$-nonexpansive map**s in Banach spaces. Furthermore, we propose an example of our generated map** and some numerical examples which support our main theorem. Our results are new and improve the recent ones in the literature.
△ Less
Submitted 28 November, 2018;
originally announced November 2018.
-
V2X Downlink Coverage Analysis with a Realistic Urban Vehicular Model
Authors:
Yae Jee Cho,
Kaibin Huang,
Chan-Byoung Chae
Abstract:
As the realization of vehicular communication such as vehicle-to-vehicle (V2V) or vehicle-to-infrastructure (V2I) is imperative for the autonomous driving cars, the understanding of realistic vehicle-to-everything (V2X) models is needed. While previous research has mostly targeted vehicular models in which vehicles are randomly distributed and the variable of carrier frequency was not considered,…
▽ More
As the realization of vehicular communication such as vehicle-to-vehicle (V2V) or vehicle-to-infrastructure (V2I) is imperative for the autonomous driving cars, the understanding of realistic vehicle-to-everything (V2X) models is needed. While previous research has mostly targeted vehicular models in which vehicles are randomly distributed and the variable of carrier frequency was not considered, a more realistic analysis of the V2X model is proposed in this paper. We use a one-dimensional (1D) Poisson cluster process (PCP) to model a realistic scenario of vehicle distribution in a perpendicular cross line road urban area and compare the coverage results with the previous research that distributed vehicles randomly by Poisson Point Process (PPP). Moreover, we incorporate the effect of different carrier frequencies, mmWave and sub-6 GHz, to our analysis by altering the antenna radiation pattern accordingly. Results indicated that while the effect of clustering led to lower outage, using mmWave had even more significance in leading to lower outage. Moreover, line-of-sight (LoS) interference links are shown to be more dominant in lowering the outage than the non-line-of-sight (NLoS) links even though they are less in number. The analytical results give insight into designing and analyzing the urban V2X channels, and are verified by actual urban area three-dimensional (3D) ray-tracing simulation.
△ Less
Submitted 25 June, 2018; v1 submitted 10 May, 2018;
originally announced May 2018.
-
Local White Matter Architecture Defines Functional Brain Dynamics
Authors:
Yo Joong Choe,
Sivaraman Balakrishnan,
Aarti Singh,
Jean M. Vettel,
Timothy Verstynen
Abstract:
Large bundles of myelinated axons, called white matter, anatomically connect disparate brain regions together and compose the structural core of the human connectome. We recently proposed a method of measuring the local integrity along the length of each white matter fascicle, termed the local connectome. If communication efficiency is fundamentally constrained by the integrity along the entire le…
▽ More
Large bundles of myelinated axons, called white matter, anatomically connect disparate brain regions together and compose the structural core of the human connectome. We recently proposed a method of measuring the local integrity along the length of each white matter fascicle, termed the local connectome. If communication efficiency is fundamentally constrained by the integrity along the entire length of a white matter bundle, then variability in the functional dynamics of brain networks should be associated with variability in the local connectome. We test this prediction using two statistical approaches that are capable of handling the high dimensionality of data. First, by performing statistical inference on distance-based correlations, we show that similarity in the local connectome between individuals is significantly correlated with similarity in their patterns of functional connectivity. Second, by employing variable selection using sparse canonical correlation analysis and cross-validation, we show that segments of the local connectome are predictive of certain patterns of functional brain dynamics. These results are consistent with the hypothesis that structural variability along axon bundles constrains communication between disparate brain regions.
△ Less
Submitted 16 September, 2018; v1 submitted 22 April, 2018;
originally announced April 2018.
-
RF Lens-Embedded Antenna Array for mmWave MIMO: Design and Performance
Authors:
Yae Jee Cho,
Gee-Yong Suk,
Byoungnam Kim,
Dong Ku Kim,
Chan-Byoung Chae
Abstract:
The requirement of high data-rate in the fifth generation wireless systems (5G) calls for the ultimate utilization of the wide bandwidth in the mmWave frequency band. Researchers seeking to compensate for mmWave's high path loss and to achieve both gain and directivity have proposed that mmWave multiple-input multiple-output (MIMO) systems make use of beamforming systems. Hybrid beamforming in mmW…
▽ More
The requirement of high data-rate in the fifth generation wireless systems (5G) calls for the ultimate utilization of the wide bandwidth in the mmWave frequency band. Researchers seeking to compensate for mmWave's high path loss and to achieve both gain and directivity have proposed that mmWave multiple-input multiple-output (MIMO) systems make use of beamforming systems. Hybrid beamforming in mmWave demonstrates promising performance in achieving high gain and directivity by using phase shifters at the analog processing block. What remains a problem, however, is the actual implementation of mmWave beamforming systems; to fabricate such a system is costly and complex. With the aim of reducing such cost and complexity, this article presents actual prototypes of the lens antenna as an effective device to be used in the future 5G mmWave hybrid beamforming systems. Using a lens as a passive phase shifter enables beamforming without the heavy network of active phase shifters, while gain and directivity are achieved by the energy-focusing property of the lens. Proposed in this article are two types of lens antennas, one for static and the other for mobile usage. Their performance is evaluated using measurements and simulation data along with link-level analysis via a software defined radio (SDR) platform. Results show the promising potential of the lens antenna for its high gain and directivity, and its improved beam-switching feasibility compared to when a lens is not used. System-level evaluations reveal the significant throughput enhancement in both real indoor and outdoor environments. Moreover, the lens antenna's design issues are also discussed by evaluating different lens sizes.
△ Less
Submitted 22 January, 2018;
originally announced January 2018.
-
Map-based Millimeter-Wave Channel Models: An Overview, Hybrid Modeling, Data, and Learning
Authors:
Yeon-Geun Lim,
Yae Jee Cho,
MinSoo Sim,
Younsun Kim,
Chan-Byoung Chae,
Reinaldo A. Valenzuela
Abstract:
Compared to the current wireless communication systems, millimeter wave (mm-Wave) promises a wide range of spectrum. As viable alternatives to existing mm-Wave channel models, various map-based channel models with different modeling methods have been widely discussed. Map-based channel models are based on a ray-tracing algorithm and include realistic channel parameters in a given map. Such paramet…
▽ More
Compared to the current wireless communication systems, millimeter wave (mm-Wave) promises a wide range of spectrum. As viable alternatives to existing mm-Wave channel models, various map-based channel models with different modeling methods have been widely discussed. Map-based channel models are based on a ray-tracing algorithm and include realistic channel parameters in a given map. Such parameters enable researchers to accurately evaluate novel technologies in the mm-Wave range. Diverse map-based modeling methods result in different modeling objectives, including the characteristics of channel parameters and different complexities of the modeling procedure. This article outlines an overview of map-based mm-Wave channel models and proposes a concept of how they can be utilized to integrate a hardware testbed/sounder with a software testbed/sounder. In addition, we categorize map-based channel parameters and provide guidelines for hybrid modeling. Next, we share the measurement data and the map-based channel parameters with the public. Lastly, we evaluate a machine learning-based beam selection algorithm through the shared database. We expect that the offered guidelines and the shared database will enable researchers to readily design a map-based channel model.
△ Less
Submitted 10 July, 2019; v1 submitted 24 November, 2017;
originally announced November 2017.
-
Relationship between Cross-Polarization Discrimination (XPD) and Spatial Correlation in Indoor Small-Cell MIMO Systems
Authors:
Yeon-Geun Lim,
Yae Jee Cho,
TaeckKeun Oh,
Yongshik Lee,
Chan-Byoung Chae
Abstract:
In this letter, we present a correlated channel model for a dual-polarization antenna to omnidirectional antennas in indoor small-cell multiple-input multiple-output (MIMO) systems. In an indoor environment, we confirm that the cross-polarization discrimination (XPD) in the direction of angle-of-departure can be represented as the spatial correlation of the MIMO channel. We also evaluate a dual-po…
▽ More
In this letter, we present a correlated channel model for a dual-polarization antenna to omnidirectional antennas in indoor small-cell multiple-input multiple-output (MIMO) systems. In an indoor environment, we confirm that the cross-polarization discrimination (XPD) in the direction of angle-of-departure can be represented as the spatial correlation of the MIMO channel. We also evaluate a dual-polarization antenna-based MIMO channel model and a spatially correlated channel model using a three-dimensional (3D) ray-tracing simulator. Furthermore, we provide the equivalent distance between adjacent antennas according to the XPD, providing insights into designing a dual-polarization antenna and its arrays.
△ Less
Submitted 6 December, 2017; v1 submitted 1 July, 2017;
originally announced July 2017.
-
Random fixed point theorems for Hardy-Rogers self-random operators with applications to random integral equations
Authors:
Plern Saipara,
Poom Kumam,
Yeol Je Cho
Abstract:
In this paper, we prove some random fixed point theorems for Hardy-Rogers self-random operators in separable Banach spaces and, as some applications, we show the existence of a solution for random nonlinear integral equations in Banach spaces. Some stochastic versions of deterministic fixed point theorems for Hardy-Rogers self map**s and stochastic integral equations are obtained.
In this paper, we prove some random fixed point theorems for Hardy-Rogers self-random operators in separable Banach spaces and, as some applications, we show the existence of a solution for random nonlinear integral equations in Banach spaces. Some stochastic versions of deterministic fixed point theorems for Hardy-Rogers self map**s and stochastic integral equations are obtained.
△ Less
Submitted 6 June, 2017;
originally announced June 2017.
-
Effective Enzyme Deployment for Degradation of Interference Molecules in Molecular Communication
Authors:
Yae Jee Cho,
H. Birkan Yilmaz,
Weisi Guo,
Chan-Byoung Chae
Abstract:
In molecular communication, the heavy tail nature of molecular signals causes inter-symbol interference (ISI). Because of this, it is difficult to decrease symbol periods and achieve high data rate. As a probable solution for ISI mitigation, enzymes were proposed to be used since they are capable of degrading ISI molecules without deteriorating the molecular communication. While most prior work ha…
▽ More
In molecular communication, the heavy tail nature of molecular signals causes inter-symbol interference (ISI). Because of this, it is difficult to decrease symbol periods and achieve high data rate. As a probable solution for ISI mitigation, enzymes were proposed to be used since they are capable of degrading ISI molecules without deteriorating the molecular communication. While most prior work has assumed an infinite amount of enzymes deployed around the channel, from a resource perspective, it is more efficient to deploy a limited amount of enzymes at particular locations and structures. This paper considers carrying out such deployment at two structures--around the receiver (Rx) and/or the transmitter (Tx) site. For both of the deployment scenarios, channels with different system environment parameters, Tx-to-Rx distance, size of enzyme area, and symbol period, are compared with each other for analyzing an optimized system environment for ISI mitigation when a limited amount of enzymes are available.
△ Less
Submitted 18 March, 2017;
originally announced March 2017.
-
A Machine Learning Approach to Model the Received Signal in Molecular Communications
Authors:
H. Birkan Yilmaz,
Changmin Lee,
Yae Jee Cho,
Chan-Byoung Chae
Abstract:
A molecular communication channel is determined by the received signal. Received signal models form the basis for studies focused on modulation, receiver design, capacity, and coding depend on the received signal models. Therefore, it is crucial to model the number of received molecules until time $t$ analytically. Modeling the diffusion-based molecular communication channel with the first-hitting…
▽ More
A molecular communication channel is determined by the received signal. Received signal models form the basis for studies focused on modulation, receiver design, capacity, and coding depend on the received signal models. Therefore, it is crucial to model the number of received molecules until time $t$ analytically. Modeling the diffusion-based molecular communication channel with the first-hitting process is an open issue for a spherical transmitter. In this paper, we utilize the artificial neural networks technique to model the received signal for a spherical transmitter and a perfectly absorbing receiver (i.e., first hitting process). The proposed technique may be utilized in other studies that assume a spherical transmitter instead of a point transmitter.
△ Less
Submitted 18 November, 2016;
originally announced November 2016.
-
Effective inter-symbol interference mitigation with a limited amount of enzymes in molecular communications
Authors:
Yae Jee Cho,
H. Birkan Yilmaz,
Weisi Guo,
Chan-Byoung Chae
Abstract:
In molecular communication via diffusion (MCvD), the inter-symbol interference (ISI) is a well known severe problem that deteriorates both data rates and link reliability. ISI mainly occurs due to the slow and highly random propagation of the messenger molecules, which causes the emitted molecules from the previous symbols to interfere with molecules from the current symbol. An effective way to mi…
▽ More
In molecular communication via diffusion (MCvD), the inter-symbol interference (ISI) is a well known severe problem that deteriorates both data rates and link reliability. ISI mainly occurs due to the slow and highly random propagation of the messenger molecules, which causes the emitted molecules from the previous symbols to interfere with molecules from the current symbol. An effective way to mitigate the ISI is using enzymes to degrade undesired molecules. Prior work on ISI mitigation by enzymes has assumed an infinite amount of enzymes randomly distributed around the molecular channel. Taking a different approach, this paper assumes an MCvD channel with a limited amount of enzymes. The main question this paper addresses is how to deploy these enzymes in an effective structure so that ISI mitigation is maximized. To find an effective MCvD channel environment, this study considers optimization of the shape of the transmitter node, the deployment location and structure, the size of the enzyme deployed area, and the half-lives of the enzymes. It also analyzes the dependence of the optimum size of the enzyme area on the distance and half-life.
△ Less
Submitted 18 April, 2016;
originally announced April 2016.
-
Carrier-mediated antiferromagnetic interlayer exchange coupling in diluted magnetic semiconductor multilayers Ga$_{1-x}$Mn$_x$As/GaAs:Be
Authors:
J. -H. Chung,
S. J. Chung,
Sanghoon Lee,
B. J. Kirby,
J. A. Borchers,
Y. J. Cho,
X. Liu,
J. K. Furdyna
Abstract:
We use neutron reflectometry to investigate the interlayer exchange coupling between Ga$_{0.97}$Mn$_{0.03}$As ferromagnetic semiconductor layers separated by non-magnetic Be-doped GaAs spacers. Polarized neutron reflectivity measured below the Curie temperature of Ga$_{0.97}$Mn$_{0.03}$As reveals a characteristic splitting at the wave vector corresponding to twice the multilayer period, indicati…
▽ More
We use neutron reflectometry to investigate the interlayer exchange coupling between Ga$_{0.97}$Mn$_{0.03}$As ferromagnetic semiconductor layers separated by non-magnetic Be-doped GaAs spacers. Polarized neutron reflectivity measured below the Curie temperature of Ga$_{0.97}$Mn$_{0.03}$As reveals a characteristic splitting at the wave vector corresponding to twice the multilayer period, indicating that the coupling between the ferromagnetic layers are antiferromagnetic (AFM). When the applied field is increased to above the saturation field, this AFM coupling is suppressed. This behavior is not observed when the spacers are undoped, suggesting that the observed AFM coupling is mediated by charge carriers introduced via Be do**. The behavior of magnetization of the multilayers measured by DC magnetometry is consistent with the neutron reflectometry results.
△ Less
Submitted 5 September, 2008;
originally announced September 2008.
-
Electron-mediated ferromagnetism and small spin-orbit interaction in a molecular-beam-epitaxy grown n-type $GaAs/Al_{0.3}Ga_{0.7}As$ heterostructure with Mn $δ$-do**
Authors:
A. Bove,
F. Altomare,
N. B. Kundtz,
Albert M. Chang,
Y. J. Cho,
X. Liu,
J. Furdyna
Abstract:
We report the first evidence of electron-mediated ferromagnetism in a molecular-beam-epitaxy (MBE) grown $GaAs/Al_{0.3}Ga_{0.7}As$ heterostructure with Mn $δ$-do**. The interaction between the magnetic dopants (Mn) and the Two-Dimensional Electron Gas (2DEG) realizes magnetic ordering when the temperature is below the Curie temperature ($T_{C} \sim 1.7K$) and the 2DEG is brought in close proxi…
▽ More
We report the first evidence of electron-mediated ferromagnetism in a molecular-beam-epitaxy (MBE) grown $GaAs/Al_{0.3}Ga_{0.7}As$ heterostructure with Mn $δ$-do**. The interaction between the magnetic dopants (Mn) and the Two-Dimensional Electron Gas (2DEG) realizes magnetic ordering when the temperature is below the Curie temperature ($T_{C} \sim 1.7K$) and the 2DEG is brought in close proximity to the Mn layer by gating. The Anomalous Hall Effect (AHE) contribution to the total Hall resistance is shown to be about three to four orders of magnitude smaller than in the case of hole-mediated ferromagnetism indicating the presence of small spin-orbit interaction.
△ Less
Submitted 2 July, 2008; v1 submitted 26 February, 2008;
originally announced February 2008.
-
A novel technique to make Ohmic contact to a buried two-dimensional electron gas in a molecular-beam-epitaxy grown $GaAs/Al_{0.3}Ga_{0.7}As$ heterostructure with Mn $δ$-do**
Authors:
A. Bove,
F. Altomare,
N. B. Kundtz,
Albert M. Chang,
Y. J. Cho,
X. Liu,
J. Furdyna
Abstract:
We report on the growth and characterization of a new Diluted Magnetic Semiconductor (DMS) heterostructure that presents a Two-Dimensional Electron Gas (2DEG) with a carrier density $n \sim 1.08 \times 10^{12} cm^{-2}$ and a mobility $μ\sim 600 cm^{2} / (Vs)$ at T $\sim$ 4.2K. As far as we know this is the highest mobility value reported in the literature for GaMnAs systems. A novel technique wa…
▽ More
We report on the growth and characterization of a new Diluted Magnetic Semiconductor (DMS) heterostructure that presents a Two-Dimensional Electron Gas (2DEG) with a carrier density $n \sim 1.08 \times 10^{12} cm^{-2}$ and a mobility $μ\sim 600 cm^{2} / (Vs)$ at T $\sim$ 4.2K. As far as we know this is the highest mobility value reported in the literature for GaMnAs systems. A novel technique was developed to make Ohmic contact to the buried 2DEG without destroying the magnetic properties of our crystal.
△ Less
Submitted 18 July, 2008; v1 submitted 26 February, 2008;
originally announced February 2008.
-
Definitive Evidence of Interlayer Coupling Between (Ga,Mn)As Layers Separated by a Nonmagnetic Spacer
Authors:
B. J. Kirby,
J. A. Borchers,
X. Liu,
Y. J. Cho,
M. Dobrowolska,
J. K. Furdyna
Abstract:
We have used polarized neutron reflectometry to study the structural and magnetic properties of the individual layers in a series of (Al,Be,Ga)As/(Ga,Mn)As/GaAs/(Ga,Mn)As multilayer samples. Structurally, we observe that the samples are virtually identical except for the GaAs spacer thickness (which varies from 3-12 nm), and confirm that the spacers contain little or no Mn. Magnetically, we obse…
▽ More
We have used polarized neutron reflectometry to study the structural and magnetic properties of the individual layers in a series of (Al,Be,Ga)As/(Ga,Mn)As/GaAs/(Ga,Mn)As multilayer samples. Structurally, we observe that the samples are virtually identical except for the GaAs spacer thickness (which varies from 3-12 nm), and confirm that the spacers contain little or no Mn. Magnetically, we observe that for the sample with the thickest spacer layer, modulation do** by the(Al,Be,Ga)As results in (Ga,Mn)As layers with very different temperature dependent magnetizations. However, as the spacer layer thickness is reduced, the temperature dependent magnetizations of the top an bottom (Ga,Mn)As layers become progressively more similar - a trend we find to be independent of the crystallographic direction along which spins are magnetized. These results definitively show that (Ga,Mn)As layers can couple across a non-magnetic spacer, and that such coupling depends on spacer thickness.
△ Less
Submitted 10 September, 2007; v1 submitted 16 August, 2007;
originally announced August 2007.
-
Refinements of Some Reverses of Schwarz's Inequality in 2-Inner Product Spaces and Applications for Integrals
Authors:
P. Cerone,
Y. J. Cho,
S. S. Dragomir,
S. S. Kim
Abstract:
Refinements of some recent reverse inequalities for the celebrated Cauchy-Bunyakovsky-Schwarz inequality in 2-inner product spaces are given. Using this framework, applications for determinantal integral inequalities are also provided.
Refinements of some recent reverse inequalities for the celebrated Cauchy-Bunyakovsky-Schwarz inequality in 2-inner product spaces are given. Using this framework, applications for determinantal integral inequalities are also provided.
△ Less
Submitted 5 September, 2003;
originally announced September 2003.
-
Norm Estimates for the Difference Between Bochner's Integral and the Convex Combination of Function's Values
Authors:
P. Cerone,
Y. J. Cho,
S. S. Dragomir,
J. K. Kim,
S. S. Kim
Abstract:
Norm estimates are developed between the Bochner integral of a vector-valued function in Banach spaces having the Radon-Nikodym property and the convex combination of function values taken on a division of the interval [a,b].
Norm estimates are developed between the Bochner integral of a vector-valued function in Banach spaces having the Radon-Nikodym property and the convex combination of function values taken on a division of the interval [a,b].
△ Less
Submitted 4 September, 2003;
originally announced September 2003.
-
On Some Gronwall Type Inequalities Involving Iterated Integrals
Authors:
Y. J. Cho,
S. S. Dragomir,
Y. -H. Kim
Abstract:
In this paper, some new Gronwall type inequalities involving iterated integrals are given.
In this paper, some new Gronwall type inequalities involving iterated integrals are given.
△ Less
Submitted 2 September, 2003;
originally announced September 2003.
-
Some Reverses of the Cauchy-Bunyakovsky-Schwarz Inequality in 2-Inner Product Spaces
Authors:
S. S. Dragomir,
Y. J. Cho,
S. S. Kim
Abstract:
In this paper, some reverses of the Cauchy-Bunyakovsky-Schwarz inequality in 2-inner product spaces are given. Using this framework, some applications for determinantal integral inequalities are also provided.
In this paper, some reverses of the Cauchy-Bunyakovsky-Schwarz inequality in 2-inner product spaces are given. Using this framework, some applications for determinantal integral inequalities are also provided.
△ Less
Submitted 1 September, 2003;
originally announced September 2003.
-
Some Boas-Bellman Type Inequalities in 2-Inner Product Spaces
Authors:
S. S. Dragomir,
Y. J. Cho,
S. S. Kim,
A. Sofo
Abstract:
Some inequalities in 2-inner product spaces generalizing Bessel's result that are similar to the Boas-Bellman inequality from inner product spaces, are given. Applications for determinantal integral inequalities are also provided.
Some inequalities in 2-inner product spaces generalizing Bessel's result that are similar to the Boas-Bellman inequality from inner product spaces, are given. Applications for determinantal integral inequalities are also provided.
△ Less
Submitted 27 August, 2003;
originally announced August 2003.