-
A fitted space-time finite element method for an advection-diffusion problem with moving interfaces
Authors:
Quang Huy Nguyen,
Van Chien Le,
Phuong Cuc Hoang,
Thi Thanh Mai Ta
Abstract:
This paper presents a fitted space-time finite element method for solving a parabolic advection-diffusion problem with a nonstationary interface. The jum** diffusion coefficient gives rise to the discontinuity of the spatial gradient of solution across the interface. We use the Banach-Necas-Babuska theorem to show the well-posedness of the continuous variational problem. A fully discrete finite-…
▽ More
This paper presents a fitted space-time finite element method for solving a parabolic advection-diffusion problem with a nonstationary interface. The jum** diffusion coefficient gives rise to the discontinuity of the spatial gradient of solution across the interface. We use the Banach-Necas-Babuska theorem to show the well-posedness of the continuous variational problem. A fully discrete finite-element based scheme is analyzed using the Galerkin method and unstructured fitted meshes. An optimal error estimate is established in a discrete energy norm under appropriate globally low but locally high regularity conditions. Some numerical results corroborate our theoretical results.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Automatic Generation of Web Censorship Probe Lists
Authors:
Jenny Tang,
Leo Alvarez,
Arjun Brar,
Nguyen Phong Hoang,
Nicolas Christin
Abstract:
Domain probe lists--used to determine which URLs to probe for Web censorship--play a critical role in Internet censorship measurement studies. Indeed, the size and accuracy of the domain probe list limits the set of censored pages that can be detected; inaccurate lists can lead to an incomplete view of the censorship landscape or biased results. Previous efforts to generate domain probe lists have…
▽ More
Domain probe lists--used to determine which URLs to probe for Web censorship--play a critical role in Internet censorship measurement studies. Indeed, the size and accuracy of the domain probe list limits the set of censored pages that can be detected; inaccurate lists can lead to an incomplete view of the censorship landscape or biased results. Previous efforts to generate domain probe lists have been mostly manual or crowdsourced. This approach is time-consuming, prone to errors, and does not scale well to the ever-changing censorship landscape.
In this paper, we explore methods for automatically generating probe lists that are both comprehensive and up-to-date for Web censorship measurement. We start from an initial set of 139,957 unique URLs from various existing test lists consisting of pages from a variety of languages to generate new candidate pages. By analyzing content from these URLs (i.e., performing topic and keyword extraction), expanding these topics, and using them as a feed to search engines, our method produces 119,255 new URLs across 35,147 domains. We then test the new candidate pages by attempting to access each URL from servers in eleven different global locations over a span of four months to check for their connectivity and potential signs of censorship. Our measurements reveal that our method discovered over 1,400 domains--not present in the original dataset--we suspect to be blocked. In short, automatically updating probe lists is possible, and can help further automate censorship measurements at scale.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
SR-CACO-2: A Dataset for Confocal Fluorescence Microscopy Image Super-Resolution
Authors:
Soufiane Belharbi,
Mara KM Whitford,
Phuong Hoang,
Shakeeb Murtaza,
Luke McCaffrey,
Eric Granger
Abstract:
Confocal fluorescence microscopy is one of the most accessible and widely used imaging techniques for the study of biological processes. Scanning confocal microscopy allows the capture of high-quality images from 3D samples, yet suffers from well-known limitations such as photobleaching and phototoxicity of specimens caused by intense light exposure, which limits its use in some applications, espe…
▽ More
Confocal fluorescence microscopy is one of the most accessible and widely used imaging techniques for the study of biological processes. Scanning confocal microscopy allows the capture of high-quality images from 3D samples, yet suffers from well-known limitations such as photobleaching and phototoxicity of specimens caused by intense light exposure, which limits its use in some applications, especially for living cells. Cellular damage can be alleviated by changing imaging parameters to reduce light exposure, often at the expense of image quality. Machine/deep learning methods for single-image super-resolution (SISR) can be applied to restore image quality by upscaling lower-resolution (LR) images to produce high-resolution images (HR). These SISR methods have been successfully applied to photo-realistic images due partly to the abundance of publicly available data. In contrast, the lack of publicly available data partly limits their application and success in scanning confocal microscopy. In this paper, we introduce a large scanning confocal microscopy dataset named SR-CACO-2 that is comprised of low- and high-resolution image pairs marked for three different fluorescent markers. It allows the evaluation of performance of SISR methods on three different upscaling levels (X2, X4, X8). SR-CACO-2 contains the human epithelial cell line Caco-2 (ATCC HTB-37), and it is composed of 22 tiles that have been translated in the form of 9,937 image patches for experiments with SISR methods. Given the new SR-CACO-2 dataset, we also provide benchmarking results for 15 state-of-the-art methods that are representative of the main SISR families. Results show that these methods have limited success in producing high-resolution textures, indicating that SR-CACO-2 represents a challenging problem. Our dataset, code and pretrained weights are available: https://github.com/sbelharbi/sr-caco-2.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Generating all invertible matrices by row operations
Authors:
Petr Gregor,
Hung P. Hoang,
Arturo Merino,
Ondřej Mička
Abstract:
We show that all invertible $n \times n$ matrices over any finite field $\mathbb{F}_q$ can be generated in a Gray code fashion. More specifically, there exists a listing such that (1) each matrix appears exactly once, and (2) two consecutive matrices differ by adding or subtracting one row from a previous or subsequent row, or by multiplying or diving a row by the generator of the multiplicative g…
▽ More
We show that all invertible $n \times n$ matrices over any finite field $\mathbb{F}_q$ can be generated in a Gray code fashion. More specifically, there exists a listing such that (1) each matrix appears exactly once, and (2) two consecutive matrices differ by adding or subtracting one row from a previous or subsequent row, or by multiplying or diving a row by the generator of the multiplicative group of $\mathbb{F}_q$. This even holds if the addition and subtraction of each row is allowed to some specific rows satisfying a certain mild condition. Moreover, we can prescribe the first and the last matrix if $n\ge 3$, or $n=2$ and $q>2$. In other words, the corresponding flip graph on all invertible $n \times n$ matrices over $\mathbb{F}_q$ is Hamilton connected if it is not a cycle.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Parameterized Complexity of Efficient Sortation
Authors:
Robert Ganian,
Hung P. Hoang,
Simon Wietheger
Abstract:
A crucial challenge arising in the design of large-scale logistical networks is to optimize parcel sortation for routing. We study this problem under the recent graph-theoretic formalization of Van Dyk, Klause, Koenemann and Megow (IPCO 2024). The problem asks - given an input digraph D (the fulfillment network) together with a set of commodities represented as source-sink tuples - for a minimum-o…
▽ More
A crucial challenge arising in the design of large-scale logistical networks is to optimize parcel sortation for routing. We study this problem under the recent graph-theoretic formalization of Van Dyk, Klause, Koenemann and Megow (IPCO 2024). The problem asks - given an input digraph D (the fulfillment network) together with a set of commodities represented as source-sink tuples - for a minimum-outdegree subgraph H of the transitive closure of D that contains a source-sink route for each of the commodities. Given the underlying motivation, we study two variants of the problem which differ in whether the routes for the commodities are assumed to be given, or can be chosen arbitrarily.
We perform a thorough parameterized analysis of the complexity of both problems. Our results concentrate on three fundamental parameterizations of the problem: (1) When attempting to parameterize by the target outdegree of H, we show that the problems are paraNP-hard even in highly restricted cases; (2) When parameterizing by the number of commodities, we utilize Ramsey-type arguments, kernelization and treewidth reduction techniques to obtain parameterized algorithms for both problems; (3) When parameterizing by the structure of D, we establish fixed-parameter tractability for both problems w.r.t. treewidth, maximum degree and the maximum routing length. We combine this with lower bounds which show that omitting any of the three parameters results in paraNP-hardness.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding
Authors:
Phong Nguyen-Thuan Do,
Son Quoc Tran,
Phu Gia Hoang,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks. To establish a standardized set of benchmarks for Vietnamese NLU, we introduce the first Vietnamese Language Understanding Evaluation (VLUE) benchm…
▽ More
The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks. To establish a standardized set of benchmarks for Vietnamese NLU, we introduce the first Vietnamese Language Understanding Evaluation (VLUE) benchmark. The VLUE benchmark encompasses five datasets covering different NLU tasks, including text classification, span extraction, and natural language understanding. To provide an insightful overview of the current state of Vietnamese NLU, we then evaluate seven state-of-the-art pre-trained models, including both multilingual and Vietnamese monolingual models, on our proposed VLUE benchmark. Furthermore, we present CafeBERT, a new state-of-the-art pre-trained model that achieves superior results across all tasks in the VLUE benchmark. Our model combines the proficiency of a multilingual pre-trained model with Vietnamese linguistic knowledge. CafeBERT is developed based on the XLM-RoBERTa model, with an additional pretraining step utilizing a significant amount of Vietnamese textual data to enhance its adaptation to the Vietnamese language. For the purpose of future research, CafeBERT is made publicly available for research purposes.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
The Heisenberg-RIXS instrument at the European XFEL
Authors:
Justine Schlappa,
Giacomo Ghiringhelli,
Benjamin E. Van Kuiken,
Martin Teichmann,
Piter S. Miedema,
Jan Torben Delitz,
Natalia Gerasimova,
Serguei Molodtsov,
Luigi Adriano,
Bernard Baranasic,
Carsten Broers,
Robert Carley,
Patrick Gessler,
Nahid Ghodrati,
David Hickin,
Le Phuong Hoang,
Manuel Izquierdo,
Laurent Mercadier,
Giuseppe Mercurio,
Sergii Parchenko,
Marijan Stupar,
Zhong Yin,
Leonardo Martinelli,
Giacomo Merzoni,
Ying Ying Peng
, et al. (22 additional authors not shown)
Abstract:
Resonant Inelastic X-ray Scattering (RIXS) is an ideal X-ray spectroscopy method to push the combination of energy and time resolutions to the Fourier transform ultimate limit, because it is unaffected by the core-hole lifetime energy broadening. And in pump-probe experiments the interaction time is made very short by the same core-hole lifetime. RIXS is very photon hungry so it takes great advant…
▽ More
Resonant Inelastic X-ray Scattering (RIXS) is an ideal X-ray spectroscopy method to push the combination of energy and time resolutions to the Fourier transform ultimate limit, because it is unaffected by the core-hole lifetime energy broadening. And in pump-probe experiments the interaction time is made very short by the same core-hole lifetime. RIXS is very photon hungry so it takes great advantage from high repetition rate pulsed X-ray sources like the European XFEL. The hRIXS instrument is designed for RIXS experiments in the soft X-ray range with energy resolution approaching the Fourier and the Heisenberg limits. It is based on a spherical grating with variable line spacing (VLS) and a position-sensitive 2D detector. Initially, two gratings are installed to adequately cover the whole photon energy range. With optimized spot size on the sample and small pixel detector the energy resolution can be better than 40 meV at any photon energy below 1000 eV. At the SCS instrument of the European XFEL the spectrometer can be easily positioned thanks to air-pads on a high-quality floor, allowing the scattering angle to be continuously adjusted over the 65-145 deg range. It can be coupled to two different sample interaction chamber, one for liquid jets and one for solids, each equipped at the state-of-the-art and compatible for optical laser pum** in collinear geometry. The measured performances, in terms of energy resolution and count rate on the detector, closely match design expectations. hRIXS is open to public users since the summer of 2022.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
The $k$-Opt algorithm for the Traveling Salesman Problem has exponential running time for $k \ge 5$
Authors:
Sophia Heimann,
Hung P. Hoang,
Stefan Hougardy
Abstract:
The $k$-Opt algorithm is a local search algorithm for the Traveling Salesman Problem. Starting with an initial tour, it iteratively replaces at most $k$ edges in the tour with the same number of edges to obtain a better tour. Krentel (FOCS 1989) showed that the Traveling Salesman Problem with the $k$-Opt neighborhood is complete for the class PLS (polynomial time local search) and that the $k$-Opt…
▽ More
The $k$-Opt algorithm is a local search algorithm for the Traveling Salesman Problem. Starting with an initial tour, it iteratively replaces at most $k$ edges in the tour with the same number of edges to obtain a better tour. Krentel (FOCS 1989) showed that the Traveling Salesman Problem with the $k$-Opt neighborhood is complete for the class PLS (polynomial time local search) and that the $k$-Opt algorithm can have exponential running time for any pivot rule. However, his proof requires $k \gg 1000$ and has a substantial gap. We show the two properties above for a much smaller value of $k$, addressing an open question by Monien, Dumrauf, and Tscheuschner (ICALP 2010). In particular, we prove the PLS-completeness for $k \geq 17$ and the exponential running time for $k \geq 5$.
△ Less
Submitted 13 June, 2024; v1 submitted 10 February, 2024;
originally announced February 2024.
-
Deep spatial context: when attention-based models meet spatial regression
Authors:
Paulina Tomaszewska,
Elżbieta Sienkiewicz,
Mai P. Hoang,
Przemysław Biecek
Abstract:
We propose 'Deep spatial context' (DSCon) method, which serves for investigation of the attention-based vision models using the concept of spatial context. It was inspired by histopathologists, however, the method can be applied to various domains. The DSCon allows for a quantitative measure of the spatial context's role using three Spatial Context Measures: $SCM_{features}$, $SCM_{targets}$,…
▽ More
We propose 'Deep spatial context' (DSCon) method, which serves for investigation of the attention-based vision models using the concept of spatial context. It was inspired by histopathologists, however, the method can be applied to various domains. The DSCon allows for a quantitative measure of the spatial context's role using three Spatial Context Measures: $SCM_{features}$, $SCM_{targets}$, $SCM_{residuals}$ to distinguish whether the spatial context is observable within the features of neighboring regions, their target values (attention scores) or residuals, respectively. It is achieved by integrating spatial regression into the pipeline. The DSCon helps to verify research questions. The experiments reveal that spatial relationships are much bigger in the case of the classification of tumor lesions than normal tissues. Moreover, it turns out that the larger the size of the neighborhood taken into account within spatial regression, the less valuable contextual information is. Furthermore, it is observed that the spatial context measure is the largest when considered within the feature space as opposed to the targets and residuals.
△ Less
Submitted 10 March, 2024; v1 submitted 18 January, 2024;
originally announced January 2024.
-
Noncollinear electric dipoles in a polar, chiral phase of CsSnBr$_3$ perovskite
Authors:
Douglas H. Fabini,
Kedar Honasoge,
Adi Cohen,
Sebastian Bette,
Kyle M. McCall,
Constantinos C. Stoumpos,
Steffen Klenner,
Mirjam Zipkat,
Le Phuong Hoang,
Jürgen Nuss,
Reinhard K. Kremer,
Mercouri G. Kanatzidis,
Omer Yaffe,
Stefan Kaiser,
Bettina V. Lotsch
Abstract:
Polar and chiral crystal symmetries confer a variety of potentially useful functionalities upon solids by coupling otherwise noninteracting mechanical, electronic, optical, and magnetic degrees of freedom. We describe two unstudied phases of the 3D perovskite, CsSnBr$_3$, which emerge below 85 K due to the formation of Sn(II) lone pairs and their interaction with extant octahedral tilts. Phase II…
▽ More
Polar and chiral crystal symmetries confer a variety of potentially useful functionalities upon solids by coupling otherwise noninteracting mechanical, electronic, optical, and magnetic degrees of freedom. We describe two unstudied phases of the 3D perovskite, CsSnBr$_3$, which emerge below 85 K due to the formation of Sn(II) lone pairs and their interaction with extant octahedral tilts. Phase II (77 K<$T$<85 K, space group $P2_1/m$) exhibits ferroaxial order driven by a noncollinear pattern of lone pair-driven distortions within the plane normal to the unique octahedral tilt axis, preserving the inversion symmetry observed at higher temperatures. Phase I ($T$<77 K, space group $P2_1$) additionally exhibits ferroelectric order due to distortions along the unique tilt axis, breaking both inversion and mirror symmetries. This polar and chiral phase exhibits second harmonic generation from the bulk and a large, intrinsic polarization$-$electrostriction coefficient along the polar axis ($Q_{22}\approx$1.1 m$^4$ C$^{-2}$), resulting in acute negative thermal expansion ($α_V=-9\times10^{-5}$ K$^{-1}$) through the onset of spontaneous polarization. The unprecedented structures of phases I and II were predicted by recursively following harmonic phonon instabilities to generate a tree of candidate structures and subsequently corroborated by synchrotron X-ray powder diffraction and polarized Raman and $^{81}$Br nuclear quadrupole resonance spectroscopies. Relativistic electronic structure scenarios compatible with reported photoluminescence measurements are discussed. Together, the polar symmetry, small bandgap, large spin-orbit splitting of Sn 5$p$ orbitals, and predicted strain sensitivity of the symmetry-breaking distortions suggest bulk samples and epitaxial films of CsSnBr$_3$ or its neighboring solid solutions as strong candidates for bulk Rashba effects.
△ Less
Submitted 25 April, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.
-
Controllable Expensive Multi-objective Learning with Warm-starting Bayesian Optimization
Authors:
Quang-Huy Nguyen,
Long P. Hoang,
Hoang V. Viet,
Dung D. Le
Abstract:
Pareto Set Learning (PSL) is a promising approach for approximating the entire Pareto front in multi-objective optimization (MOO) problems. However, existing derivative-free PSL methods are often unstable and inefficient, especially for expensive black-box MOO problems where objective function evaluations are costly. In this work, we propose to address the instability and inefficiency of existing…
▽ More
Pareto Set Learning (PSL) is a promising approach for approximating the entire Pareto front in multi-objective optimization (MOO) problems. However, existing derivative-free PSL methods are often unstable and inefficient, especially for expensive black-box MOO problems where objective function evaluations are costly. In this work, we propose to address the instability and inefficiency of existing PSL methods with a novel controllable PSL method, called Co-PSL. Particularly, Co-PSL consists of two stages: (1) warm-starting Bayesian optimization to obtain quality Gaussian Processes priors and (2) controllable Pareto set learning to accurately acquire a parametric map** from preferences to the corresponding Pareto solutions. The former is to help stabilize the PSL process and reduce the number of expensive function evaluations. The latter is to support real-time trade-off control between conflicting objectives. Performances across synthesis and real-world MOO problems showcase the effectiveness of our Co-PSL for expensive multi-objective optimization tasks.
△ Less
Submitted 9 February, 2024; v1 submitted 26 November, 2023;
originally announced November 2023.
-
Probing the Surface Polarization of Ferroelectric Thin Films by X-ray Standing Waves
Authors:
Le Phuong Hoang,
Irena Spasojevic,
Tien-Lin Lee,
David Pesquera,
Kai Rossnagel,
Jörg Zegenhagen,
Gustau Catalan,
Ivan A. Vartanyants,
Andreas Scherz,
Giuseppe Mercurio
Abstract:
Understanding the mechanisms underlying a stable polarization at the surface of ferroelectric thin films is of particular importance both from a fundamental point of view and to achieve control of the surface polarization itself. In this study, it is demonstrated that the X-ray standing wave technique allows the polarization near the surface of a ferroelectric thin film to be probed directly. The…
▽ More
Understanding the mechanisms underlying a stable polarization at the surface of ferroelectric thin films is of particular importance both from a fundamental point of view and to achieve control of the surface polarization itself. In this study, it is demonstrated that the X-ray standing wave technique allows the polarization near the surface of a ferroelectric thin film to be probed directly. The X-ray standing wave technique is employed to determine, with picometer accuracy, Ti and Ba atomic positions near the surface of three differently strained $\mathrm{BaTiO_3}$ thin films grown on scandate substrates, with a $\mathrm{SrRuO_3}$ film as bottom electrode. This technique gives direct access to atomic positions, and thus to the local ferroelectric polarization, within the first 3 unit cells below the surface. By employing X-ray photoelectron spectroscopy, a detailed overview of the oxygen-containing species adsorbed on the surface, upon exposure to ambient conditions, is obtained. The combination of structural and spectroscopic information allows us to conclude on the most plausible mechanisms that stabilize the surface polarization in the three samples under study. The different amplitude and orientation of the local ferroelectric polarizations are associated with surface charges attributed to the type, amount and spatial distribution of the oxygen-containing adsorbates.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Measuring and Evading Turkmenistan's Internet Censorship: A Case Study in Large-Scale Measurements of a Low-Penetration Country
Authors:
Sadia Nourin,
Van Tran,
Xi Jiang,
Kevin Bock,
Nick Feamster,
Nguyen Phong Hoang,
Dave Levin
Abstract:
Since 2006, Turkmenistan has been listed as one of the few Internet enemies by Reporters without Borders due to its extensively censored Internet and strictly regulated information control policies. Existing reports of filtering in Turkmenistan rely on a small number of vantage points or test a small number of websites. Yet, the country's poor Internet adoption rates and small population can make…
▽ More
Since 2006, Turkmenistan has been listed as one of the few Internet enemies by Reporters without Borders due to its extensively censored Internet and strictly regulated information control policies. Existing reports of filtering in Turkmenistan rely on a small number of vantage points or test a small number of websites. Yet, the country's poor Internet adoption rates and small population can make more comprehensive measurement challenging. With a population of only six million people and an Internet penetration rate of only 38%, it is challenging to either recruit in-country volunteers or obtain vantage points to conduct remote network measurements at scale.
We present the largest measurement study to date of Turkmenistan's Web censorship. To do so, we developed TMC, which tests the blocking status of millions of domains across the three foundational protocols of the Web (DNS, HTTP, and HTTPS). Importantly, TMC does not require access to vantage points in the country. We apply TMC to 15.5M domains, our results reveal that Turkmenistan censors more than 122K domains, using different blocklists for each protocol. We also reverse-engineer these censored domains, identifying 6K over-blocking rules causing incidental filtering of more than 5.4M domains. Finally, we use Geneva, an open-source censorship evasion tool, to discover five new censorship evasion strategies that can defeat Turkmenistan's censorship at both transport and application layers. We will publicly release both the data collected by TMC and the code for censorship evasion.
△ Less
Submitted 17 April, 2023; v1 submitted 10 April, 2023;
originally announced April 2023.
-
Drawings of Complete Multipartite Graphs Up to Triangle Flips
Authors:
Oswin Aichholzer,
Man-Kwun Chiu,
Hung P. Hoang,
Michael Hoffmann,
Jan Kynčl,
Yannic Maus,
Birgit Vogtenhuber,
Alexandra Weinberger
Abstract:
For a drawing of a labeled graph, the rotation of a vertex or crossing is the cyclic order of its incident edges, represented by the labels of their other endpoints. The extended rotation system (ERS) of the drawing is the collection of the rotations of all vertices and crossings. A drawing is simple if each pair of edges has at most one common point. Gioan's Theorem states that for any two simple…
▽ More
For a drawing of a labeled graph, the rotation of a vertex or crossing is the cyclic order of its incident edges, represented by the labels of their other endpoints. The extended rotation system (ERS) of the drawing is the collection of the rotations of all vertices and crossings. A drawing is simple if each pair of edges has at most one common point. Gioan's Theorem states that for any two simple drawings of the complete graph $K_n$ with the same crossing edge pairs, one drawing can be transformed into the other by a sequence of triangle flips (a.k.a. Reidemeister moves of Type 3). This operation refers to the act of moving one edge of a triangular cell formed by three pairwise crossing edges over the opposite crossing of the cell, via a local transformation.
We investigate to what extent Gioan-type theorems can be obtained for wider classes of graphs. A necessary (but in general not sufficient) condition for two drawings of a graph to be transformable into each other by a sequence of triangle flips is that they have the same ERS. As our main result, we show that for the large class of complete multipartite graphs, this necessary condition is in fact also sufficient. We present two different proofs of this result, one of which is shorter, while the other one yields a polynomial time algorithm for which the number of needed triangle flips for graphs on $n$ vertices is bounded by $O(n^{16})$. The latter proof uses a Carathéodory-type theorem for simple drawings of complete multipartite graphs, which we believe to be of independent interest.
Moreover, we show that our Gioan-type theorem for complete multipartite graphs is essentially tight in the sense that having the same ERS does not remain sufficient when removing or adding very few edges.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
A Framework for Controllable Pareto Front Learning with Completed Scalarization Functions and its Applications
Authors:
Tran Anh Tuan,
Long P. Hoang,
Dung D. Le,
Tran Ngoc Thang
Abstract:
Pareto Front Learning (PFL) was recently introduced as an efficient method for approximating the entire Pareto front, the set of all optimal solutions to a Multi-Objective Optimization (MOO) problem. In the previous work, the map** between a preference vector and a Pareto optimal solution is still ambiguous, rendering its results. This study demonstrates the convergence and completion aspects of…
▽ More
Pareto Front Learning (PFL) was recently introduced as an efficient method for approximating the entire Pareto front, the set of all optimal solutions to a Multi-Objective Optimization (MOO) problem. In the previous work, the map** between a preference vector and a Pareto optimal solution is still ambiguous, rendering its results. This study demonstrates the convergence and completion aspects of solving MOO with pseudoconvex scalarization functions and combines them into Hypernetwork in order to offer a comprehensive framework for PFL, called Controllable Pareto Front Learning. Extensive experiments demonstrate that our approach is highly accurate and significantly less computationally expensive than prior methods in term of inference time.
△ Less
Submitted 13 August, 2023; v1 submitted 24 February, 2023;
originally announced February 2023.
-
Augmenting Rule-based DNS Censorship Detection at Scale with Machine Learning
Authors:
Jacob Brown,
Xi Jiang,
Van Tran,
Arjun Nitin Bhagoji,
Nguyen Phong Hoang,
Nick Feamster,
Prateek Mittal,
Vinod Yegneswaran
Abstract:
The proliferation of global censorship has led to the development of a plethora of measurement platforms to monitor and expose it. Censorship of the domain name system (DNS) is a key mechanism used across different countries. It is currently detected by applying heuristics to samples of DNS queries and responses (probes) for specific destinations. These heuristics, however, are both platform-speci…
▽ More
The proliferation of global censorship has led to the development of a plethora of measurement platforms to monitor and expose it. Censorship of the domain name system (DNS) is a key mechanism used across different countries. It is currently detected by applying heuristics to samples of DNS queries and responses (probes) for specific destinations. These heuristics, however, are both platform-specific and have been found to be brittle when censors change their blocking behavior, necessitating a more reliable automated process for detecting censorship.
In this paper, we explore how machine learning (ML) models can (1) help streamline the detection process, (2) improve the potential of using large-scale datasets for censorship detection, and (3) discover new censorship instances and blocking signatures missed by existing heuristic methods. Our study shows that supervised models, trained using expert-derived labels on instances of known anomalies and possible censorship, can learn the detection heuristics employed by different measurement platforms. More crucially, we find that unsupervised models, trained solely on uncensored instances, can identify new instances and variations of censorship missed by existing heuristics. Moreover, both methods demonstrate the capability to uncover a substantial number of new DNS blocking signatures, i.e., injected fake IP addresses overlooked by existing heuristics. These results are underpinned by an important methodological finding: comparing the outputs of models trained using the same probes but with labels arising from independent processes allows us to more reliably detect cases of censorship in the absence of ground-truth labels of censorship.
△ Less
Submitted 15 June, 2023; v1 submitted 3 February, 2023;
originally announced February 2023.
-
ViHOS: Hate Speech Spans Detection for Vietnamese
Authors:
Phu Gia Hoang,
Canh Duc Luu,
Khanh Quoc Tran,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
The rise in hateful and offensive language directed at other users is one of the adverse side effects of the increased use of social networking platforms. This could make it difficult for human moderators to review tagged comments filtered by classification systems. To help address this issue, we present the ViHOS (Vietnamese Hate and Offensive Spans) dataset, the first human-annotated corpus cont…
▽ More
The rise in hateful and offensive language directed at other users is one of the adverse side effects of the increased use of social networking platforms. This could make it difficult for human moderators to review tagged comments filtered by classification systems. To help address this issue, we present the ViHOS (Vietnamese Hate and Offensive Spans) dataset, the first human-annotated corpus containing 26k spans on 11k comments. We also provide definitions of hateful and offensive spans in Vietnamese comments as well as detailed annotation guidelines. Besides, we conduct experiments with various state-of-the-art models. Specifically, XLM-R$_{Large}$ achieved the best F1-scores in Single span detection and All spans detection, while PhoBERT$_{Large}$ obtained the highest in Multiple spans detection. Finally, our error analysis demonstrates the difficulties in detecting specific types of spans in our data for future research.
Disclaimer: This paper contains real comments that could be considered profane, offensive, or abusive.
△ Less
Submitted 26 January, 2023; v1 submitted 24 January, 2023;
originally announced January 2023.
-
Combinatorial generation via permutation languages. V. Acyclic orientations
Authors:
Jean Cardinal,
Hung P. Hoang,
Arturo Merino,
Ondřej Mička,
Torsten Mütze
Abstract:
In 1993, Savage, Squire, and West described an inductive construction for generating every acyclic orientation of a chordal graph exactly once, flip** one arc at a time. We provide two generalizations of this result. Firstly, we describe Gray codes for acyclic orientations of hypergraphs that satisfy a simple ordering condition, which generalizes the notion of perfect elimination order of graphs…
▽ More
In 1993, Savage, Squire, and West described an inductive construction for generating every acyclic orientation of a chordal graph exactly once, flip** one arc at a time. We provide two generalizations of this result. Firstly, we describe Gray codes for acyclic orientations of hypergraphs that satisfy a simple ordering condition, which generalizes the notion of perfect elimination order of graphs. This unifies the Savage-Squire-West construction with a recent algorithm for generating elimination trees of chordal graphs. Secondly, we consider quotients of lattices of acyclic orientations of chordal graphs, and we provide a Gray code for them, addressing a question raised by Pilaud. This also generalizes a recent algorithm for generating lattice congruences of the weak order on the symmetric group. Our algorithms are derived from the Hartung-Hoang-Mütze-Williams combinatorial generation framework, and they yield simple algorithms for computing Hamilton paths and cycles on large classes of polytopes, including chordal nestohedra and quotientopes. In particular, we derive an efficient implementation of the Savage-Squire-West construction. Along the way, we give an overview of old and recent results about the polyhedral and order-theoretic aspects of acyclic orientations of graphs and hypergraphs.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
Improving Pareto Front Learning via Multi-Sample Hypernetworks
Authors:
Long P. Hoang,
Dung D. Le,
Tran Anh Tuan,
Tran Ngoc Thang
Abstract:
Pareto Front Learning (PFL) was recently introduced as an effective approach to obtain a map** function from a given trade-off vector to a solution on the Pareto front, which solves the multi-objective optimization (MOO) problem. Due to the inherent trade-off between conflicting objectives, PFL offers a flexible approach in many scenarios in which the decision makers can not specify the preferen…
▽ More
Pareto Front Learning (PFL) was recently introduced as an effective approach to obtain a map** function from a given trade-off vector to a solution on the Pareto front, which solves the multi-objective optimization (MOO) problem. Due to the inherent trade-off between conflicting objectives, PFL offers a flexible approach in many scenarios in which the decision makers can not specify the preference of one Pareto solution over another, and must switch between them depending on the situation. However, existing PFL methods ignore the relationship between the solutions during the optimization process, which hinders the quality of the obtained front. To overcome this issue, we propose a novel PFL framework namely PHN-HVI, which employs a hypernetwork to generate multiple solutions from a set of diverse trade-off preferences and enhance the quality of the Pareto front by maximizing the Hypervolume indicator defined by these solutions. The experimental results on several MOO machine learning tasks show that the proposed framework significantly outperforms the baselines in producing the trade-off Pareto front.
△ Less
Submitted 28 April, 2023; v1 submitted 2 December, 2022;
originally announced December 2022.
-
Meeting Decision Tracker: Making Meeting Minutes with De-Contextualized Utterances
Authors:
Shumpei Inoue,
Hy Nguyen,
Pham Viet Hoang,
Tsungwei Liu,
Minh-Tien Nguyen
Abstract:
Meetings are a universal process to make decisions in business and project collaboration. The capability to automatically itemize the decisions in daily meetings allows for extensive tracking of past discussions. To that end, we developed Meeting Decision Tracker, a prototype system to construct decision items comprising decision utterance detector (DUD) and decision utterance rewriter (DUR). We s…
▽ More
Meetings are a universal process to make decisions in business and project collaboration. The capability to automatically itemize the decisions in daily meetings allows for extensive tracking of past discussions. To that end, we developed Meeting Decision Tracker, a prototype system to construct decision items comprising decision utterance detector (DUD) and decision utterance rewriter (DUR). We show that DUR makes a sizable contribution to improving the user experience by dealing with utterance collapse in natural conversation. An introduction video of our system is also available at https://youtu.be/TG1pJJo0Iqo.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
On approximating the rank of graph divisors
Authors:
Kristóf Bérczi,
Hung P. Hoang,
Lilla Tóthmérész
Abstract:
Baker and Norine initiated the study of graph divisors as a graph-theoretic analogue of the Riemann-Roch theory for Riemann surfaces. One of the key concepts of graph divisor theory is the {\it rank} of a divisor on a graph. The importance of the rank is well illustrated by Baker's {\it Specialization lemma}, stating that the dimension of a linear system can only go up under specialization from cu…
▽ More
Baker and Norine initiated the study of graph divisors as a graph-theoretic analogue of the Riemann-Roch theory for Riemann surfaces. One of the key concepts of graph divisor theory is the {\it rank} of a divisor on a graph. The importance of the rank is well illustrated by Baker's {\it Specialization lemma}, stating that the dimension of a linear system can only go up under specialization from curves to graphs, leading to a fruitful interaction between divisors on graphs and curves.
Due to its decisive role, determining the rank is a central problem in graph divisor theory. Kiss and Tóthméresz reformulated the problem using chip-firing games, and showed that computing the rank of a divisor on a graph is NP-hard via reduction from the Minimum Feedback Arc Set problem.
In this paper, we strengthen their result by establishing a connection between chip-firing games and the Minimum Target Set Selection problem. As a corollary, we show that the rank is difficult to approximate to within a factor of $O(2^{\log^{1-\varepsilon}n})$ for any $\varepsilon > 0$ unless $P=NP$. Furthermore, assuming the Planted Dense Subgraph Conjecture, the rank is difficult to approximate to within a factor of $O(n^{1/4-\varepsilon})$ for any $\varepsilon>0$.
△ Less
Submitted 11 April, 2024; v1 submitted 20 June, 2022;
originally announced June 2022.
-
Vietnamese Hate and Offensive Detection using PhoBERT-CNN and Social Media Streaming Data
Authors:
Khanh Q. Tran,
An T. Nguyen,
Phu Gia Hoang,
Canh Duc Luu,
Trong-Hop Do,
Kiet Van Nguyen
Abstract:
Society needs to develop a system to detect hate and offense to build a healthy and safe environment. However, current research in this field still faces four major shortcomings, including deficient pre-processing techniques, indifference to data imbalance issues, modest performance models, and lacking practical applications. This paper focused on develo** an intelligent system capable of addres…
▽ More
Society needs to develop a system to detect hate and offense to build a healthy and safe environment. However, current research in this field still faces four major shortcomings, including deficient pre-processing techniques, indifference to data imbalance issues, modest performance models, and lacking practical applications. This paper focused on develo** an intelligent system capable of addressing these shortcomings. Firstly, we proposed an efficient pre-processing technique to clean comments collected from Vietnamese social media. Secondly, a novel hate speech detection (HSD) model, which is the combination of a pre-trained PhoBERT model and a Text-CNN model, was proposed for solving tasks in Vietnamese. Thirdly, EDA techniques are applied to deal with imbalanced data to improve the performance of classification models. Besides, various experiments were conducted as baselines to compare and investigate the proposed model's performance against state-of-the-art methods. The experiment results show that the proposed PhoBERT-CNN model outperforms SOTA methods and achieves an F1-score of 67,46% and 98,45% on two benchmark datasets, ViHSD and HSD-VLSP, respectively. Finally, we also built a streaming HSD application to demonstrate the practicality of our proposed system.
△ Less
Submitted 1 June, 2022;
originally announced June 2022.
-
Measuring the Accessibility of Domain Name Encryption and Its Impact on Internet Filtering
Authors:
Nguyen Phong Hoang,
Michalis Polychronakis,
Phillipa Gill
Abstract:
Most online communications rely on DNS to map domain names to their hosting IP address(es). Previous work has shown that DNS-based network interference is widespread due to the unencrypted and unauthenticated nature of the original DNS protocol. In addition to DNS, accessed domain names can also be monitored by on-path observers during the TLS handshake when the SNI extension is used. These linger…
▽ More
Most online communications rely on DNS to map domain names to their hosting IP address(es). Previous work has shown that DNS-based network interference is widespread due to the unencrypted and unauthenticated nature of the original DNS protocol. In addition to DNS, accessed domain names can also be monitored by on-path observers during the TLS handshake when the SNI extension is used. These lingering issues with exposed plaintext domain names have led to the development of a new generation of protocols that keep accessed domain names hidden. DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH) hide the domain names of DNS queries, while Encrypted Server Name Indication (ESNI) encrypts the domain name in the SNI extension.
We present DNEye, a measurement system built on top of a network of distributed vantage points, which we used to study the accessibility of DoT/DoH and ESNI, and to investigate whether these protocols are tampered with by network providers (e.g., for censorship). Moreover, we evaluate the efficacy of these protocols in circumventing network interference when accessing content blocked by traditional DNS manipulation. We find evidence of blocking efforts against domain name encryption technologies in several countries, including China, Russia, and Saudi Arabia. At the same time, we discover that domain name encryption can help with unblocking more than 55% and 95% of censored domains in China and other countries where DNS-based filtering is heavily employed.
△ Less
Submitted 1 February, 2022;
originally announced February 2022.
-
Positron Driven High-Field Terahertz Waves in Dielectric Material
Authors:
N. Majernik,
G. Andonian,
O. B. Williams,
B. D. O'Shea,
P. D. Hoang,
C. Clarke,
M. J. Hogan,
V. Yakimenko,
J. B. Rosenzweig
Abstract:
Advanced acceleration methods based on wakefields generated by high energy electron bunches passing through dielectric-based structures have demonstrated $>$GV/m fields, paving the first steps on a path to applications such as future compact linear colliders. For a collider scenario, it is desirable that, in contrast to plasmas, wakefields in dielectrics do not behave differently for positron and…
▽ More
Advanced acceleration methods based on wakefields generated by high energy electron bunches passing through dielectric-based structures have demonstrated $>$GV/m fields, paving the first steps on a path to applications such as future compact linear colliders. For a collider scenario, it is desirable that, in contrast to plasmas, wakefields in dielectrics do not behave differently for positron and electron bunches. In this Letter, we present measurements of large amplitude fields excited by positron bunches with collider-relevant parameters (energy 20 GeV, and $0.7 \times 10^{10}$ particles per bunch) in a 0.4 THz, cylindrically symmetric dielectric structure. Interferometric measurements of emitted coherent Cerenkov radiation permit spectral characterization of the positron-generated wakefields, which are compared to those excited by electron bunches. Statistical equivalence tests are incorporated to show the charge-sign invariance of the induced wakefield spectra. Transverse effects on positron beams resulting from off-axis excitation are examined and found to be consistent with the known linear response of the DWA system. The results are supported by numerical simulations and demonstrate high-gradient wakefield excitation in dielectrics for positron beams.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
Assistance and Interdiction Problems on Interval Graphs
Authors:
Hung P. Hoang,
Stefan Lendl,
Lasse Wulf
Abstract:
We introduce a novel framework of graph modifications specific to interval graphs. We study interdiction problems with respect to these graph modifications. Given a list of original intervals, each interval has a replacement interval such that either the replacement contains the original, or the original contains the replacement. The interdictor is allowed to replace up to $k$ original intervals w…
▽ More
We introduce a novel framework of graph modifications specific to interval graphs. We study interdiction problems with respect to these graph modifications. Given a list of original intervals, each interval has a replacement interval such that either the replacement contains the original, or the original contains the replacement. The interdictor is allowed to replace up to $k$ original intervals with their replacements. Using this framework we also study the contrary of interdiction problems which we call assistance problems. We study these problems for the independence number, the clique number, shortest paths, and the scattering number. We obtain polynomial time algorithms for most of the studied problems. Via easy reductions, it follows that on interval graphs, the most vital nodes problem with respect to shortest path, independence number and Hamiltonicity can be solved in polynomial time.
△ Less
Submitted 30 July, 2021;
originally announced July 2021.
-
Duality-symmetric axion electrodynamics and haloscopes of various geometries
Authors:
Dai-Nam Le,
Le Phuong Hoang,
Binh Xuan Cao
Abstract:
Within the dual symmetric point of view, the theory for seeking axion dark matter via haloscope experiments is derived by exactly solving the dual symmetric axion electrodynamics equation. Notwithstanding that the conventional theory of axion electrodynamics presented in [Phys. Rev. Lett. 51, 1415 (1983) and J. Phys. A: Math. Gen. 19, L33 (1986)] is more commonly used in haloscope theory, we show…
▽ More
Within the dual symmetric point of view, the theory for seeking axion dark matter via haloscope experiments is derived by exactly solving the dual symmetric axion electrodynamics equation. Notwithstanding that the conventional theory of axion electrodynamics presented in [Phys. Rev. Lett. 51, 1415 (1983) and J. Phys. A: Math. Gen. 19, L33 (1986)] is more commonly used in haloscope theory, we show that the dual symmetric axion electrodynamics has more advantages to apply to haloscope theory. First, the dual symmetric and conventional perspectives of axion electrodynamics coincide under long-wavelength approximation. Moreover, dual symmetric theory can obtain an exact analytical expression of the axion-induced electromagnetic field for any state of axion. This solution has been used in conventional theory for long-wavelength approximation. The difference between two theories can occur in directional axion detection or electric sensing haloscopes. For illustrative purposes, we consider the various type of resonant cavities: cylindrical solenoid, spherical solenoid, two-parallel-sheet cavity, toroidal solenoid with a rectangular cross-section, and with a circular cross-section. The resonance of the axion-induced signal, as well as the ratio of the energy difference over the stored energy inside the cavity, are investigated in these types of cavity.
△ Less
Submitted 20 August, 2022; v1 submitted 9 July, 2021;
originally announced July 2021.
-
Embedding-based Recommender System for Job to Candidate Matching on Scale
Authors:
**g Zhao,
**gya Wang,
Madhav Sigdel,
Bopeng Zhang,
Phuong Hoang,
Mengshu Liu,
Mohammed Korayem
Abstract:
The online recruitment matching system has been the core technology and service platform in CareerBuilder. One of the major challenges in an online recruitment scenario is to provide good matches between job posts and candidates using a recommender system on the scale. In this paper, we discussed the techniques for applying an embedding-based recommender system for the large scale of job to candid…
▽ More
The online recruitment matching system has been the core technology and service platform in CareerBuilder. One of the major challenges in an online recruitment scenario is to provide good matches between job posts and candidates using a recommender system on the scale. In this paper, we discussed the techniques for applying an embedding-based recommender system for the large scale of job to candidates matching. To learn the comprehensive and effective embedding for job posts and candidates, we have constructed a fused-embedding via different levels of representation learning from raw text, semantic entities and location information. The clusters of fused-embedding of job and candidates are then used to build and train the Faiss index that supports runtime approximate nearest neighbor search for candidate retrieval. After the first stage of candidate retrieval, a second stage reranking model that utilizes other contextual information was used to generate the final matching result. Both offline and online evaluation results indicate a significant improvement of our proposed two-staged embedding-based system in terms of click-through rate (CTR), quality and normalized discounted accumulated gain (nDCG), compared to those obtained from our baseline system. We further described the deployment of the system that supports the million-scale job and candidate matching process at CareerBuilder. The overall improvement of our job to candidate matching system has demonstrated its feasibility and scalability at a major online recruitment site.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
How Great is the Great Firewall? Measuring China's DNS Censorship
Authors:
Nguyen Phong Hoang,
Arian Akhavan Niaki,
Jakub Dalek,
Jeffrey Knockel,
Pellaeon Lin,
Bill Marczak,
Masashi Crete-Nishihata,
Phillipa Gill,
Michalis Polychronakis
Abstract:
The DNS filtering apparatus of China's Great Firewall (GFW) has evolved considerably over the past two decades. However, most prior studies of China's DNS filtering were performed over short time periods, leading to unnoticed changes in the GFW's behavior. In this study, we introduce GFWatch, a large-scale, longitudinal measurement platform capable of testing hundreds of millions of domains daily,…
▽ More
The DNS filtering apparatus of China's Great Firewall (GFW) has evolved considerably over the past two decades. However, most prior studies of China's DNS filtering were performed over short time periods, leading to unnoticed changes in the GFW's behavior. In this study, we introduce GFWatch, a large-scale, longitudinal measurement platform capable of testing hundreds of millions of domains daily, enabling continuous monitoring of the GFW's DNS filtering behavior.
We present the results of running GFWatch over a nine-month period, during which we tested an average of 411M domains per day and detected a total of 311K domains censored by GFW's DNS filter. To the best of our knowledge, this is the largest number of domains tested and censored domains discovered in the literature. We further reverse engineer regular expressions used by the GFW and find 41K innocuous domains that match these filters, resulting in overblocking of their content. We also observe bogus IPv6 and globally routable IPv4 addresses injected by the GFW, including addresses owned by US companies, such as Facebook, Dropbox, and Twitter.
Using data from GFWatch, we studied the impact of GFW blocking on the global DNS system. We found 77K censored domains with DNS resource records polluted in popular public DNS resolvers, such as Google and Cloudflare. Finally, we propose strategies to detect poisoned responses that can (1) sanitize poisoned DNS records from the cache of public DNS resolvers, and (2) assist in the development of circumvention tools to bypass the GFW's DNS censorship.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
Conflict-Free Coloring: Graphs of Bounded Clique Width and Intersection Graphs
Authors:
Sriram Bhyravarapu,
Tim A. Hartmann,
Hung P. Hoang,
Subrahmanyam Kalyanasundaram,
I. Vinod Reddy
Abstract:
A conflict-free coloring of a graph $G$ is a (partial) coloring of its vertices such that every vertex $u$ has a neighbor whose assigned color is unique in the neighborhood of $u$. There are two variants of this coloring, one defined using the open neighborhood and one using the closed neighborhood. For both variants, we study the problem of deciding whether the conflict-free coloring of a given g…
▽ More
A conflict-free coloring of a graph $G$ is a (partial) coloring of its vertices such that every vertex $u$ has a neighbor whose assigned color is unique in the neighborhood of $u$. There are two variants of this coloring, one defined using the open neighborhood and one using the closed neighborhood. For both variants, we study the problem of deciding whether the conflict-free coloring of a given graph $G$ is at most a given number $k$.
In this work, we investigate the relation of clique-width and minimum number of colors needed (for both variants) and show that these parameters do not bound one another. Moreover, we consider specific graph classes, particularly graphs of bounded clique-width and types of intersection graphs, such as distance hereditary graphs, interval graphs and unit square and disk graphs. We also consider Kneser graphs and split graphs. We give (often tight) upper and lower bounds and determine the complexity of the decision problem on these graph classes, which improve some of the results from the literature. Particularly, we settle the number of colors needed for an interval graph to be conflict-free colored under the open neighborhood model, which was posed as an open problem.
△ Less
Submitted 11 March, 2024; v1 submitted 18 May, 2021;
originally announced May 2021.
-
UIT-E10dot3 at SemEval-2021 Task 5: Toxic Spans Detection with Named Entity Recognition and Question-Answering Approaches
Authors:
Phu Gia Hoang,
Luan Thanh Nguyen,
Kiet Van Nguyen
Abstract:
The increment of toxic comments on online space is causing tremendous effects on other vulnerable users. For this reason, considerable efforts are made to deal with this, and SemEval-2021 Task 5: Toxic Spans Detection is one of those. This task asks competitors to extract spans that have toxicity from the given texts, and we have done several analyses to understand its structure before doing exper…
▽ More
The increment of toxic comments on online space is causing tremendous effects on other vulnerable users. For this reason, considerable efforts are made to deal with this, and SemEval-2021 Task 5: Toxic Spans Detection is one of those. This task asks competitors to extract spans that have toxicity from the given texts, and we have done several analyses to understand its structure before doing experiments. We solve this task by two approaches, Named Entity Recognition with spaCy library and Question-Answering with RoBERTa combining with ToxicBERT, and the former gains the highest F1-score of 66.99%.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
Domain Name Encryption Is Not Enough: Privacy Leakage via IP-based Website Fingerprinting
Authors:
Nguyen Phong Hoang,
Arian Akhavan Niaki,
Phillipa Gill,
Michalis Polychronakis
Abstract:
Although the security benefits of domain name encryption technologies such as DNS over TLS (DoT), DNS over HTTPS (DoH), and Encrypted Client Hello (ECH) are clear, their positive impact on user privacy is weakened by--the still exposed--IP address information. However, content delivery networks, DNS-based load balancing, co-hosting of different websites on the same server, and IP address churn, al…
▽ More
Although the security benefits of domain name encryption technologies such as DNS over TLS (DoT), DNS over HTTPS (DoH), and Encrypted Client Hello (ECH) are clear, their positive impact on user privacy is weakened by--the still exposed--IP address information. However, content delivery networks, DNS-based load balancing, co-hosting of different websites on the same server, and IP address churn, all contribute towards making domain-IP map**s unstable, and prevent straightforward IP-based browsing tracking.
In this paper, we show that this instability is not a roadblock (assuming a universal DoT/DoH and ECH deployment), by introducing an IP-based website fingerprinting technique that allows a network-level observer to identify at scale the website a user visits. Our technique exploits the complex structure of most websites, which load resources from several domains besides their primary one. Using the generated fingerprints of more than 200K websites studied, we could successfully identify 84% of them when observing solely destination IP addresses. The accuracy rate increases to 92% for popular websites, and 95% for popular and sensitive websites. We also evaluated the robustness of the generated fingerprints over time, and demonstrate that they are still effective at successfully identifying about 70% of the tested websites after two months. We conclude by discussing strategies for website owners and hosting providers towards hindering IP-based website fingerprinting and maximizing the privacy benefits offered by DoT/DoH and ECH.
△ Less
Submitted 16 June, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
A Subexponential Algorithm for ARRIVAL
Authors:
Bernd Gärtner,
Sebastian Haslebacher,
Hung P. Hoang
Abstract:
The ARRIVAL problem is to decide the fate of a train moving along the edges of a directed graph, according to a simple (deterministic) pseudorandom walk. The problem is in $NP \cap coNP$ but not known to be in $P$. The currently best algorithms have runtime $2^{Θ(n)}$ where $n$ is the number of vertices. This is not much better than just performing the pseudorandom walk. We develop a subexponentia…
▽ More
The ARRIVAL problem is to decide the fate of a train moving along the edges of a directed graph, according to a simple (deterministic) pseudorandom walk. The problem is in $NP \cap coNP$ but not known to be in $P$. The currently best algorithms have runtime $2^{Θ(n)}$ where $n$ is the number of vertices. This is not much better than just performing the pseudorandom walk. We develop a subexponential algorithm with runtime $2^{O(\sqrt{n}\log n)}$. We also give a polynomial-time algorithm if the graph is almost acyclic. Both results are derived from a new general approach to solve ARRIVAL instances.
△ Less
Submitted 9 April, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
The Web is Still Small After More Than a Decade
Authors:
Nguyen Phong Hoang,
Arian Akhavan Niaki,
Michalis Polychronakis,
Phillipa Gill
Abstract:
Understanding web co-location is essential for various reasons. For instance, it can help one to assess the collateral damage that denial-of-service attacks or IP-based blocking can cause to the availability of co-located web sites. However, it has been more than a decade since the first study was conducted in 2007. The Internet infrastructure has changed drastically since then, necessitating a re…
▽ More
Understanding web co-location is essential for various reasons. For instance, it can help one to assess the collateral damage that denial-of-service attacks or IP-based blocking can cause to the availability of co-located web sites. However, it has been more than a decade since the first study was conducted in 2007. The Internet infrastructure has changed drastically since then, necessitating a renewed study to comprehend the nature of web co-location.
In this paper, we conduct an empirical study to revisit web co-location using datasets collected from active DNS measurements. Our results show that the web is still small and centralized to a handful of hosting providers. More specifically, we find that more than 60% of web sites are co-located with at least ten other web sites---a group comprising less popular web sites. In contrast, 17.5% of mostly popular web sites are served from their own servers.
Although a high degree of web co-location could make co-hosted sites vulnerable to DoS attacks, our findings show that it is an increasing trend to co-host many web sites and serve them from well-provisioned content delivery networks (CDN) of major providers that provide advanced DoS protection benefits. Regardless of the high degree of web co-location, our analyses of popular block lists indicate that IP-based blocking does not cause severe collateral damage as previously thought.
△ Less
Submitted 9 April, 2020;
originally announced April 2020.
-
K-resolver: Towards Decentralizing Encrypted DNS Resolution
Authors:
Nguyen Phong Hoang,
Ivan Lin,
Seyedhamed Ghavamnia,
Michalis Polychronakis
Abstract:
Centralized DNS over HTTPS/TLS (DoH/DoT) resolution, which has started being deployed by major hosting providers and web browsers, has sparked controversy among Internet activists and privacy advocates due to several privacy concerns. This design decision causes the trace of all DNS resolutions to be exposed to a third-party resolver, different than the one specified by the user's access network.…
▽ More
Centralized DNS over HTTPS/TLS (DoH/DoT) resolution, which has started being deployed by major hosting providers and web browsers, has sparked controversy among Internet activists and privacy advocates due to several privacy concerns. This design decision causes the trace of all DNS resolutions to be exposed to a third-party resolver, different than the one specified by the user's access network. In this work we propose K-resolver, a DNS resolution mechanism that disperses DNS queries across multiple DoH resolvers, reducing the amount of information about a user's browsing activity exposed to each individual resolver. As a result, none of the resolvers can learn a user's entire web browsing history. We have implemented a prototype of our approach for Mozilla Firefox, and used it to evaluate the performance of web page load time compared to the default centralized DoH approach. While our K-resolver mechanism has some effect on DNS resolution time and web page load time, we show that this is mainly due to the geographical location of the selected DoH servers. When more well-provisioned anycast servers are available, our approach incurs negligible overhead while improving user privacy.
△ Less
Submitted 17 February, 2020; v1 submitted 24 January, 2020;
originally announced January 2020.
-
Combinatorial generation via permutation languages. II. Lattice congruences
Authors:
Hung Phuc Hoang,
Torsten Mütze
Abstract:
This paper deals with lattice congruences of the weak order on the symmetric group, and initiates the investigation of the cover graphs of the corresponding lattice quotients. These graphs also arise as the skeleta of the so-called quotientopes, a family of polytopes recently introduced by Pilaud and Santos [Bull. Lond. Math. Soc., 51:406-420, 2019], which generalize permutahedra, associahedra, hy…
▽ More
This paper deals with lattice congruences of the weak order on the symmetric group, and initiates the investigation of the cover graphs of the corresponding lattice quotients. These graphs also arise as the skeleta of the so-called quotientopes, a family of polytopes recently introduced by Pilaud and Santos [Bull. Lond. Math. Soc., 51:406-420, 2019], which generalize permutahedra, associahedra, hypercubes and several other polytopes. We prove that all of these graphs have a Hamilton path, which can be computed by a simple greedy algorithm. This is an application of our framework for exhaustively generating various classes of combinatorial objects by encoding them as permutations. We also characterize which of these graphs are vertex-transitive or regular via their arc diagrams, give corresponding precise and asymptotic counting results, and we determine their minimum and maximum degrees. Moreover, we investigate the relation between lattice congruences of the weak order and pattern-avoiding permutations.
△ Less
Submitted 2 December, 2022; v1 submitted 27 November, 2019;
originally announced November 2019.
-
Assessing the Privacy Benefits of Domain Name Encryption
Authors:
Nguyen Phong Hoang,
Arian Akhavan Niaki,
Nikita Borisov,
Phillipa Gill,
Michalis Polychronakis
Abstract:
As Internet users have become more savvy about the potential for their Internet communication to be observed, the use of network traffic encryption technologies (e.g., HTTPS/TLS) is on the rise. However, even when encryption is enabled, users leak information about the domains they visit via DNS queries and via the Server Name Indication (SNI) extension of TLS. Two recent proposals to ameliorate t…
▽ More
As Internet users have become more savvy about the potential for their Internet communication to be observed, the use of network traffic encryption technologies (e.g., HTTPS/TLS) is on the rise. However, even when encryption is enabled, users leak information about the domains they visit via DNS queries and via the Server Name Indication (SNI) extension of TLS. Two recent proposals to ameliorate this issue are DNS over HTTPS/TLS (DoH/DoT) and Encrypted SNI (ESNI). In this paper we aim to assess the privacy benefits of these proposals by considering the relationship between hostnames and IP addresses, the latter of which are still exposed. We perform DNS queries from nine vantage points around the globe to characterize this relationship. We quantify the privacy gain offered by ESNI for different hosting and CDN providers using two different metrics, the k-anonymity degree due to co-hosting and the dynamics of IP address changes. We find that 20% of the domains studied will not gain any privacy benefit since they have a one-to-one map** between their hostname and IP address. On the other hand, 30% will gain a significant privacy benefit with a k value greater than 100, since these domains are co-hosted with more than 100 other domains. Domains whose visitors' privacy will meaningfully improve are far less popular, while for popular domains the benefit is not significant. Analyzing the dynamics of IP addresses of long-lived domains, we find that only 7.7% of them change their hosting IP addresses on a daily basis. We conclude by discussing potential approaches for website owners and hosting/CDN providers for maximizing the privacy benefits of ESNI.
△ Less
Submitted 8 July, 2020; v1 submitted 1 November, 2019;
originally announced November 2019.
-
Automated Discovery and Classification of Training Videos for Career Progression
Authors:
Alan Chern,
Phuong Hoang,
Madhav Sigdel,
Janani Balaji,
Mohammed Korayem
Abstract:
Job transitions and upskilling are common actions taken by many industry working professionals throughout their career. With the current rapidly changing job landscape where requirements are constantly changing and industry sectors are emerging, it is especially difficult to plan and navigate a predetermined career path. In this work, we implemented a system to automate the collection and classifi…
▽ More
Job transitions and upskilling are common actions taken by many industry working professionals throughout their career. With the current rapidly changing job landscape where requirements are constantly changing and industry sectors are emerging, it is especially difficult to plan and navigate a predetermined career path. In this work, we implemented a system to automate the collection and classification of training videos to help job seekers identify and acquire the skills necessary to transition to the next step in their career. We extracted educational videos and built a machine learning classifier to predict video relevancy. This system allows us to discover relevant videos at a large scale for job title-skill pairs. Our experiments show significant improvements in the model performance by incorporating embedding vectors associated with the video attributes. Additionally, we evaluated the optimal probability threshold to extract as many videos as possible with minimal false positive rate.
△ Less
Submitted 23 July, 2019;
originally announced July 2019.
-
Seismic performance of an infilled moment-resisting steel frame during the 2016 Central Italy Earthquake
Authors:
Phan Hoang Nam,
Fabrizio Paolacci,
Phuong Hoa Hoang
Abstract:
A sequence of earthquakes occurred between the end of August 2016 and the end of October 2016 in Central Italy causing significant damage and major disruption in a wide area. The sequence of events is composed of five events with magnitude between Mw 5.5 to 6.5. As a consequence, numerous residential buildings in the affected area was not particularly resistant to the shaking, resulting in the col…
▽ More
A sequence of earthquakes occurred between the end of August 2016 and the end of October 2016 in Central Italy causing significant damage and major disruption in a wide area. The sequence of events is composed of five events with magnitude between Mw 5.5 to 6.5. As a consequence, numerous residential buildings in the affected area was not particularly resistant to the shaking, resulting in the collapse and heavy damage. With a particular focus on masonry infilled steel frames, this paper evaluates the seismic performance of an infilled moment-resisting steel frame located in Amatrice, Central Italy, which suffered significant damage during the August 2016 Central Italy earthquake. The aim is to investigate the effect of the masonry infill to the seismic performance of the building. The three-dimensional (3D) frame building is modeled using the Opensees software, where the beam and column elements are modeled by using a nonlinear hinge model and the infill is idealized as diagonal struts with nonlinear hysteretic behavior. Nonlinear static and dynamic analyses are performed for both bare and infill frames in order to assess the effect of the masonry infill on the overall seismic response and confirm the actual damage pattern surveyed in the aftermath of the 2016 Central Italy earthquake of the case study.
△ Less
Submitted 24 July, 2019;
originally announced July 2019.
-
Measuring I2P Censorship at a Global Scale
Authors:
Nguyen Phong Hoang,
Sadie Doreen,
Michalis Polychronakis
Abstract:
The prevalence of Internet censorship has prompted the creation of several measurement platforms for monitoring filtering activities. An important challenge faced by these platforms revolves around the trade-off between depth of measurement and breadth of coverage. In this paper, we present an opportunistic censorship measurement infrastructure built on top of a network of distributed VPN servers…
▽ More
The prevalence of Internet censorship has prompted the creation of several measurement platforms for monitoring filtering activities. An important challenge faced by these platforms revolves around the trade-off between depth of measurement and breadth of coverage. In this paper, we present an opportunistic censorship measurement infrastructure built on top of a network of distributed VPN servers run by volunteers, which we used to measure the extent to which the I2P anonymity network is blocked around the world. This infrastructure provides us with not only numerous and geographically diverse vantage points, but also the ability to conduct in-depth measurements across all levels of the network stack. Using this infrastructure, we measured at a global scale the availability of four different I2P services: the official homepage, its mirror site, reseed servers, and active relays in the network. Within a period of one month, we conducted a total of 54K measurements from 1.7K network locations in 164 countries. With different techniques for detecting domain name blocking, network packet injection, and block pages, we discovered I2P censorship in five countries: China, Iran, Oman, Qatar, and Kuwait. Finally, we conclude by discussing potential approaches to circumvent censorship on I2P.
△ Less
Submitted 16 July, 2019;
originally announced July 2019.
-
ICLab: A Global, Longitudinal Internet Censorship Measurement Platform
Authors:
Arian Akhavan Niaki,
Shinyoung Cho,
Zachary Weinberg,
Nguyen Phong Hoang,
Abbas Razaghpanah,
Nicolas Christin,
Phillipa Gill
Abstract:
Researchers have studied Internet censorship for nearly as long as attempts to censor contents have taken place. Most studies have however been limited to a short period of time and/or a few countries; the few exceptions have traded off detail for breadth of coverage. Collecting enough data for a comprehensive, global, longitudinal perspective remains challenging. In this work, we present ICLab, a…
▽ More
Researchers have studied Internet censorship for nearly as long as attempts to censor contents have taken place. Most studies have however been limited to a short period of time and/or a few countries; the few exceptions have traded off detail for breadth of coverage. Collecting enough data for a comprehensive, global, longitudinal perspective remains challenging. In this work, we present ICLab, an Internet measurement platform specialized for censorship research. It achieves a new balance between breadth of coverage and detail of measurements, by using commercial VPNs as vantage points distributed around the world. ICLab has been operated continuously since late 2016. It can currently detect DNS manipulation and TCP packet injection, and overt "block pages" however they are delivered. ICLab records and archives raw observations in detail, making retrospective analysis with new techniques possible. At every stage of processing, ICLab seeks to minimize false positives and manual validation.
Within 53,906,532 measurements of individual web pages, collected by ICLab in 2017 and 2018, we observe blocking of 3,602 unique URLs in 60 countries. Using this data, we compare how different blocking techniques are deployed in different regions and/or against different types of content. Our longitudinal monitoring pinpoints changes in censorship in India and Turkey concurrent with political shifts, and our clustering techniques discover 48 previously unknown block pages. ICLab's broad and detailed measurements also expose other forms of network interference, such as surveillance and malware injection.
△ Less
Submitted 10 July, 2019; v1 submitted 9 July, 2019;
originally announced July 2019.
-
Combinatorial generation via permutation languages. I. Fundamentals
Authors:
Elizabeth Hartung,
Hung Phuc Hoang,
Torsten Mütze,
Aaron Williams
Abstract:
In this work we present a general and versatile algorithmic framework for exhaustively generating a large variety of different combinatorial objects, based on encoding them as permutations. This approach provides a unified view on many known results and allows us to prove many new ones. In particular, we obtain four classical Gray codes for permutations, bitstrings, binary trees and set partitions…
▽ More
In this work we present a general and versatile algorithmic framework for exhaustively generating a large variety of different combinatorial objects, based on encoding them as permutations. This approach provides a unified view on many known results and allows us to prove many new ones. In particular, we obtain four classical Gray codes for permutations, bitstrings, binary trees and set partitions as special cases. We present two distinct applications for our new framework: The first main application is the generation of pattern-avoiding permutations, yielding new Gray codes for different families of permutations that are characterized by the avoidance of certain classical patterns, (bi)vincular patterns, barred patterns, boxed patterns, Bruhat-restricted patterns, mesh patterns, monotone and geometric grid classes, and many others. We also obtain new Gray codes for all the combinatorial objects that are in bijection to these permutations, in particular for five different types of geometric rectangulations, also known as floorplans, which are divisions of a square into $n$ rectangles subject to certain restrictions. The second main application of our framework are lattice congruences of the weak order on the symmetric group $S_n$. Recently, Pilaud and Santos realized all those lattice congruences as $(n-1)$-dimensional polytopes, called quotientopes, which generalize hypercubes, associahedra, permutahedra etc. Our algorithm generates the equivalence classes of each of those lattice congruences, by producing a Hamilton path on the skeleton of the corresponding quotientope, yielding a constructive proof that each of these highly symmetric graphs is Hamiltonian. We thus also obtain a provable notion of optimality for the Gray codes obtained from our framework: They translate into walks along the edges of a polytope.
△ Less
Submitted 3 November, 2021; v1 submitted 14 June, 2019;
originally announced June 2019.
-
Fedoryuk values and stability of global Hölderian error bounds for polynomial functions
Authors:
Huy-Vui Hà,
Phi-Dũng Hoàng
Abstract:
Let $f$ be a polynomial function of $n$ variables. In this paper, we study stability of global Hölderian error bound for a nonempty sublevel set $[f \le t]$ under a perturbation of $t$. In this paper, we give:
* Criteria for the existence of a global Hölderian error bound of $[f \le t]$;
* Formulas for computing explicitly the set…
▽ More
Let $f$ be a polynomial function of $n$ variables. In this paper, we study stability of global Hölderian error bound for a nonempty sublevel set $[f \le t]$ under a perturbation of $t$. In this paper, we give:
* Criteria for the existence of a global Hölderian error bound of $[f \le t]$;
* Formulas for computing explicitly the set $$H(f) := \{ t \in \mathbb{R}: [f \le t]\ \text{has a global Hölderian error bound}\}$$ via some Fedoryuk values of $f$ and definition of threshold for the existence of global Hölderian error bound of $f$;
* Definition of all types of stability of global Hölderian error bound of $[f \le t]$.
△ Less
Submitted 16 October, 2019; v1 submitted 15 February, 2019;
originally announced February 2019.
-
An Empirical Study of the I2P Anonymity Network and its Censorship Resistance
Authors:
Nguyen Phong Hoang,
Panagiotis Kintis,
Manos Antonakakis,
Michalis Polychronakis
Abstract:
Tor and I2P are well-known anonymity networks used by many individuals to protect their online privacy and anonymity. Tor's centralized directory services facilitate the understanding of the Tor network, as well as the measurement and visualization of its structure through the Tor Metrics project. In contrast, I2P does not rely on centralized directory servers, and thus obtaining a complete view o…
▽ More
Tor and I2P are well-known anonymity networks used by many individuals to protect their online privacy and anonymity. Tor's centralized directory services facilitate the understanding of the Tor network, as well as the measurement and visualization of its structure through the Tor Metrics project. In contrast, I2P does not rely on centralized directory servers, and thus obtaining a complete view of the network is challenging. In this work, we conduct an empirical study of the I2P network, in which we measure properties including population, churn rate, router type, and the geographic distribution of I2P peers. We find that there are currently around 32K active I2P peers in the network on a daily basis. Of these peers, 14K are located behind NAT or firewalls.
Using the collected network data, we examine the blocking resistance of I2P against a censor that wants to prevent access to I2P using address-based blocking techniques. Despite the decentralized characteristics of I2P, we discover that a censor can block more than 95% of peer IP addresses known by a stable I2P client by operating only 10 routers in the network. This amounts to severe network impairment: a blocking rate of more than 70% is enough to cause significant latency in web browsing activities, while blocking more than 90% of peer IP addresses can make the network unusable. Finally, we discuss the security consequences of the network being blocked, and directions for potential approaches to make I2P more resistant to blocking.
△ Less
Submitted 25 September, 2018; v1 submitted 24 September, 2018;
originally announced September 2018.
-
An inverse free electron laser acceleration-driven Compton scattering X-ray source
Authors:
I. Gadjev,
N. Sudar,
M. Babzien,
J. Duris,
P. Hoang,
M. Fedurin,
K. Kusche,
R. Malone,
P. Musumeci,
M. Palmer,
I. Pogorelsky,
M. Polyanskiy,
Y. Sakai,
C. Swinson,
O. Williams,
J. B. Rosenzweig
Abstract:
The generation of X-rays and γ-rays based on synchrotron radiation from free electrons, emitted in magnet arrays such as undulators, forms the basis of much of modern X-ray science. This approach has the drawback of requiring very high energy, up to the multi-GeV-scale, electron beams, to obtain the required photon energy. Due to the limit in accelerating gradients in conventional particle acceler…
▽ More
The generation of X-rays and γ-rays based on synchrotron radiation from free electrons, emitted in magnet arrays such as undulators, forms the basis of much of modern X-ray science. This approach has the drawback of requiring very high energy, up to the multi-GeV-scale, electron beams, to obtain the required photon energy. Due to the limit in accelerating gradients in conventional particle accelerators, reaching high energy typically demands use of instruments exceeding 100's of meters in length. Compact, less costly, monochromatic X-ray sources based on very high field acceleration and very short period undulators, however, may revolutionize diverse advanced X-ray applications ranging from novel X-ray therapy techniques to active interrogation of sensitive materials, by making them accessible in cost and size. Such compactness may be obtained by an all-optical approach, which employs a laser-driven high gradient accelerator based on inverse free electron laser (IFEL), followed by a collision point for inverse Compton scattering (ICS), a scheme where a laser is used to provide undulator fields. We present an experimental proof-of-principle of this approach, where a TW-class CO2 laser pulse is split in two, with half used to accelerate a high quality electron beam up to 84 MeV through the IFEL interaction, and the other half acts as an electromagnetic undulator to generate up to 13 keV X-rays via ICS. These results demonstrate the feasibility of this scheme, which can be joined with other techniques such as laser recirculation to yield very compact, high brilliance photon sources, extending from the keV to MeV scale. Furthermore, use of the IFEL acceleration with the ICS interaction produces a train of very high intensity X-ray pulses, thus also permitting a unique tool that can be phase-locked to a laser pulse in frontier pump-probe experimental scenarios.
△ Less
Submitted 2 November, 2017;
originally announced November 2017.
-
Distributed Constrained Optimization over Networked Systems via A Singular Perturbation Method
Authors:
Phuong Huu Hoang,
Hyo-Sung Ahn
Abstract:
This paper studies a constrained optimization problem over networked systems with an undirected and connected communication topology. The algorithm proposed in this work utilizes singular perturbation, dynamic average consensus, and saddle point dynamics methods to tackle the problem for a general class of objective function and affine constraints in a fully distributed manner. It is shown that th…
▽ More
This paper studies a constrained optimization problem over networked systems with an undirected and connected communication topology. The algorithm proposed in this work utilizes singular perturbation, dynamic average consensus, and saddle point dynamics methods to tackle the problem for a general class of objective function and affine constraints in a fully distributed manner. It is shown that the private information of agents in the interconnected network is guaranteed in our proposed strategy. The theoretical guarantees on the optimality of the solution are provided by rigorous analyses. We apply the new proposed solution into energy networks by a demonstration of two simulations.
△ Less
Submitted 23 October, 2017;
originally announced October 2017.
-
Topological invariants of plane curve singularities: Polar quotients and Łojasiewicz gradient exponents
Authors:
Hong-Duc Nguyen,
Tien-Son Pham,
Phi-Dung Hoang
Abstract:
In this paper, we study polar quotients and Łojasiewicz exponents of plane curve singularities, which are {\em not necessarily reduced}. We first show that the polar quotients is a topological invariant. We next prove that the Łojasiewicz gradient exponent can be computed in terms of the polar quotients, and so it is also a topological invariant. As an application, we give effective estimates of t…
▽ More
In this paper, we study polar quotients and Łojasiewicz exponents of plane curve singularities, which are {\em not necessarily reduced}. We first show that the polar quotients is a topological invariant. We next prove that the Łojasiewicz gradient exponent can be computed in terms of the polar quotients, and so it is also a topological invariant. As an application, we give effective estimates of the Łojasiewicz exponents in the gradient and classical inequalities of polynomials in two (real or complex) variables.
△ Less
Submitted 28 August, 2017;
originally announced August 2017.
-
Experimental characterization of electron beam driven wakefield modes in a dielectric woodpile Cartesian symmetric structure
Authors:
P. D. Hoang,
G. Andonian,
I. Gadjev,
B. Naranjo,
Y. Sakai,
N. Sudar,
O. Williams,
M. Fedurin,
K. Kusche,
C. Swinson,
J. B. Rosenzweig
Abstract:
The use of photonic structures in the terahertz (THz) spectral region may enable the essential characteristics of confinement, modal control, and electric field shielding for very high gradient accelerators based on wakefields in dielectrics. We report here an experimental investigation of THz wakefield modes in a 3D photonic woodpile structure. Selective control in exciting or suppressing of wake…
▽ More
The use of photonic structures in the terahertz (THz) spectral region may enable the essential characteristics of confinement, modal control, and electric field shielding for very high gradient accelerators based on wakefields in dielectrics. We report here an experimental investigation of THz wakefield modes in a 3D photonic woodpile structure. Selective control in exciting or suppressing of wakefield modes with non-zero transverse wave vector is demonstrated by using drive beams of varying transverse ellipticity. Additionally, we show that the wakefield spectrum is insensitive to the offset position of the strongly elliptical beam. These results are consistent with analytic theory and 3D simulations, and illustrate a key advantage of wakefield systems with Cartesian symmetry, the suppression of transverse wakes by elliptical beams.
△ Less
Submitted 31 July, 2017;
originally announced August 2017.
-
Single shot, double differential spectral measurements of inverse Compton scattering in linear and nonlinear regimes
Authors:
Y. Sakai,
I. Gadjev,
P. Hoang,
N. Majernik,
A. Nause,
A. Fukusawa,
O. Williams,
M. Fedurin,
B. Malone,
C. Swinson,
K. Kusche,
M. Polyanski,
M. Babzien,
M. Montemagno,
Z. Zhong,
P. Siddons,
I. Pogorelsky,
V. Yakimenko,
T. Kumita,
Y. Kamiya,
J. B. Rosenzweig
Abstract:
Inverse Compton scattering (ICS) is a unique mechanism for producing fast pulses - picosecond and below - of bright X- to gamma-rays. These nominally narrow spectral bandwidth electromagnetic radiation pulses are efficiently produced in the interaction between intense, well-focused electron and laser beams. The spectral characteristics of such sources are affected by many experimental parameters,…
▽ More
Inverse Compton scattering (ICS) is a unique mechanism for producing fast pulses - picosecond and below - of bright X- to gamma-rays. These nominally narrow spectral bandwidth electromagnetic radiation pulses are efficiently produced in the interaction between intense, well-focused electron and laser beams. The spectral characteristics of such sources are affected by many experimental parameters, such as the bandwidth of the laser, and the angles of both the electrons and laser photons at collision. The laser field amplitude induces harmonic generation and importantly, for the present work, nonlinear red shifting, both of which dilute the spectral brightness of the radiation. As the applications enabled by this source often depend sensitively on its spectra, it is critical to resolve the details of the wavelength and angular distribution obtained from ICS collisions. With this motivation, we present here an experimental study that greatly improves on previous spectral measurement methods based on X-ray K-edge filters, by implementing a multi-layer bent-crystal X-ray spectrometer. In tandem with a collimating slit, this method reveals a projection of the double-differential angular-wavelength spectrum of the ICS radiation in a single shot. The measurements enabled by this diagnostic illustrate the combined off-axis and nonlinear-field-induced red shifting in the ICS emission process. They reveal in detail the strength of the normalized laser vector potential, and provide a non-destructive measure of the temporal and spatial electron-laser beam overlap.
△ Less
Submitted 2 January, 2017;
originally announced January 2017.
-
Towards an Autonomous System Monitor for Mitigating Correlation Attacks in the Tor Network
Authors:
Nguyen Phong Hoang
Abstract:
After carefully considering the scalability problem in Tor and exhaustively evaluating related works on AS-level adversaries, the author proposes ASmoniTor, which is an autonomous system monitor for mitigating correlation attacks in the Tor network. In contrast to prior works, which often released offline packets, including the source code of a modified Tor client and a snapshot of the Internet to…
▽ More
After carefully considering the scalability problem in Tor and exhaustively evaluating related works on AS-level adversaries, the author proposes ASmoniTor, which is an autonomous system monitor for mitigating correlation attacks in the Tor network. In contrast to prior works, which often released offline packets, including the source code of a modified Tor client and a snapshot of the Internet topology, ASmoniTor is an online system that assists end users with mitigating the threat of AS-level adversaries in a near real-time fashion. For Tor clients proposed in previous works, users need to compile the source code on their machine and continually update the snapshot of the Internet topology in order to obtain accurate AS-path inferences. On the contrary, ASmoniTor is an online platform that can be utilized easily by not only technical users, but also by users without a technical background, because they only need to access it via Tor and input two parameters to execute an AS-aware path selection algorithm. With ASmoniTor, the author makes three key technical contributions to the research against AS-level adversaries in the Tor network. First, ASmoniTor does not require the users to initiate complicated source code compilations. Second, it helps to reduce errors in AS-path inferences by letting users input a set of suspected ASes obtained directly from their own traceroute measurements. Third, the Internet topology database at the back-end of ASmoniTor is periodically updated to assure near real-time AS-path inferences between Tor exit nodes and the most likely visited websites. Finally, in addition to its convenience, ASmoniTor gives users full control over the information they want to input, thus preserving their privacy.
△ Less
Submitted 6 October, 2016;
originally announced October 2016.
-
Your Neighbors Are My Spies: Location and other Privacy Concerns in GLBT-focused Location-based Dating Applications
Authors:
Nguyen Phong Hoang,
Yasuhito Asano,
Masatoshi Yoshikawa
Abstract:
Trilateration is one of the well-known threat models to the user's location privacy in location-based apps, especially those contain highly sensitive information such as dating apps. The threat model mainly bases on the publicly shown distance from a targeted victim to the adversary to pinpoint the victim's location. As a countermeasure, most of location-based apps have already implemented the 'hi…
▽ More
Trilateration is one of the well-known threat models to the user's location privacy in location-based apps, especially those contain highly sensitive information such as dating apps. The threat model mainly bases on the publicly shown distance from a targeted victim to the adversary to pinpoint the victim's location. As a countermeasure, most of location-based apps have already implemented the 'hide distance' function, or added noise to the publicly shown distance in order to protect their user's location privacy. The effectiveness of such approaches however is still questionable.
△ Less
Submitted 20 April, 2016;
originally announced April 2016.