-
Vector Embeddings by Sequence Similarity and Context for Improved Compression, Similarity Search, Clustering, Organization, and Manipulation of cDNA Libraries
Authors:
Daniel H. Um,
David A. Knowles,
Gail E. Kaiser
Abstract:
This paper demonstrates the utility of organized numerical representations of genes in research involving flat string gene formats (i.e., FASTA/FASTQ5). FASTA/FASTQ files have several current limitations, such as their large file sizes, slow processing speeds for map** and alignment, and contextual dependencies. These challenges significantly hinder investigations and tasks that involve finding…
▽ More
This paper demonstrates the utility of organized numerical representations of genes in research involving flat string gene formats (i.e., FASTA/FASTQ5). FASTA/FASTQ files have several current limitations, such as their large file sizes, slow processing speeds for map** and alignment, and contextual dependencies. These challenges significantly hinder investigations and tasks that involve finding similar sequences. The solution lies in transforming sequences into an alternative representation that facilitates easier clustering into similar groups compared to the raw sequences themselves. By assigning a unique vector embedding to each short sequence, it is possible to more efficiently cluster and improve upon compression performance for the string representations of cDNA libraries. Furthermore, through learning alternative coordinate vector embeddings based on the contexts of codon triplets, we can demonstrate clustering based on amino acid properties. Finally, using this sequence embedding method to encode barcodes and cDNA sequences, we can improve the time complexity of the similarity search by coupling vector embeddings with an algorithm that determines the proximity of vectors in Euclidean space; this allows us to perform sequence similarity searches in a quicker and more modular fashion.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
HoechstGAN: Virtual Lymphocyte Staining Using Generative Adversarial Networks
Authors:
Georg Wölflein,
In Hwa Um,
David J Harrison,
Ognjen Arandjelović
Abstract:
The presence and density of specific types of immune cells are important to understand a patient's immune response to cancer. However, immunofluorescence staining required to identify T cell subtypes is expensive, time-consuming, and rarely performed in clinical settings. We present a framework to virtually stain Hoechst images (which are cheap and widespread) with both CD3 and CD8 to identify T c…
▽ More
The presence and density of specific types of immune cells are important to understand a patient's immune response to cancer. However, immunofluorescence staining required to identify T cell subtypes is expensive, time-consuming, and rarely performed in clinical settings. We present a framework to virtually stain Hoechst images (which are cheap and widespread) with both CD3 and CD8 to identify T cell subtypes in clear cell renal cell carcinoma using generative adversarial networks. Our proposed method jointly learns both staining tasks, incentivising the network to incorporate mutually beneficial information from each task. We devise a novel metric to quantify the virtual staining quality, and use it to evaluate our method.
△ Less
Submitted 17 October, 2022; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Hoechst Is All You Need: Lymphocyte Classification with Deep Learning
Authors:
Jessica Cooper,
In Hwa Um,
Ognjen Arandjelović,
David J Harrison
Abstract:
Multiplex immunofluorescence and immunohistochemistry benefit patients by allowing cancer pathologists to identify several proteins expressed on the surface of cells, enabling cell classification, better understanding of the tumour micro-environment, more accurate diagnoses, prognoses, and tailored immunotherapy based on the immune status of individual patients. However, they are expensive and tim…
▽ More
Multiplex immunofluorescence and immunohistochemistry benefit patients by allowing cancer pathologists to identify several proteins expressed on the surface of cells, enabling cell classification, better understanding of the tumour micro-environment, more accurate diagnoses, prognoses, and tailored immunotherapy based on the immune status of individual patients. However, they are expensive and time consuming processes which require complex staining and imaging techniques by expert technicians. Hoechst staining is much cheaper and easier to perform, but is not typically used in this case as it binds to DNA rather than to the proteins targeted by immunofluorescent techniques, and it was not previously thought possible to differentiate cells expressing these proteins based only on DNA morphology. In this work we show otherwise, training a deep convolutional neural network to identify cells expressing three proteins (T lymphocyte markers CD3 and CD8, and the B lymphocyte marker CD20) with greater than 90% precision and recall, from Hoechst 33342 stained tissue only. Our model learns previously unknown morphological features associated with expression of these proteins which can be used to accurately differentiate lymphocyte subtypes for use in key prognostic metrics such as assessment of immune cell infiltration,and thereby predict and improve patient outcomes without the need for costly multiplex immunofluorescence.
△ Less
Submitted 16 July, 2021; v1 submitted 9 July, 2021;
originally announced July 2021.
-
Multiple resolution residual network for automatic thoracic organs-at-risk segmentation from CT
Authors:
Hyemin Um,
Jue Jiang,
Maria Thor,
Andreas Rimner,
Leo Luo,
Joseph O. Deasy,
Harini Veeraraghavan
Abstract:
We implemented and evaluated a multiple resolution residual network (MRRN) for multiple normal organs-at-risk (OAR) segmentation from computed tomography (CT) images for thoracic radiotherapy treatment (RT) planning. Our approach simultaneously combines feature streams computed at multiple image resolutions and feature levels through residual connections. The feature streams at each level are upda…
▽ More
We implemented and evaluated a multiple resolution residual network (MRRN) for multiple normal organs-at-risk (OAR) segmentation from computed tomography (CT) images for thoracic radiotherapy treatment (RT) planning. Our approach simultaneously combines feature streams computed at multiple image resolutions and feature levels through residual connections. The feature streams at each level are updated as the images are passed through various feature levels. We trained our approach using 206 thoracic CT scans of lung cancer patients with 35 scans held out for validation to segment the left and right lungs, heart, esophagus, and spinal cord. This approach was tested on 60 CT scans from the open-source AAPM Thoracic Auto-Segmentation Challenge dataset. Performance was measured using the Dice Similarity Coefficient (DSC). Our approach outperformed the best-performing method in the grand challenge for hard-to-segment structures like the esophagus and achieved comparable results for all other structures. Median DSC using our method was 0.97 (interquartile range [IQR]: 0.97-0.98) for the left and right lungs, 0.93 (IQR: 0.93-0.95) for the heart, 0.78 (IQR: 0.76-0.80) for the esophagus, and 0.88 (IQR: 0.86-0.89) for the spinal cord.
△ Less
Submitted 31 May, 2020; v1 submitted 27 May, 2020;
originally announced May 2020.
-
Formation of the Hayward black hole from a collapsing shell
Authors:
Hwa** Um,
Wontae Kim
Abstract:
We consider a collapsing shell of matter to form the Hayward black hole and investigate semiclassically quantum radiation from the shell. Using the Israel's formulation, we obtain the mass relation between the collapsing shell and the Hayward black hole. By using the functional Schrödinger formulation for the massless quantum radiation, the evolution of a vacuum state for a scalar field is shown t…
▽ More
We consider a collapsing shell of matter to form the Hayward black hole and investigate semiclassically quantum radiation from the shell. Using the Israel's formulation, we obtain the mass relation between the collapsing shell and the Hayward black hole. By using the functional Schrödinger formulation for the massless quantum radiation, the evolution of a vacuum state for a scalar field is shown to be unitary. We find that the number of quanta at a low frequency decreases for a large length parameter characterizing the Hayward black hole. Moreover, in the limit of low frequency, the Hawking temperature can be read off from the occupation number of excited states when the shell approaches its own horizon.
△ Less
Submitted 30 March, 2020; v1 submitted 9 December, 2019;
originally announced December 2019.
-
Active Search for Nearest Neighbors
Authors:
Hayoung Um,
Heeyoul Choi
Abstract:
In pattern recognition or machine learning, it is a very fundamental task to find nearest neighbors of a given point. All the methods for the task work basically by comparing the given point to all the points in the data set. That is why the computational cost increases with the number of data points. However, the human visual system seems to work in a different way. When the human visual system t…
▽ More
In pattern recognition or machine learning, it is a very fundamental task to find nearest neighbors of a given point. All the methods for the task work basically by comparing the given point to all the points in the data set. That is why the computational cost increases with the number of data points. However, the human visual system seems to work in a different way. When the human visual system tries to find the neighbors of one point on a map, it directly focuses on the area around the point and actively searches the neighbors by looking or zooming in and out around the point. In this paper, we propose an innovative search method for nearest neighbors, which seems very similar to how human visual system works on the task.
△ Less
Submitted 8 December, 2019; v1 submitted 1 December, 2019;
originally announced December 2019.
-
Local block-wise self attention for normal organ segmentation
Authors:
Jue Jiang,
Elguindi Sharif,
Hyemin Um,
Sean Berry,
Harini Veeraraghavan
Abstract:
We developed a new and computationally simple local block-wise self attention based normal structures segmentation approach applied to head and neck computed tomography (CT) images. Our method uses the insight that normal organs exhibit regularity in their spatial location and inter-relation within images, which can be leveraged to simplify the computations required to aggregate feature informatio…
▽ More
We developed a new and computationally simple local block-wise self attention based normal structures segmentation approach applied to head and neck computed tomography (CT) images. Our method uses the insight that normal organs exhibit regularity in their spatial location and inter-relation within images, which can be leveraged to simplify the computations required to aggregate feature information. We accomplish this by using local self attention blocks that pass information between each other to derive the attention map. We show that adding additional attention layers increases the contextual field and captures focused attention from relevant structures. We developed our approach using U-net and compared it against multiple state-of-the-art self attention methods. All models were trained on 48 internal headneck CT scans and tested on 48 CT scans from the external public domain database of computational anatomy dataset. Our method achieved the highest Dice similarity coefficient segmentation accuracy of 0.85$\pm$0.04, 0.86$\pm$0.04 for left and right parotid glands, 0.79$\pm$0.07 and 0.77$\pm$0.05 for left and right submandibular glands, 0.93$\pm$0.01 for mandible and 0.88$\pm$0.02 for the brain stem with the lowest increase of 66.7\% computing time per image and 0.15\% increase in model parameters compared with standard U-net. The best state-of-the-art method called point-wise spatial attention, achieved \textcolor{black}{comparable accuracy but with 516.7\% increase in computing time and 8.14\% increase in parameters compared with standard U-net.} Finally, we performed ablation tests and studied the impact of attention block size, overlap of the attention blocks, additional attention layers, and attention block placement on segmentation performance.
△ Less
Submitted 11 September, 2019;
originally announced September 2019.
-
Unruh temperatures in circular and drifted Rindler motions
Authors:
Yongwan Gim,
Hwa** Um,
Wontae Kim
Abstract:
We study the temperatures for the circular and drifted Rindler motions by employing the Unruh-DeWitt detector method. In the circular motion, the temperature is increasing along the radius of the circular motion until it reaches the maximum, and then it is decreasing and eventually vanishing at the limit to the radius where the proper acceleration is infinite. In fact, the temperature is proportio…
▽ More
We study the temperatures for the circular and drifted Rindler motions by employing the Unruh-DeWitt detector method. In the circular motion, the temperature is increasing along the radius of the circular motion until it reaches the maximum, and then it is decreasing and eventually vanishing at the limit to the radius where the proper acceleration is infinite. In fact, the temperature is proportional to the proper acceleration quadratically near the origin of the circular motion as compared to the usual Unruh effect depending on the linear proper acceleration. On the other hand, in the drifted Rindler motion, the observer moves with a relative velocity in the direction transverse to the acceleration. If the detector is moving slowly in the transverse direction with a finite proper acceleration, then the temperature behaves like the usual Unruh temperature, while it vanishes for the speed of light in the transverse direction according to the infinite proper acceleration. Consequently, it turns out that the temperatures behave nonlinearly with respect to the proper acceleration and the infinite proper acceleration would not always permit the divergent temperature.
△ Less
Submitted 28 June, 2018;
originally announced June 2018.
-
Unruh effect of nonlocal field theories with a minimal length
Authors:
Yongwan Gim,
Hwa** Um,
Wontae Kim
Abstract:
The nonlocal field theory commonly requires a minimal length, and so it appears to formulate the nonlocal theory in terms of the doubly special relativity which makes the speed of light and the minimal length invariant simultaneously. We set up a generic nonlocal model having the same set of solutions as the local theory but allowing Lorentz violations due to the minimal length. It is exactly corr…
▽ More
The nonlocal field theory commonly requires a minimal length, and so it appears to formulate the nonlocal theory in terms of the doubly special relativity which makes the speed of light and the minimal length invariant simultaneously. We set up a generic nonlocal model having the same set of solutions as the local theory but allowing Lorentz violations due to the minimal length. It is exactly corresponding to the model with the modified dispersion relation in the doubly special relativity. For this model, we calculate the modified Wightman function and the rate of response function by using the Unruh-DeWitt detector method. It turns out that the Unruh effect should be corrected by the minimal length related to the nonlocality in the regime of the doubly special relativity. However, for the Lorentz-invariant limit, it is shown that the Wightman function and the Unruh effect remain the same as those of the local theory.
△ Less
Submitted 13 August, 2018; v1 submitted 19 March, 2018;
originally announced March 2018.
-
Black hole complementarity with the generalized uncertainty principle in gravity's rainbow
Authors:
Yongwan Gim,
Hwa** Um,
Wontae Kim
Abstract:
When gravitation is combined with quantum theory, the Heisenberg uncertainty principle could be extended to the generalized uncertainty principle accompanying a minimal length. To see how the generalized uncertainty principle works in the context of black hole complementarity, we calculate the required energy to duplicate information for the Schwarzschild black hole. It shows that the duplication…
▽ More
When gravitation is combined with quantum theory, the Heisenberg uncertainty principle could be extended to the generalized uncertainty principle accompanying a minimal length. To see how the generalized uncertainty principle works in the context of black hole complementarity, we calculate the required energy to duplicate information for the Schwarzschild black hole. It shows that the duplication of information is not allowed and black hole complementarity is still valid even assuming the generalized uncertainty principle. On the other hand, the generalized uncertainty principle with the minimal length could lead to a modification of the conventional dispersion relation in light of Gravity's Rainbow, where the minimal length is also invariant as well as the speed of light. Revisiting the gedanken experiment, we show that the no-cloning theorem for black hole complementarity can be made valid in the regime of Gravity's Rainbow on a certain combination of parameters.
△ Less
Submitted 1 March, 2018; v1 submitted 12 December, 2017;
originally announced December 2017.
-
RNA substructure as a random matrix ensemble
Authors:
Sang Kwan Choi,
Chaiho Rim,
Hwa** Um
Abstract:
Combinatorial analysis of a certain abstract of RNA structures has been studied to investigate their statistics. Our approach regards the backbone of secondary structures as an alternate sequence of paired and unpaired sets of nucleotides, which can be described by random matrix model. We obtain the generating function of the structures using Hermitian matrix model with Chebyshev polynomial of the…
▽ More
Combinatorial analysis of a certain abstract of RNA structures has been studied to investigate their statistics. Our approach regards the backbone of secondary structures as an alternate sequence of paired and unpaired sets of nucleotides, which can be described by random matrix model. We obtain the generating function of the structures using Hermitian matrix model with Chebyshev polynomial of the second kind and analyze the statistics with respect to the number of stems. To match the experimental findings of the statistical behavior, we consider the structures in a grand canonical ensemble and find a fugacity value corresponding to an appropriate number of stems.
△ Less
Submitted 9 March, 2020; v1 submitted 22 December, 2016;
originally announced December 2016.
-
Paramecium swimming in capillary tube
Authors:
Saikat Jana,
Soong Ho Um,
Sunghwan Jung
Abstract:
Swimming organisms in their natural habitat navigate through a wide array of geometries and chemical environments. Interaction with the boundaries is ubiquitous and can significantly modify the swimming characteristics of the organism as observed under ideal conditions. We study the dynamics of ciliary locomotion in Paramecium multimicronucleatum and observe the effect of the solid boundaries on t…
▽ More
Swimming organisms in their natural habitat navigate through a wide array of geometries and chemical environments. Interaction with the boundaries is ubiquitous and can significantly modify the swimming characteristics of the organism as observed under ideal conditions. We study the dynamics of ciliary locomotion in Paramecium multimicronucleatum and observe the effect of the solid boundaries on the velocities in the near field of the organism. Experimental observations show that Paramecium executes helical trajectories that slowly transition to straight line motion as the diameter of the capillary tubes decrease. Theoretically this system is modeled as an undulating cylinder with pressure gradient and compared with experiments; showing that such considerations are necessary for modeling finite sized organisms in the restrictive geometries.
△ Less
Submitted 21 December, 2010;
originally announced December 2010.