License: CC BY 4.0
arXiv:2403.11038v1 [cs.CV] 16 Mar 2024
\epstopdfDeclareGraphicsRule

.tiffpng.pngconvert #1 \OutputFile \AppendGraphicsExtensions.tiff

Texture Edge detection by Patch consensus (TEP)

Guangyu Cui  and Sung Ha Kang School of Mathematics, Georgia Institute of Technology, Atlanta, GA, USA ([email protected])School of Mathematics, Georgia Institute of Technology, Atlanta, GA, USA ([email protected])
Abstract

We propose Texture Edge detection using Patch consensus (TEP) which is a training-free method to detect the boundary of texture. We propose a new simple way to identify the texture edge location, using the consensus of segmented local patch information. While on the boundary, even using local patch information, the distinction between textures are typically not clear, but using neighbor consensus give a clear idea of the boundary. We utilize local patch, and its response against neighboring regions, to emphasize the similarities and the differences across different textures. The step of segmentation of response further emphasizes the edge location, and the neighborhood voting gives consensus and stabilize the edge detection. We analyze texture as a stationary process to give insight into the patch width parameter verses the quality of edge detection. We derive the necessary condition for textures to be distinguished, and analyze the patch width with respect to the scale of textures. Various experiments are presented to validate the proposed model.

1 Introduction

Texture has been explored for decades [32] and fruitful results are established for different type of textures: Markov random field [25] is widely used for texture synthesis. Its generative feature fits well with the randomness of certain types of textures (wood surface, sand); Lattice based method [18] is powerful in modeling highly symmetric and periodic texture, when the texton is relatively well defined (wall paper, honeycomb); Frequency/Wavelet analysis [20, 31] utilize spacial filters to vectorize textures and plays an important role in image compression [17] and texture classification and segmentation [33, 14]. We refer to [30, 25] for a comprehensive review of classical models. Textures remains to be a challenging topic, since there is no general nor precise definition of textures, and the boundaries of different textures are especially difficult to recognize.

We explore texture edge detection and segmentation. The classical Canny edge detection [5] detects edge locations by thinning the mask function, which is computed by applying thresholds to the magnitude of |U0|subscript𝑈0|\nabla U_{0}|| ∇ italic_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT |. One of the most well-known variational segmentation models is the Mumford-Shah functional [23],

EMS(U,Γ)=αΩ\Γ|U|2𝑑x+β1(Γ)+Ω\Γ(UU0)2𝑑x.subscript𝐸𝑀𝑆𝑈Γ𝛼subscript\ΩΓsuperscript𝑈2differential-d𝑥𝛽superscript1Γsubscript\ΩΓsuperscript𝑈subscript𝑈02differential-d𝑥\displaystyle E_{MS}(U,\Gamma)=\alpha\int_{\Omega\backslash\Gamma}|\nabla U|^{% 2}dx+\beta\mathcal{H}^{1}(\Gamma)+\int_{\Omega\backslash\Gamma}(U-U_{0})^{2}dx.italic_E start_POSTSUBSCRIPT italic_M italic_S end_POSTSUBSCRIPT ( italic_U , roman_Γ ) = italic_α ∫ start_POSTSUBSCRIPT roman_Ω \ roman_Γ end_POSTSUBSCRIPT | ∇ italic_U | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_x + italic_β caligraphic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Γ ) + ∫ start_POSTSUBSCRIPT roman_Ω \ roman_Γ end_POSTSUBSCRIPT ( italic_U - italic_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_x . (1)

Here, α𝛼\alphaitalic_α and β𝛽\betaitalic_β are positive parameters, ΩΩ\Omega\subset\mathbb{R}roman_Ω ⊂ blackboard_R is a bounded image domain, U0:Ω:subscript𝑈0ΩU_{0}\mathrel{\mathop{:}}\Omega\to\mathbb{R}italic_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT : roman_Ω → blackboard_R is the given image, and 1(Γ)superscript1Γ\mathcal{H}^{1}(\Gamma)caligraphic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Γ ) denotes one-dimensional Hausdorff measure of the object boundary ΓΓ\Gammaroman_Γ. Chan and Vese [7] proposed using level set method, and it gives very effective piece-wise constant segmentation. For textured image segmentation, some texture descriptors, such as Gabor filter, can be used with these models. Gabor filter [22] can detect localized frequency response in varied orientations and scales. Figure 1 illustrates challenges of texture edge detection. For a comprehensive review of classical texture segmentation models, we refer to [13]. In real images, there are many different types of textures and it is typically difficult to find a proper filter bank that is suitable for all types of images. More recent network based approaches with data-adaptive property is capable of accomplishing high level tasks as semantic segmentation, e.g., [3, 29, 34, 10, 19], assuming the network is well-trained.

(a) (b) (c)
Refer to caption Refer to caption Refer to caption
Figure 1: Challenges of texture edge detection. (a) An given image with textures. (b) Canny edge detection. (c) Chan-Vese segmentation.

In this paper, we propose filter-free and training-free approach utilizing local patches response to capture the similarities and the differences between textures. Non-local filter [4] for image denoising averages pixel intensity values over similar image patches effectively. In [9], the authors extend this to texture synthesis algorithm, where the pixels are selected over similar image patches to regenerate textures. One of the difficulties in applying such nonlocal filter for texture edge detection is that near the boundary of texture, there may not be similar patches to give the clear boundary, different from denoising non textures edges.

We use local patch information utilizing similar responses within one texture, and different responses against different textures, and propose a Texture Edge detection method by Patch consensus (TEP). We propose a consensus based edge detection, which utilize the fact that away from the boundary often gives a clear idea about where the boundary of the texture should be located. We analyze the statistical condition for the texture edge to be detected by patch-wise similarity, and explore the relation between the patch width and the performance of the proposed method. The contributions of this paper are as follows:

  • We propose a simple training-free filter-free Texture Edge detection method using patch consensus (TEP).

  • We statistically analyze when the texture can be separated by the patch consensus.

  • Numerical results are presented to validate the proposed model.

The paper is outlined as follows: In Section 2, the details of texture edge detection with Patch consensus (TEP) is presented. Statistical analysis of the proposed model is provided in Section 3. In Section 4, we present the algorithms and numerical implementation details, and in Section 5 various experiments with comparisons and applications are presented.

2 The proposed model: Texture Edge detection by Patch consensus (TEP)

Let the discrete image domain be Ω=[1,2,,M][1,2,,N]Ω12𝑀direct-sum12𝑁\Omega=[1,2,\dots,M]\bigoplus[1,2,\dots,N]roman_Ω = [ 1 , 2 , … , italic_M ] ⨁ [ 1 , 2 , … , italic_N ], and let the matrix UM×N𝑈superscript𝑀𝑁U\in\mathbb{R}^{M\times N}italic_U ∈ blackboard_R start_POSTSUPERSCRIPT italic_M × italic_N end_POSTSUPERSCRIPT denote the given image, where U[𝐱]=U[x1,x2]𝑈delimited-[]𝐱𝑈subscript𝑥1subscript𝑥2U[\mathbf{x}]=U[x_{1},x_{2}]italic_U [ bold_x ] = italic_U [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] represents the intensity value at a pixel location 𝐱Ω𝐱Ω\mathbf{x}\in\Omegabold_x ∈ roman_Ω. We consider a square neighborhood of 𝐱𝐱\mathbf{x}bold_x with the width 2r+1+𝑟1superscriptr+1\in\mathbb{Z}^{+}italic_r + 1 ∈ blackboard_Z start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT to be

r(𝐱)={𝐲Ω𝐲𝐱r},subscript𝑟𝐱conditional-set𝐲Ωsubscriptnorm𝐲𝐱𝑟\mathcal{B}_{r}(\mathbf{x})=\{\mathbf{y}\in\Omega\mid\|\mathbf{y}-\mathbf{x}\|% _{\infty}\leq r\},caligraphic_B start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_x ) = { bold_y ∈ roman_Ω ∣ ∥ bold_y - bold_x ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_r } ,

and we denote the vector version of the image patch of r(𝐱)subscript𝑟𝐱\mathcal{B}_{r}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_x ) to be 𝒫(𝐱)d𝒫𝐱superscript𝑑\vec{\mathcal{P}}(\mathbf{x})\in\mathbb{R}^{d}over→ start_ARG caligraphic_P end_ARG ( bold_x ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT that d=(2r+1)2𝑑superscript2𝑟12d=(2r+1)^{2}italic_d = ( 2 italic_r + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. We refer to r𝑟ritalic_r as the patch width parameter. We set the order of the entry in the vector to be column-wise from left to right, i.e., let the matrix C𝐶Citalic_C be the d×d𝑑𝑑\sqrt{d}\times\sqrt{d}square-root start_ARG italic_d end_ARG × square-root start_ARG italic_d end_ARG image, the transformation matrix pair Ad×d,Bd×1formulae-sequence𝐴superscript𝑑𝑑𝐵superscript𝑑1A\in\mathbb{R}^{{d}\times\sqrt{d}},B\in\mathbb{R}^{\sqrt{d}\times 1}italic_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × square-root start_ARG italic_d end_ARG end_POSTSUPERSCRIPT , italic_B ∈ blackboard_R start_POSTSUPERSCRIPT square-root start_ARG italic_d end_ARG × 1 end_POSTSUPERSCRIPT, where

A=(111)I,B=(100),then𝒫(𝐱)=ACBformulae-sequence𝐴111tensor-product𝐼formulae-sequence𝐵100then𝒫𝐱𝐴𝐶𝐵\displaystyle A=\left(\begin{array}[]{c}1\\ 1\\ \vdots\\ 1\end{array}\right)\bigotimes I,\qquad B=\left(\begin{array}[]{c}1\\ 0\\ \vdots\\ 0\end{array}\right),\qquad\text{then}\quad\vec{\mathcal{P}}(\mathbf{x})=ACBitalic_A = ( start_ARRAY start_ROW start_CELL 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 1 end_CELL end_ROW end_ARRAY ) ⨂ italic_I , italic_B = ( start_ARRAY start_ROW start_CELL 1 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARRAY ) , then over→ start_ARG caligraphic_P end_ARG ( bold_x ) = italic_A italic_C italic_B

which transforms a square patch in d×dsuperscript𝑑𝑑\mathbb{R}^{\sqrt{d}\times\sqrt{d}}blackboard_R start_POSTSUPERSCRIPT square-root start_ARG italic_d end_ARG × square-root start_ARG italic_d end_ARG end_POSTSUPERSCRIPT to a vector in dsuperscript𝑑\mathbb{R}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Here tensor-product\bigotimes denotes Konecker product, and I𝐼Iitalic_I is d𝑑\sqrt{d}square-root start_ARG italic_d end_ARG dimensional identity matrix.

The main idea of the proposed method, Texture Edge detection by Path consensus (TEP), is as follows. From a local patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ), a patch response (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) is considered. We segment these patch responses (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) to emphasize the similarities and the differences of patch responses. Then, we collect these segmentation boundaries and construct the edge function V𝑉Vitalic_V in ΩΩ\Omegaroman_Ω.

[Step 1] For each patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ), we define the patch response in a larger domain as

(𝐲;𝐱)=1(2r+1)2𝒫(𝐲)𝒫(𝐱)220.𝐲𝐱1superscript2𝑟12superscriptsubscriptnorm𝒫𝐲𝒫𝐱220\mathcal{R}(\mathbf{y};\mathbf{x})=\frac{1}{(2r+1)^{2}}\|\vec{\mathcal{P}}(% \mathbf{y})-\vec{\mathcal{P}}(\mathbf{x})\|_{2}^{2}\geq 0.caligraphic_R ( bold_y ; bold_x ) = divide start_ARG 1 end_ARG start_ARG ( 2 italic_r + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ over→ start_ARG caligraphic_P end_ARG ( bold_y ) - over→ start_ARG caligraphic_P end_ARG ( bold_x ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≥ 0 . (2)

Here 𝐲R(𝐱)𝐲subscript𝑅𝐱\mathbf{y}\in\mathcal{B}_{R}(\mathbf{x})bold_y ∈ caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) with R>r𝑅𝑟R>ritalic_R > italic_r represents the half width of the comparison neighborhood. We refer to R𝑅Ritalic_R as the large comparison region width parameter. This (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) measures the similarity of a patch at 𝐱𝐱\mathbf{x}bold_x and a patch at 𝐲𝐲\mathbf{y}bold_y. When the patches 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) and 𝒫(𝐲)𝒫𝐲\vec{\mathcal{P}}(\mathbf{y})over→ start_ARG caligraphic_P end_ARG ( bold_y ) are similar, it gives near zero value, and when they are very different, it gives a high value. For computational efficiency, we take 𝐲𝐲\mathbf{y}bold_y from the neighborhood R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ), but one may use 𝐲Ω𝐲Ω\mathbf{y}\in\Omegabold_y ∈ roman_Ω.

[Step 2] To emphasize the texture differences and capture the edge information more clearly, we segment the response (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) on R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) using the following unsupervised multiphase segmentation model [26]:

Eseg(χi,ci,K|)=λ(i=1KPiAi)1(Γ)+i=1Kχi|(𝐲;𝐱)ci|2𝑑𝐱subscript𝐸segsubscript𝜒𝑖subscript𝑐𝑖conditional𝐾𝜆superscriptsubscript𝑖1𝐾subscript𝑃𝑖subscript𝐴𝑖superscript1Γsuperscriptsubscript𝑖1𝐾subscriptsubscript𝜒𝑖superscript𝐲𝐱subscript𝑐𝑖2differential-d𝐱E_{\text{seg}}(\chi_{i},c_{i},K|\mathcal{R})=\lambda\left(\sum_{i=1}^{K}\frac{% P_{i}}{A_{i}}\right)\mathcal{H}^{1}(\Gamma)+\sum_{i=1}^{K}\int_{\chi_{i}}|% \mathcal{R}(\mathbf{y};\mathbf{x})-c_{i}|^{2}d\mathbf{x}italic_E start_POSTSUBSCRIPT seg end_POSTSUBSCRIPT ( italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_K | caligraphic_R ) = italic_λ ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT divide start_ARG italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) caligraphic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Γ ) + ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT | caligraphic_R ( bold_y ; bold_x ) - italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d bold_x (3)

where χisubscript𝜒𝑖\chi_{i}italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the indicator function of each phase i𝑖iitalic_i which partitions BR(𝐱)=i=1Kχisubscript𝐵𝑅𝐱superscriptsubscript𝑖1𝐾subscript𝜒𝑖B_{R}(\mathbf{x})=\bigcup_{i=1}^{K}\chi_{i}italic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) = ⋃ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, K𝐾Kitalic_K is the number of phases, 1superscript1\mathcal{H}^{1}caligraphic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT denotes one-dimensional Hausdorff measure, Γ=i=1K{χi}Γsuperscriptsubscript𝑖1𝐾subscript𝜒𝑖\Gamma=\cup_{i=1}^{K}\{\partial\chi_{i}\}roman_Γ = ∪ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT { ∂ italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } is the set of all boundaries, and ci=χi(𝐲;𝐱)𝑑𝐱/χi𝑑𝐱subscript𝑐𝑖subscriptsubscript𝜒𝑖𝐲𝐱differential-d𝐱subscriptsubscript𝜒𝑖differential-d𝐱c_{i}=\int_{\chi_{i}}\mathcal{R}(\mathbf{y};\mathbf{x})\;d\mathbf{x}/\int_{% \chi_{i}}d\mathbf{x}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT caligraphic_R ( bold_y ; bold_x ) italic_d bold_x / ∫ start_POSTSUBSCRIPT italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_d bold_x is the intensity average of the phase i𝑖iitalic_i. Here the scale term Pi/Ai=1(χi)/χi𝑑𝐱subscript𝑃𝑖subscript𝐴𝑖superscript1subscript𝜒𝑖subscriptsubscript𝜒𝑖differential-d𝐱P_{i}/A_{i}=\mathcal{H}^{1}(\partial\chi_{i})/\int_{\chi_{i}}d\mathbf{x}italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT / italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = caligraphic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( ∂ italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) / ∫ start_POSTSUBSCRIPT italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_d bold_x is the perimeter over the area of each phase i𝑖iitalic_i. This model (3) automatically finds the number of phases K𝐾Kitalic_K by a greedy algorithm. In this paper, we bound the number of phases to be K{1,2}𝐾12K\in\{1,2\}italic_K ∈ { 1 , 2 }, thus it finds either one or two phases within the response (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ). We define the local edge function to be

𝒲(𝐲;𝐱)=12i=1K|χi|.𝒲𝐲𝐱12superscriptsubscript𝑖1𝐾subscript𝜒𝑖\mathcal{W}(\mathbf{y};\mathbf{x})=\frac{1}{2}\sum_{i=1}^{K}|\nabla\chi_{i}|.caligraphic_W ( bold_y ; bold_x ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | ∇ italic_χ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | . (4)

This represents the edge from the point of view of patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ).

[Step 3] A local response for points on the boundary of texture doesn’t give a good edge information in general, thus we use consensus and collect the segmented patches to determine the edge function V(𝐱)𝑉𝐱V(\mathbf{x})italic_V ( bold_x ), by superposing 𝒲(𝐲;𝐱)𝒲𝐲𝐱\mathcal{W}(\mathbf{y};\mathbf{x})caligraphic_W ( bold_y ; bold_x ) for 𝐱Ωfor-all𝐱Ω\forall\mathbf{x}\in\Omega∀ bold_x ∈ roman_Ω;

V(𝐱)=1|R(𝐱)|𝐲R(𝐱)𝒲(𝐱;𝐲).𝑉𝐱1subscript𝑅𝐱subscript𝐲subscript𝑅𝐱𝒲𝐱𝐲\displaystyle V(\mathbf{x})=\frac{1}{\mathinner{\!\left\lvert\mathcal{B}_{R}(% \mathbf{x})\right\rvert}}\sum_{\mathbf{y}\in\mathcal{B}_{R}(\mathbf{x})}% \mathcal{W}(\mathbf{x};\mathbf{y}).italic_V ( bold_x ) = divide start_ARG 1 end_ARG start_ARG | caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) | end_ARG ∑ start_POSTSUBSCRIPT bold_y ∈ caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) end_POSTSUBSCRIPT caligraphic_W ( bold_x ; bold_y ) . (5)

This becomes a non-binary edge function V(𝐱):Ω[0,1]:𝑉𝐱Ω01V(\mathbf{x})\mathrel{\mathop{:}}\Omega\to[0,1]italic_V ( bold_x ) : roman_Ω → [ 0 , 1 ] representing the ratio of 𝐱𝐱\mathbf{x}bold_x’s neighbors 𝐲R(𝐱)𝐲subscript𝑅𝐱\mathbf{y}\in\mathcal{B}_{R}(\mathbf{x})bold_y ∈ caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) that voted 𝐱𝐱\mathbf{x}bold_x as an edge pixel. This superposition gives consensus among patch responses. Even when the texture boundary is not very clear, points away from the boundary can still give a good information about the edge location.

Refer to caption
Figure 2: Texture Edge detection by Patch consensus (TEP). For each patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) in the given image U𝑈Uitalic_U, the patch response (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) is computed. The patch response is segmented, and the boundary of the phases gives the local edge 𝒲(𝐲,𝐱)𝒲𝐲𝐱\mathcal{W}(\mathbf{y},\mathbf{x})caligraphic_W ( bold_y , bold_x ). The edge function V(𝐱)𝑉𝐱V(\mathbf{x})italic_V ( bold_x ) is computed by the consensus of the local edge function 𝒲(𝐲,𝐱)𝒲𝐲𝐱\mathcal{W}(\mathbf{y},\mathbf{x})caligraphic_W ( bold_y , bold_x ).

The flowchart of the proposed model TEP is presented in Figure 2: for each pixel 𝐱Ω𝐱Ω\mathbf{x}\in\Omegabold_x ∈ roman_Ω and its patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ), the patch responses on a larger domain BR(𝐱)subscript𝐵𝑅𝐱B_{R}(\mathbf{x})italic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) is computed as (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ). We use unsupervised segmentation to segment the patch responses to emphasize the similarities and the differences among these patch responses. The gradient of phases is used to compute the local edges 𝒲(𝐲;𝐱)𝒲𝐲𝐱\mathcal{W}(\mathbf{y};\mathbf{x})caligraphic_W ( bold_y ; bold_x ). Finally, the consensus is used to get the edge map V(𝐱)𝑉𝐱V(\mathbf{x})italic_V ( bold_x ). Since we use the observer patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) as input, the proposed method is self-adaptive to the image without the need of training. This also reduces the number of hyper-parameters needed in filter based approaches. The parameters needed for TEP are the patch width parameter r𝑟ritalic_r of 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ), the large comparison region width parameter R𝑅Ritalic_R, and one regularity parameter λ𝜆\lambdaitalic_λ for the unsupervised multi-phase segmentation.

3 Analytical properties of the proposed model

In this section, we statistically analyze when the texture can be separated by the patch consensus. In particular, we model the texture as random fields, derive the necessary conditions for our model to generate distinguishable patch responses for different textures in the sense of patch-wise Euclidean distance, and study the roles of the patch width parameter r𝑟ritalic_r, and the large comparison region width parameter R𝑅Ritalic_R.

3.1 Texture as Stationary Random Field

Random field models the self-similarity property of the natural stochastic textures well that statistical approaches are proposed for structure-texture decomposition and image denoising [16, 36, 35]. In this paper, we model texture as a two dimensional Gaussian random field [1] defined on pixels ΩΩ\Omegaroman_Ω, and study how the decay of correlation of the texture random field helps to identify texture boundaries from the patch responses for stochastic textures. In the context of discussing image patches as random vectors, we use calligraphic letter 𝒫𝒫\vec{\mathcal{P}}over→ start_ARG caligraphic_P end_ARG to denote random vector and lowercase letter v𝑣\vec{v}over→ start_ARG italic_v end_ARG to denote a concrete vector of the same size as 𝒫𝒫\vec{\mathcal{P}}over→ start_ARG caligraphic_P end_ARG. We start with introducing the definitions of Gaussian random field and its related properties.

Definition 1.

Let 𝐱Ω2𝐱normal-Ωsuperscript2\mathbf{x}\in\Omega\subset\mathbb{Z}^{2}bold_x ∈ roman_Ω ⊂ blackboard_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT be the pixel index. The set of random variables 𝒫={𝒫(𝐱)}𝐱Ω𝒫subscript𝒫𝐱𝐱normal-Ω\mathcal{P}=\{\mathcal{P}(\mathbf{x})\}_{\mathbf{x}\in\Omega}caligraphic_P = { caligraphic_P ( bold_x ) } start_POSTSUBSCRIPT bold_x ∈ roman_Ω end_POSTSUBSCRIPT is a Gaussian random field, if 𝒫(𝐱)=[𝒫(𝐱1),𝒫(𝐱2),,𝒫(𝐱d)]Tnormal-→𝒫𝐱superscript𝒫subscript𝐱1𝒫subscript𝐱2normal-…𝒫subscript𝐱𝑑𝑇\vec{\mathcal{P}}(\mathbf{x})=[\mathcal{P}(\mathbf{x}_{1}),\mathcal{P}(\mathbf% {x}_{2}),\dots,\mathcal{P}(\mathbf{x}_{d})]^{T}over→ start_ARG caligraphic_P end_ARG ( bold_x ) = [ caligraphic_P ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , caligraphic_P ( bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , caligraphic_P ( bold_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is a d𝑑ditalic_d-dimensional Gaussian random vector for arbitrary choices of indices 𝐱1,𝐱2,,𝐱dΩsubscript𝐱1subscript𝐱2normal-…subscript𝐱𝑑normal-Ω\mathbf{x}_{1},\mathbf{x}_{2},\dots,\mathbf{x}_{d}\in\Omegabold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , bold_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ∈ roman_Ω, where d+𝑑superscriptd\in\mathbb{Z}^{+}italic_d ∈ blackboard_Z start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT and 𝒫(𝐱i)𝒫subscript𝐱𝑖\mathcal{P}(\mathbf{x}_{i})caligraphic_P ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) denotes a Gaussian variable indexed by pixel location 𝐱isubscript𝐱𝑖\mathbf{x}_{i}bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. The probability density of 𝒫(𝐱)=vnormal-→𝒫𝐱normal-→𝑣\vec{\mathcal{P}}(\mathbf{x})=\vec{v}over→ start_ARG caligraphic_P end_ARG ( bold_x ) = over→ start_ARG italic_v end_ARG is given by

ϕ(v)=1(2π)d/2|Σ|1/2e12(vμp)TΣp1(vμp),italic-ϕ𝑣1superscript2𝜋𝑑2superscriptΣ12superscript𝑒12superscript𝑣subscript𝜇𝑝𝑇superscriptsubscriptΣ𝑝1𝑣subscript𝜇𝑝\displaystyle\phi(\vec{v})=\frac{1}{(2\pi)^{d/2}|\Sigma|^{1/2}}e^{-\frac{1}{2}% (\vec{v}-\vec{\mu}_{p})^{T}\Sigma_{p}^{-1}(\vec{v}-\vec{\mu}_{p})},italic_ϕ ( over→ start_ARG italic_v end_ARG ) = divide start_ARG 1 end_ARG start_ARG ( 2 italic_π ) start_POSTSUPERSCRIPT italic_d / 2 end_POSTSUPERSCRIPT | roman_Σ | start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT end_ARG italic_e start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( over→ start_ARG italic_v end_ARG - over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over→ start_ARG italic_v end_ARG - over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ,

where μp=𝔼(𝒫(𝐱))subscriptnormal-→𝜇𝑝𝔼normal-→𝒫𝐱\vec{\mu}_{p}=\mathbb{E}(\vec{\mathcal{P}}(\mathbf{x}))over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = blackboard_E ( over→ start_ARG caligraphic_P end_ARG ( bold_x ) ) is the expectation vector and Σp=Cov(𝒫(𝐱))subscriptnormal-Σ𝑝normal-Covnormal-→𝒫𝐱\Sigma_{p}=\mathrm{Cov}(\vec{\mathcal{P}}(\mathbf{x}))roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = roman_Cov ( over→ start_ARG caligraphic_P end_ARG ( bold_x ) ) is the nonnegative definite d×d𝑑𝑑d\times ditalic_d × italic_d covariance matrix.

A Gaussian random field is completely determined by its first and the second moments, i.e., its mean μ𝜇\vec{\mu}over→ start_ARG italic_μ end_ARG and covariance ΣΣ\Sigmaroman_Σ, and Gaussian distribution is suitable for many natural stochastic textures [36]. In this paper, we assume fast decaying of the correlation with respect to pixelwise distance 𝐱𝐲2subscriptnorm𝐱𝐲2\|\mathbf{x}-\mathbf{y}\|_{2}∥ bold_x - bold_y ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and choose a squared exponential covariance function such as

Cov(𝒫(𝐱1),𝒫(𝐱2))=γp(𝐱1,𝐱2)=σp2exp(𝐱1𝐱2222lp2),Cov𝒫subscript𝐱1𝒫subscript𝐱2subscript𝛾𝑝subscript𝐱1subscript𝐱2superscriptsubscript𝜎𝑝2superscriptsubscriptnormsubscript𝐱1subscript𝐱2222superscriptsubscript𝑙𝑝2\displaystyle\mathrm{Cov}(\mathcal{P}(\mathbf{x}_{1}),\mathcal{P}(\mathbf{x}_{% 2}))=\gamma_{p}(\mathbf{x}_{1},\mathbf{x}_{2})=\sigma_{p}^{2}\exp\left(-\frac{% \|\mathbf{x}_{1}-\mathbf{x}_{2}\|_{2}^{2}}{2l_{p}^{2}}\right),roman_Cov ( caligraphic_P ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , caligraphic_P ( bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) = italic_γ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG ∥ bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) , (6)

which makes the random field 𝒫𝒫\mathcal{P}caligraphic_P stationary and isotropic, here σp>0subscript𝜎𝑝0\sigma_{p}>0italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT > 0 is the magnitude parameter and lp>0subscript𝑙𝑝0l_{p}>0italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT > 0 is the decaying rate parameter. We remark that we choose the squared exponential decaying covariance (6) for the convenience of computation, and the derivations of this section can be generalized to decaying covariance functions of any order. For textures with spatially repetitive patterns, it is natural to assume the corresponding random field to be stationary [36].

Definition 2.

A Gaussian random field 𝒫𝒫\mathcal{P}caligraphic_P is called stationary, if for every 𝐱1,𝐱2,,𝐱dΩsubscript𝐱1subscript𝐱2normal-…subscript𝐱𝑑normal-Ω\mathbf{x}_{1},\mathbf{x}_{2},\dots,\mathbf{x}_{d}\in\Omegabold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , bold_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ∈ roman_Ω and 𝐳2𝐳superscript2\mathbf{z}\in\mathbb{Z}^{2}bold_z ∈ blackboard_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, the joint distribution of the Gaussian random vector [𝒫(𝐱1+𝐳),𝒫(𝐱2+𝐳),,𝒫(𝐱d+𝐳)]𝒫subscript𝐱1𝐳𝒫subscript𝐱2𝐳normal-…𝒫subscript𝐱𝑑𝐳[\mathcal{P}(\mathbf{x}_{1}+\mathbf{z}),\mathcal{P}(\mathbf{x}_{2}+\mathbf{z})% ,\dots,\mathcal{P}(\mathbf{x}_{d}+\mathbf{z})][ caligraphic_P ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + bold_z ) , caligraphic_P ( bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + bold_z ) , … , caligraphic_P ( bold_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT + bold_z ) ] is independent of 𝐳𝐳\mathbf{z}bold_z.

Definition 3.

A stationary Gaussian random field 𝒫𝒫\mathcal{P}caligraphic_P is called isotropic, if its covariance function γp(𝐱,𝐲)subscript𝛾𝑝𝐱𝐲\gamma_{p}(\mathbf{x},\mathbf{y})italic_γ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_x , bold_y ) only depends on the relative distance of pixels 𝐱𝐱\mathbf{x}bold_x and 𝐲𝐲\mathbf{y}bold_y, i.e., γp(𝐱𝐲)=γp(𝐱𝐲2)subscript𝛾𝑝𝐱𝐲subscript𝛾𝑝subscriptnorm𝐱𝐲2\gamma_{p}(\mathbf{x}-\mathbf{y})=\gamma_{p}(\|{\mathbf{x}}-\mathbf{y}\|_{2})italic_γ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_x - bold_y ) = italic_γ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( ∥ bold_x - bold_y ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ).

An immediate consequence of Definition 2 is that the distribution of the d𝑑ditalic_d-dimensional image patch is independent of the choice of the patch center, i.e. 𝒫(𝐱)𝒩(μp,Σp)similar-to𝒫𝐱𝒩subscript𝜇𝑝subscriptΣ𝑝\vec{\mathcal{P}}(\mathbf{x})\sim\mathcal{N}\left(\vec{\mu}_{p},\Sigma_{p}\right)over→ start_ARG caligraphic_P end_ARG ( bold_x ) ∼ caligraphic_N ( over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) for all 𝐱Ω𝐱Ω\mathbf{x}\in\Omegabold_x ∈ roman_Ω. The patch response (𝐲,𝐱)𝐲𝐱\mathcal{R}(\mathbf{y},\mathbf{x})caligraphic_R ( bold_y , bold_x ) involves the observation of two patches 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) and 𝒫(𝐲)𝒫𝐲\vec{\mathcal{P}}(\mathbf{y})over→ start_ARG caligraphic_P end_ARG ( bold_y ). The mutual distribution of the two patches follows

(𝒫(𝐱)𝒫(𝐲))𝒩((μpμp),(ΣpΣc(τ)ΣcT(τ)Σp))similar-tomatrix𝒫𝐱𝒫𝐲𝒩matrixsubscript𝜇𝑝subscript𝜇𝑝matrixsubscriptΣ𝑝subscriptΣc𝜏superscriptsubscriptΣc𝑇𝜏subscriptΣ𝑝\displaystyle\begin{pmatrix}\vec{\mathcal{P}}(\mathbf{x})\\ \vec{\mathcal{P}}(\mathbf{y})\end{pmatrix}\sim\mathcal{N}\left(\begin{pmatrix}% \vec{\mu}_{p}\\ \vec{\mu}_{p}\end{pmatrix},\begin{pmatrix}\Sigma_{p}&\Sigma_{\mathrm{c}}(\tau)% \\ \Sigma_{\mathrm{c}}^{T}(\tau)&\Sigma_{p}\end{pmatrix}\right)( start_ARG start_ROW start_CELL over→ start_ARG caligraphic_P end_ARG ( bold_x ) end_CELL end_ROW start_ROW start_CELL over→ start_ARG caligraphic_P end_ARG ( bold_y ) end_CELL end_ROW end_ARG ) ∼ caligraphic_N ( ( start_ARG start_ROW start_CELL over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) , ( start_ARG start_ROW start_CELL roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) end_CELL end_ROW start_ROW start_CELL roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_τ ) end_CELL start_CELL roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ) (13)

where the d×d𝑑𝑑d\times ditalic_d × italic_d covariance matrix Σc(τ)=Cov(𝒫(𝐱),𝒫(𝐲))subscriptΣc𝜏Cov𝒫𝐱𝒫𝐲\Sigma_{\mathrm{c}}(\tau)=\mathrm{Cov}(\vec{\mathcal{P}}(\mathbf{x}),\vec{% \mathcal{P}}(\mathbf{y}))roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) = roman_Cov ( over→ start_ARG caligraphic_P end_ARG ( bold_x ) , over→ start_ARG caligraphic_P end_ARG ( bold_y ) ) only depends on the relative distance τ=𝐲𝐱2𝜏subscriptnorm𝐲𝐱2\tau=\|\mathbf{y}-\mathbf{x}\|_{2}italic_τ = ∥ bold_y - bold_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT of pixels 𝐱,𝐲𝐱𝐲\mathbf{x},\mathbf{y}bold_x , bold_y, as a consequence of Definition 3. The entries of the covariance function is given by (6), i.e.,

Σc(τ)[i,j]=σp2exp(τi,j22lp2)subscriptΣc𝜏𝑖𝑗superscriptsubscript𝜎𝑝2superscriptsubscript𝜏𝑖𝑗22superscriptsubscript𝑙𝑝2\Sigma_{\mathrm{c}}(\tau)[i,j]=\sigma_{p}^{2}\exp\left(-\frac{\tau_{i,j}^{2}}{% 2l_{p}^{2}}\right)roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) [ italic_i , italic_j ] = italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG italic_τ start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )

where τi,jsubscript𝜏𝑖𝑗\tau_{i,j}italic_τ start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT denotes the relative distance of i𝑖iitalic_i’th pixel in 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) and j𝑗jitalic_j’th pixel in 𝒫(𝐲)𝒫𝐲\vec{\mathcal{P}}(\mathbf{y})over→ start_ARG caligraphic_P end_ARG ( bold_y ). Let d=(2r+1)2𝑑superscript2𝑟12d=(2r+1)^{2}italic_d = ( 2 italic_r + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as in section 2, and we assume τ>22r𝜏22𝑟\tau>2\sqrt{2}ritalic_τ > 2 square-root start_ARG 2 end_ARG italic_r, which guarantees that 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) and 𝒫(𝐲)𝒫𝐲\vec{\mathcal{P}}(\mathbf{y})over→ start_ARG caligraphic_P end_ARG ( bold_y ) do not overlap, hence τi,j>τ22rsubscript𝜏𝑖𝑗𝜏22𝑟\tau_{i,j}>\tau-2\sqrt{2}ritalic_τ start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT > italic_τ - 2 square-root start_ARG 2 end_ARG italic_r for all i,j[1,(2r+1)2]𝑖𝑗1superscript2𝑟12i,j\in[1,(2r+1)^{2}]\cap\mathbb{Z}italic_i , italic_j ∈ [ 1 , ( 2 italic_r + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] ∩ blackboard_Z. This leads to an upper bound of the Frobenius norm of the covariance matrix ΣcsubscriptΣc\Sigma_{\mathrm{c}}roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT:

Σc(τ)F=i,j=1(2r+1)2σp4exp(τi,j2lp2)σp2(2r+1)2exp((τ22r)22lp2).subscriptnormsubscriptΣc𝜏𝐹superscriptsubscript𝑖𝑗1superscript2𝑟12superscriptsubscript𝜎𝑝4superscriptsubscript𝜏𝑖𝑗2superscriptsubscript𝑙𝑝2superscriptsubscript𝜎𝑝2superscript2𝑟12superscript𝜏22𝑟22superscriptsubscript𝑙𝑝2\displaystyle\|\Sigma_{\mathrm{c}}(\tau)\|_{F}\;\;=\;\;\sqrt{\sum_{i,j=1}^{(2r% +1)^{2}}\sigma_{p}^{4}\exp{\left(-\frac{\tau_{i,j}^{2}}{l_{p}^{2}}\right)}}\;% \;\leq\;\;\sigma_{p}^{2}(2r+1)^{2}\exp\left(-\frac{(\tau-2\sqrt{2}r)^{2}}{2l_{% p}^{2}}\right).∥ roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) ∥ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT = square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 2 italic_r + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG italic_τ start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) end_ARG ≤ italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 2 italic_r + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG ( italic_τ - 2 square-root start_ARG 2 end_ARG italic_r ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) . (14)

Fixing r𝑟ritalic_r, the cross term Σc(τ)OdsubscriptΣc𝜏subscript𝑂𝑑\Sigma_{\mathrm{c}}(\tau)\to O_{d}roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) → italic_O start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT as τ𝜏\tau\to\inftyitalic_τ → ∞, where Odsubscript𝑂𝑑O_{d}italic_O start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT is d×dsuperscript𝑑𝑑\mathbb{R}^{d\times d}blackboard_R start_POSTSUPERSCRIPT italic_d × italic_d end_POSTSUPERSCRIPT null matrix. Comparing (6) and (14), the decaying rate of correlation of the image patches is consistent with the rate of the pixel-wise covariance function γ(τ)𝛾𝜏\gamma(\tau)italic_γ ( italic_τ ).

The conditional distribution of 𝒫(𝐲)𝒫𝐲\vec{\mathcal{P}}(\mathbf{y})over→ start_ARG caligraphic_P end_ARG ( bold_y ) with respect to 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) is again multivariate Gaussian, which is fully determined by its mean and variance functions

μp(𝐲;𝐱)subscript𝜇𝑝𝐲𝐱\displaystyle\vec{\mu}_{p}(\mathbf{y};\mathbf{x})over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) =μp+ΣcT(τ)Σp1(𝒫(𝐱)μp),absentsubscript𝜇𝑝superscriptsubscriptΣc𝑇𝜏superscriptsubscriptΣ𝑝1𝒫𝐱subscript𝜇𝑝\displaystyle=\vec{\mu}_{p}+\Sigma_{\mathrm{c}}^{T}(\tau)\Sigma_{p}^{-1}(\vec{% \mathcal{P}}(\mathbf{x})-\vec{\mu}_{p}),= over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT + roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_τ ) roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over→ start_ARG caligraphic_P end_ARG ( bold_x ) - over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) , (15)
Σp(𝐲;𝐱)subscriptΣ𝑝𝐲𝐱\displaystyle\Sigma_{p}(\mathbf{y};\mathbf{x})roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) =ΣpΣcT(τ)Σp1Σc(τ).absentsubscriptΣ𝑝superscriptsubscriptΣc𝑇𝜏superscriptsubscriptΣ𝑝1subscriptΣc𝜏\displaystyle=\Sigma_{p}-\Sigma_{\mathrm{c}}^{T}(\tau)\Sigma_{p}^{-1}\Sigma_{% \mathrm{c}}(\tau).= roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_τ ) roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) . (16)

Combining with (14), μp(𝐲;𝐱)subscript𝜇𝑝𝐲𝐱\vec{\mu}_{p}(\mathbf{y};\mathbf{x})over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) and Σp(𝐲;𝐱)subscriptΣ𝑝𝐲𝐱\Sigma_{p}(\mathbf{y};\mathbf{x})roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) converge to μpsubscript𝜇𝑝\vec{\mu}_{p}over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT and ΣpsubscriptΣ𝑝\Sigma_{p}roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT as τ𝜏\tau\to\inftyitalic_τ → ∞, i.e,

limτμp(𝐲;𝐱)μpF=0,limτΣp(𝐲;𝐱)ΣpF=0.formulae-sequencesubscript𝜏subscriptnormsubscript𝜇𝑝𝐲𝐱subscript𝜇𝑝𝐹0subscript𝜏subscriptnormsubscriptΣ𝑝𝐲𝐱subscriptΣ𝑝𝐹0\displaystyle\lim_{\tau\to\infty}\|\vec{\mu}_{p}(\mathbf{y};\mathbf{x})-\vec{% \mu}_{p}\|_{F}=0,\quad\lim_{\tau\to\infty}\|\Sigma_{p}(\mathbf{y};\mathbf{x})-% \Sigma_{p}\|_{F}=0.roman_lim start_POSTSUBSCRIPT italic_τ → ∞ end_POSTSUBSCRIPT ∥ over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) - over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT = 0 , roman_lim start_POSTSUBSCRIPT italic_τ → ∞ end_POSTSUBSCRIPT ∥ roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) - roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT = 0 .

3.2 Characteristics of the patch response

In the following, we provide the main results of the section. In order to compute the expectation of (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ), we need the following lemma:

Lemma 1 (Expectation of quadratic form [27]).

Let 𝒫normal-→𝒫\vec{\mathcal{P}}over→ start_ARG caligraphic_P end_ARG be a d×1𝑑1d\times 1italic_d × 1 random vector with mean μpsubscriptnormal-→𝜇𝑝\vec{\mu}_{p}over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT and variance Σpsubscriptnormal-Σ𝑝\Sigma_{p}roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT, and let A𝐴Aitalic_A be an d×d𝑑𝑑d\times ditalic_d × italic_d symmetric matrix. Then

𝔼(𝒫TA𝒫)=μpTAμp+tr(AΣp)𝔼superscript𝒫𝑇𝐴𝒫superscriptsubscript𝜇𝑝𝑇𝐴subscript𝜇𝑝tr𝐴subscriptΣ𝑝\displaystyle\mathbb{E}(\vec{\mathcal{P}}^{T}A\vec{\mathcal{P}})=\vec{\mu}_{p}% ^{T}A\vec{\mu}_{p}+\mathrm{tr}(A\Sigma_{p})blackboard_E ( over→ start_ARG caligraphic_P end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_A over→ start_ARG caligraphic_P end_ARG ) = over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_A over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT + roman_tr ( italic_A roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT )

where tr()normal-trnormal-⋅\mathrm{tr}(\cdot)roman_tr ( ⋅ ) is the trace operator.

Theorem 1.

Let the random field 𝒫𝒫\mathcal{P}caligraphic_P be defined as in Definition 1, equipped with the covariance function (6). Then the patch response (𝐲;𝐱)=1d𝒫(𝐲)𝒫(𝐱)22𝐲𝐱1𝑑superscriptsubscriptnormnormal-→𝒫𝐲normal-→𝒫𝐱22\mathcal{R}(\mathbf{y};\mathbf{x})=\frac{1}{d}\|\vec{\mathcal{P}}(\mathbf{y})-% \vec{\mathcal{P}}(\mathbf{x})\|_{2}^{2}caligraphic_R ( bold_y ; bold_x ) = divide start_ARG 1 end_ARG start_ARG italic_d end_ARG ∥ over→ start_ARG caligraphic_P end_ARG ( bold_y ) - over→ start_ARG caligraphic_P end_ARG ( bold_x ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, where d=(2r+1)2𝑑superscript2𝑟12d=(2r+1)^{2}italic_d = ( 2 italic_r + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, has expectation

𝔼((𝐲;𝐱))=2σp2(1exp(τ22lp2)).𝔼𝐲𝐱2superscriptsubscript𝜎𝑝21superscript𝜏22superscriptsubscript𝑙𝑝2\displaystyle\mathbb{E}\left(\mathcal{R}(\mathbf{y};\mathbf{x})\right)=2\sigma% _{p}^{2}\left(1-\exp(-\frac{\tau^{2}}{2l_{p}^{2}})\right).blackboard_E ( caligraphic_R ( bold_y ; bold_x ) ) = 2 italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 - roman_exp ( - divide start_ARG italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ) . (17)

The proof is presented in Appendix A. Theorem 1 describes the expectation of (𝐲;𝐱)𝐲𝐱\mathcal{R(\mathbf{y};\mathbf{x})}caligraphic_R ( bold_y ; bold_x ) when the patch centered at location 𝐲𝐲\mathbf{y}bold_y is drawn from 𝒫𝒫\mathcal{P}caligraphic_P.

When it is not, i.e. comparing two different textures, let 𝒬(𝐲)𝒩(μq,Σq)similar-to𝒬𝐲𝒩subscript𝜇𝑞subscriptΣ𝑞\vec{\mathcal{Q}}(\mathbf{y})\sim\mathcal{N}\left(\vec{\mu}_{q},\Sigma_{q}\right)over→ start_ARG caligraphic_Q end_ARG ( bold_y ) ∼ caligraphic_N ( over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT , roman_Σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) be another random field independent from 𝒫𝒫\mathcal{P}caligraphic_P, where the covariance function is given as

Cov(𝒬(𝐱1),𝒬(𝐲2))=γq(𝐱1,𝐱2)=σq2exp(𝐱1𝐱2222lq2)Cov𝒬subscript𝐱1𝒬subscript𝐲2subscript𝛾𝑞subscript𝐱1subscript𝐱2superscriptsubscript𝜎𝑞2superscriptsubscriptnormsubscript𝐱1subscript𝐱2222superscriptsubscript𝑙𝑞2\displaystyle\mathrm{Cov}(\mathcal{Q}(\mathbf{x}_{1}),\mathcal{Q}(\mathbf{y}_{% 2}))=\gamma_{q}(\mathbf{x}_{1},\mathbf{x}_{2})=\sigma_{q}^{2}\exp\left(-\frac{% \|\mathbf{x}_{1}-\mathbf{x}_{2}\|_{2}^{2}}{2l_{q}^{2}}\right)roman_Cov ( caligraphic_Q ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , caligraphic_Q ( bold_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) = italic_γ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = italic_σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG ∥ bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_l start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG )

for some σq,lq>0subscript𝜎𝑞subscript𝑙𝑞0\sigma_{q},l_{q}>0italic_σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT , italic_l start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT > 0. If the patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) is observing 𝒬(𝐲)𝒬𝐲\vec{\mathcal{Q}}(\mathbf{y})over→ start_ARG caligraphic_Q end_ARG ( bold_y ), we simply have

μq(𝐲;𝐱)=μq,Σq(𝐲;𝐱)=Σq,formulae-sequencesubscript𝜇𝑞𝐲𝐱subscript𝜇𝑞subscriptΣ𝑞𝐲𝐱subscriptΣ𝑞\displaystyle\vec{\mu}_{q}(\mathbf{y};\mathbf{x})=\vec{\mu}_{q},\quad\Sigma_{q% }(\mathbf{y};\mathbf{x})=\Sigma_{q},over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ( bold_y ; bold_x ) = over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT , roman_Σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ( bold_y ; bold_x ) = roman_Σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ,

since Cov(𝒫(𝐱),𝒬(𝐲))=0Cov𝒫𝐱𝒬𝐲0\mathrm{Cov}(\mathcal{P}(\mathbf{x}),\mathcal{Q}(\mathbf{y}))=0roman_Cov ( caligraphic_P ( bold_x ) , caligraphic_Q ( bold_y ) ) = 0. The expectation of the patch response is given as

𝔼(1d𝒫(𝐱)𝒬(𝐲)22)𝔼1𝑑superscriptsubscriptnorm𝒫𝐱𝒬𝐲22\displaystyle\mathbb{E}\left(\frac{1}{d}\|\vec{\mathcal{P}}(\mathbf{x})-\vec{% \mathcal{Q}}(\mathbf{y})\|_{2}^{2}\right)blackboard_E ( divide start_ARG 1 end_ARG start_ARG italic_d end_ARG ∥ over→ start_ARG caligraphic_P end_ARG ( bold_x ) - over→ start_ARG caligraphic_Q end_ARG ( bold_y ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) =1d𝔼𝐱(𝔼𝐲|𝐱(𝒫(𝐱)T𝒫(𝐱)2𝒫(𝐱)T𝒬(𝐲)+𝒬(𝐲)T𝒬(𝐲)))absent1𝑑subscript𝔼𝐱subscript𝔼conditional𝐲𝐱𝒫superscript𝐱𝑇𝒫𝐱2𝒫superscript𝐱𝑇𝒬𝐲𝒬superscript𝐲𝑇𝒬𝐲\displaystyle=\frac{1}{d}\mathbb{E}_{\mathbf{x}}\left(\mathbb{E}_{\mathbf{y}|% \mathbf{x}}\left(\vec{\mathcal{P}}(\mathbf{x})^{T}\vec{\mathcal{P}}(\mathbf{x}% )-2\vec{\mathcal{P}}(\mathbf{x})^{T}\vec{\mathcal{Q}}(\mathbf{y})+\vec{% \mathcal{Q}}(\mathbf{y})^{T}\vec{\mathcal{Q}}(\mathbf{y})\right)\right)= divide start_ARG 1 end_ARG start_ARG italic_d end_ARG blackboard_E start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT ( blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( over→ start_ARG caligraphic_P end_ARG ( bold_x ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over→ start_ARG caligraphic_P end_ARG ( bold_x ) - 2 over→ start_ARG caligraphic_P end_ARG ( bold_x ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over→ start_ARG caligraphic_Q end_ARG ( bold_y ) + over→ start_ARG caligraphic_Q end_ARG ( bold_y ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over→ start_ARG caligraphic_Q end_ARG ( bold_y ) ) )
=1d(μpTμp+tr(Σp)2μpTμq+μqTμq+tr(Σq))absent1𝑑superscriptsubscript𝜇𝑝𝑇subscript𝜇𝑝trsubscriptΣ𝑝2superscriptsubscript𝜇𝑝𝑇subscript𝜇𝑞superscriptsubscript𝜇𝑞𝑇subscript𝜇𝑞trsubscriptΣ𝑞\displaystyle=\frac{1}{d}\left(\vec{\mu}_{p}^{T}\vec{\mu}_{p}+\mathrm{tr}(% \Sigma_{p})-2\vec{\mu}_{p}^{T}\vec{\mu}_{q}+\vec{\mu_{q}}^{T}\vec{\mu_{q}}+% \mathrm{tr}(\Sigma_{q})\right)= divide start_ARG 1 end_ARG start_ARG italic_d end_ARG ( over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT + roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) - 2 over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT + over→ start_ARG italic_μ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over→ start_ARG italic_μ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT end_ARG + roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) )
=1d(μpμq22+tr(Σp)+tr(Σq))=(μpμq)2+σp2+σq2.absent1𝑑superscriptsubscriptnormsubscript𝜇𝑝subscript𝜇𝑞22trsubscriptΣ𝑝trsubscriptΣ𝑞superscriptsubscript𝜇𝑝subscript𝜇𝑞2superscriptsubscript𝜎𝑝2superscriptsubscript𝜎𝑞2\displaystyle=\frac{1}{d}\left(\|\vec{\mu}_{p}-\vec{\mu}_{q}\|_{2}^{2}+\mathrm% {tr}(\Sigma_{p})+\mathrm{tr}(\Sigma_{q})\right)=(\mu_{p}-\mu_{q})^{2}+\sigma_{% p}^{2}+\sigma_{q}^{2}.= divide start_ARG 1 end_ARG start_ARG italic_d end_ARG ( ∥ over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) + roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) ) = ( italic_μ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (18)

Suppose there are two textures 𝒫,𝒬𝒫𝒬\mathcal{P},\mathcal{Q}caligraphic_P , caligraphic_Q in R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) while the patch in r(𝐱)subscript𝑟𝐱\mathcal{B}_{r}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_x ) is drawn from 𝒫𝒫\mathcal{P}caligraphic_P. Texture edge can be detected if the quantities (17) and (18) differs, preferably significantly differs. This difference is described by

diff(τ)diff𝜏\displaystyle\mathrm{diff}(\tau)roman_diff ( italic_τ ) =|𝔼(1d𝒫(𝐲)𝒫(𝐱)221d𝒫(𝐱)𝒬(𝐲)22)|absent𝔼1𝑑superscriptsubscriptnorm𝒫𝐲𝒫𝐱221𝑑superscriptsubscriptnorm𝒫𝐱𝒬𝐲22\displaystyle=\mathinner{\!\left\lvert\mathbb{E}\left(\frac{1}{d}\|\vec{% \mathcal{P}}(\mathbf{y})-\vec{\mathcal{P}}(\mathbf{x})\|_{2}^{2}-\frac{1}{d}\|% \vec{\mathcal{P}}(\mathbf{x})-\vec{\mathcal{Q}}(\mathbf{y})\|_{2}^{2}\right)% \right\rvert}= start_ATOM | blackboard_E ( divide start_ARG 1 end_ARG start_ARG italic_d end_ARG ∥ over→ start_ARG caligraphic_P end_ARG ( bold_y ) - over→ start_ARG caligraphic_P end_ARG ( bold_x ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_d end_ARG ∥ over→ start_ARG caligraphic_P end_ARG ( bold_x ) - over→ start_ARG caligraphic_Q end_ARG ( bold_y ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) | end_ATOM
=|(μpμq)2+(σp2σq2)2σp2exp(τ22lp2)|.absentsuperscriptsubscript𝜇𝑝subscript𝜇𝑞2superscriptsubscript𝜎𝑝2superscriptsubscript𝜎𝑞22superscriptsubscript𝜎𝑝2superscript𝜏22superscriptsubscript𝑙𝑝2\displaystyle=\mathinner{\!\left\lvert(\mu_{p}-\mu_{q})^{2}+(\sigma_{p}^{2}-% \sigma_{q}^{2})-2\sigma_{p}^{2}\exp{(-\frac{\tau^{2}}{2l_{p}^{2}})}\right% \rvert}.= start_ATOM | ( italic_μ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) - 2 italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) | end_ATOM . (19)

Notice that this difference (19) is a function of τ𝜏\tauitalic_τ. For larger τ𝜏\tauitalic_τ, this separation is clearer, yet, more likely 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) may encounter another texture 𝒬(𝐲)𝒬𝐲\vec{\mathcal{Q}}(\mathbf{y})over→ start_ARG caligraphic_Q end_ARG ( bold_y ) in a real image when 𝐲𝐲\mathbf{y}bold_y is far away from 𝐱𝐱\mathbf{x}bold_x. It is rather important to search for edges in (;𝐱)𝐱\mathcal{R}(\cdot;\mathbf{x})caligraphic_R ( ⋅ ; bold_x ) in the region that τ𝜏\tauitalic_τ is large.

3.3 Stability of the patch response w.r.t. the patch width parameter r𝑟ritalic_r

We explore how the value of the patch response (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) concentrates to its expectation with respect to the patch width parameter r𝑟ritalic_r. Since (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) is a quadratic form of two Gaussian vectors, its distribution can be described by a variation of the χ2superscript𝜒2\chi^{2}italic_χ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT distribution [2].

Theorem 2.

Let 𝒫𝒫\mathcal{P}caligraphic_P be defined as in Theorem 1, then the patch response (𝐲;𝐱)|𝒫(𝐱)=vevaluated-at𝐲𝐱normal-→𝒫𝐱normal-→𝑣\mathcal{R}(\mathbf{y};\mathbf{x})|_{\vec{\mathcal{P}}(\mathbf{x})=\vec{v}}caligraphic_R ( bold_y ; bold_x ) | start_POSTSUBSCRIPT over→ start_ARG caligraphic_P end_ARG ( bold_x ) = over→ start_ARG italic_v end_ARG end_POSTSUBSCRIPT follows generalized χ2superscript𝜒2\chi^{2}italic_χ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT distribution [2] in 𝒫(𝐲)normal-→𝒫𝐲\vec{\mathcal{P}}(\mathbf{y})over→ start_ARG caligraphic_P end_ARG ( bold_y ). Assuming 1/dvμp22𝒪(σp2)similar-to1𝑑superscriptsubscriptnormnormal-→𝑣subscriptnormal-→𝜇𝑝22𝒪superscriptsubscript𝜎𝑝21/d\|\vec{v}-\vec{\mu}_{p}\|_{2}^{2}\sim\mathcal{O}(\sigma_{p}^{2})1 / italic_d ∥ over→ start_ARG italic_v end_ARG - over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∼ caligraphic_O ( italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), the variance of the patch response becomes

Var((𝐲;𝐱)|𝒫(𝐱)=v)𝒪(σp4r2).similar-toVarevaluated-at𝐲𝐱𝒫𝐱𝑣𝒪superscriptsubscript𝜎𝑝4superscript𝑟2\displaystyle\mathrm{Var}\left(\mathcal{R}(\mathbf{y};\mathbf{x})|_{\vec{% \mathcal{P}}(\mathbf{x})=\vec{v}}\right)\sim\mathcal{O}(\frac{\sigma_{p}^{4}}{% r^{2}}).roman_Var ( caligraphic_R ( bold_y ; bold_x ) | start_POSTSUBSCRIPT over→ start_ARG caligraphic_P end_ARG ( bold_x ) = over→ start_ARG italic_v end_ARG end_POSTSUBSCRIPT ) ∼ caligraphic_O ( divide start_ARG italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG start_ARG italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) .
Proof.

Fix 𝒫(𝐱)=v𝒫𝐱𝑣\vec{\mathcal{P}}(\mathbf{x})=\vec{v}over→ start_ARG caligraphic_P end_ARG ( bold_x ) = over→ start_ARG italic_v end_ARG, and denote 𝒮:=1d(𝒫(𝐲)|𝒫(𝐱)=vv)assign𝒮1𝑑evaluated-at𝒫𝐲𝒫𝐱𝑣𝑣\vec{\mathcal{S}}\vcentcolon=\frac{1}{\sqrt{d}}\left(\vec{\mathcal{P}}(\mathbf% {y})|_{\vec{\mathcal{P}}(\mathbf{x})=\vec{v}}-\vec{v}\right)over→ start_ARG caligraphic_S end_ARG := divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_d end_ARG end_ARG ( over→ start_ARG caligraphic_P end_ARG ( bold_y ) | start_POSTSUBSCRIPT over→ start_ARG caligraphic_P end_ARG ( bold_x ) = over→ start_ARG italic_v end_ARG end_POSTSUBSCRIPT - over→ start_ARG italic_v end_ARG ), then 𝒮𝒮\vec{\mathcal{S}}over→ start_ARG caligraphic_S end_ARG follows the Gaussian distribution 𝒮𝒩(μ*,Σ*)similar-to𝒮𝒩subscript𝜇subscriptΣ\vec{\mathcal{S}}\sim\mathcal{N}\left(\vec{\mu}_{*},\Sigma_{*}\right)over→ start_ARG caligraphic_S end_ARG ∼ caligraphic_N ( over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT , roman_Σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ), where according to (16), we have

μ*=1d(𝔼(𝒫(𝐲)|𝒫(𝐱)=v)v),andΣ*=1d(ΣpΣcT(τ)Σp1Σc(τ)).formulae-sequencesubscript𝜇1𝑑𝔼evaluated-at𝒫𝐲𝒫𝐱𝑣𝑣andsubscriptΣ1𝑑subscriptΣ𝑝superscriptsubscriptΣc𝑇𝜏superscriptsubscriptΣ𝑝1subscriptΣc𝜏\displaystyle\vec{\mu}_{*}=\frac{1}{\sqrt{d}}\left(\mathbb{E}\left(\vec{% \mathcal{P}}(\mathbf{y})|_{\vec{\mathcal{P}}(\mathbf{x})=\vec{v}}\right)-\vec{% v}\right),\mathrm{~{}~{}and~{}~{}}\Sigma_{*}=\frac{1}{d}\left(\Sigma_{p}-% \Sigma_{\mathrm{c}}^{T}(\tau)\Sigma_{p}^{-1}\Sigma_{\mathrm{c}}(\tau)\right).over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_d end_ARG end_ARG ( blackboard_E ( over→ start_ARG caligraphic_P end_ARG ( bold_y ) | start_POSTSUBSCRIPT over→ start_ARG caligraphic_P end_ARG ( bold_x ) = over→ start_ARG italic_v end_ARG end_POSTSUBSCRIPT ) - over→ start_ARG italic_v end_ARG ) , roman_and roman_Σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_d end_ARG ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_τ ) roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) ) .

Let Q𝑄Qitalic_Q be an orthogonal matrix that diagonalize Σ*subscriptΣ\Sigma_{*}roman_Σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT, that is, QTΣ*Q=diag(λ1,λ2,,λd)=Λsuperscript𝑄𝑇subscriptΣ𝑄diagsubscript𝜆1subscript𝜆2subscript𝜆𝑑ΛQ^{T}\Sigma_{*}Q=\text{diag}(\lambda_{1},\lambda_{2},\dots,\lambda_{d})=\Lambdaitalic_Q start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT italic_Q = diag ( italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_λ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) = roman_Λ, where λi>0subscript𝜆𝑖0\lambda_{i}>0italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > 0 are the eigenvalues of Σ*subscriptΣ\Sigma_{*}roman_Σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT. Define a new random vector

𝒰=QTΣ*12(𝒮μ*),𝒰superscript𝑄𝑇subscriptsuperscriptΣ12𝒮subscript𝜇\vec{\mathcal{U}}=Q^{T}\Sigma^{-\frac{1}{2}}_{*}(\vec{\mathcal{S}}-\vec{\mu}_{% *}),over→ start_ARG caligraphic_U end_ARG = italic_Q start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ( over→ start_ARG caligraphic_S end_ARG - over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ,

here 𝒰𝒰\vec{\mathcal{U}}over→ start_ARG caligraphic_U end_ARG is standard Gaussian, i.e., 𝒰𝒩(0,Id)similar-to𝒰𝒩0subscript𝐼𝑑\vec{\mathcal{U}}\sim\mathcal{N}(\vec{0},I_{d})over→ start_ARG caligraphic_U end_ARG ∼ caligraphic_N ( over→ start_ARG 0 end_ARG , italic_I start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ). The observed patch response can be reformulated as

(𝐲;𝐱)|𝒫(𝐱)=vevaluated-at𝐲𝐱𝒫𝐱𝑣\displaystyle\mathcal{R}(\mathbf{y};\mathbf{x})|_{\vec{\mathcal{P}}(\mathbf{x}% )=\vec{v}}caligraphic_R ( bold_y ; bold_x ) | start_POSTSUBSCRIPT over→ start_ARG caligraphic_P end_ARG ( bold_x ) = over→ start_ARG italic_v end_ARG end_POSTSUBSCRIPT =𝒮22=(𝒰+b)TQTΣ*Q(𝒰+b)=(𝒰+b)TΛ(𝒰+b)=j=1dλj(𝒰j+bj)2,absentsuperscriptsubscriptnorm𝒮22superscript𝒰𝑏𝑇superscript𝑄𝑇subscriptΣ𝑄𝒰𝑏superscript𝒰𝑏𝑇Λ𝒰𝑏superscriptsubscript𝑗1𝑑subscript𝜆𝑗superscriptsubscript𝒰𝑗subscript𝑏𝑗2\displaystyle=\|\vec{\mathcal{S}}\|_{2}^{2}=(\vec{\mathcal{U}}+\vec{b})^{T}Q^{% T}\Sigma_{*}Q(\vec{\mathcal{U}}+\vec{b})=(\vec{\mathcal{U}}+\vec{b})^{T}% \Lambda(\vec{\mathcal{U}}+\vec{b})=\sum_{j=1}^{d}\lambda_{j}(\mathcal{U}_{j}+b% _{j})^{2},= ∥ over→ start_ARG caligraphic_S end_ARG ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( over→ start_ARG caligraphic_U end_ARG + over→ start_ARG italic_b end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_Q start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT italic_Q ( over→ start_ARG caligraphic_U end_ARG + over→ start_ARG italic_b end_ARG ) = ( over→ start_ARG caligraphic_U end_ARG + over→ start_ARG italic_b end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Λ ( over→ start_ARG caligraphic_U end_ARG + over→ start_ARG italic_b end_ARG ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( caligraphic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (20)

where b=QTΣ*12μ*𝑏superscript𝑄𝑇subscriptsuperscriptΣ12subscript𝜇\vec{b}=Q^{T}\Sigma^{-\frac{1}{2}}_{*}\vec{\mu}_{*}over→ start_ARG italic_b end_ARG = italic_Q start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT start_POSTSUBSCRIPT * end_POSTSUBSCRIPT over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT, and 𝒰jsubscript𝒰𝑗\mathcal{U}_{j}caligraphic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and bjsubscript𝑏𝑗b_{j}italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT denote the j𝑗jitalic_j’th element of vectors 𝒰𝒰\vec{\mathcal{U}}over→ start_ARG caligraphic_U end_ARG, and b𝑏\vec{b}over→ start_ARG italic_b end_ARG, respectively. The response (20) is a weighted sum of squares of d𝑑ditalic_d independent Gaussian variables (𝒰j+bj)𝒩(bj,1)similar-tosubscript𝒰𝑗subscript𝑏𝑗𝒩subscript𝑏𝑗1(\mathcal{U}_{j}+b_{j})\sim\mathcal{N}(b_{j},1)( caligraphic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∼ caligraphic_N ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , 1 ). Each (𝒰j+bj)2superscriptsubscript𝒰𝑗subscript𝑏𝑗2(\mathcal{U}_{j}+b_{j})^{2}( caligraphic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT follows noncentral chi-squared distribution χν2(δ)superscriptsubscript𝜒𝜈2𝛿\chi_{\nu}^{2}(\delta)italic_χ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_δ ), which is fully described by the degree of freedom ν𝜈\nuitalic_ν and noncentrality parameter δ𝛿\deltaitalic_δ, and the mean and variance of such distribution is given by ν+δ𝜈𝛿\nu+\deltaitalic_ν + italic_δ and 2(ν+2δ)2𝜈2𝛿2(\nu+2\delta)2 ( italic_ν + 2 italic_δ ). Specifically, we have (𝒰j+bj)2χ12(bj2)similar-tosuperscriptsubscript𝒰𝑗subscript𝑏𝑗2subscriptsuperscript𝜒21subscriptsuperscript𝑏2𝑗(\mathcal{U}_{j}+b_{j})^{2}\sim\chi^{2}_{1}(b^{2}_{j})( caligraphic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∼ italic_χ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ). The density of the patch response (20) in general does not have a closed form [8]. With d=(2r+1)2𝑑superscript2𝑟12d=(2r+1)^{2}italic_d = ( 2 italic_r + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, its variance becomes

Var((𝐲;𝐱)|𝒫(𝐱)=v)Varevaluated-at𝐲𝐱𝒫𝐱𝑣\displaystyle\mathrm{Var}\left(\mathcal{R}(\mathbf{y};\mathbf{x})|_{\vec{% \mathcal{P}}(\mathbf{x})=\vec{v}}\right)roman_Var ( caligraphic_R ( bold_y ; bold_x ) | start_POSTSUBSCRIPT over→ start_ARG caligraphic_P end_ARG ( bold_x ) = over→ start_ARG italic_v end_ARG end_POSTSUBSCRIPT ) =j=1dλj2Var(𝒰j+bj)2=2j=1dλj2(1+2bj2)absentsuperscriptsubscript𝑗1𝑑superscriptsubscript𝜆𝑗2Varsuperscriptsubscript𝒰𝑗subscript𝑏𝑗22superscriptsubscript𝑗1𝑑superscriptsubscript𝜆𝑗212superscriptsubscript𝑏𝑗2\displaystyle=\sum_{j=1}^{d}\lambda_{j}^{2}\mathrm{Var}(\mathcal{U}_{j}+b_{j})% ^{2}=2\sum_{j=1}^{d}\lambda_{j}^{2}(1+2b_{j}^{2})= ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_Var ( caligraphic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 2 ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 + 2 italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )
=2tr(Λ2)+4bTΛ2b=2tr(Σ*2)+4μ*TΣ*μ*absent2trsuperscriptΛ24superscript𝑏𝑇superscriptΛ2𝑏2trsuperscriptsubscriptΣ24superscriptsubscript𝜇𝑇subscriptΣsubscript𝜇\displaystyle=2\mathrm{tr}(\Lambda^{2})+4\vec{b}^{T}\Lambda^{2}\vec{b}=2% \mathrm{tr}(\Sigma_{*}^{2})+4\vec{\mu}_{*}^{T}\Sigma_{*}\vec{\mu}_{*}= 2 roman_t roman_r ( roman_Λ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + 4 over→ start_ARG italic_b end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Λ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over→ start_ARG italic_b end_ARG = 2 roman_t roman_r ( roman_Σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + 4 over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT
=2d2(tr(Σp(𝐲;𝐱)2)+2(μpv)TΣp(𝐲;𝐱)(μpv))𝒪(σp4d).absent2superscript𝑑2trsubscriptΣ𝑝superscript𝐲𝐱22superscriptsubscript𝜇𝑝𝑣𝑇subscriptΣ𝑝𝐲𝐱subscript𝜇𝑝𝑣similar-to𝒪superscriptsubscript𝜎𝑝4𝑑\displaystyle=\frac{2}{d^{2}}\left(\mathrm{tr}\left(\Sigma_{p}(\mathbf{y};% \mathbf{x})^{2}\right)+2(\vec{\mu}_{p}-\vec{v})^{T}\Sigma_{p}(\mathbf{y};% \mathbf{x})(\vec{\mu}_{p}-\vec{v})\right)\sim\mathcal{O}(\frac{\sigma_{p}^{4}}% {d}).= divide start_ARG 2 end_ARG start_ARG italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + 2 ( over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - over→ start_ARG italic_v end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) ( over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - over→ start_ARG italic_v end_ARG ) ) ∼ caligraphic_O ( divide start_ARG italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG start_ARG italic_d end_ARG ) .

In Figure 3, (a) shows a synthetic image consists of two textures 𝒫𝒫\mathcal{P}caligraphic_P (left) and 𝒬𝒬\mathcal{Q}caligraphic_Q (right) from Brodatz texture images set111The Brodatz texture image set is obtained from https://sipi.usc.edu/database/. Two patches 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) and 𝒬(𝐲)𝒬𝐲\vec{\mathcal{Q}}(\mathbf{y})over→ start_ARG caligraphic_Q end_ARG ( bold_y ) are marked with blue and yellow squares correspondingly. (b) and (c) show two patch responses (;𝐱)𝐱\mathcal{R}(\cdot;\mathbf{x})caligraphic_R ( ⋅ ; bold_x ) and (;𝐲)𝐲\mathcal{R}(\cdot;\mathbf{y})caligraphic_R ( ⋅ ; bold_y ). The brightness is proportional to the value of the patch responses. This shows that with a suitable patch width parameter r𝑟ritalic_r, the texture edge is clearly emphasized, which is consistent with the analysis in section 3.2. The contrast of two textured regions clearly indicates the edge location.

(a) (b) (c)
Refer to caption Refer to caption Refer to caption
Figure 3: (a) Synthetic texture image with two textures 𝒫𝒫\mathcal{P}caligraphic_P (left) and 𝒬𝒬\mathcal{Q}caligraphic_Q (right). Two patches 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) and 𝒬(𝐲)𝒬𝐲\vec{\mathcal{Q}}(\mathbf{y})over→ start_ARG caligraphic_Q end_ARG ( bold_y ) are marked with blue and yellow. (b) and (c) show two patch responses (;𝐱)𝐱\mathcal{R}(\cdot;\mathbf{x})caligraphic_R ( ⋅ ; bold_x ) and (;𝐲)𝐲\mathcal{R}(\cdot;\mathbf{y})caligraphic_R ( ⋅ ; bold_y ) respectively. Note that the texture edge is clearly emphasized with a suitable patch width parameter r𝑟ritalic_r.

The edge function V𝑉Vitalic_V is given by the consensus of many patch responses. For accurate edge detection, these responses from different observer patches should be consistent, that is many patch responses should recognize there is an edge. This can be measured by the distribution of 𝔼𝐲|𝐱((𝐲;𝐱))subscript𝔼conditional𝐲𝐱𝐲𝐱\mathbb{E}_{\mathbf{y}|\mathbf{x}}\left(\mathcal{R}(\mathbf{y};\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( caligraphic_R ( bold_y ; bold_x ) ), the expected response from the perspective of patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ). In Figure 4 (a), the histograms of the two textures in Figure 3 (a) are presented. The intensity values of the two textures are heavily overlapped which indicates the challenges of using intensity based method to detect the texture boundaries. By using the patch based consensus, Figure 4 (b) and (c) show that as the patch width parameter increases, the more concentrated the expectation becomes. This is consistent with Theorem 2, thus hel** to distinguish two textures. Figure 4 (b) and (c) show the estimated distribution of 𝔼𝐲|𝐱((𝐲;𝐱))subscript𝔼conditional𝐲𝐱𝐲𝐱\mathbb{E}_{\mathbf{y}|\mathbf{x}}\left(\mathcal{R}(\mathbf{y};\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( caligraphic_R ( bold_y ; bold_x ) ), (b) is assuming the pixel 𝐱𝐱\mathbf{x}bold_x is from a random field 𝒫𝒫\mathcal{P}caligraphic_P, and (c) is assuming the pixel 𝐱𝐱\mathbf{x}bold_x is from a random field 𝒬𝒬\mathcal{Q}caligraphic_Q. Note that 𝔼𝐱(𝔼𝐲|𝐱((𝐲;𝐱)))subscript𝔼𝐱subscript𝔼conditional𝐲𝐱𝐲𝐱\mathbb{E}_{\mathbf{x}}\left(\mathbb{E}_{\mathbf{y}|\mathbf{x}}\left(\mathcal{% R}(\mathbf{y};\mathbf{x})\right)\right)blackboard_E start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT ( blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( caligraphic_R ( bold_y ; bold_x ) ) ) is given from (17) in Theorem 1, if 𝐲𝐲\mathbf{y}bold_y is equipped with 𝒫𝒫\mathcal{P}caligraphic_P, and from (18), if equipped with 𝒬𝒬\mathcal{Q}caligraphic_Q. Two sets of the distributions are concentrated around estimated expectations, which are computed from the pixel-wise means and variances of the texture images. We observe the concentration effect with a larger concentration rate, which is due to the variance difference of two textures. In particular, we have σp>σqsubscript𝜎𝑝subscript𝜎𝑞\sigma_{p}>\sigma_{q}italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT > italic_σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT, and according to Theorem 2, the variance of 𝔼𝐲|𝐱((𝐲;𝐱))subscript𝔼conditional𝐲𝐱𝐲𝐱\mathbb{E}_{\mathbf{y}|\mathbf{x}}\left(\mathcal{R}(\mathbf{y};\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( caligraphic_R ( bold_y ; bold_x ) ) is 𝒪(σp4/r2)𝒪superscriptsubscript𝜎𝑝4superscript𝑟2\mathcal{O}(\sigma_{p}^{4}/r^{2})caligraphic_O ( italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT / italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) for Figure 4 (b) and 𝒪(σq4/r2)𝒪superscriptsubscript𝜎𝑞4superscript𝑟2\mathcal{O}(\sigma_{q}^{4}/r^{2})caligraphic_O ( italic_σ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT / italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) for Figure 4 (c). Neither of the textures 𝒫𝒫\mathcal{P}caligraphic_P and 𝒬𝒬\mathcal{Q}caligraphic_Q are strictly stationary nor isotropic, yet our model well-describes the behavior of the patch response.

(a) (b) (c)
Refer to caption Refer to caption Refer to caption
Figure 4: (a) The intensity histograms of two textures in Figure 3 (a). (b) Estimated distributions of the patch response 𝔼𝐲|𝐱((𝐲;𝐱))subscript𝔼conditional𝐲𝐱𝐲𝐱\mathbb{E}_{\mathbf{y}|\mathbf{x}}\left(\mathcal{R}(\mathbf{y};\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( caligraphic_R ( bold_y ; bold_x ) ) as r𝑟ritalic_r increases, assuming the patch centered at 𝐱𝐱\mathbf{x}bold_x is sampled from 𝒫𝒫\mathcal{P}caligraphic_P, and 𝐲𝐲\mathbf{y}bold_y from 𝒫𝒫\mathcal{P}caligraphic_P or 𝒬𝒬\mathcal{Q}caligraphic_Q as indicated in the legend. (c) Same as (b) assuming the patch centered at 𝐱𝐱\mathbf{x}bold_x is sampled from 𝒬𝒬\mathcal{Q}caligraphic_Q.

3.4 The patch width parameter r𝑟ritalic_r and edge detection

The quality of edge detection depends on the intensity contrast of the patch response (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ). This contrast is given by the responses observing 𝒫𝒫\mathcal{P}caligraphic_P or 𝒬𝒬\mathcal{Q}caligraphic_Q by the patch centered at 𝐱𝐱\mathbf{x}bold_x, and the regularity of (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) is related to the choice of r𝑟ritalic_r.

For two texture separation, we first assume R𝑅Ritalic_R is chosen that R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) contains two different textures 𝒫𝒫\mathcal{P}caligraphic_P and 𝒬𝒬\mathcal{Q}caligraphic_Q with a fixed pixel 𝐱𝐱\mathbf{x}bold_x away from any texture boundary. We use the squared Hellinger distance [11] of two probability density functions f1,f2subscript𝑓1subscript𝑓2f_{1},f_{2}italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT to compare the two different patch responses:

2(f1,f2)=1f1,f2[0,1],superscript2subscript𝑓1subscript𝑓21subscript𝑓1subscript𝑓201\displaystyle\mathcal{H}^{2}(f_{1},f_{2})=1-\sqrt{\langle f_{1},f_{2}\rangle}% \in[0,1],caligraphic_H start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = 1 - square-root start_ARG ⟨ italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟩ end_ARG ∈ [ 0 , 1 ] , (21)

here ,\langle\cdot,\cdot\rangle⟨ ⋅ , ⋅ ⟩ denotes the inner product. Squared Hellinger distance (21) is a bounded metric that measures the similarity of the probability density functions f1,f2subscript𝑓1subscript𝑓2f_{1},f_{2}italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in terms of the overlap. In Figure 5, the blue curve indicates the squared Hellinger distance of the patch responses of observing textures 𝒫𝒫\mathcal{P}caligraphic_P and 𝒬𝒬\mathcal{Q}caligraphic_Q from the perspective of patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) and the red curve is from the perspective of patch 𝒬(𝐱)𝒬𝐱\vec{\mathcal{Q}}(\mathbf{x})over→ start_ARG caligraphic_Q end_ARG ( bold_x ). These curves represents the differences of the density functions shown in Figure 4 (b) and (c). As r𝑟ritalic_r increases, two responses get better separated in Figure 4 (b) and (c), which is represented as the increasing value of squared Hellinger distance. The growth of two blue and red curves are different as r𝑟ritalic_r increases, which is due to the difference in the variance of two textures 𝒫𝒫\mathcal{P}caligraphic_P and 𝒬𝒬\mathcal{Q}caligraphic_Q in Figure 4. The horizontal dash line in Figure 5 shows a wide range of r𝑟ritalic_r which gives the separation of two textures.

Refer to caption
Figure 5: Change of distance of distribution function of 𝔼𝐲|𝐱((𝐲;𝐱))subscript𝔼conditional𝐲𝐱𝐲𝐱\mathbb{E}_{\mathbf{y}|\mathbf{x}}(\mathcal{R}(\mathbf{y};\mathbf{x}))blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( caligraphic_R ( bold_y ; bold_x ) ) with respect to the patch width parameter r𝑟ritalic_r. Blue line assumes the observer 𝐱𝐱\mathbf{x}bold_x being equipped with 𝒫𝒫\mathcal{P}caligraphic_P, while the red line assumes 𝐱𝐱\mathbf{x}bold_x being equipped with 𝒬𝒬\mathcal{Q}caligraphic_Q. The horizontal dashed line indicates the required minimal distance for two textures to be distinguished by the segmentation model.

In practice, the patch width parameter r𝑟ritalic_r only needs to meet segmentation requirement of one of two adjacent textures.

4 Numerical Details

We summarize the proposed method in Algorithm 1 which includes following modifications for an efficient computation.

Input : The given image U𝑈Uitalic_U, the patch width parameter r𝑟ritalic_r, the large comparison region width parameter R𝑅Ritalic_R , the regularity parameter λ𝜆\lambdaitalic_λ for the segmentation model (3), and the parameter δ𝛿\deltaitalic_δ for modification (22).
1 Initialize V𝑉Vitalic_V as a zero matrix of the size of U𝑈Uitalic_U;
2 for 𝐱Ω𝐱normal-Ω\mathbf{x}\in\Omegabold_x ∈ roman_Ω do
3       for 𝐲R(𝐱)𝐲subscript𝑅𝐱\mathbf{y}\in\mathcal{B}_{R}(\mathbf{x})bold_y ∈ caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) do
4             Compute (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) in (2), and modify to ^(;𝐱)^𝐱\hat{\mathcal{R}}(\cdot;\mathbf{x})over^ start_ARG caligraphic_R end_ARG ( ⋅ ; bold_x ) as in (22);
5            
6       end for
7      Compute 𝒲(;𝐱)𝒲𝐱\mathcal{W}(\cdot;\mathbf{x})caligraphic_W ( ⋅ ; bold_x ) from the segmentation (4), and modify to 𝒲^(;𝐱)^𝒲𝐱\hat{\mathcal{W}}(\cdot;\mathbf{x})over^ start_ARG caligraphic_W end_ARG ( ⋅ ; bold_x ) as in (23);
8       Update V|R(𝐱)V|R(𝐱)+1(2R+1)2𝒲^(;𝐱)evaluated-at𝑉subscript𝑅𝐱evaluated-at𝑉subscript𝑅𝐱1superscript2𝑅12^𝒲𝐱V|_{\mathcal{B}_{R}(\mathbf{x})}\leftarrow V|_{\mathcal{B}_{R}(\mathbf{x})}+% \frac{1}{(2R+1)^{2}}\hat{\mathcal{W}}(\cdot;\mathbf{x})italic_V | start_POSTSUBSCRIPT caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) end_POSTSUBSCRIPT ← italic_V | start_POSTSUBSCRIPT caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG ( 2 italic_R + 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG over^ start_ARG caligraphic_W end_ARG ( ⋅ ; bold_x );
9      
10 end for
Output : V𝑉Vitalic_V the edge function of the given image U𝑈Uitalic_U.
Algorithm 1 Texture Edge Detection by Patch consensus

First, when computing the patch response (;𝐱)𝐱\mathcal{R}(\cdot;\mathbf{x})caligraphic_R ( ⋅ ; bold_x ), if two points 𝐱𝐱\mathbf{x}bold_x and 𝐲𝐲\mathbf{y}bold_y are very close, i.e. 𝐲𝐲\mathbf{y}bold_y is inside δ(𝐱)subscript𝛿𝐱\mathcal{B}_{\delta}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT ( bold_x ) for δ𝛿\deltaitalic_δ small, the patches 𝒫(𝐲)𝒫𝐲\vec{\mathcal{P}}(\mathbf{y})over→ start_ARG caligraphic_P end_ARG ( bold_y ) and 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) overlapped in most parts. This results in unwanted singularity around the center of (;𝐱)𝐱\mathcal{R}(\cdot;\mathbf{x})caligraphic_R ( ⋅ ; bold_x ). We remove this center singularity with a local average:

^(𝐲;𝐱)={1|R(𝐱)/δ(𝐱)|𝐳R(𝐱)/δ(𝐱)(𝐳;𝐱)if 𝐲𝐱δ,(𝐲;𝐱),otherwise.^𝐲𝐱cases1subscript𝑅𝐱subscript𝛿𝐱subscript𝐳subscript𝑅𝐱subscript𝛿𝐱𝐳𝐱if subscriptnorm𝐲𝐱𝛿𝐲𝐱otherwise\displaystyle\hat{\mathcal{R}}(\mathbf{y};\mathbf{x})=\begin{cases}\frac{1}{% \mathinner{\!\left\lvert\mathcal{B}_{R}(\mathbf{x})/\mathcal{B}_{\delta}(% \mathbf{x})\right\rvert}}\sum_{\textbf{z}\in\mathcal{B}_{R}(\mathbf{x})/% \mathcal{B}_{\delta}(\mathbf{x})}\mathcal{R}(\textbf{z};\mathbf{x})&\quad\text% {if }\|\mathbf{y}-\mathbf{x}\|_{\infty}\leq\delta,\\ \mathcal{R}(\mathbf{y};\mathbf{x}),&\quad\text{otherwise}.\end{cases}over^ start_ARG caligraphic_R end_ARG ( bold_y ; bold_x ) = { start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG | caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) / caligraphic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT ( bold_x ) | end_ARG ∑ start_POSTSUBSCRIPT z ∈ caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) / caligraphic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT ( bold_x ) end_POSTSUBSCRIPT caligraphic_R ( z ; bold_x ) end_CELL start_CELL if ∥ bold_y - bold_x ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_δ , end_CELL end_ROW start_ROW start_CELL caligraphic_R ( bold_y ; bold_x ) , end_CELL start_CELL otherwise . end_CELL end_ROW (22)

The patch response (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) in the subdomain δ(𝐱)subscript𝛿𝐱\mathcal{B}_{\delta}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT ( bold_x ) is replaced by the average over its complement 𝐳R(𝐱)/δ(𝐱)(𝐳;𝐱)subscript𝐳subscript𝑅𝐱subscript𝛿𝐱𝐳𝐱\sum_{\textbf{z}\in\mathcal{B}_{R}(\mathbf{x})/\mathcal{B}_{\delta}(\mathbf{x}% )}\mathcal{R}(\textbf{z};\mathbf{x})∑ start_POSTSUBSCRIPT z ∈ caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) / caligraphic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT ( bold_x ) end_POSTSUBSCRIPT caligraphic_R ( z ; bold_x ). In practice, we choose δ=5𝛿5\delta=5italic_δ = 5 when the patch width parameter r[10,30]𝑟1030r\in[10,30]italic_r ∈ [ 10 , 30 ].

Secondly, when R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) is close to, but not overlapped with, any texture edge, the patch centered at some pixel 𝐲R(𝐱)𝐲subscript𝑅𝐱\mathbf{y}\in\mathcal{B}_{R}(\mathbf{x})bold_y ∈ caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) may still see the texture edge outside R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ). This can cause the local edge function 𝒲(𝐲;𝐱)𝒲𝐲𝐱\mathcal{W}(\mathbf{y};\mathbf{x})caligraphic_W ( bold_y ; bold_x ) to report a false positive edge inside R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ), and give thick and blurry edge on V𝑉Vitalic_V. We make the local edge function 𝒲(𝐲;𝐱)𝒲𝐲𝐱\mathcal{W}(\mathbf{y};\mathbf{x})caligraphic_W ( bold_y ; bold_x ) to only respond within R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ), by the following modification

𝒲^(𝐲;𝐱)={𝒲(𝐲;𝐱)if d(𝐲,BR(𝐱))>r,0otherwise,^𝒲𝐲𝐱cases𝒲𝐲𝐱if d𝐲subscript𝐵𝑅𝐱𝑟0otherwise,\displaystyle\hat{\mathcal{W}}(\mathbf{y};\mathbf{x})=\begin{cases}\mathcal{W}% (\mathbf{y};\mathbf{x})&\quad\text{if }\mathrm{d}(\mathbf{y},\partial B_{R}(% \mathbf{x}))>r,\\ 0&\quad\text{otherwise,}\end{cases}over^ start_ARG caligraphic_W end_ARG ( bold_y ; bold_x ) = { start_ROW start_CELL caligraphic_W ( bold_y ; bold_x ) end_CELL start_CELL if roman_d ( bold_y , ∂ italic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) ) > italic_r , end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL otherwise, end_CELL end_ROW (23)

where d(𝐲,BR(𝐱))d𝐲subscript𝐵𝑅𝐱\mathrm{d}(\mathbf{y},\partial B_{R}(\mathbf{x}))roman_d ( bold_y , ∂ italic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) ) is the distance of pixel 𝐲𝐲\mathbf{y}bold_y to the boundary of R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ), i.e.,

d(𝐲,BR(𝐱))=min{𝐲𝐳|𝐳R(𝐱)}.d𝐲subscript𝐵𝑅𝐱conditionalsubscriptnorm𝐲𝐳𝐳subscript𝑅𝐱\mathrm{d}(\mathbf{y},\partial B_{R}(\mathbf{x}))=\min\left\{\|\mathbf{y}-% \mathbf{z}\|_{\infty}~{}|~{}\mathbf{z}\in\partial\mathcal{B}_{R}(\mathbf{x})% \right\}.roman_d ( bold_y , ∂ italic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) ) = roman_min { ∥ bold_y - bold_z ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT | bold_z ∈ ∂ caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ) } .

Thirdly, we bound the number of phases to be K{1,2}𝐾12K\in\{1,2\}italic_K ∈ { 1 , 2 } in the segmentation step. When K=1𝐾1K=1italic_K = 1, the energy (3) reduces to the variance of the given image. The effect of the parameter λ𝜆\lambdaitalic_λ can be interpreted as a threshold on the segmentation model to give one or two phases. We set λ𝜆\lambdaitalic_λ = 0.01 to 0.05, when normalized patch response [0,1]01\mathcal{R}\in[0,1]caligraphic_R ∈ [ 0 , 1 ] is used. When the given image range is U[0,255]𝑈0255U\in[0,255]italic_U ∈ [ 0 , 255 ] and the patch response is not normalize, we use λ𝜆\lambdaitalic_λ = 450 to 1,000. In Figure 5, the horizontal dashed line represents the distance threshold for the two textures to be separated, i.e. the segmentation model to choose K=2𝐾2K=2italic_K = 2. The λ𝜆\lambdaitalic_λ controls the regularity for the local edge function 𝒲(𝐲;𝐱)𝒲𝐲𝐱\mathcal{W}(\mathbf{y};\mathbf{x})caligraphic_W ( bold_y ; bold_x ), and efficiently reduce the unwanted edge detected. With λ𝜆\lambdaitalic_λ fixed, textures requires different patch width parameters r𝑟ritalic_r to find an edge (if there is one).

5 Numerical Experiments

In this section, we present numerical results exploring different aspects of the proposed model. First, Figure 6 represents the procedure of the proposed method. In the center figure, for each point 𝐱𝐱\mathbf{x}bold_x, yellow boxes show the local patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) with r=3𝑟3r=3italic_r = 3, and the blue boxes show the patch responses (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) in R(𝐱)subscript𝑅𝐱\mathcal{B}_{R}(\mathbf{x})caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( bold_x ). For each zoomed location, we present the yellow local patch 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ), patch response (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) and the local edge function 𝒲(𝐲;𝐱)𝒲𝐲𝐱\mathcal{W}(\mathbf{y};\mathbf{x})caligraphic_W ( bold_y ; bold_x ). For 𝐱1subscript𝐱1\mathbf{x}_{1}bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, 𝐱3subscript𝐱3\mathbf{x}_{3}bold_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, and 𝐱4subscript𝐱4\mathbf{x}_{4}bold_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, two regions are identified and an edge is found between two textures. For 𝐱6subscript𝐱6\mathbf{x}_{6}bold_x start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT, two edges are found separating the patch response to three regions, here two of these three regions represents the same textured region. Notice for 𝐱2subscript𝐱2\mathbf{x}_{2}bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and 𝐱5subscript𝐱5\mathbf{x}_{5}bold_x start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT, although textures are changing and patch response shows some textures, they are identified to be the same textured regions and no edges are found.

Refer to caption
Figure 6: For various locations 𝐱𝐱\mathbf{x}bold_x, yellow boxes show the local patch 𝒫(𝐱)𝒫𝐱\mathcal{P}(\mathbf{x})caligraphic_P ( bold_x ) with r=3𝑟3r=3italic_r = 3, and the blue boxes show the patch responses (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) in Rsubscript𝑅\mathcal{B}_{R}caligraphic_B start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT. The zoomed-in images also show the local edge function 𝒲(𝐲;𝐱)𝒲𝐲𝐱\mathcal{W}(\mathbf{y};\mathbf{x})caligraphic_W ( bold_y ; bold_x ). For 𝐱1subscript𝐱1\mathbf{x}_{1}bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, 𝐱3subscript𝐱3\mathbf{x}_{3}bold_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, 𝐱4subscript𝐱4\mathbf{x}_{4}bold_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and 𝐱6subscript𝐱6\mathbf{x}_{6}bold_x start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT, edges are clearly found, while for 𝐱2subscript𝐱2\mathbf{x}_{2}bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and 𝐱5subscript𝐱5\mathbf{x}_{5}bold_x start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT, although patch responses show some textures, they are identified to be the same textured regions and no edges are found.

5.1 Real images with texture

We represent the texture edge detection result for real textured images, and show comparison with Canny edge detection [5]. In Figure 7, TEP finds texture and object boundaries without finding edges within textures. Zoom-in of the red and the yellow boxes in (a) are presented in (d)-(g), where (d) and (f) shows how TEP V(𝐱)𝑉𝐱V(\mathbf{x})italic_V ( bold_x ) only finds the boundary of the textures. In (d), TEP result considers the checkerboard texture as one region, and is able to detect the subtle transitions at the corner of the table. The train rail is considered as an entity, despite the track lines in (d), while, the Canny edge detection in (e) finds sharp gradient change as edges, and finds the edges of the checkerboard pattern also. In (f), notice that the shades caused by wrinkles are ignored by the proposed model, while it is captured by Canny edge detection in (g). For TEP, r=3,R=35formulae-sequence𝑟3𝑅35r=3,R=35italic_r = 3 , italic_R = 35, and λ=1000𝜆1000\lambda=1000italic_λ = 1000 are used, while for the Canny edge detection, we used (0.04,0.1)0.040.1(0.04,0.1)( 0.04 , 0.1 ) for hysteresis thresholding and σ=2𝜎2\sigma=2italic_σ = 2 for Gaussian blurring.

(a) (b) (c)
Refer to caption Refer to caption Refer to caption
(d) (e) (f) (g)
Refer to caption Refer to caption Refer to caption Refer to caption
Figure 7: (a) The given image. (b) TEP result V(𝐱)𝑉𝐱V(\mathbf{x})italic_V ( bold_x ). (c) Canny edge detection. (d) and (e) are zoom-in of the red box, and (f) and (g) are that of the yellow box in (a). TEP edges V(𝐱)𝑉𝐱V(\mathbf{x})italic_V ( bold_x ) are red edge in (d) and yellow edge in (f). (e) and (g) show Canny edge detection in Cyan. TEP finds the texture edges without finding the edges inside one texture.

In Figure 8, first two rows (a)-(g), the worm details are understood as texture in (d). For TEP, r=5,R=20formulae-sequence𝑟5𝑅20r=5,R=20italic_r = 5 , italic_R = 20, and λ=800𝜆800\lambda=800italic_λ = 800 are used, and for Canny edge detector, threshold parameters {0.04,0.1}0.040.1\{0.04,0.1\}{ 0.04 , 0.1 } and σ=2𝜎2\sigma=2italic_σ = 2 for Gaussian filter are used. In Figure 8 last row, the details of the hair is understood as texture by TEP, while Canny edge detection finds the details. For TEP, r=5,R=30formulae-sequence𝑟5𝑅30r=5,R=30italic_r = 5 , italic_R = 30, and λ=450𝜆450\lambda=450italic_λ = 450 are used, and for Canny edge detector, threshold parameters {0.12,0.3}0.120.3\{0.12,0.3\}{ 0.12 , 0.3 } and σ=2𝜎2\sigma=2italic_σ = 2 for Gaussian filter are used. TEP consistently represents the region better even for textures with complicated and large scale patterns.

(a) (b) (c)
Refer to caption Refer to caption Refer to caption
(d) (e) (f) (g)
Refer to caption Refer to caption Refer to caption Refer to caption
(h) (i) (j)
Refer to caption Refer to caption Refer to caption
Figure 8: (a) The given image. (b) TEP result V(𝐱)𝑉𝐱V(\mathbf{x})italic_V ( bold_x ). (c) Canny edge detection. (d) and (e) are zoom-in of the red box, and (f) and (g) are that of the yellow box in (a). TEP edges V(𝐱)𝑉𝐱V(\mathbf{x})italic_V ( bold_x ) are red edge in (d) and yellow edge in (f). (e) and (g) show Canny edge detection in Cyan. Texture of the worm using TEP is clearly understood as texture in (d) and red edge finds the boundary of the worm. (h) The given image. (i) and (j) are zoom-in of the yellow box in (h). (i) TEP result V(𝐱)𝑉𝐱V(\mathbf{x})italic_V ( bold_x ) in yellow. (j) Canny edge detection in cyan. TEP finds hairy region as one texture.

TEP is a training-free method for texture edge detection. Yet, in Figure 9, we present images from the Berkeley segmentation dataset BSDS500, and compare with the state-of-the-art machine learning model, Edge Detector with Transformer (EDTER) [24] as an example. Since the Berkeley Segmentation dataset for edge detection was published [21], it has been a benchmark for contour detection, especially in machine learning community [34, 19, 24, 10]. These methods are trained with color images with ground-truth data provided by human experts [21] that these methods aim at object detection. On the other hand, We apply TED only on gray scale images without any a priori knowledge of the image. TEP detects local texture edges, and this is not an object detection method. Even then, in Figure 9, TEP shows good edge detection and provides comparable results to the deep learning model. In the first row images, TEP and EDTER both finds large scale region with bricks (while Canny edge detection finds details of the bricks). TEP gives different strength to the edge, some parts are weaker edges than others, while EDTER gives the same strength, since it is object oriented contour detection. In the second and third rows, TEP edge is closer to the given image, grou** different texture correctly, and TEP finds irregular texture boundary. In the last row, while TEP finds more details of the dress, EDTER is simplified, and TEP sees the texture of the flood and finds the edge of the texture, while EDTER finds the edges in the floor tiles. With texture edge detection, TEP can give comparable good edge detection results.

(a) (b) (c) (d)
Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption
Figure 9: Images from BDSD500 Dataset. First column (a) are the given images. (b) TEP results. (c) Canny edge detection. (d) EDTER results.

5.2 The scale of the texture vs the patch width parameter

The patch width parameter r𝑟ritalic_r can be adjusted to find different scales in the image. In Figure 10, we experiment with image (a) which has different scales of texture for each object. The background has the smallest texton - the smallest repeating unit in the texture. The triangle, the circle and the square all have different sizes of texton in increasing order. From image (b) to (d), the patch width parameter r𝑟ritalic_r is increasing from r=4,6𝑟46r=4,6italic_r = 4 , 6 to 8, and as r𝑟ritalic_r increases, TEP sees bigger patches as a texture. In (b) with a small r𝑟ritalic_r, each texton in the square is identifies as a separate region, since it understands the texture only in the small scale that each texton within the square is understood as a separate region. In the circle while the edge of the circle is identified, within the circle it also shows some texture details. In image (d), even the big texton is captured by a large r𝑟ritalic_r, that all objects clearly shows the texture edge boundary. When r𝑟ritalic_r is large enough, TEP ignores the fine details within the textured region.

(a) (b) (c) (d)
Refer to caption Refer to caption Refer to caption Refer to caption
Figure 10: (a) Synthetic texture image with different sizes of texton. For different r𝑟ritalic_r, (b) r=4𝑟4r=4italic_r = 4, (c) r=6𝑟6r=6italic_r = 6, (d) r=8𝑟8r=8italic_r = 8, TEP finds different scale of texture. (Other parameters are fixed as λ=0.02,R=20formulae-sequence𝜆0.02𝑅20\lambda=0.02,R=20italic_λ = 0.02 , italic_R = 20.) When r𝑟ritalic_r is small, only texture with small textons are identified as one region. In (b) the square, each texton is identified as one region. When r𝑟ritalic_r is large, even the large texture is identified as one region, and clearer texture edge is found.

Figure 11 shows real example in (a), using r=1𝑟1r=1italic_r = 1 for (b) and r=7𝑟7r=7italic_r = 7 for (c). In (b), the starfish shape is identified but with many details, since with a small r𝑟ritalic_r, only very small scale texture is identified as one region. In (c), with a large r𝑟ritalic_r, larger textures, e.g., inside the starfish, is identified as one textured region, and only boundary of the starfish is emphasized. As the patch width parameter increase, TEP focuses on large scale structure of the given image, while grou** small details as one region.

(a) (b) (c)
Refer to caption Refer to caption Refer to caption
Figure 11: (a) The given image. The TEP results are shown with (b) r=1𝑟1r=1italic_r = 1, (c) r=7𝑟7r=7italic_r = 7. The comparison window parameter R=35𝑅35R=35italic_R = 35 is fixed for both results.

5.3 Robustness Against Noise and Multiple junctions

We test the robustness of TEP against different levels and types of noise. Figure 12 shows the TEP result against additive Gaussian noise with increasing variance, and the TEP result against increasing salt-and-pepper noise. In the first row (a), Gaussian noise with variance 0.02, 0.04, 0.06, 0.08 to 0.1 are added from the left to the right. In the second row (b), they are TEP results with parameters r=5,R=20formulae-sequence𝑟5𝑅20r=5,R=20italic_r = 5 , italic_R = 20, and λ=0.018𝜆0.018\lambda=0.018italic_λ = 0.018 (using normalized patch response). In the third row (c), textured images with Salt and Pepper noise ranging from 10%,20%,30%,40%percent10percent20percent30percent4010\%,20\%,30\%,40\%10 % , 20 % , 30 % , 40 %, to 50%percent5050\%50 % are shown from the left to the right. In the forth row (d), TEP results are shown with the same parameters r=5,R=20formulae-sequence𝑟5𝑅20r=5,R=20italic_r = 5 , italic_R = 20, and λ=0.018𝜆0.018\lambda=0.018italic_λ = 0.018 (with normalized patch response). As more noise are added, some parts of edge strength gets weaker. Otherwise, TEP shows robustness against Gaussian and Salt and Pepper noise. In Figure 12 (d) row, clear edge is detected up to 40-50% of Salt and Pepper noise.

(a) Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
(b) Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
(c) Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
(d) Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Figure 12: First row (a): Textured image with additive Gaussian noise. From the first to the last image, Gaussian noise with variance 0.02, 0.04, 0.06, 0.08 to 0.1 are added (which corresponds to SNR = 17.01,14.55,13.30,12.4817.0114.5513.3012.4817.01,14.55,13.30,12.4817.01 , 14.55 , 13.30 , 12.48, to 11.8911.8911.8911.89). Second row (b) shows the results of TEP showing the edge detection. Third row (c): Textured image with Salt and Pepper noise ranging from 10%,20%,30%,40%percent10percent20percent30percent4010\%,20\%,30\%,40\%10 % , 20 % , 30 % , 40 %, to 50%percent5050\%50 %. Forth row (d) shows the results of TEP showing the edge detection. Parameters r=5,R=20formulae-sequence𝑟5𝑅20r=5,R=20italic_r = 5 , italic_R = 20, and λ=0.018𝜆0.018\lambda=0.018italic_λ = 0.018 (with normalized patch response) are used for both set of experiments. TEP shows robustness against noise.

When detecting edges, finding sharp junctions can be challenging, e.g., due to multiple edges meeting at one point, it can easily get blurry. In Figure 13, we experiment with images with multiple junctions of various textures. TEP computes the non-binary edge function V𝑉Vitalic_V by collecting local segmentation results, that as long as the texture can be separated, multiple junction can also be identified. This is consistent with equation (19) in Section 3, the differences of average intensities help the proposed method to identify the texture boundaries. Figure 13 (b) and (d) show TEP results showing clear edges near the junction point. The experiment results show TEP well handles Nlimit-from𝑁N-italic_N -junction problem with N=4𝑁4N=4italic_N = 4 and N=8𝑁8N=8italic_N = 8. In (d), the strength of the edge, V𝑉Vitalic_V value may be not as high for some points near the junction in the center, and for a very accurate edge detection, multiple junction will impose challenges. One can further improve the sharpness of edge detection with small modifications, which we further discussed in Appendix B.

(a) (b) (c) (d)
Refer to caption Refer to caption Refer to caption Refer to caption
Figure 13: (a) Texture image with 4 junction. (b) TEP result of (a). (c) Texture image with 8-junction. (d) TEP result of (c). TEP well handles Nlimit-from𝑁N-italic_N -junction problem with N=4𝑁4N=4italic_N = 4 and N=8𝑁8N=8italic_N = 8.

5.4 Image segmentation using the edge function V𝑉Vitalic_V and image decomposition

Using the edge function V𝑉Vitalic_V, we can design a color segmentation method, using chromaticity and brightness model [6]. We separate the given color image 𝐔0:Ω3:subscript𝐔0Ωsuperscript3\mathbf{U}_{0}\mathrel{\mathop{:}}\Omega\to\mathbb{R}^{3}bold_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT : roman_Ω → blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT to the brightness Ub:Ω:subscript𝑈𝑏ΩU_{b}\mathrel{\mathop{:}}\Omega\to\mathbb{R}italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT : roman_Ω → blackboard_R and the chromaticity 𝐔c:Ω𝕊3={𝐱3𝐱2=1}:subscript𝐔𝑐Ωsuperscript𝕊3conditional-set𝐱superscript3subscriptnorm𝐱21\mathbf{U}_{c}\mathrel{\mathop{:}}\Omega\to\mathbb{S}^{3}=\{\mathbf{x}\in% \mathbb{R}^{3}\mid\|\mathbf{x}\|_{2}=1\}bold_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT : roman_Ω → blackboard_S start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT = { bold_x ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ∣ ∥ bold_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 }:

𝐔0=Ub𝐔c, where Ub=|𝐔0| and 𝐔c=𝐔0Ub.formulae-sequencesubscript𝐔0subscript𝑈𝑏subscript𝐔𝑐 where subscript𝑈𝑏subscript𝐔0 and subscript𝐔𝑐subscript𝐔0subscript𝑈𝑏\displaystyle\mathbf{U}_{0}=U_{b}\cdot\mathbf{U}_{c},\text{ where }U_{b}=|% \mathbf{U}_{0}|\text{ and }\mathbf{U}_{c}=\frac{\mathbf{U}_{0}}{U_{b}}.bold_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ⋅ bold_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , where italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = | bold_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | and bold_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = divide start_ARG bold_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG .

We use the edge function, and propose the following segmentation functional for each chromaticity and brightness components, with 𝐔~=U~b𝐔~c~𝐔subscript~𝑈𝑏subscript~𝐔𝑐\widetilde{\mathbf{U}}=\widetilde{U}_{b}\cdot\widetilde{\mathbf{U}}_{c}over~ start_ARG bold_U end_ARG = over~ start_ARG italic_U end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ⋅ over~ start_ARG bold_U end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT,

U~bsubscript~𝑈𝑏\displaystyle\widetilde{U}_{b}over~ start_ARG italic_U end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT =argminUΩ{gα(V)|U|2+γ1|UUb|2}𝑑𝐱,absentsubscriptargmin𝑈subscriptΩsubscript𝑔𝛼𝑉superscript𝑈2subscript𝛾1superscript𝑈subscript𝑈𝑏2differential-d𝐱\displaystyle=\operatorname*{argmin}_{U}\int_{\Omega}\left\{g_{\alpha}(V)|% \nabla U|^{2}+\gamma_{1}|U-U_{b}|^{2}\right\}d\mathbf{x},= roman_argmin start_POSTSUBSCRIPT italic_U end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT { italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_V ) | ∇ italic_U | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | italic_U - italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } italic_d bold_x , (24)
𝐔~csubscript~𝐔𝑐\displaystyle\widetilde{\mathbf{U}}_{c}over~ start_ARG bold_U end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT =argmin𝐔(𝐔)=Ω{gα(V)|𝐔|2+γ2|𝐔𝐔c|2+β(1|𝐔|)2}𝑑𝐱,absentsubscriptargmin𝐔𝐔subscriptΩsubscript𝑔𝛼𝑉superscript𝐔2subscript𝛾2superscript𝐔subscript𝐔𝑐2𝛽superscript1𝐔2differential-d𝐱\displaystyle=\operatorname*{argmin}_{\mathbf{U}}(\mathbf{U})=\int_{\Omega}% \left\{g_{\alpha}(V)|\nabla\mathbf{U}|^{2}+\gamma_{2}|\mathbf{U}-\mathbf{U}_{c% }|^{2}+\beta(1-|\mathbf{U}|)^{2}\right\}d\mathbf{x},= roman_argmin start_POSTSUBSCRIPT bold_U end_POSTSUBSCRIPT ( bold_U ) = ∫ start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT { italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_V ) | ∇ bold_U | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | bold_U - bold_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_β ( 1 - | bold_U | ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } italic_d bold_x ,

where gα(x):[0,1][0,1]:subscript𝑔𝛼𝑥0101g_{\alpha}(x)\mathrel{\mathop{:}}[0,1]\to[0,1]italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_x ) : [ 0 , 1 ] → [ 0 , 1 ] is an edge indication function, gα(x)=1xα1+xαsubscript𝑔𝛼𝑥1superscript𝑥𝛼1superscript𝑥𝛼\displaystyle{g_{\alpha}(x)=\frac{1-x^{\alpha}}{1+x^{\alpha}}}italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_x ) = divide start_ARG 1 - italic_x start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG start_ARG 1 + italic_x start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT end_ARG, such that gαsubscript𝑔𝛼g_{\alpha}italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT is strictly decreasing, and gα(0)=1,gα(1)=0formulae-sequencesubscript𝑔𝛼01subscript𝑔𝛼10g_{\alpha}(0)=1,~{}g_{\alpha}(1)=0italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( 0 ) = 1 , italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( 1 ) = 0 for α>0𝛼0\alpha>0italic_α > 0. In order to utilize texture edge V𝑉Vitalic_V in an effective way, gαsubscript𝑔𝛼g_{\alpha}italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT needs to control the smoothness of U𝑈Uitalic_U inversely proportional to the strength of V𝑉Vitalic_V within the range of V[0,1]𝑉01V\in[0,1]italic_V ∈ [ 0 , 1 ]. In application, since V𝑉Vitalic_V is generated through consensus, 1V1𝑉1-V1 - italic_V is far from zero at texture edge, we choose α<1𝛼1\alpha<1italic_α < 1 to enhance the convexity of gα(V)subscript𝑔𝛼𝑉g_{\alpha}(V)italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( italic_V ), which creates wider region near V=1𝑉1V=1italic_V = 1 thus properly control the smoothness of U𝑈Uitalic_U.

The functionals (24) are minimized by considering the Euler-Lagrange equations with gradient decent with time evolution:

U~btsubscript~𝑈𝑏𝑡\displaystyle\frac{\partial\widetilde{U}_{b}}{\partial t}divide start_ARG ∂ over~ start_ARG italic_U end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG ∂ italic_t end_ARG =div(gU~b)+γ1(U~bUb),absentdiv𝑔subscript~𝑈𝑏subscript𝛾1subscript~𝑈𝑏subscript𝑈𝑏\displaystyle=\operatorname{div}(g\nabla\widetilde{U}_{b})+\gamma_{1}(% \widetilde{U}_{b}-U_{b}),= roman_div ( italic_g ∇ over~ start_ARG italic_U end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ) + italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( over~ start_ARG italic_U end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT - italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ) ,
𝐔~ctsubscript~𝐔𝑐𝑡\displaystyle\frac{\partial\widetilde{\mathbf{U}}_{c}}{\partial t}divide start_ARG ∂ over~ start_ARG bold_U end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_ARG start_ARG ∂ italic_t end_ARG =div(g𝐔~c)+γ2(𝐔~c𝐔c)+β(11|𝐔~c|)𝐔~c,absentdiv𝑔subscript~𝐔𝑐subscript𝛾2subscript~𝐔𝑐subscript𝐔𝑐𝛽11subscript~𝐔𝑐subscript~𝐔𝑐\displaystyle=\operatorname{div}(g\nabla\widetilde{\mathbf{U}}_{c})+\gamma_{2}% (\widetilde{\mathbf{U}}_{c}-\mathbf{U}_{c})+\beta(1-\frac{1}{|\widetilde{% \mathbf{U}}_{c}|})\widetilde{\mathbf{U}}_{c},= roman_div ( italic_g ∇ over~ start_ARG bold_U end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) + italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( over~ start_ARG bold_U end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT - bold_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) + italic_β ( 1 - divide start_ARG 1 end_ARG start_ARG | over~ start_ARG bold_U end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT | end_ARG ) over~ start_ARG bold_U end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ,

using finite difference scheme. We only used the brightness of the image to compute V(𝐱)𝑉𝐱V(\textbf{x})italic_V ( x ). Figure 14 (a) is the given image, (b) two-phase clustering of image (a), (c) shows the segmentation result of (24) and (d) is two-phase clustering of image (c). Within each region, small scale details are removed, while the edge is kept very sharp.

(a) (b) (c) (d)
Refer to caption Refer to caption Refer to caption Refer to caption
Figure 14: (a) The given color image. (b) Two-phase clustering of image (a). (c) The proposed segmentation result in (24). (d) Two-phase clustering of image (c). While within each region is diffused, the texture edge is kept very sharp, clearly segmenting the image. Notice how well texture edge is found using V𝑉Vitalic_V.

We present more segmentation results in Figure 15. The brightness of the image is used to compute V(𝐱)𝑉𝐱V(\textbf{x})italic_V ( x ), and we used parameters r=5,R=25formulae-sequence𝑟5𝑅25r=5,R=25italic_r = 5 , italic_R = 25, and λ=400𝜆400\lambda=400italic_λ = 400, with U[0,255]𝑈0255U\in[0,255]italic_U ∈ [ 0 , 255 ]. For gαsubscript𝑔𝛼g_{\alpha}italic_g start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT, α=0.2𝛼0.2\alpha=0.2italic_α = 0.2 is used and dt=0.1𝑑𝑡0.1dt=0.1italic_d italic_t = 0.1 for evolution. The first row, the grains are identified as one texture, as well as some textures on the ground as another texture. In the second row, branches with leafs, and grass regions are identifies as different textures. In the third row, grains on the rock are identified as one texture, and they are well separated from the fur of the animal, even when the colors are similar. In the forth row, the texture within the coral are identified as a texture, and detail of the oscillatory boundary are well identified.

(a) (b) (c)
Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption
Figure 15: (a) The given image. (b) The edge function g𝑔gitalic_g using V𝑉Vitalic_V, which ignores the texture within regions. (c) The segmentation result of (24). Textures of the images are well grouped as each region, e.g., in the top image, grains are identified as one region, and in the third row, grains on the rock are identified as one texture, and they are will separated from the fur of the animal, even when the colors are similar.

This method can be naturally extended to image decomposition, and in Figure 16, we present the remainder after the segmentation showing the details of the image.

(a) (b) (c)
Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption
Figure 16: (a) The given image. (b) The segmentation result using (24). There is a natural extension to image decomposition and (c) shows the reminder, subtracting image (b) from (a), representing the details of the image.

In Appendix, we further discuss more details of the proposed method, e.g. proof of Theorem 1, behavior of periodic texture, and junction refinement.

6 Concluding Remarks

We proposed a texture edge detection method which utilize patch based consensus. We use the local patch and its response, to emphasize the similarities and the differences among textures, and segmentation of the patch response helps to locate the edge location clearly. On the boundary of texture, local patch information and patch response is not as accurate to identify edge location, that we utilize the neighbor consensus to stabilize the process. We statistically analyze when the texture can be separated, and derive necessary conditions to distinguish textures. The proposed method has three parameters which are not very sensitive to choose, and show how patch width and the size of texton is related. This method is robust to different type of noise, and can handle multiple junctions. This method is training-free and filter-free approach. We provided numerical details and various experiments which illustrate the properties of the proposed model.

In the future, one may consider refining and thinning the edge thickness, which will also improve image decomposition application in Figure 16. We can also consider using multi-scale approach to further separate more object related edges, e.g., using different scales of patch responses. We may improve the performance of TEP via utilizing a scheme with adaptive patch size which can handle more complicated real images. Also, different types of kernels can be used instead of squared euclidean distance when comparing image patches, in order to enhance the sensitivity to certain types of textures.

References

  • [1] Robert J Adler and Jonathan E Taylor. Random fields and geometry. Springer Science & Business Media, 2009.
  • [2] Serge B. Provost A.M. Mathai. Quadratic forms in random variables: theory and applications. M. Dekker, 1992.
  • [3] Gedas Bertasius, Jianbo Shi, and Lorenzo Torresani. Deepedge: A multi-scale bifurcated deep network for top-down contour detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4380–4389, 2015.
  • [4] Antoni Buades, Bartomeu Coll, and Jean-Michel Morel. Non-Local Means Denoising. Image Processing On Line, 1:208–212, 2011.
  • [5] John Canny. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, PAMI-8(6):679–698, 1986.
  • [6] Tony F Chan, Sung Ha Kang, and Jianhong Shen. Total variation denoising and enhancement of color images based on the cb and hsv color models. Journal of Visual Communication and Image Representation, 12(4):422–435, 2001.
  • [7] Tony F Chan and Luminita A Vese. Active contours without edges. IEEE Transactions on image processing, 10(2):266–277, 2001.
  • [8] Robert B Davies. Algorithm as 155: The distribution of a linear combination of χ2superscript𝜒2\chi^{2}italic_χ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT random variables. Applied Statistics, pages 323–333, 1980.
  • [9] Alexei A Efros and Thomas K Leung. Texture synthesis by non-parametric sampling. In Proceedings of the seventh IEEE international conference on computer vision, volume 2, pages 1033–1038. IEEE, 1999.
  • [10] Jianzhong He, Shiliang Zhang, Ming Yang, Yanhu Shan, and Tiejun Huang. Bi-directional cascade network for perceptual edge detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3828–3837, 2019.
  • [11] Ernst Hellinger. Neue begründung der theorie quadratischer formen von unendlichvielen veränderlichen. Journal für die reine und angewandte Mathematik, 1909(136):210–271, 1909.
  • [12] Byung-Woo Hong, Stefano Soatto, Kangyu Ni, and Tony Chan. The scale of a texture and its application to segmentation. In 2008 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8. IEEE, 2008.
  • [13] Dana E Ilea and Paul F Whelan. Image segmentation based on the integration of colour–texture descriptors—a review. Pattern Recognition, 44(10-11):2479–2501, 2011.
  • [14] Anil K Jain and Farshid Farrokhnia. Unsupervised texture segmentation using gabor filters. Pattern recognition, 24(12):1167–1186, 1991.
  • [15] Peter W Jones and Triet M Le. Local scales and multiscale image decompositions. Applied and Computational Harmonic Analysis, 26(3):371–394, 2009.
  • [16] Samah Khawaled and Yehoshua Y Zeevi. On the self-similarity of natural stochastic textures. arXiv preprint arXiv:1906.06768, 2019.
  • [17] Adrian S Lewis and G Knowles. Image compression using the 2-d wavelet transform. IEEE Transactions on image Processing, 1(2):244–250, 1992.
  • [18] Yanxi Liu, Robert T Collins, and Yanghai Tsin. A computational model for periodic pattern perception based on frieze and wallpaper groups. IEEE transactions on pattern analysis and machine intelligence, 26(3):354–371, 2004.
  • [19] Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Kai Wang, and Xiang Bai. Richer convolutional features for edge detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3000–3009, 2017.
  • [20] S Livens, P Scheunders, G Van de Wouwer, and D Van Dyck. Wavelets for texture analysis, an overview. 1997.
  • [21] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proc. 8th Int’l Conf. Computer Vision, volume 2, pages 416–423, July 2001.
  • [22] Rajiv Mehrotra, Kameswara Rao Namuduri, and Nagarajan Ranganathan. Gabor filter-based edge detection. Pattern recognition, 25(12):1479–1494, 1992.
  • [23] David Bryant Mumford and Jayant Shah. Optimal approximations by piecewise smooth functions and associated variational problems. Communications on pure and applied mathematics, 1989.
  • [24] Mengyang Pu, Ya** Huang, Yuming Liu, Qingji Guan, and Haibin Ling. Edter: Edge detection with transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1402–1412, 2022.
  • [25] Lara Raad, Axel Davy, Agnès Desolneux, and Jean-Michel Morel. A survey of exemplar-based texture synthesis. Annals of Mathematical Sciences and Applications, 3(1):89–148, 2018.
  • [26] Berta Sandberg, Sung Ha Kang, and Tony F Chan. Unsupervised multiphase segmentation: A phase balancing model. IEEE transactions on image processing, 19(1):119–130, 2009.
  • [27] George AF Seber and Alan J Lee. Linear regression analysis, volume 330. John Wiley & Sons, 2003.
  • [28] Jean Serra and Luc Vincent. An overview of morphological filtering. Circuits, Systems and Signal Processing, 11:47–108, 1992.
  • [29] Xavier Soria, Edgar Riba, and Angel Sappa. Dense extreme inception network: Towards a robust cnn model for edge detection. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1912–1921, 2020.
  • [30] Mihran Tuceryan and Anil K Jain. Texture analysis. Handbook of pattern recognition and computer vision, pages 235–276, 1993.
  • [31] Michael Unser. Texture classification and segmentation using wavelet frames. IEEE Transactions on image processing, 4(11):1549–1560, 1995.
  • [32] Luc Van Gool, Piet Dewaele, and André Oosterlinck. Texture analysis anno 1983. Computer vision, graphics, and image processing, 29(3):336–357, 1985.
  • [33] Li Wang and Dong-Chen He. Texture classification using texture spectrum. Pattern recognition, 23(8):905–910, 1990.
  • [34] Saining ”Xie and Zhuowen” Tu. Holistically-nested edge detection. In Proceedings of IEEE International Conference on Computer Vision, 2015.
  • [35] Ruotao Xu, Yong Xu, and Yuhui Quan. Structure-texture image decomposition using discriminative patch recurrence. IEEE Transactions on Image Processing, 30:1542–1555, 2020.
  • [36] Ido Zachevsky and Yehoshua Y Josh Zeevi. Statistics of natural stochastic textures and their application in image denoising. IEEE Transactions on Image Processing, 25(5):2130–2145, 2016.

Appendix A Proof of Theorem 1.

Proof.

Note that (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) is a quadratic function of two random vectors 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ) and 𝒫(𝐲)𝒫𝐲\vec{\mathcal{P}}(\mathbf{y})over→ start_ARG caligraphic_P end_ARG ( bold_y ), the expectation needs to be computed with double integral

𝔼((𝐲;𝐱))=𝔼𝐱(𝔼𝐲|𝐱(𝐲;𝐱)).𝔼𝐲𝐱subscript𝔼𝐱subscript𝔼conditional𝐲𝐱𝐲𝐱\mathbb{E}\left(\mathcal{R}(\mathbf{y};\mathbf{x})\right)=\mathbb{E}_{\mathbf{% x}}\left(\mathbb{E}_{\mathbf{y}|\mathbf{x}}\mathcal{R}(\mathbf{y};\mathbf{x})% \right).blackboard_E ( caligraphic_R ( bold_y ; bold_x ) ) = blackboard_E start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT ( blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT caligraphic_R ( bold_y ; bold_x ) ) .

To handle the conditional expectation 𝔼𝐲|𝐱()subscript𝔼conditional𝐲𝐱\mathbb{E}_{\mathbf{y}|\mathbf{x}}(\cdot)blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( ⋅ ), one needs the conditional distribution of 𝒫(𝐲)𝒫𝐲\vec{\mathcal{P}}(\mathbf{y})over→ start_ARG caligraphic_P end_ARG ( bold_y ), which is Gaussian with mean (15) and variance (16). Then

𝔼𝐲|𝐱(𝒫(𝐲)𝒫(𝐱)22)subscript𝔼conditional𝐲𝐱superscriptsubscriptnorm𝒫𝐲𝒫𝐱22\displaystyle\mathbb{E}_{\mathbf{y}|\mathbf{x}}\left(\|\vec{\mathcal{P}}(% \mathbf{y})-\vec{\mathcal{P}}(\mathbf{x})\|_{2}^{2}\right)blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( ∥ over→ start_ARG caligraphic_P end_ARG ( bold_y ) - over→ start_ARG caligraphic_P end_ARG ( bold_x ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) =𝔼𝐲|𝐱(𝒫T(𝐲)𝒫(𝐲)2𝒫T(𝐲)𝒫(𝐱)+𝒫T(𝐱)𝒫(𝐱)).absentsubscript𝔼conditional𝐲𝐱superscript𝒫𝑇𝐲𝒫𝐲2superscript𝒫𝑇𝐲𝒫𝐱superscript𝒫𝑇𝐱𝒫𝐱\displaystyle=\mathbb{E}_{\mathbf{y}|\mathbf{x}}\left(\vec{\mathcal{P}}^{T}(% \mathbf{y})\vec{\mathcal{P}}(\mathbf{y})-2\vec{\mathcal{P}}^{T}(\mathbf{y})% \vec{\mathcal{P}}(\mathbf{x})+\vec{\mathcal{P}}^{T}(\mathbf{x})\vec{\mathcal{P% }}(\mathbf{x})\right).= blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( over→ start_ARG caligraphic_P end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_y ) over→ start_ARG caligraphic_P end_ARG ( bold_y ) - 2 over→ start_ARG caligraphic_P end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_y ) over→ start_ARG caligraphic_P end_ARG ( bold_x ) + over→ start_ARG caligraphic_P end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_x ) over→ start_ARG caligraphic_P end_ARG ( bold_x ) ) . (25)
=tr(Σp(𝐲;𝐱))+μpT(𝐲;𝐱)μp(𝐲;𝐱)2μpT(𝐲;𝐱)P(𝐱)+𝒫T(𝐱)𝒫(𝐱)absenttrsubscriptΣ𝑝𝐲𝐱superscriptsubscript𝜇𝑝𝑇𝐲𝐱subscript𝜇𝑝𝐲𝐱2superscriptsubscript𝜇𝑝𝑇𝐲𝐱𝑃𝐱superscript𝒫𝑇𝐱𝒫𝐱\displaystyle=\mathrm{tr}\left(\Sigma_{p}(\mathbf{y;x})\right)+\vec{\mu}_{p}^{% T}(\mathbf{y};\mathbf{x})\vec{\mu}_{p}(\mathbf{y};\mathbf{x})-2\vec{\mu}_{p}^{% T}(\mathbf{y};\mathbf{x})\vec{P}(\mathbf{x})+\vec{\mathcal{P}}^{T}(\mathbf{x})% \vec{\mathcal{P}}(\mathbf{x})= roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) ) + over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_y ; bold_x ) over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) - 2 over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_y ; bold_x ) over→ start_ARG italic_P end_ARG ( bold_x ) + over→ start_ARG caligraphic_P end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_x ) over→ start_ARG caligraphic_P end_ARG ( bold_x ) (26)

where Lemma 1 is applied to the first term of the right hand side of (25). To compute the expectation of (26) with respect to 𝒫(𝐱)𝒫𝐱\vec{\mathcal{P}}(\mathbf{x})over→ start_ARG caligraphic_P end_ARG ( bold_x ), we have the following identities:

tr(Σp(𝐲;𝐱))trsubscriptΣ𝑝𝐲𝐱\displaystyle\mathrm{tr}\left(\Sigma_{p}(\mathbf{y};\mathbf{x})\right)roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) ) =tr(Σp)tr(Σp1ΣcT(τ)Σc(τ))absenttrsubscriptΣ𝑝trsuperscriptsubscriptΣ𝑝1superscriptsubscriptΣc𝑇𝜏subscriptΣc𝜏\displaystyle=\mathrm{tr}\left(\Sigma_{p}\right)-\mathrm{tr}\left(\Sigma_{p}^{% -1}\Sigma_{\mathrm{c}}^{T}(\tau)\Sigma_{\mathrm{c}}(\tau)\right)= roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) - roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_τ ) roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) )
𝔼𝐱(μpT(𝐲;𝐱)μp(𝐲;𝐱))subscript𝔼𝐱superscriptsubscript𝜇𝑝𝑇𝐲𝐱subscript𝜇𝑝𝐲𝐱\displaystyle\mathbb{E}_{\mathbf{x}}\left(\vec{\mu}_{p}^{T}(\mathbf{y};\mathbf% {x})\vec{\mu}_{p}(\mathbf{y};\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT ( over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_y ; bold_x ) over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_y ; bold_x ) ) =μpTμp+tr(Σp1ΣcT(τ)Σc(τ))absentsuperscriptsubscript𝜇𝑝𝑇subscript𝜇𝑝trsuperscriptsubscriptΣ𝑝1superscriptsubscriptΣc𝑇𝜏subscriptΣc𝜏\displaystyle=\vec{\mu}_{p}^{T}\vec{\mu}_{p}+\mathrm{tr}\left(\Sigma_{p}^{-1}% \Sigma_{\mathrm{c}}^{T}(\tau)\Sigma_{\mathrm{c}}(\tau)\right)= over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT + roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_τ ) roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) )
𝔼𝐱(μpT(𝐲;𝐱)𝒫(𝐱))subscript𝔼𝐱superscriptsubscript𝜇𝑝𝑇𝐲𝐱𝒫𝐱\displaystyle\mathbb{E}_{\mathbf{x}}\left(\vec{\mu}_{p}^{T}(\mathbf{y};\mathbf% {x})\vec{\mathcal{P}}(\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT ( over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_y ; bold_x ) over→ start_ARG caligraphic_P end_ARG ( bold_x ) ) =μpTμp+tr(Σc(τ))absentsuperscriptsubscript𝜇𝑝𝑇subscript𝜇𝑝trsubscriptΣc𝜏\displaystyle=\vec{\mu}_{p}^{T}\vec{\mu}_{p}+\mathrm{tr}\left(\Sigma_{\mathrm{% c}}(\tau)\right)= over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT + roman_tr ( roman_Σ start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT ( italic_τ ) )
𝔼𝐱(𝒫T(𝐱)𝒫(𝐱))subscript𝔼𝐱superscript𝒫𝑇𝐱𝒫𝐱\displaystyle\mathbb{E}_{\mathbf{x}}\left(\vec{\mathcal{P}}^{T}(\mathbf{x})% \vec{\mathcal{P}}(\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT ( over→ start_ARG caligraphic_P end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_x ) over→ start_ARG caligraphic_P end_ARG ( bold_x ) ) =μpTμp+tr(Σp).absentsuperscriptsubscript𝜇𝑝𝑇subscript𝜇𝑝trsubscriptΣ𝑝\displaystyle=\vec{\mu}_{p}^{T}\vec{\mu}_{p}+\mathrm{tr}(\Sigma_{p}).= over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over→ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT + roman_tr ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) .

With substitution, the expectation of (𝐲;𝐱)𝐲𝐱\mathcal{R}(\mathbf{y};\mathbf{x})caligraphic_R ( bold_y ; bold_x ) is then given as

𝔼𝐱(𝔼𝐲|𝐱(𝐲;𝐱))subscript𝔼𝐱subscript𝔼conditional𝐲𝐱𝐲𝐱\displaystyle\mathbb{E}_{\mathbf{x}}\left(\mathbb{E}_{\mathbf{y}|\mathbf{x}}% \mathcal{R}(\mathbf{y};\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT ( blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT caligraphic_R ( bold_y ; bold_x ) ) =1d𝔼𝐱(𝔼𝐲|𝐱(𝒫(𝐲)𝒫(𝐱)22))absent1𝑑subscript𝔼𝐱subscript𝔼conditional𝐲𝐱superscriptsubscriptnorm𝒫𝐲𝒫𝐱22\displaystyle=\frac{1}{d}\mathbb{E}_{\mathbf{x}}\left(\mathbb{E}_{\mathbf{y}|% \mathbf{x}}\left(\|\vec{\mathcal{P}}(\mathbf{y})-\vec{\mathcal{P}}(\mathbf{x})% \|_{2}^{2}\right)\right)= divide start_ARG 1 end_ARG start_ARG italic_d end_ARG blackboard_E start_POSTSUBSCRIPT bold_x end_POSTSUBSCRIPT ( blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( ∥ over→ start_ARG caligraphic_P end_ARG ( bold_y ) - over→ start_ARG caligraphic_P end_ARG ( bold_x ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) )
=1d(2tr(Σp)2tr(Σc(τ)))=2σp2(1exp(τ22lp2)).absent1𝑑2trsubscriptΣ𝑝2trsubscriptΣ𝑐𝜏2superscriptsubscript𝜎𝑝21superscript𝜏22superscriptsubscript𝑙𝑝2\displaystyle=\frac{1}{d}\left(2\mathrm{tr}\left(\Sigma_{p}\right)-2\mathrm{tr% }\left(\Sigma_{c}(\tau)\right)\right)=2\sigma_{p}^{2}\left(1-\exp(-\frac{\tau^% {2}}{2l_{p}^{2}})\right).= divide start_ARG 1 end_ARG start_ARG italic_d end_ARG ( 2 roman_t roman_r ( roman_Σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) - 2 roman_t roman_r ( roman_Σ start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_τ ) ) ) = 2 italic_σ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 - roman_exp ( - divide start_ARG italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ) .

Appendix B Junction edge refinement

For the edges near a junction point, the strength of V𝑉Vitalic_V may be weaker than straight edges as seem in Figure 13. This is because the local patches at a location x which is near a junction point, may observed another textures which is a bit further from the two immediate two texture edge near x and confuse the segmentation. In Figure 17, (b) shows this effect. This can be improved by a simple dilation-erosion operation from mathematical morphology [28]. By using line shaped structuring element, where the orientation of the line should be parallel to the existing edge direction, the edge function V𝑉Vitalic_V is improved as in Figure 17 (c) and (d).

(a) (b) (c) (d)
Refer to caption Refer to caption Refer to caption Refer to caption
Figure 17: (a) The given image with 4-junction. (b) The edge function V𝑉Vitalic_V. (c) Improved edge function V𝑉Vitalic_V using straight line direction enhancing. (d) TEP result with the improved V𝑉Vitalic_V.

Appendix C Periodic texture and its patch response

Extending from the analysis in Section 3, we numerically present the cases for periodic texture. For periodic texture, it is interesting to notice that the variance of expectation is also strongly correlated to the period of the texture. In Figure 18, we show the distribution of 𝔼𝐲|𝐱((𝐲;𝐱))subscript𝔼conditional𝐲𝐱𝐲𝐱\mathbb{E}_{\mathbf{y}|\mathbf{x}}\left(\mathcal{R}(\mathbf{y};\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( caligraphic_R ( bold_y ; bold_x ) ) with varying patch width parameter r𝑟ritalic_r. The variance is near zero whenever the patch width parameter r𝑟ritalic_r matches the periods of the texture, while the general decreasing effect discussed in section 3 still exists. This is consistent with the work by Hong, et al. [12], where the authors observed that for a periodic image, some statistical distance measurement of the image patch vs the entire image vanishes whenever the patch width parameter is a multiple of the texture period. In another work [15], authors measured the scale of the texture by applying time varying Gaussian kernel to the image, and observe when the averaging process has big jump, and use it to measure the scale.

In this paper, we choose relatively small r𝑟ritalic_r while kee** the patch response consistent and stable. For periodic texture, we can use these estimations to help find the scale of the texture.

(a) (b)
Refer to caption Refer to caption
Figure 18: (a) Given synthetic periodic texture. (b) Estimation of the distribution of 𝔼𝐲|𝐱((𝐲;𝐱))subscript𝔼conditional𝐲𝐱𝐲𝐱\mathbb{E}_{\mathbf{y}|\mathbf{x}}\left(\mathcal{R}(\mathbf{y};\mathbf{x})\right)blackboard_E start_POSTSUBSCRIPT bold_y | bold_x end_POSTSUBSCRIPT ( caligraphic_R ( bold_y ; bold_x ) ). The black horizontal line in (b) shows empirical mean, and the blue horizontal line in (b) shows the theoretical mean. The dark red region shows the location of distribution median. The vertical lines indicate multiples of the texture period. Notice that there is a sharp decrease of the variance when the patch width parameter r𝑟ritalic_r hits the period.