PhoTOS: Topology Optimization of Photonic Components using a Shape Library

   Rahul Kumar Padhy
Department of Mechanical Engineering
University of Wisconsin-Madison
Madison, WI, USA
[email protected]
   Aaditya Chandrasekhar
Department of Mechanical Engineering
Northwestern University
Evanston, IL, USA
[email protected]
Abstract

Topology Optimization (TO) holds the promise of designing next-generation compact and efficient photonic components. However, ensuring the optimized designs comply with fabrication constraints imposed by semiconductor foundries remains a challenge. This work presents a TO framework that guarantees designs satisfy fabrication criteria, particularly minimum feature size and separation. Leveraging recent advancements in machine learning and feature map** methods, our approach constructs components by transforming shapes from a predefined library, simplifying constraint enforcement. Specifically, we introduce a Convo-implicit Variational Autoencoder to encode the discrete shape library into a differentiable space, enabling gradient-based optimization. The efficacy of our framework is demonstrated through the design of several common photonic components.

Keywords Topology Optimization  \cdot Photonics  \cdot Fabrication Constraints  \cdot Machine Learning

1 Introduction

Refer to caption
Figure 1: Graphical abstract: Given a predefined library of shapes, a Convo-implicit Variational Autoencoder (VAE) is trained to encode them in a differentiable latent space. Shape instances are then selected from this latent space and subjected to rotation, scaling, and translation to populate the design space, yielding an optimized photonic component.

Topology optimization (TO) are a class of methods used to optimize material distribution within a design domain, achieving optimal performance under given constraints. These methods efficiently navigate complex design spaces that challenge conventional, intuition-based approaches. While traditionally applied in structural mechanics, TO has seen increasing adoption in photonics [1], where advances in integrated photonic device design are critical for progress in high-speed communication [2], quantum computing [3], and machine learning accelerators [4].

While various TO methods have been proposed, density-based approaches have been widely adopted [5]. Here, the design is parameterized by a set of pixels, and the optimal material is assigned to each pixel [6, 7]. This results in organic, free-form designs. Consider, for example, the design of a waveguide bend (fig. 2(a)), where light from the input port (right) is to be directed to the output port (bottom) with minimal back reflection and maximum transmission. fig. 2(b) illustrates a typical density-based design. While offering significant design freedom, the resulting designs can be difficult to interpret [8], manufacture, and constrain to fabrication criteria [9, 7, 10].

Conversely, alternative techniques have been proposed that represent the design using simpler geometric shapes such as bars [11], plates [12], and polygons [13]. These methods, collectively referred to as feature map** methods [14] parameterize shapes by high-level descriptors such as width, radius, and angle. Varying these parameters generates various dimensions of the shape that are projected onto the design domain. For instance fig. 2(c) showcases the waveguide bend composed of triangles. As evident, these techniques yield more interpretable designs that are easier to constrain. However, while simplifying the enforcement of manufacturing constraints, they significantly limit design freedom [8].

This work proposes an optimization framework that strikes a balance between density-based and feature-map** methods. Our approach offers increased design freedom [15] while retaining the interpretability and ease of constraint enforcement characteristic of feature-map** techniques. Specifically, we make the following contributions:

  1. 1.

    We extend feature-map** methods, traditionally limited to a single shape, to accommodate multiple shapes (section 2.2).

  2. 2.

    We introduce a Convo-implicit Variational Autoencoder (VAE) [16] to transform a discrete library of shapes into a continuous and differentiable form, enabling gradient based optimization (section 2.2.2).

  3. 3.

    We present an optimization framework that produce fabricable, optimized photonic components by selecting and transforming shapes from the library (section 2.4).

  4. 4.

    We apply our framework to design waveguide bends and mode converters, demonstrating its ability to reliably produce photonic components that meet both performance and fabrication criteria (section 3).

Refer to caption
Figure 2: (a) Photonic device design domain and boundary conditions. (b) Density-based topology optimization. (c) Topology optimization using single shape-based feature map**. (d) Topology optimization using multiple generic shapes.

2 Proposed Method

2.1 Overview

In this study, we focus on the TO of photonic components. We begin by assuming that the performance criteria; typically the allowable insertion, reflection and cross-talk loss have been prescribed. Further, we assume that the fabrication criteria: minimum feature size (MFS) and minimum separation distance (MSD) (fig. 5) have also been specified. Finally, we assume that a library of (nLsubscript𝑛𝐿n_{L}italic_n start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT) shapes has been prescribed (fig. 3). The objective then is to find through gradient based optimization, an optimal configuration (selection, translation, orientation, and scaling) (see fig. 4) of shape instances from the library onto the design space such that the optimized design meets the required performance and fabrication criteria.

2.2 Shape Library

Refer to caption
Figure 3: A library of nL(=15)annotatedsubscript𝑛𝐿absent15n_{L}\;(=15)italic_n start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( = 15 ) prescribed shapes.

We compose our library (fig. 3) of generic shapes frequently employed in photomask fabrication [17] and other common shapes. The shapes are chosen to meet an MFS (fig. 5(a)) of 40 nm [18]. Importantly, we observe that these shapes are not derived from a single parametric family; rather, the library includes a variety of discrete shapes without common defining parameters.

Recall that gradient-based optimization relies on the ability to continuously vary its optimization parameters. In other words, the optimizer must be able to continuously vary between the shapes in the library. Thus, to facilitate the optimal choice of shapes by the optimizer, we transform our discrete shape library into a continuous and differentiable representation. In particular, a continuous representation of the shape library entails two key components:

  1. (a)

    a continuous representation of each individual shape within the library (section 2.2.1).

  2. (b)

    a continuous representation of the collective library itself (section 2.2.2).

Refer to caption
Figure 4: Shape instances selected from the library are translated, oriented, and scaled onto the design domain.
Refer to caption
Figure 5: To ensure fabricability, we impose: (a) minimum feature size constraint and (b) minimum separation distance constraints.

2.2.1 Shape Representation

Refer to caption
Figure 6: A shape from the library and its signed distance field (SDF). The 0thsuperscript0𝑡0^{th}0 start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT contour of the SDF is shown in dotted lines.

To achieve continuous shape representation, we represent our shapes using signed distance fields (SDFs). SDFs (ϕitalic-ϕ\phiitalic_ϕ) are C0superscript𝐶0C^{0}italic_C start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT continuous functions, with the value at any point defined as the shortest distance to the shape’s boundary (fig. 6). We compute the SDFs for the shapes in our library over the bounding box ([𝒙¯b,𝒙¯b]subscript¯𝒙𝑏subscript¯𝒙𝑏[\underline{\bm{x}}_{b},\overline{\bm{x}}_{b}][ under¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT , over¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ]). In addition to their continuous nature, the implicit definition of distances from the boundary, facilitates the computation of the MSD constraint (see fig. 5(b), section 2.4).

2.2.2 Library Representation

To achieve continuous library representation, we employ a VAE. VAEs are a particular construct of neural networks, that among other attributes, convert discrete data into a continuous and differentiable representation. This facilitates gradient-based optimization [19, 20, 21].

Specifically, the shape SDFs from the previous section serve as both the input and output data for the VAE. We propose a VAE architecture with the following key attributes:

  1. (a)

    A continuous and differentiable latent space that represents the shape library, facilitating gradient-based optimization.

  2. (b)

    An implicit representation of the output shape SDFs as a function of spatial coordinates. This enables the querying of SDF values at arbitrary locations and resolutions, a crucial requirement for our optimization formulation.

To achieve these attributes, we propose a convo-implicit VAE architecture. This architecture employs a convolutional encoder in conjunction with an implicit encoder [16, 22, 23] and decoder. The proposed Convo-Implicit VAE architecture (fig. 7) comprises the following components:

  1. 1.

    Firstly, the shape SDFs (={ϕ(1),ϕ(2),,ϕ(L)})superscriptitalic-ϕ1superscriptitalic-ϕ2superscriptitalic-ϕ𝐿(\mathcal{I}=\{\phi^{(1)},\phi^{(2)},\ldots,\phi^{(L)}\})( caligraphic_I = { italic_ϕ start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , italic_ϕ start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT , … , italic_ϕ start_POSTSUPERSCRIPT ( italic_L ) end_POSTSUPERSCRIPT } ) of size {nL×nx×nx}subscript𝑛𝐿subscript𝑛𝑥subscript𝑛𝑥\{n_{L}\times n_{x}\times n_{x}\}{ italic_n start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT } are propagated through the convolutional encoder. Here, nxsubscript𝑛𝑥n_{x}italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT denotes the resolution of the SDF images in the training set. This results in an output latent space z𝑧zitalic_z of size {nL×nz}subscript𝑛𝐿subscript𝑛𝑧\{n_{L}\times n_{z}\}{ italic_n start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT }. In particular, we have nz=2subscript𝑛𝑧2n_{z}=2italic_n start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT = 2.

  2. 2.

    The coordinates of the SDFs (𝒙𝒙\bm{x}bold_italic_x) of size {np×2}subscript𝑛𝑝2\{n_{p}\times 2\}{ italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT × 2 } are propagated through an implicit encoder [22]. Here, np=nx2subscript𝑛𝑝superscriptsubscript𝑛𝑥2n_{p}=n_{x}^{2}italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is the total number of pixels in the training images. This results in an output of projected coordinates C𝐶Citalic_C of size {np×nc}subscript𝑛𝑝subscript𝑛𝑐\{n_{p}\times n_{c}\}{ italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT }.

  3. 3.

    The latent space z𝑧zitalic_z and the projected coordinates C𝐶Citalic_C are concatenated. This combined tensor of size {nL×np×(nz+nc)}subscript𝑛𝐿subscript𝑛𝑝subscript𝑛𝑧subscript𝑛𝑐\{n_{L}\times n_{p}\times(n_{z}+n_{c})\}{ italic_n start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT × ( italic_n start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT + italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) } is propagated through the decoder, producing the reconstructed feature SDFs (^={ϕ^(1),ϕ^(2),,ϕ^(L)})^superscript^italic-ϕ1superscript^italic-ϕ2superscript^italic-ϕ𝐿(\hat{\mathcal{I}}=\{\hat{\phi}^{(1)},\hat{\phi}^{(2)},\ldots,\hat{\phi}^{(L)}\})( over^ start_ARG caligraphic_I end_ARG = { over^ start_ARG italic_ϕ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , over^ start_ARG italic_ϕ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT , … , over^ start_ARG italic_ϕ end_ARG start_POSTSUPERSCRIPT ( italic_L ) end_POSTSUPERSCRIPT } ) of size {nL×np}subscript𝑛𝐿subscript𝑛𝑝\{n_{L}\times n_{p}\}{ italic_n start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT }.

Refer to caption
Figure 7: Architecture of the convo-implicit VAE.
Refer to caption
Figure 8: Continuous latent space obtained from the trained Conv-implicit VAE. (\bigstar) represents the coordinates [1.7,0.3]1.70.3[1.7,-0.3][ 1.7 , - 0.3 ], corresponding to a rounded triangle.

We train the Convo-Implicit VAE to minimize the discrepancy between the reconstructed and input SDFs by reducing their mean squared error [24]. Additionally, the latent space (fig. 8) is constrained to resemble a standard normal distribution 𝒛𝒩(0,1)similar-to𝒛𝒩01\bm{z}\sim\mathcal{N}(0,1)bold_italic_z ∼ caligraphic_N ( 0 , 1 ) through a KL divergence loss term [19]. The Convo-Implicit VAE’s net loss can be expressed as:

Lv=||^||22+βKL(𝒛||𝒩)L_{v}=||\mathcal{I}-\hat{\mathcal{I}}||_{2}^{2}\;+\;\beta\text{KL}(\bm{z}||% \mathcal{N})italic_L start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = | | caligraphic_I - over^ start_ARG caligraphic_I end_ARG | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_β KL ( bold_italic_z | | caligraphic_N ) (1)

Where the parameter β𝛽\betaitalic_β (=5×108)absent5superscript108(=5\times 10^{-8})( = 5 × 10 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT ) is the relative weight of the KL divergence loss term. Convergence was achieved after training for 4×1044superscript1044\times 10^{4}4 × 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT epochs with a learning rate of 5×1045superscript1045\times 10^{-4}5 × 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT using the Adam optimizer.

After training, we discard the convolutional encoder. The retained implicit encoder and decoder can be queried with spatial 𝒙𝒙\bm{x}bold_italic_x and latent 𝒛𝒛\bm{z}bold_italic_z coordinates respectively to obtain the SDF ϕ^(𝒙,𝒛)^italic-ϕ𝒙𝒛\hat{\phi}(\bm{x},\bm{z})over^ start_ARG italic_ϕ end_ARG ( bold_italic_x , bold_italic_z ). For instance querying at 𝒛=[1.7,0.3]𝒛1.70.3\bm{z}=[1.7,-0.3]bold_italic_z = [ 1.7 , - 0.3 ] and 𝒙[𝒙¯b,𝒙¯b]𝒙subscript¯𝒙𝑏subscript¯𝒙𝑏\bm{x}\in[\underline{\bm{x}}_{b},\overline{\bm{x}}_{b}]bold_italic_x ∈ [ under¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT , over¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ], we obtain the SDF of the rounded triangle (fig. 8).

2.2.3 Approximating the SDF

The retained implicit encoder and decoder allow us to query shape SDFs at any spatial coordinate, including those outside the training bounding box (section 2.2). However, this presents two challenges:

  1. 1.

    Querying the VAE for each shape instance over a large number of spatial coordinates can be computationally expensive.

  2. 2.

    The VAE must extrapolate SDF values outside the training bounding box, leading to inaccurate results.

To address these challenges, we propose approximating the shape SDF outside its bounding box. Observe that at large distances, the SDF of any shape approaches the Euclidean distance from the origin (Figure 9). Therefore, we can approximate the SDF as:

ϕ^(𝒙)={ϕ^(𝒙)𝒙[𝒙¯b,𝒙¯b]𝒙2Otherwise^italic-ϕ𝒙cases^italic-ϕ𝒙𝒙subscript¯𝒙𝑏subscript¯𝒙𝑏otherwiseabsentsubscriptnorm𝒙2Otherwiseotherwise\hat{\phi}(\bm{x})=\begin{cases}\hat{\phi}(\bm{x})\quad\bm{x}\in[\underline{% \bm{x}}_{b},\overline{\bm{x}}_{b}]\\ \approx||\bm{x}||_{2}\quad\text{Otherwise}\end{cases}over^ start_ARG italic_ϕ end_ARG ( bold_italic_x ) = { start_ROW start_CELL over^ start_ARG italic_ϕ end_ARG ( bold_italic_x ) bold_italic_x ∈ [ under¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT , over¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ] end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL ≈ | | bold_italic_x | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT Otherwise end_CELL start_CELL end_CELL end_ROW (2)
Refer to caption
Figure 9: SDF at large distance distance from an object can be approximated to a circle.

Importantly, note that since we retain the SDF values within the training bounding box, this approximation does not compromise the accuracy of our optimization.

2.2.4 Geometric Projection

Refer to caption
Figure 10: The SDF of a shape and its projected density field.

The central idea of geometry projection is to map shapes, as described by their SDFs, onto density fields (fig. 10). This map** subsequently enables us to simulate the design. The density field, ρ(x)𝜌𝑥\rho(x)italic_ρ ( italic_x ), is defined such that ρ(𝒙)=0𝜌𝒙0\rho(\bm{x})=0italic_ρ ( bold_italic_x ) = 0 denotes the region outside the shape and ρ(𝒙)=1𝜌𝒙1\rho(\bm{x})=1italic_ρ ( bold_italic_x ) = 1 denotes the region inside the shape. Given ϕ^(i)(𝒙)superscript^italic-ϕ𝑖𝒙\hat{\phi}^{(i)}(\bm{x})over^ start_ARG italic_ϕ end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( bold_italic_x ) as the SDF at a point 𝒙𝒙\bm{x}bold_italic_x of shape i𝑖iitalic_i, the corresponding density ρ(i)(𝒙)superscript𝜌𝑖𝒙\rho^{(i)}(\bm{x})italic_ρ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( bold_italic_x ), is obtained as:

ρ(i)(𝒙)=σβ(ϕ^(𝒙(i)))=11+eβϕ^(𝒙(i))superscript𝜌𝑖𝒙subscript𝜎𝛽^italic-ϕsuperscript𝒙𝑖11superscript𝑒𝛽^italic-ϕsuperscript𝒙𝑖\rho^{(i)}(\bm{x})=\sigma_{\beta}(\hat{\phi}(\bm{x}^{(i)}))=\frac{1}{1+e^{-% \beta\hat{\phi}(\bm{x}^{(i)})}}italic_ρ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( bold_italic_x ) = italic_σ start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT ( over^ start_ARG italic_ϕ end_ARG ( bold_italic_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) ) = divide start_ARG 1 end_ARG start_ARG 1 + italic_e start_POSTSUPERSCRIPT - italic_β over^ start_ARG italic_ϕ end_ARG ( bold_italic_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT end_ARG (3)

Where σβsubscript𝜎𝛽\sigma_{\beta}italic_σ start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT is the sigmoid function with a projection sharpness of β𝛽\betaitalic_β. Observe that this map** is differentiable, thus enabling gradient-based optimization. Once the density fields of all shape instances are computed, the overall design density ρ^(𝒙)^𝜌𝒙\hat{\rho}(\bm{x})over^ start_ARG italic_ρ end_ARG ( bold_italic_x ) is obtained by taking the union using a pthsuperscript𝑝𝑡p^{th}italic_p start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT-norm formulation:

ρ^(𝒙)=(i=1nF(ρ(i)(𝒙))p)1p^𝜌𝒙superscriptsuperscriptsubscript𝑖1subscript𝑛𝐹superscriptsuperscript𝜌𝑖𝒙𝑝1𝑝\hat{\rho}(\bm{x})=\left(\sum\limits_{i=1}^{n_{F}}\left(\rho^{(i)}(\bm{x}\big{% )}\right)^{p}\right)^{\frac{1}{p}}over^ start_ARG italic_ρ end_ARG ( bold_italic_x ) = ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_ρ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( bold_italic_x ) ) start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_p end_ARG end_POSTSUPERSCRIPT (4)

Finally, the design domain is populated with Silicon where the density equals 1, and Silicon dioxide where the density equals 0. The permittivity at a point 𝒙𝒙\bm{x}bold_italic_x can then be expressed as:

ε(𝒙)=ε(ox)+(ε(si)ε(ox))ρ^(𝒙)𝜀𝒙superscript𝜀𝑜𝑥superscript𝜀𝑠𝑖superscript𝜀𝑜𝑥^𝜌𝒙\varepsilon(\bm{x})=\varepsilon^{(ox)}+(\varepsilon^{(si)}-\varepsilon^{(ox)})% \hat{\rho}(\bm{x})italic_ε ( bold_italic_x ) = italic_ε start_POSTSUPERSCRIPT ( italic_o italic_x ) end_POSTSUPERSCRIPT + ( italic_ε start_POSTSUPERSCRIPT ( italic_s italic_i ) end_POSTSUPERSCRIPT - italic_ε start_POSTSUPERSCRIPT ( italic_o italic_x ) end_POSTSUPERSCRIPT ) over^ start_ARG italic_ρ end_ARG ( bold_italic_x ) (5)

Where ε(ox)(=2.25)annotatedsuperscript𝜀𝑜𝑥absent2.25\varepsilon^{(ox)}(=2.25)italic_ε start_POSTSUPERSCRIPT ( italic_o italic_x ) end_POSTSUPERSCRIPT ( = 2.25 ) and ε(si)(=12.25)annotatedsuperscript𝜀𝑠𝑖absent12.25\varepsilon^{(si)}(=12.25)italic_ε start_POSTSUPERSCRIPT ( italic_s italic_i ) end_POSTSUPERSCRIPT ( = 12.25 ) are the permittivities of Silicon oxide and Silicon respectively.

2.3 Shape Transformation

Recall that our objective is to achieve an optimal configuration of shapes from the library within the design space (fig. 4). Further, recall that each shape is defined in its local (library) frame of reference (fig. 7). We achieve the configurations through affine transformations from the local to the global (design) frame of reference. In particular, we define a translation (x¯(i),y¯(i)superscript¯𝑥𝑖superscript¯𝑦𝑖\bar{x}^{(i)},\bar{y}^{(i)}over¯ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , over¯ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT), orientation (θ(i)superscript𝜃𝑖\theta^{(i)}italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT), and scaling (s(i)superscript𝑠𝑖s^{(i)}italic_s start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT) operation for each shape instances i=1,,nF𝑖1subscript𝑛𝐹i=1,\ldots,n_{F}italic_i = 1 , … , italic_n start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT. With xe,yesubscript𝑥𝑒subscript𝑦𝑒{x}_{e},{y}_{e}italic_x start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT being coordinates in the design frame of reference (typically, center coordinates of a simulation mesh elements), we obtain the transformed coordinates x~e(i),y~e(i)superscriptsubscript~𝑥𝑒𝑖superscriptsubscript~𝑦𝑒𝑖\tilde{x}_{e}^{(i)},\tilde{y}_{e}^{(i)}over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , over~ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT as:

(x~e(i)y~e(i))=1s(i)𝑹(θ(i))(xte(i)yte(i))matrixsuperscriptsubscript~𝑥𝑒𝑖superscriptsubscript~𝑦𝑒𝑖1superscript𝑠𝑖𝑹superscript𝜃𝑖matrixsuperscriptsubscriptsubscript𝑥𝑡𝑒𝑖superscriptsubscriptsubscript𝑦𝑡𝑒𝑖\begin{pmatrix}\tilde{x}_{e}^{(i)}\\ \tilde{y}_{e}^{(i)}\end{pmatrix}=\frac{1}{s^{(i)}}\bm{R}(\theta^{(i)})\begin{% pmatrix}{x_{t}}_{e}^{(i)}\\ {y_{t}}_{e}^{(i)}\end{pmatrix}( start_ARG start_ROW start_CELL over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = divide start_ARG 1 end_ARG start_ARG italic_s start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_ARG bold_italic_R ( italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) ( start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) (6)

where xte(i),yte(i)superscriptsubscriptsubscript𝑥𝑡𝑒𝑖superscriptsubscriptsubscript𝑦𝑡𝑒𝑖{x_{t}}_{e}^{(i)},{y_{t}}_{e}^{(i)}italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT are the coordinate after translation:

(xte(i)yte(i))=(xex¯(i)yey¯(i)),matrixsuperscriptsubscriptsubscript𝑥𝑡𝑒𝑖superscriptsubscriptsubscript𝑦𝑡𝑒𝑖matrixsubscript𝑥𝑒superscript¯𝑥𝑖subscript𝑦𝑒superscript¯𝑦𝑖\begin{pmatrix}{x_{t}}_{e}^{(i)}\\ {y_{t}}_{e}^{(i)}\end{pmatrix}=\begin{pmatrix}x_{e}-\bar{x}^{(i)}\\ y_{e}-\bar{y}^{(i)}\end{pmatrix},( start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - over¯ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - over¯ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) , (7)

and 𝑹(θ(i))𝑹superscript𝜃𝑖\bm{R}(\theta^{(i)})bold_italic_R ( italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) is the orientation matrix:

𝑹(θ(i))=[cosθ(i)sinθ(i)sinθ(i)cosθ(i)]𝑹superscript𝜃𝑖matrixsuperscript𝜃𝑖superscript𝜃𝑖superscript𝜃𝑖superscript𝜃𝑖\bm{R}(\theta^{(i)})=\begin{bmatrix}\cos\theta^{(i)}&-\sin\theta^{(i)}\\ \sin\theta^{(i)}&\cos\theta^{(i)}\\ \end{bmatrix}bold_italic_R ( italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) = [ start_ARG start_ROW start_CELL roman_cos italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL start_CELL - roman_sin italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL roman_sin italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL start_CELL roman_cos italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] (8)

2.4 Optimization

Having established a differentiable representation of the shape library and the necessary methods for its transformation within the design domain, we now outline the key components of the design optimization framework.

Design Variables : The optimization process encompasses the selection (as determined by latent coordinate (z1(i),z2(i))superscriptsubscript𝑧1𝑖superscriptsubscript𝑧2𝑖(z_{1}^{(i)},z_{2}^{(i)})( italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , italic_z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT )), translation (x¯(i),y¯(i))superscript¯𝑥𝑖superscript¯𝑦𝑖(\overline{x}^{(i)},\overline{y}^{(i)})( over¯ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , over¯ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ), orientation (θ(i))superscript𝜃𝑖(\theta^{(i)})( italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ), and scaling (s(i))superscript𝑠𝑖(s^{(i)})( italic_s start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) of each shape instance i=1,,nF𝑖1subscript𝑛𝐹i=1,\ldots,n_{F}italic_i = 1 , … , italic_n start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT. Collectively, this forms our design variables 𝚲¯={𝚲(1),𝚲(2),,𝚲(nF)}bold-¯𝚲superscript𝚲1superscript𝚲2superscript𝚲subscript𝑛𝐹\bm{\overline{\Lambda}}=\{\bm{\Lambda}^{(1)},\bm{\Lambda}^{(2)},\ldots,\bm{% \Lambda}^{(n_{F})}\}overbold_¯ start_ARG bold_Λ end_ARG = { bold_Λ start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , bold_Λ start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT , … , bold_Λ start_POSTSUPERSCRIPT ( italic_n start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT } where 𝚲(i)={z1(i),z2(i),x¯(i),y¯(i),θ(i),s(i)}superscript𝚲𝑖superscriptsubscript𝑧1𝑖superscriptsubscript𝑧2𝑖superscript¯𝑥𝑖superscript¯𝑦𝑖superscript𝜃𝑖superscript𝑠𝑖\bm{\Lambda}^{(i)}=\{z_{1}^{(i)},z_{2}^{(i)},\bar{x}^{(i)},\bar{y}^{(i)},% \theta^{(i)},s^{(i)}\}bold_Λ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT = { italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , italic_z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , over¯ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , over¯ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , italic_s start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT }.

Simulation : Upon computing the permittivities (eq. 5) at the center of mesh elements, we simulate the optical response of the component by employing a Frequency Domain Finite Difference (FDFD) solver. The electric field, 𝑬𝑬\bm{E}bold_italic_E, is determined by solving the linear system:

𝑲ω(𝜺)𝑬=𝑱subscript𝑲𝜔𝜺𝑬𝑱\bm{K}_{\omega}(\bm{\varepsilon})\bm{E}=\bm{J}bold_italic_K start_POSTSUBSCRIPT italic_ω end_POSTSUBSCRIPT ( bold_italic_ε ) bold_italic_E = bold_italic_J (9)

,

where 𝑲ωsubscript𝑲𝜔\bm{K}_{\omega}bold_italic_K start_POSTSUBSCRIPT italic_ω end_POSTSUBSCRIPT represents the system matrix at frequency ω𝜔\omegaitalic_ω and 𝑱𝑱\bm{J}bold_italic_J denotes the source(s). Specifically, we utilize Ceviche [25], an open-source FDFD electro-magnetic simulator. The resulting electric field is then used to compute the scattering spectra, which serves as the metric for evaluating the objective function.

Objective: We aim to achieve photonic devices that meet performance specifications, specifically, desired levels of insertion and reflection losses. Following the approach in [9], we express these specifications by defining cutoff values 𝑺superscript𝑺\bm{S}^{*}bold_italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT for the magnitudes of the scattering spectra at the monitored wavelengths at the input and output ports. With 𝑺(𝚲¯)𝑺¯𝚲\bm{S}(\bar{\bm{\Lambda}})bold_italic_S ( over¯ start_ARG bold_Λ end_ARG ) representing the scattering spectra obtained with design parameters 𝚲¯¯𝚲\bar{\bm{\Lambda}}over¯ start_ARG bold_Λ end_ARG, we can express our objective as:

L(𝚲¯)=Ψ+(|𝑺(𝚲¯)|2|𝑺|2𝒘)22𝐿¯𝚲superscriptsubscriptnormsuperscriptΨsuperscript𝑺¯𝚲2superscriptsuperscript𝑺2𝒘22L(\bar{\bm{\Lambda}})=\left\|\Psi^{+}\left(\frac{|\bm{S}(\bar{\bm{\Lambda}})|^% {2}-|\bm{S}^{*}|^{2}}{\bm{w}}\right)\right\|_{2}^{2}italic_L ( over¯ start_ARG bold_Λ end_ARG ) = ∥ roman_Ψ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( divide start_ARG | bold_italic_S ( over¯ start_ARG bold_Λ end_ARG ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - | bold_italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG bold_italic_w end_ARG ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (10)

where Ψ+()superscriptΨ\Psi^{+}(\cdot)roman_Ψ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( ⋅ ) denotes the softplus function, and 𝒘𝒘\bm{w}bold_italic_w represents the relative weights for each entry in the scattering spectra. For a detailed discussion, we refer the reader to [9].

Minimum Separation Distance Constraint : We impose a minimum spacing constraint to promote fabricability. Following [26] [27], we offset the feature instances by an amount equal to half of the MSD (i.e., δ=MSD/2superscript𝛿𝑀𝑆𝐷2\delta^{*}=MSD/2italic_δ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_M italic_S italic_D / 2). Then, with χ+()superscript𝜒\chi^{+}(\cdot)italic_χ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( ⋅ ) representing the ReLU function and nesubscript𝑛𝑒n_{e}italic_n start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT representing the number of elements on the mesh, we can express the minimum separation distance constraint as:

gs(𝚲¯)1ne(e=1neχ+(i=1nFσβ(ϕ(i)(𝒙e)δ)1))0subscript𝑔𝑠bold-¯𝚲1subscript𝑛𝑒superscriptsubscript𝑒1subscript𝑛𝑒superscript𝜒superscriptsubscript𝑖1subscript𝑛𝐹subscript𝜎𝛽superscriptitalic-ϕ𝑖subscript𝒙𝑒superscript𝛿10g_{s}(\bm{\overline{\Lambda}})\equiv\frac{1}{n_{e}}\left(\sum\limits_{e=1}^{n_% {e}}\chi^{+}\left(\sum\limits_{i=1}^{n_{F}}\sigma_{\beta}({\phi^{(i)}(\bm{x}_{% e})-\delta^{*}})-1\right)\right)\leq 0italic_g start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( overbold_¯ start_ARG bold_Λ end_ARG ) ≡ divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_ARG ( ∑ start_POSTSUBSCRIPT italic_e = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_χ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_σ start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT ( italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( bold_italic_x start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) - italic_δ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) - 1 ) ) ≤ 0 (11)

Latent Space Constraint : Observe that while the optimizer continuously varies the latent coordinates during optimization, the shapes encoded from the library occupy discrete points in the latent space. In other words, the optimizer is allowed to explore shapes outside the library during optimization. However, at convergence, we expect to select shapes exclusively from the library. To enforce this, we impose a constraint on the distances between the latent coordinates of selected shape instances and those of the library shapes. With 𝒟𝒟\mathcal{D}caligraphic_D being the pairwise distances between the latent coordinates of shape instances and library shapes:

𝒟ji=||𝒛(j)𝒛(i)||,j=1,,nL,i=1,,nF\mathcal{D}_{ji}=||\bm{z}_{*}^{(j)}-\bm{z}^{(i)}||\quad,\;j=1,\ldots,n_{L}\;,% \;i=1,\ldots,n_{F}caligraphic_D start_POSTSUBSCRIPT italic_j italic_i end_POSTSUBSCRIPT = | | bold_italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT - bold_italic_z start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT | | , italic_j = 1 , … , italic_n start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_i = 1 , … , italic_n start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT (12)

We can express the constraint as:

glmax𝑖(min𝑗𝒟ji)0subscript𝑔𝑙𝑖max𝑗minsubscript𝒟𝑗𝑖0g_{l}\equiv\underset{i}{\text{max}}(\underset{j}{\text{min}}\;\mathcal{D}_{ji}% )\leq 0italic_g start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ≡ underitalic_i start_ARG max end_ARG ( underitalic_j start_ARG min end_ARG caligraphic_D start_POSTSUBSCRIPT italic_j italic_i end_POSTSUBSCRIPT ) ≤ 0 (13)

To facilitate gradient-based optimization, we use the LogSumExp approximations of the max and min functions.

Bound Constraints : Given that the latent space coordinates 𝒛𝒩(0,1)similar-to𝒛𝒩01\bm{z}\sim\mathcal{N}(0,1)bold_italic_z ∼ caligraphic_N ( 0 , 1 ), we constrain zi[3,3],isubscript𝑧𝑖33for-all𝑖z_{i}\in[-3,3]\;,\;\forall iitalic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ [ - 3 , 3 ] , ∀ italic_i. Further, the translation coordinates (x¯(i),y¯(i))superscript¯𝑥𝑖superscript¯𝑦𝑖\left(\bar{x}^{(i)},\bar{y}^{(i)}\right)( over¯ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , over¯ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) are constrained to lie within the bounding box of the design domain (fig. 4). The orientation parameter is constrained as 0θ(i)2π0superscript𝜃𝑖2𝜋0\leq\theta^{(i)}\leq 2\pi0 ≤ italic_θ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ≤ 2 italic_π. Furthermore, the scaling parameter is constrained to smins(i)smaxsubscript𝑠minsuperscript𝑠𝑖subscript𝑠maxs_{\text{min}}\leq s^{(i)}\leq s_{\text{max}}italic_s start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ≤ italic_s start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ≤ italic_s start_POSTSUBSCRIPT max end_POSTSUBSCRIPT. The lower bound sminsubscript𝑠mins_{\text{min}}italic_s start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ensures the MFS constraint is met. For example, with a library MFS of 40404040 nm, imposing an MFS of 60606060 nm requires smin=1.5subscript𝑠min1.5s_{\text{min}}=1.5italic_s start_POSTSUBSCRIPT min end_POSTSUBSCRIPT = 1.5 (see section 3.2). The upper bound smaxsubscript𝑠maxs_{\text{max}}italic_s start_POSTSUBSCRIPT max end_POSTSUBSCRIPT, the ratio of the diagonal lengths of the design and shape bounding boxes, ensures the shape instance does not exceed the design domain.

Optimization : Collecting the objective (eq. 10), solver (eq. 9), and constraint (eqs. 11 and 13) the optimization problem can be expressed as:

minimize𝚲¯={𝚲(1),𝚲(2),𝚲(n)}¯𝚲superscript𝚲1superscript𝚲2superscript𝚲𝑛minimize\displaystyle\underset{\overline{\bm{\Lambda}}=\{\bm{\Lambda}^{(1)},\bm{% \Lambda}^{(2)},\ldots\bm{\Lambda}^{(n)}\}}{\text{minimize}}start_UNDERACCENT over¯ start_ARG bold_Λ end_ARG = { bold_Λ start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , bold_Λ start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT , … bold_Λ start_POSTSUPERSCRIPT ( italic_n ) end_POSTSUPERSCRIPT } end_UNDERACCENT start_ARG minimize end_ARG L(𝚲¯)𝐿¯𝚲\displaystyle L(\bar{\bm{\Lambda}})italic_L ( over¯ start_ARG bold_Λ end_ARG ) (14a)
subject to 𝑲ω(𝚲¯)𝑬=𝑱subscript𝑲𝜔¯𝚲𝑬𝑱\displaystyle\bm{K}_{\omega}(\overline{\bm{\Lambda}})\bm{E}=\bm{J}bold_italic_K start_POSTSUBSCRIPT italic_ω end_POSTSUBSCRIPT ( over¯ start_ARG bold_Λ end_ARG ) bold_italic_E = bold_italic_J (14b)
gs(𝚲¯)0subscript𝑔𝑠¯𝚲0\displaystyle g_{s}(\overline{\bm{\Lambda}})\leq 0italic_g start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( over¯ start_ARG bold_Λ end_ARG ) ≤ 0 (14c)
gl(𝚲¯)0subscript𝑔𝑙¯𝚲0\displaystyle g_{l}(\overline{\bm{\Lambda}})\leq 0italic_g start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( over¯ start_ARG bold_Λ end_ARG ) ≤ 0 (14d)
𝚲¯min𝚲¯𝚲¯maxsubscript¯𝚲min¯𝚲subscript¯𝚲max\displaystyle\bar{\bm{\Lambda}}_{\text{min}}\leq\bar{\bm{\Lambda}}\leq\bar{\bm% {\Lambda}}_{\text{max}}over¯ start_ARG bold_Λ end_ARG start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ≤ over¯ start_ARG bold_Λ end_ARG ≤ over¯ start_ARG bold_Λ end_ARG start_POSTSUBSCRIPT max end_POSTSUBSCRIPT (14e)

Where [𝚲¯min,𝚲¯max]subscript¯𝚲minsubscript¯𝚲max[\bar{\bm{\Lambda}}_{\text{min}},\bar{\bm{\Lambda}}_{\text{max}}][ over¯ start_ARG bold_Λ end_ARG start_POSTSUBSCRIPT min end_POSTSUBSCRIPT , over¯ start_ARG bold_Λ end_ARG start_POSTSUBSCRIPT max end_POSTSUBSCRIPT ] collectively represent the bound constraints on the design variables. The method of moving asymptotes (MMA) [28]; a gradient-based constrained optimizer is employed to perform the design updates. Finally, to update the design variables 𝚲¯¯𝚲\bar{\bm{\Lambda}}over¯ start_ARG bold_Λ end_ARG, we require the gradients of the objectives and constraints with respect to these variables. Leveraging the automatic differentiation (AD) capabilities of JAX [29], we automatically derive the sensitivities [30, 31]. In practice, this means we only define the forward expressions, and JAX’s autograd library computes all necessary derivatives with machine precision.

2.5 Algorithm

Having defined all components of the framework, we now summarize and present the complete algorithm.

As mentioned earlier, we assume a library of shapes has been provided. The procedure begins with training the VAE as detailed in section 2.2.2. The network is trained till sufficiently high representational accuracy is attained. The convolutional encoder is then discarded and the implicit encoder and decoder are retained (VAE𝑉𝐴superscript𝐸VAE^{*}italic_V italic_A italic_E start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT). The network now takes the latent space coordinates and spatial coordinates as input and returns the SDF of the shape at the pertinent latent and spatial coordinates.

The main optimization algorithm is summarized in algorithm 1. Initially, the design domain is discretized, and the coordinates of the element centers are computed (line 1). The design variables are then randomly initialized (line 2). We iterate the optimization process until convergence. During each iteration, the latent space coordinates and transformation parameters of the shape instances are obtained from the design variables (line 5). The transformed coordinates of the design space are computed using the transformation parameters (line 6). The SDFs of the shape instances are then computed using the latent space coordinates and the transformed coordinates (line 7). The SDFs are then projected to obtain the density fields (line 8). The design density field is then computed as a union of the density fields of the shape instances (line 9). We then determine the permittivity field from the density field (line 10). We utilize the permittivities to simulate the optical response of the design (line 11). The scattering spectra are computed using the optical response (line 12). Then the objective, minimum separation distance and latent space constraints are computed (line 13, line 14, line 15). The sensitivities are computed in an automated fashion (line 16) and the design variable is updated using MMA (line 17). The process continues until the MMA tolerances are met or the iterations exceed a maximum value (line 19).

Algorithm 1 PhoTOS
1:Ωh0𝒙esubscriptsuperscriptΩ0subscript𝒙𝑒\Omega^{0}_{h}\rightarrow\bm{x}_{e}roman_Ω start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT → bold_italic_x start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT \triangleright elem center coordinates 𝒙esubscript𝒙𝑒\bm{x}_{e}bold_italic_x start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT of size {ne×2}subscript𝑛𝑒2\{n_{e}\times 2\}{ italic_n start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT × 2 }
2:Initialize 𝚲¯0superscriptbold-¯𝚲0\bm{\overline{\Lambda}}^{0}overbold_¯ start_ARG bold_Λ end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT \triangleright random initialization
3:k = 0
4:repeat\triangleright optimization loop
5:     𝚲¯={𝒛𝟏,𝒛𝟐,𝒙¯,𝒚¯,𝜽,𝒔}¯𝚲subscript𝒛1subscript𝒛2bold-¯𝒙bold-¯𝒚𝜽𝒔\overline{\bm{\Lambda}}=\{\bm{z_{1}},\bm{z_{2}},\bm{\bar{x}},\bm{\bar{y}},\bm{% \theta},\bm{s}\}over¯ start_ARG bold_Λ end_ARG = { bold_italic_z start_POSTSUBSCRIPT bold_1 end_POSTSUBSCRIPT , bold_italic_z start_POSTSUBSCRIPT bold_2 end_POSTSUBSCRIPT , overbold_¯ start_ARG bold_italic_x end_ARG , overbold_¯ start_ARG bold_italic_y end_ARG , bold_italic_θ , bold_italic_s }
6:     (𝒙¯,𝒚¯,𝜽,𝒔,𝒙e)𝒙~bold-¯𝒙bold-¯𝒚𝜽𝒔subscript𝒙𝑒~𝒙(\bm{\bar{x}},\bm{\bar{y}},\bm{\theta},\bm{s},\bm{x}_{e})\rightarrow\tilde{\bm% {x}}( overbold_¯ start_ARG bold_italic_x end_ARG , overbold_¯ start_ARG bold_italic_y end_ARG , bold_italic_θ , bold_italic_s , bold_italic_x start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) → over~ start_ARG bold_italic_x end_ARG \triangleright Affine transformations, section 2.3
7:     (𝒛,𝒙~,VAE)ϕ^𝒛~𝒙𝑉𝐴superscript𝐸^bold-italic-ϕ(\bm{z},\tilde{\bm{x}},VAE^{*})\rightarrow\hat{\bm{\phi}}( bold_italic_z , over~ start_ARG bold_italic_x end_ARG , italic_V italic_A italic_E start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) → over^ start_ARG bold_italic_ϕ end_ARG \triangleright shape instance SDFs, section 2.2.2
8:     ϕ^𝝆^bold-italic-ϕ𝝆\hat{\bm{\phi}}\rightarrow{\bm{\rho}}over^ start_ARG bold_italic_ϕ end_ARG → bold_italic_ρ \triangleright projection of shape SDF to density, eq. 3
9:     𝝆𝝆^𝝆^𝝆\bm{\rho}\rightarrow\hat{\bm{\rho}}bold_italic_ρ → over^ start_ARG bold_italic_ρ end_ARG \triangleright design density, eq. 4
10:     𝝆𝜺𝝆𝜺\bm{\rho}\rightarrow\bm{\varepsilon}bold_italic_ρ → bold_italic_ε \triangleright design permittivity, eq. 5
11:     𝑲ω(𝜺),𝑱𝑬subscript𝑲𝜔𝜺𝑱𝑬\bm{K}_{\omega}(\bm{\varepsilon}),\bm{J}\rightarrow\bm{E}bold_italic_K start_POSTSUBSCRIPT italic_ω end_POSTSUBSCRIPT ( bold_italic_ε ) , bold_italic_J → bold_italic_E \triangleright EM simulation, eq. 9
12:     𝑬𝑺𝑬𝑺\bm{E}\rightarrow\bm{S}bold_italic_E → bold_italic_S \triangleright scattering spectra, section 2.4
13:     (𝑺,𝑺)L𝑺superscript𝑺𝐿(\bm{S},\bm{S}^{*})\rightarrow L( bold_italic_S , bold_italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) → italic_L \triangleright objective, eq. 10
14:     ϕ^,MSDgs^bold-italic-ϕ𝑀𝑆𝐷subscript𝑔𝑠\hat{\bm{\phi}},MSD\rightarrow g_{s}over^ start_ARG bold_italic_ϕ end_ARG , italic_M italic_S italic_D → italic_g start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT \triangleright MSD constraint, eq. 11
15:     (𝒛,𝒛)gl𝒛subscript𝒛subscript𝑔𝑙(\bm{z},\bm{z}_{*})\rightarrow g_{l}( bold_italic_z , bold_italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ) → italic_g start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT \triangleright latent space constraint, eq. 13
16:     Compute 𝚲J,𝚲gs,𝚲glsubscript𝚲𝐽subscript𝚲subscript𝑔𝑠subscript𝚲subscript𝑔𝑙\nabla_{\bm{\Lambda}}J,\nabla_{\bm{\Lambda}}g_{s},\nabla_{\bm{\Lambda}}g_{l}∇ start_POSTSUBSCRIPT bold_Λ end_POSTSUBSCRIPT italic_J , ∇ start_POSTSUBSCRIPT bold_Λ end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , ∇ start_POSTSUBSCRIPT bold_Λ end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT \triangleright automatic differentiation, section 2.4
17:     MMA(𝚲k,L,gs,gl,𝚲L,𝚲gs,𝚲gl)𝚲(k+1)superscript𝚲𝑘𝐿subscript𝑔𝑠subscript𝑔𝑙subscript𝚲𝐿subscript𝚲subscript𝑔𝑠subscript𝚲subscript𝑔𝑙superscript𝚲𝑘1(\bm{\Lambda}^{k},L,g_{s},g_{l},\nabla_{\bm{\Lambda}}L,\nabla_{\bm{\Lambda}}g_% {s},\nabla_{\bm{\Lambda}}g_{l})\rightarrow\bm{\Lambda}^{(k+1)}( bold_Λ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_L , italic_g start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , italic_g start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , ∇ start_POSTSUBSCRIPT bold_Λ end_POSTSUBSCRIPT italic_L , ∇ start_POSTSUBSCRIPT bold_Λ end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , ∇ start_POSTSUBSCRIPT bold_Λ end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) → bold_Λ start_POSTSUPERSCRIPT ( italic_k + 1 ) end_POSTSUPERSCRIPT \triangleright MMA update step, section 2.4
18:     k++\text{k}++k + +
19:until  Convergence and k < max_epoch

3 Numerical Experiments

In this section, we present several experiments to demonstrate the proposed framework. Without loss of generality, the default parameters for the experiments are set as follows:

  1. 1.

    The computational domain consists of a grid of 80×80808080\times 8080 × 80 elements, representing a design domain of 1600160016001600 ×\times× 1600nm21600superscriptnm21600\;\text{nm}^{2}1600 nm start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

  2. 2.

    A MFS of 40 nm and a MSD of 60 nm is imposed on the design.

  3. 3.

    The design is populated with 36 shape instances initialized on a 6×6666\times 66 × 6 grid. The latent coordinates are initialized randomly with a seed of 27.

  4. 4.

    All components are optimized for their performance within two 10101010 nm wavelength bands centered at ω=1270𝜔1270\omega=1270italic_ω = 1270 nm and ω=1290𝜔1290\omega=1290italic_ω = 1290 nm (O-band). Excitation source in their fundamental mode is considered.

  5. 5.

    Designs with an insertion loss above -0.5 dB and a back reflection loss below -20 dB are desired.

  6. 6.

    The input and output port waveguides have a width of 400400400400 nm.

  7. 7.

    Optimization is performed using MMA with a move limit set to 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT for a maximum of 150 iterations. All other default parameters correspond to the version of MMA presented in [28].

  8. 8.

    All experiments are conducted on a MacBook M3 Pro, using the JAX library [29] in Python.

3.1 Convergence

In this section, we apply our TO framework to the design of waveguide bends and mode converters. fig. 11(i) depicts the convergence of a waveguide bend. Given an excitation from port 1 (bottom), our objective is to maximize transmission to port 2 (right) while minimizing back reflection. fig. 11(ii) illustrates the convergence of a mode converter design, where the goal is to achieve maximum conversion of the fundamental waveguide mode at port 1 (bottom) to the second-order mode at port 2 (top), with minimal back reflection.

Initially, both designs exhibit poor performance, characterized by low transmission to the output port and significant reflection at the input port. Further, the latent coordinates of the shapes do not align with those of our library shapes. However, subsequent updates substantially modify the design topology by optimizing the transformation parameters and latent coordinates.

By the 150thsuperscript150𝑡150^{th}150 start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT iteration, the target performance specifications are met. Additionally, the separation and latent coordinate constraints are satisfied, with the latent coordinates of the shapes in the design completely overlap** those of our library shapes. In essence, we obtain designs that fulfill the performance requirements while adhering to the MFS and MSD constraints. fig. 11(d) confirms that the final designs consist of well-separated shapes from our library, satisfying the MSD constraint.

Refer to caption
Figure 11: Evolution of designs: (i) waveguide bend and (ii) mode converter. (a) Scattering spectra. (b) Latent space coordinates of selected and encoded library shapes. (c) Electric field magnitude at wavelength of 1280 nm. (d) Final design.

3.2 Feature Size Constraints

Refer to caption
Figure 12: Performance and design for various minimum feature size (MFS) and minimum separation distance (MSD). (i) Waveguide bend. (ii) Mode converter. (a): Scattering spectra at the input and output port. (b): Electric field magnitude at a wavelength of 1280 nm. (c): Optimized design.

A key consideration during the design stage is investigating the impact of the feature sizes; MFS and MSD. While larger feature sizes may enhance fabricability and be necessary for certain fabrication processes, they also restrict the solution space. Recall that the MFS is enforced through a lower bound on the scaling range, and the MSD is imposed via eq. 11.

We explore various combinations of MFS/MSD in fig. 12 for the waveguide bend and mode converter. In all cases, our framework successfully satisfied the imposed size constraints and met performance requirements. However, it is important to note that if the required feature sizes are large, the optimizer may fail to achieve the desired performance due to a lack of feasible points in the solution space. In such scenarios, exploring optimization with larger component dimensions may be necessary.

3.3 Effect of Initialization

Refer to caption
Figure 13: Optimized performance and design obtained from different random initialization seed of (a) 10 (b) 100 (c) 1000. Top: Scattering parameters at the input and output port. Middle: Electric field magnitude at wavelength of 1280 nm. Bottom: Optimized design.

In this experiment, we investigate the influence of initial design on the optimization result. Given the non-convex, non-linear nature of our optimization problem, we anticipate that different initializations will lead to distinct local optima. We initialize the optimization with a grid of equi-spaced shapes, setting the scaling to the mean of the MFS and domain diagonal length. The latent coordinates are initialized randomly as 𝒩(μ=0,σ=2)𝒩formulae-sequence𝜇0𝜎2\mathcal{N}(\mu=0,\sigma=2)caligraphic_N ( italic_μ = 0 , italic_σ = 2 ).

For example, consider fig. 13. Here we optimize mode convertors with varying seeds for the latent coordinates. Observe that while we obtain diverse topologies, the performances are similar; as indicated by their scattering parameters. This suggests that, as expected, the loss landscape is highly non-convex with numerous local solutions. Additionally, it indicates the robustness of our TO framework, capable of discovering designs with the desired performance regardless of initialization.

4 Conclusion

In this work, we presented a topology optimization framework for the design of fabricable photonic components. Building upon feature map** methods, our approach extends their capabilities to encompass multiple generic shapes. Further, we incorporate fabrication constraints, namely minimum feature size and separation in our design. We leveraged a convo-implicit variational autoencoder (VAE) to transform the discrete shape library into a continuous and differentiable latent space, facilitating gradient-based optimization. The effectiveness of our framework was demonstrated through the successful design of waveguide bends and mode converters, highlighting its ability to generate high-performance photonic components that adhere to strict fabrication constraints.

We also identify several avenues for improvement. While the current convo-implicit VAE is trained solely on SDF images, incorporating the Eikonal equation into the training process could enhance accuracy [22]. Further, the number of shape instances was predetermined in this study. Future work will investigate the inclusion of this number as a design variable, offering greater flexibility. In addition, our library was limited to approximately 15 shapes for illustrative purposes. Expanding the library to include a wider variety of shapes would increase the design space and potentially yield even better results. Furthermore, our current implementation is focused on 2D designs. Extending the framework to 3D designs and validating the fabricated compoents’ performance is of significant interest. Finally, we aim to leverage the expanded design space offered by our approach to design more complex photonic components [10, 32, 33].

Acknowledgments

No external funding was used in supporting this work.

Compliance with ethical standards

The authors declare that they have no conflict of interest.

Replication of Results

The Python code is available at github.com/aadityacs/PhoTOS

References

  • [1] Rasmus E Christiansen and Ole Sigmund. Inverse design in photonics by topology optimization: tutorial. JOSA B, 38(2):496–509, 2021.
  • [2] David Marpaung, Jian** Yao, and José Capmany. Integrated microwave photonics. Nature photonics, 13(2):80–90, 2019.
  • [3] Juan M Arrazola, Ville Bergholm, Kamil Brádler, Thomas R Bromley, Matt J Collins, Ish Dhand, Alberto Fumagalli, Thomas Gerrits, Andrey Goussev, Lukas G Helt, et al. Quantum circuits with many photons on a programmable nanophotonic chip. Nature, 591(7848):54–60, 2021.
  • [4] Gordon Wetzstein, Aydogan Ozcan, Sylvain Gigan, Shanhui Fan, Dirk Englund, Marin Soljačić, Cornelia Denz, David AB Miller, and Demetri Psaltis. Inference in artificial intelligence with deep optics and photonics. Nature, 588(7836):39–47, 2020.
  • [5] Ole Sigmund and Kurt Maute. Topology optimization approaches: A comparative review. Structural and multidisciplinary optimization, 48(6):1031–1055, 2013.
  • [6] Ronald P Jenkins, Sawyer D Campbell, and Douglas H Werner. General-purpose algorithm for two-material minimum feature size enforcement of freeform nanophotonic devices. ACS Photonics, 10(4):845–853, 2022.
  • [7] Mingkun Chen, Jiaqi Jiang, and Jonathan A Fan. Design space reparameterization enforces hard geometric constraints in inverse-designed nanophotonic devices. ACS Photonics, 7(11):3141–3151, 2020.
  • [8] Julián A Norato. Topology optimization with supershapes. Structural and Multidisciplinary Optimization, 58(2):415–434, 2018.
  • [9] Martin F Schubert, Alfred KC Cheung, Ian AD Williamson, Aleksandra Spyra, and David H Alexander. Inverse design of photonic devices with strict foundry fabrication constraints. ACS Photonics, 9(7):2327–2336, 2022.
  • [10] Alexander Y Piggott, Jesse Lu, Konstantinos G Lagoudakis, Jan Petykiewicz, Thomas M Babinec, and Jelena Vučković. Inverse design and demonstration of a compact and broadband on-chip wavelength demultiplexer. Nature photonics, 9(6):374–377, 2015.
  • [11] JA Norato, BK Bell, and Daniel A Tortorelli. A geometry projection method for continuum-based topology optimization with discrete elements. Computer Methods in Applied Mechanics and Engineering, 293:306–327, 2015.
  • [12] Shanglong Zhang, Julián A Norato, Arun L Gain, and Naesung Lyu. A geometry projection method for the topology optimization of plate structures. Structural and Multidisciplinary Optimization, 54:1173–1190, 2016.
  • [13] Aaditya Chandrasekhar. Polyto: Structural topology optimization using convex polygons. arXiv preprint arXiv:2305.04406, 2023.
  • [14] Fabian Wein, Peter D Dunning, and Julián A Norato. A review on feature-map** methods for structural optimization. Structural and multidisciplinary optimization, 62:1597–1638, 2020.
  • [15] Yohan Choi, Aki Fujimura, and Abhishek Shendre. Curvilinear masks: an overview. Photomask Technology 2021, 11855:157–172, 2021.
  • [16] Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 165–174, 2019.
  • [17] Krishnaparvathy Puthankovilakam, Toralf Scharf, Hans Peter Herzig, Uwe Vogler, and Reinhard Voelkel. Unified rule based correction for corners in proximity lithography mask using high resolution features. Microelectronic Engineering, 172:35–44, 2017.
  • [18] Mo Chen, Rasmus E Christiansen, Jonathan A Fan, Göktuğ Işiklar, Jiaqi Jiang, Steven G Johnson, Wenchao Ma, Owen D Miller, Ardavan Oskooi, Martin F Schubert, et al. Validation and characterization of algorithms and software for photonics inverse design. JOSA B, 41(2):A161–A176, 2024.
  • [19] Diederik P Kingma, Max Welling, et al. An introduction to variational autoencoders. Foundations and Trends® in Machine Learning, 12(4):307–392, 2019.
  • [20] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  • [21] Carl Doersch. Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908, 2016.
  • [22] Vincent Sitzmann, Julien Martel, Alexander Bergman, David Lindell, and Gordon Wetzstein. Implicit neural representations with periodic activation functions. Advances in neural information processing systems, 33:7462–7473, 2020.
  • [23] Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 33:7537–7547, 2020.
  • [24] Mahindra Rautela, J Senthilnath, Armin Huber, and S Gopalakrishnan. Towards deep generation of guided wave representations for composite materials. IEEE Transactions on Artificial Intelligence, 2022.
  • [25] Tyler W Hughes, Ian AD Williamson, Momchil Minkov, and Shanhui Fan. Forward-mode differentiation of maxwell’s equations. ACS Photonics, 6(11):3010–3016, 2019.
  • [26] Zhan Kang, Yaguang Wang, and Yiqiang Wang. Structural topology optimization with minimum distance control of multiphase embedded components by level set method. Computer Methods in Applied Mechanics and Engineering, 306:299–318, 2016.
  • [27] Boyang Deng, Kyle Genova, Soroosh Yazdani, Sofien Bouaziz, Geoffrey Hinton, and Andrea Tagliasacchi. Cvxnet: Learnable convex decomposition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 31–44, 2020.
  • [28] Krister Svanberg. Mma and gcmma-two methods for nonlinear optimization. vol, 1:1–15, 2007.
  • [29] James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy programs, 2018.
  • [30] Aaditya Chandrasekhar, Saketh Sridhara, and Krishnan Suresh. Auto: a framework for automatic differentiation in topology optimization. Structural and Multidisciplinary Optimization, 64(6):4355–4365, 2021.
  • [31] Momchil Minkov, Ian AD Williamson, Lucio C Andreani, Dario Gerace, Beicheng Lou, Alex Y Song, Tyler W Hughes, and Shanhui Fan. Inverse design of photonic crystals through automatic differentiation. Acs Photonics, 7(7):1729–1741, 2020.
  • [32] Weijie Chang, Xinshu Ren, Yingquan Ao, Longhui Lu, Mengfan Cheng, Lei Deng, Deming Liu, and Minming Zhang. Inverse design and demonstration of an ultracompact broadband dual-mode 3 db power splitter. Optics Express, 26(18):24135–24144, 2018.
  • [33] Omer Yesilyurt, Zhaxylyk A Kudyshev, Alexandra Boltasseva, Vladimir M Shalaev, and Alexander V Kildishev. Efficient topology-optimized couplers for on-chip single-photon sources. ACS Photonics, 8(10):3061–3068, 2021.