\addbibresource

BFCR_paper_bib.bib

Braced Fourier Continuation and Regression for Anomaly Detection

Josef Sabuda
Abstract

In this work, the concept of Braced Fourier Continuation and Regression (BFCR) is introduced. BFCR is a novel and computationally efficient means of finding nonlinear regressions or trend lines in arbitrary one-dimensional data sets. The Braced Fourier Continuation (BFC) and BFCR algorithms are first outlined, followed by a discussion of the properties of BFCR as well as demonstrations of how BFCR trend lines may be used effectively for anomaly detection both within and at the edges of arbitrary one-dimensional data sets. Finally, potential issues which may arise while using BFCR for anomaly detection as well as possible mitigation techniques are outlined and discussed. All source code and example data sets are either referenced or available via GitHub, and all associated code is written entirely in Python.

Keywords: Regression, Trend Finding Algorithm, Fourier Continuation, Fourier Analysis, FFT, Anomaly Detection, Outlier Detection

1 Introduction

Braced Fourier Continuation and Regression (BFCR) is a novel and computationally efficient means of finding nonlinear regressions or trend lines in arbitrary one-dimensional data sets. The main idea behind BFCR is as follows: If one can take a given data set and obtain an accurate Fourier representation of it via the FFT, then one can use a suitable low-pass filter to remove the high-frequency components of the data set while preserving the overall trend with an IFFT, thereby creating a trend line or nonlinear regression. However, since general data sets will almost never be periodic and hence will almost always suffer from Gibbs phenomena in their Fourier representations, any given data set will first need to be made periodic before carrying out this process. A modified version of Fourier Continuation -Braced Fourier Continuation (BFC)- is what accomplishes this task in BFCR and allows the overall idea to work.

Implementations of the algorithms discussed in this paper, written entirely in Python, are available via GitHub [Sabuda2024]

Refer to caption
Figure 1: Illustration of the BFCR algorithm on an example data set taken from [Wyrick2022], with select portions magnified.

1.1 Background: Fourier Continuation

As a brief summary, Fourier Continuation (FC) [Albin2011, Amlani2016, Bruno2022] is a means of taking arbitrary, non-periodic data sets and extending or “continuing” them to be periodic via appending special, dataset-dependant synthetic data. Once a given non-periodic data set has been continued with FC, one can take an FFT of the now periodic continued data set and obtain an accurate Fourier representation of the data without any Gibbs phenomena at the endpoints of the original data set. Then the added synthetic data can be removed, and one can use the Fourier representation of the original data set as they wish.

1.2 Braced Fourier Continuation

FC, while powerful, has a problem which needs to be addressed prior to its use for regression that stems from the process by which it creates the synthetic data it uses to continue the input data. That process is susceptible to creating continuations whose values grow exceptionally large for data sets which are sufficiently “noisy” or non-smooth around the endpoints, an example of which is shown in Figure 2 below.

Refer to caption
Figure 2: Illustration of FC on a non-smooth data set taken from [FFIEC]. Figure 2a depicts the input data, while Figure 2b illustrates the output of the FC process on this data, when FC hyper-parameter d=12𝑑12d=12italic_d = 12.

This “explosion” of the FC synthetic data is a significant issue, as if it dominates in the combined data set, it will cause the contributions of the original dataset in the FFT output to be pushed to comparatively higher frequencies, i.e., to become the noise which the algorithm seeks to eliminate.

To solve this problem in BFCR, FC is applied not to the original data set but to precomputed and appropriately scaled “bracing” data which is known to have a smooth and bounded continuation, hence the term “Braced Fourier Continuation”. It is this continued bracing data which is then appended to the original data set in step 1 of the BFCR algorithm. An important consequence and benefit of this approach is that the synthetic data used in BFCR can be precalculated once and then scaled based on two, dataset-specific scalars; much like how the matrices used in FC process need only be calculated once for a given choice of hyper-parameters.

The BFC synthetic data creation algorithm is described in more technical detail below.

1.3 The Braced Fourier Continuation Algorithm

Given some input data X𝑋Xitalic_X, the BFC algorithm is as follows:

Refer to caption
Figure 3: Illustration of select steps of the BFC algorithm visualized on example data taken from [FFIEC]. Figure 3a depicts the input data, while Figure 3b depicts the output of steps 2-6. Figure 3c depicts steps 7-9, and Figure 3d depicts the end result of the algorithm.
  1. 1.

    Select the hyper-parameters to be used in the FC process, including d𝑑ditalic_d, which will determine the number of bracing data points needed.

  2. 2.

    Take the last four points of the input data X𝑋Xitalic_X, {xN3,xN2,xN1,xN}subscript𝑥𝑁3subscript𝑥𝑁2subscript𝑥𝑁1subscript𝑥𝑁\{x_{N-3},x_{N-2},x_{N-1},x_{N}\}{ italic_x start_POSTSUBSCRIPT italic_N - 3 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_N - 2 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_N - 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT }, and enumerate them as pairs: {(0,xN3),(1,xN2),(2,xN1),(3,xN)}0subscript𝑥𝑁31subscript𝑥𝑁22subscript𝑥𝑁13subscript𝑥𝑁\{(0,x_{N-3}),(1,x_{N-2}),(2,x_{N-1}),(3,x_{N})\}{ ( 0 , italic_x start_POSTSUBSCRIPT italic_N - 3 end_POSTSUBSCRIPT ) , ( 1 , italic_x start_POSTSUBSCRIPT italic_N - 2 end_POSTSUBSCRIPT ) , ( 2 , italic_x start_POSTSUBSCRIPT italic_N - 1 end_POSTSUBSCRIPT ) , ( 3 , italic_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) }

  3. 3.

    Find the lines of best fit, in the least squares sense, through {(1,xN2),\{(1,x_{N-2}),{ ( 1 , italic_x start_POSTSUBSCRIPT italic_N - 2 end_POSTSUBSCRIPT ) , (2,xN1),(3,xN)}(2,x_{N-1}),(3,x_{N})\}( 2 , italic_x start_POSTSUBSCRIPT italic_N - 1 end_POSTSUBSCRIPT ) , ( 3 , italic_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) }: L1=a1x+b1subscript𝐿1subscript𝑎1𝑥subscript𝑏1L_{1}=a_{1}*x+b_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∗ italic_x + italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (Right line of best fit 1 in Figure 3b), and {(0,xN3),(1,xN2),(2,xN1)}0subscript𝑥𝑁31subscript𝑥𝑁22subscript𝑥𝑁1\{(0,x_{N-3}),(1,x_{N-2}),(2,x_{N-1})\}{ ( 0 , italic_x start_POSTSUBSCRIPT italic_N - 3 end_POSTSUBSCRIPT ) , ( 1 , italic_x start_POSTSUBSCRIPT italic_N - 2 end_POSTSUBSCRIPT ) , ( 2 , italic_x start_POSTSUBSCRIPT italic_N - 1 end_POSTSUBSCRIPT ) }: L2=a2x+b2subscript𝐿2subscript𝑎2𝑥subscript𝑏2L_{2}=a_{2}*x+b_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∗ italic_x + italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (Right line of best fit 2 in Figure 3b)

  4. 4.

    Project L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT forward by 1, r1=a14+b1subscript𝑟1subscript𝑎14subscript𝑏1r_{1}=a_{1}*4+b_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∗ 4 + italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT forward by 2, r2=a24+b2subscript𝑟2subscript𝑎24subscript𝑏2r_{2}=a_{2}*4+b_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∗ 4 + italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT

  5. 5.

    Take the average of r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT to arrive at the Right Scaling Point RSP𝑅𝑆𝑃RSPitalic_R italic_S italic_P; RSP=(r1+r2)/2𝑅𝑆𝑃subscript𝑟1subscript𝑟22RSP=(r_{1}+r_{2})/2italic_R italic_S italic_P = ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) / 2

  6. 6.

    Repeat steps 1-4 using the first four points of the input data {x1,x2,x3,x4}subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑥4\{x_{1},x_{2},x_{3},x_{4}\}{ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT } enumerated as pairs as follows: {(0,x4),(1,x3),(2,x2),(3,x1)}0subscript𝑥41subscript𝑥32subscript𝑥23subscript𝑥1\{(0,x_{4}),(1,x_{3}),(2,x_{2}),(3,x_{1})\}{ ( 0 , italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) , ( 1 , italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) , ( 2 , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , ( 3 , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) }, labelling the resulting Left Scaling Point LSP𝐿𝑆𝑃LSPitalic_L italic_S italic_P

  7. 7.

    Take the first d𝑑ditalic_d points of your selected bracing data ({S1}subscript𝑆1\{S_{1}\}{ italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT }) and scale them such that the last point of ({S1}subscript𝑆1\{S_{1}\}{ italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT }) equals LSP𝐿𝑆𝑃LSPitalic_L italic_S italic_P

  8. 8.

    Take the last d𝑑ditalic_d points of your selected bracing data ({S2}subscript𝑆2\{S_{2}\}{ italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }) and scale them such that the first point of ({S2}subscript𝑆2\{S_{2}\}{ italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }) equals RSP𝑅𝑆𝑃RSPitalic_R italic_S italic_P

  9. 9.

    Append {S1}subscript𝑆1\{S_{1}\}{ italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } and {S2}subscript𝑆2\{S_{2}\}{ italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } to X𝑋Xitalic_X such that Xext={S1,X,S2}subscript𝑋𝑒𝑥𝑡subscript𝑆1𝑋subscript𝑆2X_{ext}=\{S_{1},X,S_{2}\}italic_X start_POSTSUBSCRIPT italic_e italic_x italic_t end_POSTSUBSCRIPT = { italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X , italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }, as shown Figure 3c

  10. 10.

    Apply the FC process to Xextsubscript𝑋𝑒𝑥𝑡X_{ext}italic_X start_POSTSUBSCRIPT italic_e italic_x italic_t end_POSTSUBSCRIPT to reach Xcontsubscript𝑋𝑐𝑜𝑛𝑡X_{cont}italic_X start_POSTSUBSCRIPT italic_c italic_o italic_n italic_t end_POSTSUBSCRIPT, as shown in Figure 3d

In practice, this algorithm can be simplified in that the FC process can actually be run on the selected bracing first, then the results can scaled by the multipliers found in steps 4 and 5. This means that once bracing data has been selected and the FC process has been run on it, the FC process does not need to be run again unless and until one decides to use different bracing data.

Note that in all of the examples shown in this paper, the choice of hyper-parameters used in the FC process are as follows: d=Z=12𝑑𝑍12d=Z=12italic_d = italic_Z = 12, C=27𝐶27C=27italic_C = 27, E=0𝐸0E=0italic_E = 0, and Nover=20subscript𝑁𝑜𝑣𝑒𝑟20N_{over}=20italic_N start_POSTSUBSCRIPT italic_o italic_v italic_e italic_r end_POSTSUBSCRIPT = 20.

2 The BFCR Algorithm

Given an input data set X𝑋Xitalic_X with N𝑁Nitalic_N points, X={xi,i=1,2,3,,N}X=\{x_{i},i=1,2,3,...,N\}italic_X = { italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i = 1 , 2 , 3 , … , italic_N }, the BFCR algorithm is as follows:

Refer to caption
Figure 4: Illustration of select steps of the BFCR Algorithm on example data taken from [Wyrick2022]. Figure 4a depicts step 1 of the algorithm on the example data. Figure 4b depicts steps 3 and part of step 4, namely the FFT modes of the continued data set with zero mean Y𝑌Yitalic_Y both with and without a Sigma Approximation filter. Figure 4c depicts the reconstructed trend from second part of step 4. Figure 4d depicts the end result of the BFCR algorithm on the input data set.
  1. 1.

    Extend X𝑋Xitalic_X to be periodic via BFC and obtain the continued data set Xcont={xcont,i,i=1,2,3,,N+C}X_{cont}=\{x_{cont,i},i=1,2,3,...,N+C\}italic_X start_POSTSUBSCRIPT italic_c italic_o italic_n italic_t end_POSTSUBSCRIPT = { italic_x start_POSTSUBSCRIPT italic_c italic_o italic_n italic_t , italic_i end_POSTSUBSCRIPT , italic_i = 1 , 2 , 3 , … , italic_N + italic_C } with N+C𝑁𝐶N+Citalic_N + italic_C points. This step is demonstrated in Figure 4a above.

  2. 2.

    Calculate the mean of the Xcontsubscript𝑋𝑐𝑜𝑛𝑡X_{cont}italic_X start_POSTSUBSCRIPT italic_c italic_o italic_n italic_t end_POSTSUBSCRIPT, μ𝜇\muitalic_μ, and subtract it from each point in Xcontsubscript𝑋𝑐𝑜𝑛𝑡X_{cont}italic_X start_POSTSUBSCRIPT italic_c italic_o italic_n italic_t end_POSTSUBSCRIPT, so that new data set Y𝑌Yitalic_Y has zero mean; Y={yi}𝑌subscript𝑦𝑖Y=\{y_{i}\}italic_Y = { italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }, where yi=xcont,iμ,i[1,2,3,,N+C]formulae-sequencesubscript𝑦𝑖subscript𝑥𝑐𝑜𝑛𝑡𝑖𝜇for-all𝑖123𝑁𝐶y_{i}=x_{cont,i}-\mu,\forall i\in[1,2,3,...,N+C]italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_c italic_o italic_n italic_t , italic_i end_POSTSUBSCRIPT - italic_μ , ∀ italic_i ∈ [ 1 , 2 , 3 , … , italic_N + italic_C ]

  3. 3.

    Take the FFT of Y𝑌Yitalic_Y

  4. 4.

    Use the FFT modes of Y𝑌Yitalic_Y with a low-pass filter based on Sigma Approximation in an IFFT to reconstruct the data as Y={yi,i=1,2,3,,N+C}Y^{{}^{\prime}}=\{{y^{{}^{\prime}}}_{i},i=1,2,3,...,N+C\}italic_Y start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT = { italic_y start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i = 1 , 2 , 3 , … , italic_N + italic_C }. A comparison of the FFT modes from the example data, with and without the Sigma Approximation filter, is shown in Figure 4b with the resulting reconstruction shown in Figure 4c.

  5. 5.

    Remove the points associated with the synthetic data from the reconstruction; Y={yi,i=1,2,3,,N}Y^{{}^{\prime}}=\{{y^{{}^{\prime}}}_{i},i=1,2,3,...,N\}italic_Y start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT = { italic_y start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i = 1 , 2 , 3 , … , italic_N }

  6. 6.

    Add back the mean subtracted in step 2 to each point in the reconstructed dataset Ysuperscript𝑌Y^{{}^{\prime}}italic_Y start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT to arrive at Xsuperscript𝑋X^{{}^{\prime}}italic_X start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT; X={xi}superscript𝑋subscriptsuperscript𝑥𝑖X^{{}^{\prime}}=\{{x^{{}^{\prime}}}_{i}\}italic_X start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT = { italic_x start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }, where xi=(yi+μ),i[1,2,3,,N]formulae-sequencesubscriptsuperscript𝑥𝑖subscriptsuperscript𝑦𝑖𝜇for-all𝑖123𝑁{x^{{}^{\prime}}}_{i}=({y^{{}^{\prime}}}_{i}+\mu),\forall i\in[1,2,3,...,N]italic_x start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( italic_y start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_μ ) , ∀ italic_i ∈ [ 1 , 2 , 3 , … , italic_N ]

  7. 7.

    Calculate the average difference between the reconstruction and the original dataset, μ=(i=1N(xixi))/Nsuperscript𝜇superscriptsubscript𝑖1𝑁subscriptsuperscript𝑥𝑖subscript𝑥𝑖𝑁\mu^{{}^{\prime}}=(\sum_{i=1}^{N}({x^{{}^{\prime}}}_{i}-x_{i}))/Nitalic_μ start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT = ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) / italic_N, and subtract it from the reconstruction to arrive at the final trend line Xtrend={xtrend,i}subscript𝑋𝑡𝑟𝑒𝑛𝑑subscript𝑥𝑡𝑟𝑒𝑛𝑑𝑖X_{trend}=\{x_{trend,i}\}italic_X start_POSTSUBSCRIPT italic_t italic_r italic_e italic_n italic_d end_POSTSUBSCRIPT = { italic_x start_POSTSUBSCRIPT italic_t italic_r italic_e italic_n italic_d , italic_i end_POSTSUBSCRIPT }, where xtrend,i=xiμ,i[1,2,3,,N]formulae-sequencesubscript𝑥𝑡𝑟𝑒𝑛𝑑𝑖subscriptsuperscript𝑥𝑖superscript𝜇for-all𝑖123𝑁x_{trend,i}={x^{{}^{\prime}}}_{i}-\mu^{{}^{\prime}},\forall i\in[1,2,3,...,N]italic_x start_POSTSUBSCRIPT italic_t italic_r italic_e italic_n italic_d , italic_i end_POSTSUBSCRIPT = italic_x start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_μ start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT , ∀ italic_i ∈ [ 1 , 2 , 3 , … , italic_N ]. The final result of this step, and hence the overall algorithm on the example data, is shown in Figure 4d.

The choice of low-pass filter used in step 3 can actually be varied and is up to the user, but each of the examples presented in this paper as well as in the associated source code utilize a filter based on Sigma Approximation with each Lanczos σ𝜎\sigmaitalic_σ factor raised to the 4thsuperscript4𝑡4^{th}4 start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT power. Sigma approximation is an advantageous choice of low-pass filter in BFCR as, by design, it will greatly reduce any Gibbs phenomena arising from any internal jump discontinuities which may exist in a given data set.

3 Properties of BFCR

BFCR has a number of important properties that make it stand out from other regression models/methodologies, including:

  1. 1.

    BFCR can be used on general, one-dimensional data sets regardless of data volatility or behavior so long as they contain at least 4 data points. Note: It is possible to generalize BFCR to general, N-dimensional data sets, but this is beyond the scope of this paper. Additionally, only examples of BFCR used on real datasets will be shown in this paper.

  2. 2.

    BFCR requires no assumptions about the structure of the underlying trend within the data (e.g., linear, quadratic, etc.) in order to create a regression.

  3. 3.

    Assuming all bracing and FC data has been precomputed, BFCR has a computational complexity of O((N+C)log(N+C))𝑂𝑁𝐶𝑙𝑜𝑔𝑁𝐶O((N+C)*log(N+C))italic_O ( ( italic_N + italic_C ) ∗ italic_l italic_o italic_g ( italic_N + italic_C ) ), where N𝑁Nitalic_N is the number of points in the input data set, and C𝐶Citalic_C is the number of points added through the BFC process.

4 Anomaly Detection with BFCR

BFCR can be used effectively for anomaly/outlier detection in general, one-dimensional data sets. Generally speaking, the larger the input data set, the more effective the below algorithms using BFCR will be, with an important caveat as well as other notable potential issues and associated mitigation techniques discussed in the next section. Additionally, while the BFCR algorithm can be used on data sets with as few as 4 points, the author recommends when using BFCR for anomaly detection that data sets have at least 6 points of data.

Due to the nature of the BFCR algorithm, there are two cases to consider when using BFCR for anomaly detection which are discussed below. Both cases, however, share the same fundamental idea of taking a sample (from a single data point) and comparing it to a population (derived from the rest of the data points) via population statistics with an assumed statistical distribution.

4.1 Anomaly Detection Away from Edges

Non-edge anomaly detection with BFCR (i.e., for data points which are neither the first nor last points in a given data set) is straightforward, and the algorithm is as follows:

Refer to caption
Figure 5: Illustration of the BFCR internal anomaly detection algorithm on some example data taken from [FFIEC]. Figure 5a depicts the sample data set. Figure 5b shows the results of step 1 of the algorithm, Figure 5c shows the process and result of step 2 of the algorithm, and Figure 5d shows the end result of the algorithm when assuming a normal distribution in step 3.
  1. 1.

    Given an input data set X𝑋Xitalic_X with N𝑁Nitalic_N points, calculate a trend line Y𝑌Yitalic_Y through the input data set using BFCR.

  2. 2.

    Find the mean and standard deviation of absolute difference between each point in the data set and its corresponding point in the trend;

    μ=i=1N|xiyi|N𝜇superscriptsubscript𝑖1𝑁subscript𝑥𝑖subscript𝑦𝑖𝑁\mu=\dfrac{\sum_{i=1}^{N}|x_{i}-y_{i}|}{N}italic_μ = divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | end_ARG start_ARG italic_N end_ARG, σ=i=1N(|xiyi|μ)2N𝜎superscriptsubscript𝑖1𝑁superscriptsubscript𝑥𝑖subscript𝑦𝑖𝜇2𝑁\sigma=\sqrt{\dfrac{\sum_{i=1}^{N}(|x_{i}-y_{i}|-\mu)^{2}}{N}}italic_σ = square-root start_ARG divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( | italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | - italic_μ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_N end_ARG end_ARG

    These values, μ𝜇\muitalic_μ and σ𝜎\sigmaitalic_σ, are the population statistics.

  3. 3.

    Assume a statistical distribution for the population statistics, and see if any internal points (i.e, the samples) are outliers according to the properties of that distribution. For example, if one assumes the absolute differences will be normally distributed, then one would loop through every internal data point of interest and see if any lie more than two standard deviations away from the mean absolute difference. If any do, those points would be flagged as anomalies or outliers. All of the BFCR anomaly detection examples shown in this paper assume that the population statistics derived from BFCR trend lines follow normal distributions.

4.2 Edge Anomaly Detection

While similar, anomaly detection with BFCR for the edges of a data set differs from internal anomaly detection in important ways. The algorithm for edge anomaly detection presented below assumes one is trying to determine whether the very last point in a given data set is anomalous/an outlier.

Refer to caption
Figure 6: Illustration of the BFCR edge anomaly detection algorithm on some example data taken from [FFIEC]. Figure 6a depicts the sample data set. Figure 6b shows the results of step 1 of the algorithm, Figure 6c shows the process and result of steps 2 and 3 of the algorithm, and Figure 6d shows the end result of the algorithm when assuming a normal distribution in step 4.
  1. 1.

    Given an input data set X𝑋Xitalic_X with N𝑁Nitalic_N points, calculate two BFCR trend lines, Y1subscript𝑌1Y_{1}italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Y2subscript𝑌2Y_{2}italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT; Y1subscript𝑌1Y_{1}italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is calculated using all but the very last data point in the input dataset, and Y2subscript𝑌2Y_{2}italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT by using the entire input dataset.

  2. 2.

    Using Y1subscript𝑌1Y_{1}italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, find the mean and standard deviation of the absolute differences between Y1subscript𝑌1Y_{1}italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and all but the very last point in the input data;

    μ=i=1N1|xiy1,i|N1𝜇superscriptsubscript𝑖1𝑁1subscript𝑥𝑖subscript𝑦1𝑖𝑁1\mu=\dfrac{\sum_{i=1}^{N-1}|x_{i}-y_{1,i}|}{N-1}italic_μ = divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT | italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT 1 , italic_i end_POSTSUBSCRIPT | end_ARG start_ARG italic_N - 1 end_ARG, σ=i=1N1(|xiy1,i|μ)2N1𝜎superscriptsubscript𝑖1𝑁1superscriptsubscript𝑥𝑖subscript𝑦1𝑖𝜇2𝑁1\sigma=\sqrt{\dfrac{\sum_{i=1}^{N-1}(|x_{i}-y_{1,i}|-\mu)^{2}}{N-1}}italic_σ = square-root start_ARG divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT ( | italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT 1 , italic_i end_POSTSUBSCRIPT | - italic_μ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_N - 1 end_ARG end_ARG

    These values, μ𝜇\muitalic_μ and σ𝜎\sigmaitalic_σ, are the population statistics.

  3. 3.

    Using Y2subscript𝑌2Y_{2}italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, measure the absolute difference between the last point in Y2subscript𝑌2Y_{2}italic_Y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with the last point of the input data;

    s=|y2,NxN|𝑠subscript𝑦2𝑁subscript𝑥𝑁s=|y_{2,N}-x_{N}|italic_s = | italic_y start_POSTSUBSCRIPT 2 , italic_N end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT |

    This measurement, s𝑠sitalic_s, is the sample.

  4. 4.

    Assume a statistical distribution for the population statistics, and see if the sample is an outlier according to the properties of that distribution. For example, if one assumes the absolute differences will be normally distributed, then one would check to see if the sample lies more than two standard deviations away from the mean absolute difference. If it does, then that point would be flagged as an anomaly or outlier. Again, all of the BFCR anomaly detection examples shown in this paper assume that the population statistics from BFCR trend lines follow normal distributions.

To check the first point in a given data set, simply reverse sort the data set and then use the same process presented above.

5 Anomaly Detection with BFCR; Potential Issues and Mitigation Techniques

5.1 Data Sets with Varying Volatility

Refer to caption
Figure 7: Illustration of volatility regime change and the effect it has on internal anomaly detection on example data taken from [Wyrick2022]. Figure 7a depicts BFCR internal anomaly detection algorithm results on a sample data set which contains significant amounts of data from two distinct volatility regimes (before and after the large spike). Figure 7b shows the results the algorithm on a portion of the same data set but with most of the data from the first volatility regime excluded.

BFCR for anomaly detection, like any trend-based anomaly detection method, is vulnerable to false positives/negatives when the underlying volatility of the data, what will hereto be referred to as the volatility regime, changes within the data set. Figure 7 above depicts this situation; when significant amounts of data from the previous volatility regime (before the spike) are included when analyzing points in the data for internal anomalies (Figure 7a), more points are flagged as anomalies than if most of the data from the old regime is excluded (Figure 7b). Note: If the volatility regime change was reversed, i.e., if the data went from being more volatile to significantly less volatile, fewer points would have been identified as anomalous if significant data from the more volatile regime was included.

This issue of volatility regime change within a data set can be handled in various ways, with one potential method being the following which is simple and efficient using BFCR:

  1. 1.

    Given an input data set X𝑋Xitalic_X with N𝑁Nitalic_N points, use BFCR to draw a trend Y𝑌Yitalic_Y through the data.

  2. 2.

    Measure the absolute differences D𝐷Ditalic_D between each point in the input data and the trend;

    D={|yixi|,i=1,2,3,,N}D=\{|y_{i}-x_{i}|,i=1,2,3,...,N\}italic_D = { | italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | , italic_i = 1 , 2 , 3 , … , italic_N }

  3. 3.

    Cut D𝐷Ditalic_D into equal (or nearly equal) halves and calculate the standard deviation of each half (σ1,σ2)\sigma_{1},\sigma_{2})italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ).

  4. 4.

    Divide σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT by σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (or vice versa) and see if the result falls outside some specified percentage range (e.g., 25%).

    0.75σ1σ21.250.75subscript𝜎1subscript𝜎21.250.75\leq\dfrac{\sigma_{1}}{\sigma_{2}}\leq 1.250.75 ≤ divide start_ARG italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ≤ 1.25

    If it does, remove the first some percentage (e.g., 20%) of the input data and repeat the process over again until either the resulting data halves have similar volatility, or only some specified cut-off percentage (e.g., 50%) of the input data remains.

Once this process is run and terminates, assuming the input data set isn’t very long and does not contain more than two volatility regimes, the (likely truncated) data set will now likely be of mostly a single volatility regime.

5.2 Data Sets with Pre-Existing Outliers

Refer to caption
Figure 8: Illustration of the potential effect of pre-existing internal outliers while using BFCR for edge anomaly detection on example data taken from [FFIEC]. Figures 8a and 8b depict the results of the unmodified BFCR for edge anomaly detection algorithm on an example data set which contains a significant internal outlier. Figures 8c and 8d show the results of the modified algorithm which first finds and filters out internal anomalies on the same data set.

When using BFCR for edge anomaly detection, it is possible for data sets to contain internal outliers which, if ignored, would hamper the ability of the algorithm to detect anomalous edge data points whose distances from the trend line are still extreme compared to most points in the data set, but not as extreme as the internal outlier points. This situation is depicted in Figure 8 above, and can be mitigated by screening for and removing the influence of these internal outliers in the population statistics. This can be done by inserting an additional step in the BFCR edge anomaly detection algorithm after step 2; namely step 3 of the BFCR internal anomaly detection algorithm. In other words, after steps 1 and 2 of the edge anomaly detection algorithm, use the already calculated population statistics from step 2 to see if any internal points are outliers according to the assumed statistical distribution the population statistics describe (e.g., a normal distribution). If any do, recalculate the population mean and standard deviation while neglecting the contributions of those internal outliers before moving on to step 3 and proceeding as normal.

5.3 Data Sets with Little to No Noise

While the BFCR and BFCR edge anomaly detection algorithms can be used on any one-dimensional data set, the algorithms work best on data sets in which there exists a non-trivial amount of noise. If there is little to no noise present within a given input data set, then the resulting BFCR regression(s) will likely deviate from the data mostly around the end-points, hence the edge anomaly detection algorithm will be prone to false positives. This phenomena is demonstrated in the plots in Figure 9 below. The issue stems from the likely mismatch between one’s selected bracing data and the underlying trend in the input data around the end points (especially if one is using precomputed bracing data). It is this mismatch, combined with the lack of noise, that causes whatever low-pass filter being used in the algorithms to filter out FFT modes which correspond to the underlying trend at the end points of the input data set, thereby causing the calculated trend to deviate from the underlying data around those end points.

For edge anomaly detection this potential issue can, in practice, largely be mitigated via using either one or both of the following pre-processing methods to the input data before applying the BFCR edge anomaly detection algorithm:

  1. 1.

    Specify a minimum percent change threshold for the change between the last two points in any given data set (e.g., xN/xN110%subscript𝑥𝑁subscript𝑥𝑁1percent10x_{N}/x_{N-1}\geq 10\%italic_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT / italic_x start_POSTSUBSCRIPT italic_N - 1 end_POSTSUBSCRIPT ≥ 10 %) that must be met in order for the algorithm to be run. This will generally work because, in practice, many if not most real world data sets do not vary exponentially over time (where this mitigation technique would not necessarily work).

  2. 2.

    Calculate the coefficient of variation of the differences between the last m𝑚mitalic_m (e.g., 4) points in the input data (e.g., {(xNxN1),(xN1xN2),(xN2xN3),,(xNm+2xNm+1)}subscript𝑥𝑁subscript𝑥𝑁1subscript𝑥𝑁1subscript𝑥𝑁2subscript𝑥𝑁2subscript𝑥𝑁3subscript𝑥𝑁𝑚2subscript𝑥𝑁𝑚1\{(x_{N}-x_{N-1}),(x_{N-1}-x_{N-2}),(x_{N-2}-x_{N-3}),...,(x_{N-m+2}-x_{N-m+1})\}{ ( italic_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_N - 1 end_POSTSUBSCRIPT ) , ( italic_x start_POSTSUBSCRIPT italic_N - 1 end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_N - 2 end_POSTSUBSCRIPT ) , ( italic_x start_POSTSUBSCRIPT italic_N - 2 end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_N - 3 end_POSTSUBSCRIPT ) , … , ( italic_x start_POSTSUBSCRIPT italic_N - italic_m + 2 end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_N - italic_m + 1 end_POSTSUBSCRIPT ) }). If this coefficient is less than some chosen threshold (e.g., 0.2), then the edge of the data set has low volatility and likely does not contain an outlier, hence the edge anomaly detection algorithm is unnecessary.

Refer to caption
Figure 9: Illustration of false positives when the BFCR edge anomaly detection algorithm is run on data sets with no noise. Figures 9a and 9b depict the algorithm being run on linear data of the form y=x𝑦𝑥y=xitalic_y = italic_x. Figures 9c and 9d depict the algorithm being run on quadratic data of the form y=x2𝑦superscript𝑥2y=x^{2}italic_y = italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Figures 9e and 9f depict the algorithm being run on exponential data of the form y=ex𝑦superscript𝑒𝑥y=e^{x}italic_y = italic_e start_POSTSUPERSCRIPT italic_x end_POSTSUPERSCRIPT.

6 Conclusion

This paper introduced the Braced Fourier Continuation and Regression (BFCR) algorithm, a novel and computationally efficient means of finding nonlinear regressions or trend lines in arbitrary, one-dimensional data sets. Additionally, the use of BFCR for efficient and flexible outlier detection, both within and at the edges of general data sets, was also introduced. Possible future work includes extension of the BFCR algorithm to higher dimensional data sets, modification of the algorithm to handle data sets with non-uniform spacing between points, as well as improving the computational complexity of the algorithm to O(N+C)𝑂𝑁𝐶O(N+C)italic_O ( italic_N + italic_C ) from O((N+C)log(N+C))𝑂𝑁𝐶𝑙𝑜𝑔𝑁𝐶O((N+C)*log(N+C))italic_O ( ( italic_N + italic_C ) ∗ italic_l italic_o italic_g ( italic_N + italic_C ) ), where N𝑁Nitalic_N is the number of points in the input data set and C𝐶Citalic_C is the number of points added through the BFC process.

\printbibliography