#################################
Augmented Kalman filter algorithm
#################################

Introduction
============

The subpackage `augkf`_ implements algorithms based on an augmented state Kalman filter (AugKF) initiated by [BGA17]_.

The following figure shows the Kalman filter in its simplest form:

.. image:: ./img/basic_augkf.svg
    :width: 500px
    :align: center
    :alt: basic principle of Kalman filter

The sequential data assimilation algorithm is composed of two kinds of operations:

* **Forecasts** are performed every :math:`\Delta t_f`. An ensemble of :math:`N_e` core states is time stepped between :math:`t` and :math:`t+\Delta t_f`.
* **Analyses** are performed every :math:`\Delta t_a` with :math:`\Delta t_a=n\Delta t_f` (analyses are performed every :math:`n` forecasts). A Best Linear Unbiased Estimate (BLUE) is performed using observation data.

From a technical point of view, algorithm steps take CoreState_ objects as inputs and return the CoreState_ resulting from the operations. Forecasts and analyses are handled by the AugkfAlgo_ object that implements the full AugKF algorithm.

More information can be found in the relevant tutorials:

- `CoreState usage`_
- `Forecast and analysis steps usage`_

.. _AugkfAlgo: https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.augkf.algo.html
.. _AugkfAnalyser: https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.augkf.analyser.html
.. _augkf: https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.augkf.html
.. _CoreState: https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.corestates.html
.. _CoreState usage: https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/usage_corestate.html
.. _AugkfForecaster: https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.augkf.forecaster.html
.. _Forecast and analysis steps usage: https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/usage_steps.html
.. _config: https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/configuration.html

The induction equation
======================
The basic equation coupling the components of the core state at a given time is the radial induction equation at the core surface in the spherical harmonic spectral domain:

.. math::
    :label: induction

    \dot{\mathbf{b}} = \mathrm{A}(\mathbf{b})\mathbf{u} + \mathbf{e_r}


* :math:`\mathbf{b}` (resp. :math:`\mathbf{u}` and :math:`\dot{\mathbf{b}}`) stores the Schmidt semi-normalized spherical harmonic coefficients of the magnetic field (resp. the core flow and the secular variation) up to a truncation degree :math:`L_b` (resp. :math:`L_u` and :math:`L_{sv}` ). The number of stored coefficients in those vectors are respectively :math:`N_b = L_b (L_b + 2)`, :math:`N_u = 2L_u(L_u+2)` and :math:`N _{sv} = L_{sv}(L_{sv}+2)`.
* :math:`\mathrm{A}(\mathbf{b})` is the matrix of Gaunt-Elsasser integrals [Moon79]_ of dimensions :math:`N_{sv} \times N_u` , depending on :math:`\mathbf{b}`.
* :math:`\mathbf{e_r}` stands for errors of representativeness (of dimension :math:`N_{sv}` ). This term accounts for both subgrid induction (arising due to the truncation of the fields) and magnetic diffusion.

Priors
======

The data assimilation steps require spatial cross-covariances that are derived from priors. The priors consist in a sampled series of spectral coefficients for the magnetic field MF, the core flow U and errors of representativeness ER. The error of representativeness (ER) can be computed from the core flow (U) and the main field (MF) using the script *compute_ER.py*.

The supplied priors in *pygeodyn* are the geodynamo free runs *0path*, *50path*, *71path*, *100path* codes but other types of priors can be used (more information in the section about `prior types`_).


AR(1) version
=============

Forecast and AR(1) processes
----------------------------

We wish to fit the data with a multivariate AR-1 model of the form

.. math::

    \text{d}{\bf x} +{\sf \tilde A} {\bf x}\text{d}t = \text{d}\boldsymbol{\zeta}

with :math:`\boldsymbol{\zeta}` a Wiener process and :math:`{\sf \tilde A}` is the drift matrix, of dimension :math:`P\times P` with :math:`P` the dimension of the vector :math:`{\bf x}_t={\bf x}(t)` 

:math:`\mathbf{x}` is centered (substract average) and stands for the core flow U or the subgrid error ER or the augmented state Z ([U, ER])

Once discretized with an Euler-Maruyama scheme, it gives:

.. math::
    :label: Euler-Maruyama

    {\bf x}_{i+1} = -{\bf x_i} {\sf A} + {\sf D}

with: :math:`{\sf A} = {\Delta t \sf \tilde A} - {\bf I}` and :math:`{\sf D} = \sqrt{\Delta t} \mathbf{L} \mathbf{w_i}`, with :math:`\mathbf{w_i}` a normal draw (:math:`\mathbf{w_i} = \mathcal{N}(0,1)`)


Thus, we seek :math:`\mathbf{L}` and :math:`{\sf \tilde A}`.


Diagonal AR(1)
^^^^^^^^^^^^^^

In the case where the geodynamo series used as priors do not allow to derive meaningful temporal statistics, the two drift matrices are simply diagonal, and controlled by a single free parameter tunable in the configuration file (:math:`\tau_x` for :math:`\mathbf{x}`, U and ER only):

.. math::

    \mathrm{\tilde A} = \frac{1}{\tau_{x}}\mathrm{I}_{x}

with :math:`\mathrm{I}_{x}` the identity matrix of rank :math:`N_{x}`.

The covariance matrix of :math:`{\bf x}` writes: :math:`\mathrm{P}_{xx}= \mathbb{E}\left(\mathbf{x}\mathbf{x}^T\right)`

:math:`{\sf L}` is the lower Cholesky matrix of :math:`\frac{\sqrt 2}{Tau_x} \mathrm{P}_{xx}`

Dense AR(1)
^^^^^^^^^^^

We only treat the case of a single geodynamo prior. To know more about the mixing of priors (71path with 100path) please refer to the scientific paper ...

The filtered-version :math:`\hat{\bf x}(t)` of the geodynamo series :math:`{\bf x}(t)` relate through the smoothing window :math:`\omega`:

.. math::

    \hat{\bf x}(t) =\int w(\tau){\bf x}(t+\tau)d\tau

We emphasize on the difference between:

- :math:`\Delta t` the time step of the geodynamo series
- :math:`\Delta s` the time duration of the smoothing window :math:`\omega` (configuration parameter ``dt_smoothing``)

We write :math:`N_\tau = \frac{\Delta s}{\Delta t}` the size of :math:`\omega` and :math:`M_\tau = \sum_{j=-N_\tau}^{N_\tau}w_j^2`, the :math:`L_2` norm of :math:`\omega`

The model is discretized as:

.. math::

    \hat{\bf x}_{t_i} =\sum_{j=-N_\tau}^{N_\tau} w_j {\bf x}_{t_{i+j}}


We now note:

.. math::

    \hat{\bf X}_i = \hat{\bf x}_i

.. math::

    \Delta \hat{\bf X}_i = \hat{\bf{x}}_{i+1} - \hat{\bf{x}}_i

The drift matrix :math:`\sf \tilde A` is computed as:

.. math::

    {\sf  \tilde A} = - \frac{1}{\Delta t}\left[\frac{\Delta\hat{\bf X}\hat{\bf X}^{T}}{M_\tau N_\tau} \right] \left[\frac{\hat{\bf X}\hat{\bf X}^{T}}{{M_\tau N_\tau}} \right]^{-1}

Then we deduce :math:`\hat{\bf W}`:

.. math::

    \hat{\bf W} = \left(\Delta \hat{\bf X} + \Delta t {\sf \tilde A} \hat{\bf X}\right)/\sqrt{\Delta t}

And finally compute :math:`{\sf S}`:

.. math::

    {\sf S} = \displaystyle\frac{1}{\displaystyle N_s} 
    \left[   \frac{\hat{\bf W}  \hat{\bf W}^{T}}{{M_\tau N_\tau}} \right]

with, :math:`N_s = \frac{N_t}{N_\tau}` the number of independant data in the smoothed geodynamo series

:math:`\mathbf{L}` is the lower Cholesky matrix of :math:`{\sf S}`

Complete forecast step
^^^^^^^^^^^^^^^^^^^^^^

The first step of the forecast is to compute :math:`\mathbf{u}(t+\Delta t_f)` and :math:`\mathbf{e_r}(t+\Delta t_f)` using Eqs. :eq:`Euler-Maruyama`.

If U ER are independant:

:math:`\mathbf{u}'` and :math:`\mathbf{e}'` are centered (average zero)

.. math::

    \mathbf{u}'(t+\Delta t_f) = -{\mathbf{u}'(t)} {\sf A_u} + {\sf D_u}

.. math::

    \mathbf{e}'(t+\Delta t_f) = -{\mathbf{e}'(t)} {\sf A_e} + {\sf D_e}

Otherwise we consider the augmented state Z ([U, ER]):

:math:`\mathbf{z}'` is centered (average zero)

.. math::

     \mathbf{z}'(t+\Delta t_f) = -{\mathbf{z}'(t)} {\sf A_z} + {\sf D_z}

Then, the vector :math:`\dot{\mathbf{b}}(t+\Delta t_f)` is evaluated thanks to Eq. :eq:`induction` by using :math:`\mathbf{u}(t+\Delta t_f)`, :math:`\mathbf{e_r}(t+\Delta t_f)` and :math:`\mathbf{b}(t)`.

.. math::

    \dot{\mathbf{b}}(t+\Delta t_f) = \mathrm{A}(\mathbf{b}(t))\mathbf{u}(t+\Delta t_f) + \mathbf{e_r}(t+\Delta t_f)

where, :math:`\mathrm{A}` is the matrix of Gaunt-Elsasser integrals [Moon79]_

Finally, the vector :math:`\mathbf{b}(t+\Delta t_f)` is evaluated by finite difference.

.. math::

	\mathbf{b}(t+\Delta t_f) = \mathbf{b}(t) + \Delta t_f \left[\dot{\mathbf{b}}(t+\Delta t_f)\right]

yielding the complete CoreState at :math:`t+\Delta t_f`.

Analysis
--------

The analysis step takes as input the ensemble of forecast core states  :math:`\mathbf{X}^f(t_a)`, the geodynamo statistics, plus MF and SV observations at :math:`t=t_a` together with their uncertainties. It is performed in two steps:

* First, a best linear unbiased estimate (BLUE) of an ensemble of realisations of :math:`\mathbf{b}` is performed from magnetic field observations :math:`\mathbf{b}^o(t)` and the ensemble of forecasts :math:`\mathbf{b}^f(t)`. The two are linked through an observation operator :math:`\mathrm{H}_{b}` that relates :math:`\mathbf{b}^f(t)` to :math:`\mathbf{b}^o(t)` (:math:`\mathbf{b}^o(t) = \mathrm{H}_{b}\mathbf{b}^f(t)`). The Kalman gain matrix is then given by

.. math::

    \mathrm{K}_{bb} = \mathrm{P}^f_{bb}\mathrm{H}_b^T\left(\mathrm{H}_b \mathrm{P}^f_{bb} \mathrm{H}_b^T + \mathrm{R}_{bb}\right)^{-1}

where :math:`\mathrm{P}^f_{bb}` is the correlation matrix of the forecasted magnetic field and :math:`\mathrm{R}_{bb}` the MF observation error matrix. We use the glasso algorithm to approximate :math:`\mathrm{P}^f_{bb}` [IGF23]_ (see ``remove_spurious`` in config_).

The analysed magnetic field vector :math:`\mathbf{b}^a(t_a)` comes from the BLUE:

.. math::

    \mathbf{b}^a(t_a) = \mathbf{b}^f(t_a) + \mathrm{K}_{bb}\left(\mathbf{b}^o(t_a) - \mathrm{H}_b\mathbf{b}^f(t_a)\right)

* Second, a BLUE of an ensemble of realisations of the augmented state :math:`\mathbf{z}`, that is a concatenation of :math:`\mathbf{u}` and :math:`\mathbf{e}_r` (:math:`\mathbf{z} = [\mathbf{u}^T\mathbf{e}_r^T]^T`), using the observations of the SV.

The observation operator :math:`\mathrm{H}_z` relates :math:`\mathbf{z}^f(t)` to :math:`\mathbf{\dot{b}}^o(t)`. It is composed of the observation operator for SV :math:`\mathrm{H}_{sv}` that relates :math:`\mathbf{\dot{b}}^f(t)` to :math:`\mathbf{\dot{b}}^o(t)` and of another matrix deriving from the induction equation  :eq:`induction` at :math:`t_a`:

.. math::

    \mathbf{\dot{b}^f}(t_a) = \mathrm{A}(\mathbf{b}^a(t_a))\mathbf{u}^f(t_a) + \mathbf{e_r}^f(t_a).

We have then :math:`\mathrm{H}_z(t_a) = \mathrm{H}_{sv}[\mathrm{A}(\mathbf{b}^a(t_a))\mathrm{I}_e]` where :math:`\mathrm{I}_e` is the identity of rank :math:`N_e=N_{sv}`.

The Kalman gain matrix is given by

.. math::

    \mathrm{K}_{zz} = \mathrm{P}^f_{zz}\mathrm{H}_z^T\left(\mathrm{H}_z \mathrm{P}^f_{zz} \mathrm{H}_z^T + \mathrm{R}_{\dot{b}\dot{b}}\right)^{-1}

where :math:`\mathrm{P}^f_{zz}` is the correlation matrix of the forecasted augmented state and :math:`\mathrm{R}_{\dot{b}\dot{b}}` the SV observation error matrix. We use the glasso algorithm to approximate :math:`\mathrm{P}^f_{zz}` [IGF23]_ (see ``remove_spurious`` in config_).

The analysed core flow and subgrid errors are then given by the BLUE

.. math::

    \mathbf{z}^a(t_a) = \mathbf{z}^f(t_a) + \mathrm{K}_{zz}\left(\mathbf{\dot{b}}^o(t_a) - \mathrm{H}_z\mathbf{z}^f(t_a)\right)

Having now :math:`\mathbf{b}^a(t_a)`, :math:`\mathbf{u}^a(t_a)`, :math:`\mathbf{e_r}^a(t_a)`, all is left is to derive :math:`\mathbf{\dot{b}}^a(t_a)` from the induction equation :eq:`induction`:

.. math::

    \mathbf{\dot{b}}^a(t_a) = \mathrm{A}(\mathbf{b}^a(t_a))\mathbf{u}^a(t_a) + \mathbf{e_r}^a(t_a).


Complete analysis step
^^^^^^^^^^^^^^^^^^^^^^

The complete analysis step is therefore:

.. math::

    \mathbf{b}^a(t_a) &= \mathbf{b}^f(t_a) + \mathrm{K}_{bb}\left(\mathbf{b}^o(t_a) - \mathrm{H}_b\mathbf{b}^f(t_a)\right)

    \mathbf{z}^a(t_a) &= \mathbf{z}^f(t_a) + \mathrm{K}_{zz}\left(\mathbf{\dot{b}}^o(t_a) - \mathrm{H}_z\mathbf{z}^f(t_a)\right)

    \mathbf{\dot{b}}^a(t_a) &= \mathrm{A}(\mathbf{b}^a(t_a))\mathbf{u}^a(t_a) + \mathbf{e_r}^a(t_a)

yielding the complete CoreState at :math:`t_a`.

AR(3) version
=============

Forecast and AR(3) processes
----------------------------

We wish to fit the data with a multivariate AR-3 model of the form

.. math::

    \text{d}{\bf x}''(t) + {\sf \tilde A} \text{d}{\bf x}'(t) + {\sf \tilde B} \text{d}{\bf x}(t) + {\sf \tilde C}{\bf x}(t) = {\sf  \tilde D} \text{d}\boldsymbol{\zeta}


with :math:`\boldsymbol{\zeta}` a Wiener process and :math:`{\sf \tilde A}`, :math:`{\sf \tilde B}`, :math:`{\sf  \tilde C}` are drift matrices, of dimension :math:`P\times P` with :math:`P` the dimension of the vector :math:`{\bf x}_t={\bf x}(t)` 

:math:`\mathbf{x}` is centered (substract average) and stands for the core flow U or the subgrid error ER or the augmented state Z ([U, ER])

We can write the AR(3) process as an AR(1) process:

.. math::

    \text{d}{\bf z}(t) = - \begin{bmatrix} 0&-I&0\\ 0&0&-I\\ {\sf \tilde C}&{\sf \tilde B}&{\sf \tilde A} \end{bmatrix} {\bf z}(t) \text{d}t + \begin{bmatrix} 0&0&0\\ 0&0&0\\ 0&0&{\sf \tilde D} \end{bmatrix} \begin{bmatrix} {\bf 0}\\ {\bf 0}\\ {\sf \tilde D} \end{bmatrix}

with, :math:`{\bf z}(t) = \begin{bmatrix} {\bf x}(t)\\ {\bf x}'(t)\\ {\bf x}''(t) \end{bmatrix}`

A first order differenciation gives

.. math:: 
    
    {\bf z}_i = \begin{bmatrix} {\bf x}_i\\ \frac{{\bf x}_{i+1} - {\bf x}_i}{\Delta t}\\ \frac{{\bf x}_{i+2} - 2 {\bf x}_{i+1} + {\bf x}_i}{\Delta t^2} \end{bmatrix}, {\Delta}{\bf z}_i = \begin{bmatrix} {\bf x}_{i+1} - {\bf x}_i\\ \frac{{\bf x}_{i+2} - 2 {\bf x}_{i+1} + {\bf x}_i}{\Delta t}\\ \frac{{\bf x}_{i+3} - 3 {\bf x}_{i+2} + 3{\bf x}_{i+1} - {\bf x}_i}{\Delta t^2} \end{bmatrix}

Once discretized with an Euler-Maruyama scheme, it gives:

.. math::
    :label: Euler-Maruyama

    {\bf x}_{i+3} = - {\bf x_{i+2}} {\sf A} - {\bf x_{i+1}} {\sf B} - {\bf x_i} {\sf C} + {\sf D}

with: 

:math:`{\sf A} = {\Delta t \sf \tilde A} - 3 {\bf I}`
:math:`{\sf B} = {\Delta t^2 \sf \tilde B} - 2 {\Delta t \sf \tilde A} + 3 {\bf I}`
:math:`{\sf C} = {\Delta t^3 \sf \tilde C} - {\Delta t^2 \sf \tilde B} + {\Delta t \sf \tilde A} - {\bf I}`
:math:`{\sf D} = \Delta t^{5/2} \mathbf{L} \mathbf{w_i}`, with :math:`\mathbf{w_i}` a normal draw (:math:`\mathbf{w_i} = \mathcal{N}(0,1)`)


Thus, we seek :math:`\mathbf{L}`, :math:`{\sf \tilde A}`, :math:`{\sf \tilde B}` and :math:`{\sf \tilde C}`.

The computation of these matrices is similar to the Dense AR(1) process.

Complete forecast step
^^^^^^^^^^^^^^^^^^^^^^

The first step of the forecast is to compute :math:`\mathbf{u}(t+\Delta t_f)` and :math:`\mathbf{e_r}(t+\Delta t_f)` using Eqs. :eq:`Euler-Maruyama`.

If U ER are independant:

:math:`\mathbf{u}'` and :math:`\mathbf{e}'` are centered (average zero)

.. math::

    \mathbf{u}'(t+\Delta t_f) = -{\mathbf{u}'(t)} {\sf A_u} -{\mathbf{u}'(t-\Delta t_f)} {\sf B_u} -{\mathbf{u}'(t- 2 \Delta t_f)} {\sf C_u} + {\sf D_u}

.. math::

    \mathbf{e}'(t+\Delta t_f) = -{\mathbf{e}'(t)} {\sf A_e} -{\mathbf{e}'(t-\Delta t_f)} {\sf B_e} -{\mathbf{e}'(t- 2 \Delta t_f)} {\sf C_e} + {\sf D_e}

Otherwise we consider the augmented state Z ([U, ER]):

:math:`\mathbf{z}'` is centered (average zero)

.. math::

     \mathbf{z}'(t+\Delta t_f) = -{\mathbf{z}'(t)} {\sf A_z} -{\mathbf{z}'(t-\Delta t_f)} {\sf B_z} -{\mathbf{z}'(t- 2 \Delta t_f)} {\sf C_z} + {\sf D_z}

Then, the vector :math:`\dot{\mathbf{b}}(t+\Delta t_f)` is evaluated thanks to Eq. :eq:`induction` by using :math:`\mathbf{u}(t+\Delta t_f)`, :math:`\mathbf{e_r}(t+\Delta t_f)` and :math:`\mathbf{b}(t)`.

.. math::

    \dot{\mathbf{b}}(t+\Delta t_f) = \mathrm{A}(\mathbf{b}(t))\mathbf{u}(t+\Delta t_f) + \mathbf{e_r}(t+\Delta t_f)

where, :math:`\mathrm{A}` is the matrix of Gaunt-Elsasser integrals [Moon79]_

Finally, the vector :math:`\mathbf{b}(t+\Delta t_f)` is evaluated by finite difference.

.. math::

	\mathbf{b}(t+\Delta t_f) = \mathbf{b}(t) + \Delta t_f \left[\dot{\mathbf{b}}(t+\Delta t_f)\right]

yielding the complete CoreState at :math:`t+\Delta t_f`.

Analysis
--------
In the following, we note :math:`ta^- = ta - dta` and :math:`ta^+ = ta + dta`

The analysis step takes as input the ensemble of forecast core states  :math:`\mathbf{X}^f` between :math:`t = ta^-` and :math:`t = ta^+`, the geodynamo statistics, plus MF and SV observations at :math:`t = ta^-`, :math:`t=t_a` and :math:`t = ta^+` together with their uncertainties.

It is performed in two steps:

* MF analysis

First, we perform a linear regression to fit every realisation of the forecast core states from :math:`t=ta^-` to :math:`t=ta^+` using a 2nd order Taylor expansion.

We obtain an ensemble :math:`\mathbf{B}_{t^{a}}^{f}=\left[\mathbf{b}_{t^{a}}^{f }, \mathbf{b}_{t^{a}}^{f^{\prime}}, \mathbf{b}_{t^{a}}^{f^{\prime \prime}}\right]`   

We write: :math:`\mathbf{Y}_{t^{a}}^{M F}=\mathrm{H}_{B} \mathbf{B}_{t^{a}}+\mathbf{E}_{t^{a}}^{M F}`

where :math:`\mathbf{E}_{t^{a}}^{M F}=\left[\mathbf{e}_{t^{a-}}^{M F T}, \mathbf{e}_{t^{a}}^{M F T}, \mathbf{e}_{t^{a+}}^{M F T}\right]^{T}, \mathbf{Y}_{t^{a}}^{M F}=\left[\mathbf{y}_{t^{a-}}^{M F T}, \mathbf{y}_{t^{a}}^{M F T}, \mathbf{y}_{t^{a+}}^{M F T}\right]^{T}`, and

:math:`\mathrm{H}_{B}=\left[\begin{array}{ccc} \mathrm{H}_{b} & -\delta t^{a} \mathrm{H}_{b} & \frac{1}{2} \delta t^{a 2} \mathrm{H}_{b} \\ \mathrm{H}_{b} & 0 & 0 \\ \mathrm{H}_{b} & +\delta t^{a} \mathrm{H}_{b} & \frac{1}{2} \delta t^{a 2} \mathrm{H}_{b} \end{array}\right]`

From an ensemble of :math:`\mathbf{B}_{t^{a}}^{f}` we compute empirically the model state forecast covariance matrix :math:`\mathrm{P}_{B B}^{f}\left(t^{a}\right)=\mathbb{E}\left(\delta \mathbf{B}_{t^{a}}^{f} \delta \mathbf{B}_{t^{a}}^{f T}\right)`.
We use the glasso algorithm to approximate :math:`\mathrm{P}_{B B}^{f}\left(t^{a}\right)` [IGF23]_ (see ``remove_spurious`` in config_).

The Kalman gain matrix writes:

.. math::

    \mathrm{K}_{B}=\mathrm{P}_{B B}^{f} \mathrm{H}_{B}^{T}\left[\mathrm{H}_{B} \mathrm{P}_{B B}^{f} \mathrm{H}_{B}^{T}+\tilde{\mathrm{R}}^{M F}\right]^{-1}

to be used for the analysis step for each ensemble member:

.. math::

    \mathbf{B}_{t^{a}}^{a}=\mathbf{B}_{t^{a}}^{f}+\mathrm{K}_{B}\left(\mathbf{Y}_{t^{a}}^{M F}-\mathrm{H}_{B} \mathbf{B}_{t^{a}}^{f}\right)

The analyzed MF at the three epochs can then be deduced from :math:`\mathbf{B}_{t^{a}}` using a Taylor expansion:

.. math::

    \left[\begin{array}{l}
    \mathbf{b}_{t^{a-}}^{a} \\
    \mathbf{b}_{t^{a}}^{a} \\
    \mathbf{b}_{t^{a+}}^{a}
    \end{array}\right]=\left[\begin{array}{ccc}
    \mathbf{I}_{b} & -\delta t^{a} \mathbf{I}_{b} & \frac{1}{2} \delta t^{a 2} \mathbf{I}_{b} \\
    \mathbf{I}_{b} & 0 & 0 \\
    \mathbf{I}_{b} & +\delta t^{a} \mathbf{I}_{b} & \frac{1}{2} \delta t^{a 2} \mathbf{I}_{b}
    \end{array}\right] \mathbf{B}_{t^{a}}^{a}

:math:`\mathrm{I}_{b}` is the identity matrix of the size of the vector :math:`\mathbf{b}`.

* SV analysis

First, we perform a linear regression to fit every realisation of the forecast core states from :math:`t=ta^-` to :math:`t=ta^+` using a 2nd order Taylor expansion.

We obtain an ensemble :math:`\mathbf{Z}_{t^{a}}^{f}=\left[\mathbf{z}_{t^{a}}^{f }, \mathbf{z}_{t^{a}}^{f^{\prime}}, \mathbf{z}_{t^{a}}^{f^{\prime \prime}}\right]`   


We write: :math:`\Delta \mathbf{Y}_{t^{a}}^{S V}=\mathrm{H}_{Z}\left(\mathbf{B}_{t^{a}}\right) \mathbf{Z}_{t^{a}}+\mathbf{E}_{t^{a}}^{S V}`
where :math:`\Delta \mathbf{y}_{t}^{S V}` is a SV observation vector at epoch $t$ once corrected from the SV prediction from the average dynamo state.

where :math:`\mathbf{E}_{t^{a}}^{S V}=\left[\mathbf{e}_{t^{a-}}^{S V T}, \mathbf{e}_{t^{a}}^{S V T}, \mathbf{e}_{t^{a+}}^{S V T}\right]^{T}, \Delta \mathbf{Y}_{t^{a}}^{S V}=\left[\Delta \mathbf{y}_{t^{a-}}^{S V T}, \Delta \mathbf{y}_{t^{a}}^{S V T}, \Delta \mathbf{y}_{t^{a+}}^{S V T}\right]^{T}`, and

.. math::

    \mathrm{H}_{Z}=\left[\begin{array}{ccc}
    \mathrm{H}_{z}\left(\mathbf{b}_{t^{a-}}^{a}\right) & -\delta t^{a} \mathrm{H}_{z}\left(\mathbf{b}_{t^{a-}}^{a}\right) & \frac{1}{2} \delta t^{a 2} \mathrm{H}_{z}\left(\mathbf{b}_{t^{a-}}^{a}\right) \\
    \mathrm{H}_{z}\left(\mathbf{b}_{t^{a}}^{a}\right) & 0 & 0 \\
    \mathrm{H}_{z}\left(\mathbf{b}_{t^{a+}}^{a}\right) & +\delta t^{a} \mathrm{H}_{z}\left(\mathbf{b}_{t^{a+}}^{a}\right) & \frac{1}{2} \delta t^{a} \mathrm{H}_{z}\left(\mathbf{b}_{t^{a+}}^{a}\right)
    \end{array}\right]

with,

.. math::

    \mathbf{H}_{z}\left(\mathbf{b}_{t}^{a}\right)=\left[\begin{array}{ll}
    \mathbf{H}_{b} \tilde{\mathrm{A}}\left(\mathbf{b}_{t}^{a}\right) & \mathbf{H}_{b}
    \end{array}\right],

where, :math:`\mathrm{A}` is the matrix of Gaunt-Elsasser integrals [Moon79]_

From an ensemble of :math:`\mathbf{Z}_{t^{a}}^{f}` we compute empirically the model state forecast covariance matrix :math:`\mathrm{P}_{Z Z}^{f}\left(t^{a}\right)=\mathbb{E}\left(\delta \mathbf{Z}_{t^{a}}^{f} \delta \mathbf{Z}_{t^{a}}^{f T}\right)`.
We use the glasso algorithm to approximate :math:`\mathrm{P}_{Z Z}^{f}\left(t^{a}\right)` [IGF23]_ (see ``remove_spurious`` in config_).

The Kalman gain matrix writes:

.. math::

    \mathrm{K}_{Z}=\mathrm{P}_{Z Z}^{f} \mathrm{H}_{Z}^{T}\left[\mathrm{H}_{Z} \mathrm{P}_{Z Z}^{f} \mathrm{H}_{Z}^{T}+\tilde{\mathrm{R}}^{S V}\right]^{-1}

used for the analysis step for each ensemble member:

.. math::

    \mathbf{Z}_{t^{a}}^{a}=\mathbf{Z}_{t^{a}}^{f}+\mathrm{K}_{Z}\left(\Delta \mathbf{Y}_{t^{a}}^{S V}-\mathrm{H}_{Z} \mathbf{Z}_{t^{a}}^{f}\right)


Complete analysis step
^^^^^^^^^^^^^^^^^^^^^^

The complete analysis step is therefore:

.. math::

    \mathbf{B}_{t^{a}}^{a} &= \mathbf{B}_{t^{a}}^{f}+\mathrm{K}_{B}\left(\mathbf{Y}_{t^{a}}^{M F}-\mathrm{H}_{B} \mathbf{B}_{t^{a}}^{f}\right)
    
    \mathbf{Z}_{t^{a}}^{a} &= \mathbf{Z}_{t^{a}}^{f}+\mathrm{K}_{Z}\left(\Delta \mathbf{Y}_{t^{a}}^{S V}-\mathrm{H}_{Z} \mathbf{Z}_{t^{a}}^{f}\right)

    \mathbf{\dot{b}}_{t^{a}}^{a} &= \mathrm{A}(\mathbf{b}_{t^{a}}^{a})\mathbf{u}_{t^{a}}^{a} + \mathbf{e_r}_{t^{a}}^{a}

where, :math:`\mathrm{A}` is the matrix of Gaunt-Elsasser integrals [Moon79]_

yielding the complete CoreState at :math:`t_a`.


Restart of a forecast for an AR-3 model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Using an AR model of order higher than 3 requires to know, at each initialization of the forecast, the model state at 3 successive epochs.

During the analysis we have computed: :math:`\mathbf{Z}_{t^{a}}^{a}=\left[\mathbf{z}_{t^{a}}^{a}, \mathbf{z}_{t^{a}}^{a^{\prime}}, \mathbf{z}_{t^{a}}^{a^{\prime \prime}}\right]`

So we restart from a Taylor expansion:

.. math::

    \left\{\begin{array}{rl}
    \mathbf{z}_{t^{a}} &= \mathbf{z}_{t^{a}}^{a}\\
    \mathbf{z}_{t^{a}-\Delta t^{f}} & =\mathbf{z}_{t^{a}}^{a}-\Delta t^{f} \mathbf{z}_{t^{a}}^{a^{\prime}}+\frac{(\Delta t^{f})^{2}}{2}\mathbf{z}_{t^{a}}^{a^{\prime \prime}}\\
    \mathbf{z}_{t^{a}-2 \Delta t^{f}} & =\mathbf{z}_{t^{a}}^{a}-2 \Delta t^{f} \mathbf{z}_{t^{a}}^{a^{\prime}}+\frac{(2 \Delta t^{f})^{2}}{2}\mathbf{z}_{t^{a}}^{a^{\prime \prime}}
    \end{array} .\right.


References
==========

.. [BGA17] Barrois, O., Gillet, N. & Aubert, J. Contributions to the geomagnetic secular variation from a reanalysis of core surface dynamics, *Geophysical Journal International*, 211, 50–68, 2017. doi:10.1093/gji/ggx280
.. [Moon79] Moon, W. Numerical evaluation of geomagnetic dynamo integrals (Elsasser and Adams-Gaunt integrals), *Computer Physics Communications*, 16, 267–271, 1979. doi:10.1016/0010-4655(79)90092-4

For more details on the above steps, we refer to [BGA17]_ and to:

.. [BHF18] Barrois, O., Hammer, M. D., Finlay, C. C., Martin, Y. & Gillet, N. Assimilation of ground and satellite magnetic measurements: inference of core surface magnetic and velocity field changes, *Geophysical Journal International*, 2018. doi:10.1093/gji/ggy297
.. [BGA19] Barrois, O. et al. Erratum: ‘Contributions to the geomagnetic secular variation from a reanalysis of core surface dynamics’ and ‘Assimilation of ground and satellite magnetic measurements: inference of core surface magnetic and velocity field changes’, *Geophysical Journal International*, 216, 2106–2113, 2019. doi:10.1093/gji/ggy471
.. [GHA19] Gillet, N., Huder, L. & Aubert, J. A reduced stochastic model of core surface dynamics based on geodynamo simulations, *Geophysical Journal International*, 219, 522–539, 2019. doi:10.1093/gji/ggz313
.. [IGF23] Istas, M., Gillet, N., Finlay, C. C., Hammer, M. D., & Huder, L. Transient core surface dynamics from ground and satellite geomagnetic data, *Geophysical Journal International*, 233, 1890–1915, 2023. doi:10.1093/gji/ggad039