271124 commit (71b5f6fd) · Commits · Stefano Covino / TimeDomainAstrophysics

Lectures/Lecture - Non Parametric Analysis/Buhlmann (2002) - Bootstraps for Time Series.pdf

0 → 100644

+2.62 MiB

File added.

No diff preview for this file type.

Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricAnalysis.ipynb

+11 −3

Original line number	Diff line number	Diff line
		@@ -352,7 +352,7 @@
		"## String Length Method\n",
		"***\n",
		"\n",
		"This algorithm was proposed originally in [Lafler & Kinman (1965](https://ui.adsabs.harvard.edu/abs/1965ApJS...11..216L/abstract) and than discussed in [Dworetsky (1983)](https://ui.adsabs.harvard.edu/abs/1983MNRAS.203..917D/abstract) and extended in [Clarke (2002)](http://localhost:8888/notebooks/Lecture-NonParametricAnalysis.ipynb#:~:text=Clarke%20(2002)%20%2D%20%22String/Rope%20length%20methods%20using%20the%20Lafler%2DKinman%20statistic%22). "
		"This algorithm was proposed originally in [Lafler & Kinman (1965](https://ui.adsabs.harvard.edu/abs/1965ApJS...11..216L/abstract)) and than discussed in [Dworetsky (1983)](https://ui.adsabs.harvard.edu/abs/1983MNRAS.203..917D/abstract) and extended in [Clarke (2002)](http://localhost:8888/notebooks/Lecture-NonParametricAnalysis.ipynb#:~:text=Clarke%20(2002)%20%2D%20%22String/Rope%20length%20methods%20using%20the%20Lafler%2DKinman%20statistic%22). "
		]
		},
		{
		@@ -519,7 +519,7 @@
		"## Generalized Correlation Function: Correntropy\n",
		"****\n",
		"\n",
		"> Recently, non-parametric methods have attracted a considerable attention and novel appoaches, based on ideas and algorithms developed for machine-learning in a big-data scenario have been developed.\n",
		"> Recently, non-parametric methods have attracted a considerable attention and novel appoaches, based on ideas and algorithms developed for machine-learning in a big-data scenario, have been developed.\n",
		"\n",
		"\n",
		"- [Huijse et al. (2012)](https://ui.adsabs.harvard.edu/abs/2012ITSP...60.5135H/abstract) discussed a methodology based on information theoretic (IT) based criteria.\n",
		@@ -564,6 +564,12 @@
		"- Kernels can also be viewed as covariance functions for correlated observations at different points of the input domain (see lectures about Gaussian Processes). \n",
		" - A kernel can be any semi-definite positive function. And it is possible to develop periodic kernels.\n",
		" \n",
		"- For instance, one possibility is:\n",
		"\n",
		"$$ G_{\\sigma;P}(z-y) = \\frac{1}{\\sqrt{2\\pi}\\sigma} \\exp \\left ( - \\frac{\\sin^2 \\left( \\frac{\\pi}{P} (z - y) \\right)}{0.5\\sigma^2}\\right ) $$\n",
		"\n",
		"- where now the kernel depends explictly on the period.\n",
		"\n",
		"- This brings to a metric combining the correntropy with a periodic kernel to measure similarity among samples separated by a given period. \n",
		"\n",
		"- This algorithm is known as Correntropy Kernelized Periodogram, and does not require any resampling, slotting or folding scheme, as it is computed directly from the available samples. \n",
		@@ -598,6 +604,8 @@
		"\n",
		"- Technically speaking, the MI is the divergence (i.e. statistical distance) between the joint PDF of the RVs and the product of their marginal PDFs.\n",
		"\n",
		"- Shannon's MI for continuous RVs $X$ and $Y$ with joint PDF $f_{X, Y}(\\cdot, \\cdot)$ is defined as:\n",
		"\n",
		"$$ \\text{MI}_S(X, Y) = D_{KL}(f_{X,Y} \|\| f_X f_Y) = \\iint f_{X,Y} \\log f_{X,Y} \\,dx \\,dy - \\int f_{X} \\log f_X \\,dx - \\int f_{Y} \\log f_Y \\,dy $$\n",
		"\n",
		"where $D_{KL}(\\cdot \|\| \\cdot)$ is the Kullback-Leibler divergence and $f_X (x)= \\int f_{X,Y} (x,y)\\,dy$, $f_Y (y) = \\int f_{X,Y} (x, y)\\,dx$ are the marginal PDFs of $X$ and $Y$, respectively.\n",
		@@ -620,7 +628,7 @@
		"where $f_{X,Y}(\\cdot, \\cdot)$ is the joint PDF of $X$ and $Y$ while $f_X(\\cdot)$ and $f_Y(\\cdot)$ are the marginal PDFs, respectively. \n",
		"\n",
		"- The terms $V_J$, $V_M$ and $V_C$ correspond to the integrals of the squared joint PDF, squared product of the marginal PDFs and product of joint PDF and marginal PDFs, respectively. \n",
		"- In the ITL framework, there are estimators of these quantities that can be computed directly from data samples.\n",
		"- In the ITL framework, there are estimators of these quantities that can be computed directly from data samples (see quoted paper for details).\n",
		" - This estimator is called, in literature, the information potential, IP, of an RV and it corresponds to the expected value of its PDF.\n",
		"\n",
		"- Skipping further technical details that a concerned reader can find in the quoted papers, the analysis follow these guidelines:\n",

Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricPeriodogram.ipynb

+14 −2

Original line number	Diff line number	Diff line
		@@ -744,7 +744,7 @@
		"- In IID bootstrap the data points are randomly sampled with replacement. The IID bootstrap destroys not only the periodicity but also any time correlation or structure in the time series. This results in underestimation of the confidence bars. \n",
		" - For data with serial correlations it is better to use moving block (MB) bootstrap. In MB bootstrap blocks of data of a given length are patched together to create a new time series. The block length is a parameter. Because light curves are irregularly sampled we set a block length in days rather than number of points. The ideal is to set the length so that it destroys the periodicity and preserves most of the serial correlation\n",
		"\n",
		"- Bootstrap applied to time series is discussed in [Bühlmann (2002) - \"Bootstraps for time series.\"](https://www.jstor.org/stable/3182810?casa_token=fuWC-ffm10sAAAAA%3A2PyzOtU2pxoXzaTaaYIaWjrCA3nokR5BjvLdggYcci8Rn8G7UPz4ceMEXfuDOOHg1NtuVWO6ZMkXUjwJl5pLcKxt1ojZTK9WpgrjJPdGsE6o83LjNcuG)."
		"- Bootstrap applied to time series is discussed in [Bühlmann (2002) - \"Bootstraps for time series.\"](https://www.jstor.org/stable/3182810)."
		]
		},
		{
		@@ -759,6 +759,18 @@
		"- [Süveges (2014) - \"Extreme-value modelling for the significance assessment of periodogram peaks\"](https://ui.adsabs.harvard.edu/abs/2014MNRAS.440.2099S/abstract). "
		]
		},
		{
		"cell_type": "markdown",
		"metadata": {},
		"source": [
		"## Further Material\n",
		"\n",
		"Papers for examining more closely some of the discussed topics.\n",
		"\n",
		"- [Bühlmann (2002) - \"Bootstraps for Time Series\"](https://www.jstor.org/stable/3182810).\n",
		" "
		]
		},
		{
		"cell_type": "markdown",
		"metadata": {},
		@@ -829,7 +841,7 @@
		"name": "python",
		"nbconvert_exporter": "python",
		"pygments_lexer": "ipython3",
		"version": "3.10.11"
		"version": "3.12.7"
		}
		},
		"nbformat": 4,

README.md

+1 −1

Original line number	Diff line number	Diff line
		@@ -2,4 +2,4 @@

		This is a repository with material (notebooks, papers, etc.) for the Time Domain Astrophysics course delivered at the Università dell'Insubria by Stefano Covino.

		Last update: 25 November 2024.
		Last update: 27 November 2024.