Commit 941e3ce4 authored by Alice Donini's avatar Alice Donini
Browse files

upload docs

parent 0d5ad8e2
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build
if "%1" == "" goto help
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
:end
popd
sphinx
sphinx-rtd-theme
docutils==0.16 # v0.17 problem in displaying bullet lists with RTD theme
\ No newline at end of file
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
sys.path.insert(0, os.path.abspath('../..'))
# -- Project information -----------------------------------------------------
project = 'LST analysis'
copyright = '2022, Alice Donini'
author = 'Alice Donini'
# The full version, including alpha/beta/rc tags
release = '0.0.1'
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
"sphinx_rtd_theme",
"sphinx.ext.autodoc",
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
# html_theme = 'alabaster'
html_theme = 'sphinx_rtd_theme'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_theme_options = {
'titles_only': True
}
A "How to" Guide for LST analysis at IT
===============================================================================================
This document aims to describe how to perform an analysis at the IT cluster in La Palma using the scripts hosted on the `GitLab repository <https://www.ict.inaf.it/gitlab/alice.donini/lst-analysis>`_ .
.. toctree::
:maxdepth: 1
:caption: Contents
install/index
usage/index
.. Indices and tables
.. ==================
.. * :ref:`genindex`
.. * :ref:`modindex`
.. * :ref:`search`
.. _install:
Installation
============
First you must make sure you have ``lstchain`` installed, since it is needed to run the scripts. If you don't have ``lstchain``, you should get the latest stable relase or the development version if you also want to participate in its development. For the installation of either version, please refer to `lstchain <https://github.com/cta-observatory/cta-lstchain>`_.
.. note::
Under the ASWG workspace at the IT you can find both the Anaconda package and the latest stable relase of ``lstchain`` installed.
You can use these pre-installed versions, or create your own installation/virtual environment under your personal workspace.
Create a `init.sh` file with variables useful for later:
.. code-block:: bash
conda activate lst-dev
export CODE_DIR = ../lst_scripts
export PYTHONPATH = $CODE_DIR:$PYTHONPATH
export CONFIG_FOLDER = ../
``CODE_DIR`` is the directory in which you will have the scripts, while ``CONFIG_FOLDER`` is the directory where the configuration files should be stored.
Fill the missing path `../` with the specific path of your installation.
The first command activate the lstchain conda environment that you installed. If your environment is called differently, change the name.
At the IT it is also installed a common lstchain conda environment. This conda environment is kept up to date with every new release, only change the name of the environment accordingly to the version you need of ``lstchain``.
The available environment can be seen at ``/fefs/aswg/software/conda/envs/``.
The `init.sh` file will be:
.. code-block:: bash
source /fefs/aswg/software/conda/etc/profile.d/conda.sh
conda activate lstchain-v0-9.1
export CODE_DIR = ../lst_scripts
export PYTHONPATH = $CODE_DIR:$PYTHONPATH
export CONFIG_FOLDER = ../
An `init.sh` file example is already present in the repository. You can edit it with your directory paths and before starting the analysis run:
.. code-block::
source init.sh
This way, the lstchain environment and all the variable needed for the analysis will be set.
.. _db_generation:
Database generation
===================
The script `GenerateDB_Seiya.py` generates a database file of the data taken by LST from the excel file Seiya is keepeing up to date.
The databse is used to extract info about target and runs, so that the wanted data files can be retrieved at the IT.
To generate the database file run:
.. code-block::
python GenerateDB.py
Add argument ``-n`` if you want to change the name given to the file. Default name is `database.csv`.
A run selection can be made later. Several ways of getting a run list are implemented in each script:
* The user provides the run number and night with a file;
* Selections of the runs based on the `night` provided by the user in the configuration file.
* Selections of the runs based on the `TCU name` provided by the user
* Selections of the runs based on the `angular distance`, i.e the maximum distance in degrees between the source position and the run pointing position
.. note::
Remember to generate a new database file every now and then to be up to date with the latest data taken.
.. warning::
The search of runs through the database has an issue on the dates at the moment. The database is generated from the drive log, so all the runs taken after the midnight are saved under the following day. This doesn't happen at the IT, where the runs are stored under the day of the data taking night.
.. _dl2:
DL2 files generation
====================
The script `DL1_to_DL2.py` allows to analyze DL1 data and produce DL2 files.
Usage:
.. code-block::
usage: DL1_to_DL2.py [-h] [--prod PROD] [--outdir OUTDIR] [--config CONFIG] [--config-analysis CONFIG_ANALYSIS]
[--verbose] [--source_name SOURCE_NAME] [--tcuname TCUNAME] [--runlist RUNLIST]
[--distance DISTANCE] [--ra RA] [--dec DEC] [--submit] [--dry] [--globber]
DL1 to DL2 converter
optional arguments:
--config CONFIG, -c CONFIG
Specify a personal config file for the analysis
--config-analysis CONFIG_ANALYSIS
Specify a config file which described analysis profile to use
--dec DEC Dec coordinate of the target. To add if you want to use custom position
--distance DISTANCE, -dis DISTANCE
Max distance in degrees between the target position and the run pointing position for the
run selection, negative value means no selection using this parameter (default: -1).
--dry Make a dry run, no true submission
--globber, -g If True, overwrites existing output file without asking
--outdir OUTDIR, -o OUTDIR
Directory to store the output
--prod PROD, -p PROD Prod to use (default: v0.8.4)
--ra RA RA coordinate of the target. To add if you want to use custom position
--runlist RUNLIST, -rl RUNLIST
File with a list of run and the associated night to be analysed
--source_name SOURCE_NAME, -n SOURCE_NAME
Name of the source
--submit Submit the cmd to slurm on site
--tcuname TCUNAME Apply run selection based on TCU source name
--verbose, -v Increase output verbosity
-h, --help show this help message and exit
It makes use of a configuration file `config_dl1_to_dl2.yaml` (option ``--config``):
.. code-block:: yaml
# Directory where job file are written
jobmanager: ../jobmanager
# Database file name
db: database.csv
# LST real data path (don't modify it)
data_folder: /fefs/aswg/data/real
# path to main data folder of the user
# change it accordingly to your working env
base_dir: /fefs/aswg/alice.donini/Analysis/data
# Path to personal directory where output data will be saved.
# Uncomment and modify in case you want to use a non standard path
#output_folder: ../DL2/Crab
# Directory where config files are stored
#config_folder: ./
# Path to trained RF files
path_models: ../models
# Values for automatic selection of DL1 data
dl1_data:
DL1_dir: /fefs/aswg/data/real/DL1 # path to DL1 directory
night: [20210911, 20210912] # day(s) of observation (more than one is possible)
version: v0.9.1 # v0.7.3, v0.8.4, v0.9, v0.9.1
cleaning: tailcut84
Edit the configuration file: change the paths based on your working directory and modify the DL1 data information used to search for the files at the IT.
The script uses the created database (:ref:`db_generation`) to find the runs to analyze, based on the nights specified in the configuration file `config_dl1_to_dl2.yaml`.
An extra selection can be done in coordinates (``--distance``, ``--ra`` and ``--dec`` are mandatory arguments) or by the name of the source as saved in TCU (argument ``--tcuname``).
If none of these selection methods is given, then all the runs available in the dates specified in the configuration file are considered for the analysis.
The search for DL1 files can be also done by giving a file list of runs and nights to be analyzed (option ``--runlist``).
No database file is needed in this case.
.. warning::
The search of runs through the database has an issue on the dates at the moment. The database is generated from the drive log, so all the runs taken after the midnight are saved under the following day. This doesn't happen at the IT, where the runs are stored under the day of the starting night. So for some runs the search could fail, even if they are there.
Thus if you use the database search always add in the config file also the date of the night after, so that you are sure all the runs take after the midnight are considered too.
Example of a file run list:
.. code-block::
2909 20201117
2911 20201117
3089 20201206
The argument ``--dry`` allows to perform a dry run.
No jobs are submitted and only the verbose output is printed.
Usefull to have a check of which runs are selected and the goodness of the sent command.
Some examples of how to run the script:
.. code-block::
python DL1_to_DL2.py -c config_dl1_to_dl2.yaml -n Crab --tcuname Crab -v --submit
python DL1_to_DL2.py -c config_dl1_to_dl2.yaml -n Crab --distance 2 --ra 83.633080 --dec 22.014500 -v --submit
python DL1_to_DL2.py -c config_dl1_to_dl2.yaml -n Crab --runlist $CONFIG_FOLDER/Crab_Nov2020.txt -v --submit
.. _dl3:
DL3 files generation
====================
There are two different scripts to generate the DL3 file:
* `DL2_to_DL3.py`, which allows to analyze DL2 data generated by the user beforehand from DL1 data (:ref:`dl2`)
* `create_DL3_files_from_database.py`, which uses the DL2 files created auyomatically by LST OSA.
.. note::
At the moment is recommended the use of the first script `DL2_to_DL3.py` and generate you own DL2 data, since the automatized generation of DL2 data is not optimized.
DL2_to_DL3.py
--------------
The script `DL2_to_DL3.py` uses DL2 data generated by the user beforehand (:ref:`dl2`) and produces DL3 files.
Usage:
.. code-block::
usage: DL2_to_DL3.py [-h] [--prod PROD] [--outdir OUTDIR] [--config CONFIG] [--config-analysis CONFIG_ANALYSIS]
[--verbose] [--source_name SOURCE_NAME] [--tcuname TCUNAME] [--runlist RUNLIST]
[--distance DISTANCE] [--ra RA] [--dec DEC] [--submit] [--dry] [--globber] [--cut_file CUT]
[--gh_cut GH_CUT] [--theta_cut THETA_CUT]
DL2 to DL3 converter
optional arguments:
--config CONFIG, -c CONFIG
Specify a personal config file for the analysis
--config-analysis CONFIG_ANALYSIS
Specify a config file which describes analysis profile to use
--cut_file CUT, -cut CUT
Cut file
--dec DEC Dec coordinate of the target. To add if you want to use custom position
--distance DISTANCE, -dis DISTANCE
Max distance in degrees between the target position and the run pointing position for the run
selection, negative value means no selection using this parameter (default: -1).
--dry Make a dry run, no true submission
--gh_cut GH_CUT Fixed selection cut for gh_score (gammaness)
--globber, -g If True, overwrites existing output file without asking
--outdir OUTDIR, -o OUTDIR
Directory to store the output
--prod PROD, -p PROD Prod to use (default: v0.9.4)
--ra RA RA coordinate of the target. To add if you want to use custom position
--runlist RUNLIST, -rl RUNLIST
File with a list of run and the associated night to be analysed
--source_name SOURCE_NAME, -n SOURCE_NAME
Name of the source
--submit Submit the cmd to slurm on site
--tcuname TCUNAME Apply run selection based on TCU source name
--theta_cut THETA_CUT
Fixed selection cut for theta
--verbose, -v Increase output verbosity
-h, --help show this help message and exit
It makes use of a configuration file `config_dl2_to_dl3.yaml` (option ``--config``).
Edit the configuration file changing the paths based on the analysis you want to perform.
For the moment the IRF file to use has to be specified in the configuration file.
IRF files with different settings can be found in ``/fefs/aswg/data/mc/IRF/``, otherwise create your own (:ref:`irf_generation`).
Configuration file:
.. code-block::
# Directory where job files are written
jobmanager: ../jobmanager
# Database file name
db: database.csv
# Path to main data folder of the user
# Change it accordingly to your working env
base_dir: /fefs/aswg/alice.donini/Analysis/data
# Path to personal directory where output data will be saved.
# Uncomment and modify in case you want to use a non standard path
#output_folder: ../DL3/Crab
# Path to the folder where cut files are stored
cut_folder: ../cuts
# Path to IRF file, change it based on your analysis
irf_file: "/fefs/aswg/data/mc/IRF/20200629_prod5_trans_80/zenith_20deg/south_pointing/20210416_v0.7.3_prod5_trans_80_local_taicut_8_4/off0.4deg/irf_20210416_v073_prod5_trans_80_local_taicut_8_4_gamma_point-like_off04deg.fits.gz"
# Values for automatic selection of DL2 data from personal directories
dl2_data:
night: [20210911, 20210912] # day(s) of observation (more than one is possible)
version: v0.9.1 # v0.7.3, v0.8.4, v0.9, v0.9.1
cleaning: tailcut84
The generated DL3 data will be saved in a personal folder following the tree structure visible in :ref:`install`.
It is possible to define a different ouput folder adding the option ``-o``.
.. note::
Same cuts as the one used for the generation of IRF should be used.
create_DL3_files_from_database.py
----------------------------------
For a fast analysis with the use of DL2 files automatically mergerd by LST OSA, run for example:
.. code-block:: bash
python DL2_to_DL3.py --tcuname "grb210807A" --ra 76.677 --dec 58.245 --outdir ../data/ --submit -v
python DL2_to_DL3.py --tcuname "Crab" --ra 83.633080 --dec 22.014500 --outdir ../data/ --submit -v
python DL2_to_DL3.py --distance 2 --ra 83.633080 --dec 22.014500 --outdir ../data/ --submit -v
`--tcuname` is the name of the TCU configuration and is used to select the runs to analyze from the Database file, unless specific nights or list of runs are specified using option `--night` and `--run`. Also Ra, Dec position can be specified if already known, and are mandatory in case the name of the target can not be resolved. If the option distance is specified, runlist is created using all runs with poiting position within the angular distance specified.
If no cut configuration file is given, standard configuration is applied.
.. _usage:
Workflow and usage
===================
.. toctree::
:maxdepth: 1
:hidden:
db_generation
irf_generation
dl1_production
dl2_production
dl3_production
index_generation
You should have at least a working installation of ``lstchain`` (:ref:`install`) and access to the IT.
Analysis steps
--------------
There are many data levels and different scripts are used to call the corresponding ``lstchain`` one. The steps in order are:
* Database Generation (:ref:`db_generation`)
* DL1 generation (:ref:`dl1`)
* DL2 generation (:ref:`dl2`)
* DL3 generation (:ref:`dl3`)
.. note::
For the scripts to work, you always have to make a source of the `init.sh` file (:ref:`install`) before starting, so that all the needed variables are set.
The scripts are created to adapt to the following directory tree:
.. code-block:: bash
Parent_Folder
└──DL1
├── source
│ └── night
│ └── version
│ └── cleaning
DL2
├── source
│ └── night
│ └── version
│ └── cleaning
DL3
└── source
└── night
└── version
└── cleaning
The script ``create_analysis_tree.py`` allows to create the needed directories, following the scheme above.
Define the default name for the "Parent Folder" in the argparse with your own path.
"Parent Folder" path will be the equivalent of the varaible ``base_dir`` in the configuration files used by the scripts in the later analysis.
Modify the script at your convenience to obtain a different structure, but be aware that I/O path in the scripts are based on this structure, so you may need to adapt the scripts too.
Multiple nights can be specified.
.. code-block:: bash
usage: create_analysis_tree.py [-h] [--main_dir MAIN_DIR] --source SOURCE --night NIGHT [NIGHT ...] [--version VERSION] [--cleaning CLEANING]
Create a directory structure
optional arguments:
-h, --help show this help message and exit
--main_dir MAIN_DIR Path to parent folder
--source SOURCE Source name
--night NIGHT [NIGHT ...]
Night date
--version VERSION lstchain version (default: v0.9.2)
--cleaning CLEANING Cleaning type (default: tailcut84)
Example of use:
.. code-block:: bash
python create_analysis_tree.py --source Crab --night 20220304
Output:
.. code-block:: bash
Directory /fefs/aswg/workspace/alice.donini/Analysis/data/DL1/Crab/20220304/v0.9.2/tailcut84 already exists
Directory /fefs/aswg/workspace/alice.donini/Analysis/data/DL2/Crab/20220304/v0.9.2/tailcut84 created
Directory /fefs/aswg/workspace/alice.donini/Analysis/data/DL3/Crab/20220304/v0.9.2/tailcut84 created
Directory structure for analysis on Crab was created.
Database generation
~~~~~~~~~~~~~~~~~~~~
Unless you specify a file with a list of runs and nights, there is the need of a database, through which the run selection is made.
For the generation of the database file refer to :ref:`db_generation`.
DL1 generation
~~~~~~~~~~~~~~~~~~~~~
R0 data to DL1 data, i.e. from camera raw waveforms to calibrated and parametrized images:
* Low level camera calibration
* High level camera calibration
* Image cleaning
* Image parameter calculation
Usage of ``lstchain.scripts.lstchain_data_r0_to_dl1`` for real data and ``lstchain.scripts.lstchain_mc_r0_to_dl1`` for MC.
If you already have a DL1 file containing images and parameters (DL1a and DL1b), you can recalculate the parameters
using a different cleaning by using: ``lstchain.scripts.lstchain_dl1ab``
Refer to :ref:`dl1` (yet to be implemented).
DL2 generation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DL1 to DL2 data, i.e. training of machine learning methods to perform:
* Energy estimation
* Arrival direction
* gamma/hadron separation
Usage of ``lstchain.scripts.lstchain_dl1_to_dl2`` for real data and MC.
Refer to :ref:`dl2`.
DL3 generation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DL2 to DL3 data. At this stage gammaness and direction cut are applied to produce list of gamma candidates.
To generate DL3 files, an IRF file has to be provided. If not available, the IRF file has to be produced before the generation of the DL3 files.
In this step usage of:
* ``lstchain.tools.lstchain_create_irf_files`` (:ref:`irf_generation`)
* ``lstchain.tools.lstchain_create_dl3_file`` (:ref:`dl3`)
To analyze the results using `Gammapy <https://gammapy.org>`_ there is the need of an index. This is produced using:
* ``lstchain.tools.lstchain_create_dl3_index_files`` (:ref:`index_generation`)
For a quick look into the data and perform :math:`{\theta}^2/{\alpha}` plots starting from DL2 data, you can also use ``lstchain.scripts.lstchain_post_dl2``.
You will need to parse a toml configuration file, two examples can be found in `lstchain <https://github.com/cta-observatory/cta-lstchain/tree/master/docs/examples/post_dl2_analysis>`_. You will need to specify the runs to analyze and the `data_tag`, which specifies the version of lstchain used to process the data, e.g v0.7.3. Different cut can be also specified.
To run the script simply do:
.. code-block:: bash
lstchain_post_dl2 -c config_wobble.toml -v
.. _index_generation:
Index creation
==============
Gammapy needs an index file to read the generated DL3 fits file (:ref:`dl3`). This can be created using the `create_DL3_index.py` script.
The script makes use of the same configuration file of the script to generate DL3 files: `config_dl2_to_dl3.yaml`.
Usage:
.. code-block:: bash
usage: create_DL3_index.py [-h] [--prod PROD] [--outdir OUTDIR] [--config CONFIG]
[--config-analysis CONFIG_ANALYSIS] [--verbose] [--source_name SOURCE_NAME]
[--tcuname TCUNAME] [--runlist RUNLIST [RUNLIST ...]] [--distance DISTANCE]
[--ra RA] [--dec DEC]
DL3 index maker
optional arguments:
--config CONFIG, -c CONFIG
Specify a personal config file for the analysis
--config-analysis CONFIG_ANALYSIS
Specify a config file which described analysis profile to use
--dec DEC Dec coordinate of the target. To add if you want to use custom position
--distance DISTANCE, -dis DISTANCE
Max distance in degrees between the target position and the run pointing position
for the run selection, negative value means no selection using this parameter
(default: -1).
--outdir OUTDIR, -o OUTDIR