This document aims to describe how to perform an analysis at the IT cluster in La Palma using the scripts hosted on the `GitLab repository <https://www.ict.inaf.it/gitlab/alice.donini/lst-analysis>`_ .
This document aims to describe how to perform an analysis at the IT cluster in La Palma using the scripts hosted on the `GitLab repository <https://gitlab.com/davide.miceli1993/lst_scripts>`_ .
The script `DL1_to_DL2.py` allows to analyze DL1 data and produce DL2 files.
Mandatory arguments are the name of the source (option ``-n``) and a configuration file `config_dl1_to_dl2.yaml` (option ``--config``). The script will search for the configuration file in the ``$CONFIG`` folder specified in the initial settings.
Specify a config file which described analysis profile to use
--dec DEC Dec coordinate of the target. To add if you want to use custom position
--distance DISTANCE, -dis DISTANCE
Max distance in degrees between the target position and the run pointing position for the
run selection, negative value means no selection using this parameter (default: -1).
Max distance in degrees between the target position and the run pointing position for the run selection. Negative value means no selection using this parameter (default: -1).
--dry Make a dry run, no true submission
--globber, -g If True, overwrites existing output file without asking
--globber, -g If True, overwrites existing output file without asking (default: False).
--outdir OUTDIR, -o OUTDIR
Directory to store the output
--prod PROD, -p PROD Prod to use (default: v0.8.4)
--ra RA RA coordinate of the target. To add if you want to use custom position
--runlist RUNLIST, -rl RUNLIST
File with a list of run and the associated night to be analysed
@@ -38,11 +34,11 @@ Usage:
--verbose, -v Increase output verbosity
-h, --help show this help message and exit
It makes use of a configuration file `config_dl1_to_dl2.yaml` (option ``--config``):
Preview of the configuration file:
.. code-block:: yaml
# Directory where job file are written
# Directory where job files are written
jobmanager: ../jobmanager
# Database file name
@@ -53,42 +49,45 @@ It makes use of a configuration file `config_dl1_to_dl2.yaml` (option ``--config
# Path to personal directory where output data will be saved.
# Uncomment and modify in case you want to use a non standard path
#output_folder: ../DL2/Crab
# Directory where config files are stored
#config_folder: ./
# Otherwise files will be saved to {base_dir}/DL2/{source_name}/{night}/{version}/{cleaning}
#dl2_output_folder: ../DL2/Crab
# Path to trained RF files
path_models: ../models
# Uncomment and modify in case you want to specify a custom configuration file for the lstchain script
# lstchain_config: ../lstchain_84.json
# Values for automatic selection of DL1 data
dl1_data:
DL1_dir: /fefs/aswg/data/real/DL1 # path to DL1 directory
night: [20210911, 20210912] # day(s) of observation (more than one is possible)
version: v0.9.1 # v0.7.3, v0.8.4, v0.9, v0.9.1
version: v0.9 # v0.7.3, v0.8.4, v0.9
cleaning: tailcut84
# uncomment the line below and specify the nights only if database search is used and not a custom runlist file
#night: [20210911, 20210912] # day(s) of observation (more than one is possible)
Edit the configuration file: change the paths based on your working directory and modify the DL1 data information used to search for the files at the IT.
The script uses the created database (:ref:`db_generation`) to find the runs to analyze, based on the nights specified in the configuration file `config_dl1_to_dl2.yaml`.
An extra selection can be done in coordinates (``--distance``, ``--ra`` and ``--dec`` are mandatory arguments) or by the name of the source as saved in TCU (argument ``--tcuname``).
There are two ways to select the data to analyze: through a search in the database (outdated for now) or through a run list given by the user.
If you want to use the created database (:ref:`db_generation`) to find the runs to analyze, then fill in the information about the wanted nights in the configuration file `config_dl1_to_dl2.yaml`.
An extra selection can be done in coordinates (in case ``--distance``, ``--ra`` and ``--dec`` are mandatory arguments) or by the name of the source as saved in TCU (argument ``--tcuname``).
If none of these selection methods is given, then all the runs available in the dates specified in the configuration file are considered for the analysis.
The search for DL1 files can be also done by giving a file list of runs and nights to be analyzed (option ``--runlist``).
No database file is needed in this case.
.. warning::
The search of runs through the database has an issue on the dates at the moment. The database is generated from the drive log, so all the runs taken after the midnight are saved under the following day. This doesn't happen at the IT, where the runs are stored under the day of the starting night. So for some runs the search could fail, even if they are there.
The search of runs through the database has a "feature". The database is generated from the drive log, so all the runs taken after the midnight are saved under the following day. This doesn't happen at the IT, where the runs are stored under the day of the starting night. So for some runs the search could fail, even if they are there.
Thus if you use the database search always add in the config file also the date of the night after, so that you are sure all the runs take after the midnight are considered too.
Thus if you use the database search always add in the configuration file also the date of the night after, so that you are sure all the runs take after the midnight are considered too.
Example of a file run list:
The search for DL1 files can be also done by giving a file list of runs and nights to be analyzed (option ``--runlist``).
No database file is needed in this case.
The runlist can be either manually created or produced using the `create_run_list.py` script (:ref:`runlist_generation`).
Example of a runlist file:
.. code-block::
@@ -96,9 +95,9 @@ Example of a file run list:
2911 20201117
3089 20201206
The argument ``--dry`` allows to perform a dry run.
No jobs are submitted and only the verbose output is printed.
Usefull to have a check of which runs are selected and the goodness of the sent command.
The argument ``--dry`` allows to perform a dry run. No jobs are submitted and only the verbose output is printed.
This option is usefull to have a check of which runs are selected and the goodness of the sent command.
There are many data levels and different scripts are used to call the corresponding ``lstchain`` one. The steps in order are:
* Database Generation (:ref:`db_generation`)
* Database Generation (:ref:`db_generation`) or Runlist generation (:ref: `runlist_generation`)
* DL1 generation (:ref:`dl1`)
* DL2 generation (:ref:`dl2`)
* DL3 generation (:ref:`dl3`)
@@ -64,13 +64,13 @@ Multiple nights can be specified.
Create a directory structure
optional arguments:
options:
-h, --help show this help message and exit
--main_dir MAIN_DIR Path to parent folder
--source SOURCE Source name
--night NIGHT [NIGHT ...]
Night date
--version VERSION lstchain version (default: v0.9.2)
--version VERSION lstchain version (default: v0.9)
--cleaning CLEANING Cleaning type (default: tailcut84)
Example of use:
@@ -88,12 +88,13 @@ Output:
Directory /fefs/aswg/workspace/alice.donini/Analysis/data/DL3/Crab/20220304/v0.9.2/tailcut84 created
Directory structure for analysis on Crab was created.
Database generation
Database/Runlist generation
~~~~~~~~~~~~~~~~~~~~
Unless you specify a file with a list of runs and nights, there is the need of a database, through which the run selection is made.
Run selection can be done through two methods:
For the generation of the database file refer to :ref:`db_generation`.
* generation of a file with a list of runs and corresponding nights that is given as input to the analysis scripts, refer to :ref: `runlist_generation`.
* creation of a database through which the run selection is made, refer to :ref:`db_generation`.