rename install, remove trailing cells (a18a975f) · Commits · Akke Esa Tapio Viitanen / lsst_inaf_agile

.gitignore

+1 −1

Original line number	Diff line number	Diff line
		@@ -66,7 +66,7 @@ demo
		.venv/
		venv/
		doc/
		notebooks/
		#notebooks/
		overleaf/
		paper/
		table/

docs/index.rst

+1 −1

Original line number	Diff line number	Diff line
		@@ -54,6 +54,6 @@ Useful resources
		:hidden:

		self
		install
		installation
		userguide
		autoapi/index

docs/install.rst→docs/installation.rst

+2 −2

Original line number	Diff line number	Diff line
		Install
		=======
		Installation
		============

		Due to the modular nature, installation of AGILE is done in a few distinct
		steps that are detailed here.

docs/notebooks/create_photometric_catalog.ipynb

+0 −5

Original line number	Diff line number	Diff line
		%% Cell type:markdown id:4ce8b6f0-0ff4-4fc8-a317-d9e303f590da tags:

		# Create photometric catalogs

		This notebooks demonstrates how to use the AGILE truth catalog in order to reduce the simulated images and create photometric catalogs using the LSST Science pipelines

		We use imSim (https://lsstdesc.org/imSim/index.html) to simulate the images. The purpose of this notebook is not to doucment the feature of imSim, but only to demonstrate how to pass our truth catalog as input to imSim, and how to run software using the tools available in AGILE. Note that imSim only simulates raw LSSTCam data (incl. e.g. sky and instrumental noise), which are not science ready products. The raw images are processed in subsequent steps by using the LSST Science pipelines.

		Below, we provide helpful links and resources to further understand the LSST Science Pipelines and the butler:

		+ LSST Science Pipelines: https://pipelines.lsst.io/
		+ butler: https://pipelines.lsst.io/modules/lsst.daf.butler/index.html
		+ DRP: https://github.com/lsst/drp_pipe
		+ DP0.2: https://dp0-2.lsst.io/data-products-dp0-2/index.html
		+ DP1: https://dp1.lsst.io/

		The following notebook may be accessed [here](https://www.ict.inaf.it/gitlab/akke.viitanen/lsst_inaf_agile/-/blob/a619e4244f56e731d11724f8a70dd5e7d6e1fda4/docs/notebooks/create_photometric_catalog.ipynb)

		%% Cell type:code id:ae4ce30f-f883-4878-a3f8-8ed90ee112bb tags:

		``` python
		import os

		if not os.path.exists("src/lsst_inaf_agile"):
		os.chdir("../../")
		os.getcwd()
		dirname = "data/tests/test_agile"
		```

		%% Cell type:markdown id:073add4f-0080-4f7d-a458-209e2bfdc50a tags:

		## Make a butler repository

		%% Cell type:code id:03466797-0fb3-4d25-963f-c12b2468f427 tags:

		``` python
		repo = f"{dirname}/repo"
		print(f"Name of repository is '{repo}'")
		```

		%% Cell type:code id:29666986-d87c-4b43-8ffe-4af60a122c74 tags:

		``` python
		%%file src/scripts/make_repo.sh
		#!/bin/bash -ex
		if [ $# -ne 1 ] ; then
		echo "Initialize a repository"
		echo "Usage: $0 [repo]"
		exit 1
		fi

		REPO=$1
		REPO_ASSET_DIR=/beegfs/AGILE/Pleiadi_at_IRA/pleia15/lsst/repo_assets
		CALIBS=${REPO_ASSET_DIR}/calib_imSim2.0_w_2023_04

		butler create ${REPO}
		butler register-instrument ${REPO} lsst.obs.lsst.LsstCam
		butler import --export-file "${CALIBS}/export.yaml" \
		--skip-dimensions 'instrument,detector,physical_filter,band' \
		--transfer direct ${REPO} ${CALIBS}
		butler write-curated-calibrations ${REPO} 'LSSTCam'
		butler register-skymap --config-file ${REPO_ASSET_DIR}/DC2_skymap_config.py ${REPO}
		cp ${REPO}/gen3.sqlite3 ${REPO}/gen3.sqlite3_orig
		```

		%% Cell type:code id:a0fc5980-1dc8-455d-a972-1a97273a2615 tags:

		``` python
		os.system(f"sh src/scripts/make_repo.sh {repo}")
		```

		%% Cell type:markdown id:2db244d6-5770-4c12-904e-7e20d1989171 tags:

		## Ingest files into the repository

		%% Cell type:code id:e27db403-c026-46bc-80d4-ca29dd4ea868 tags:

		``` python
		%%file src/scripts/ingest.sh
		#!/bin/bash -x

		if [ $# -lt 2 ] ; then
		echo "Usage: $0 [repo] (raws)"
		exit 1
		fi

		repo=$1
		shift
		raws="$@"

		butler ingest-raws --transfer direct ${repo} ${raws}
		butler define-visits --collections LSSTCam/raw/all ${repo} LSSTCam
		butler collection-chain ${repo} LSSTCam/defaults 'LSSTCam/raw/all,LSSTCam/calib,u/jchiang/calib_imSim2.0_w_2023_04,refcats,skymaps'
		```

		%% Cell type:code id:b45f0700-af53-4181-95f2-b4bf4b3e0a4e tags:

		``` python
		import glob

		raws = " ".join(glob.glob(f"{dirname}/imsim/*/amp.fits.fz", recursive=True))
		print(raws)
		```

		%% Cell type:code id:82461ca7-eb77-4453-82fd-341edf414392 tags:

		``` python
		os.system(f"sh src/scripts/ingest.sh {repo} {raws}")
		```

		%% Cell type:markdown id:d4c75520-2a98-4812-947e-d2d95264fd65 tags:

		## Run a pipeline processing task

		%% Cell type:code id:2546962b-8e97-4aad-a519-aae2f7092656 tags:

		``` python
		%%file src/scripts/pipetask.sh
		#!/bin/sh -ex

		###############################################################################
		# Run an LSST pipeline task
		###############################################################################

		if [ $# -lt 2 ] ; then
		echo "Usage: $0 [repo] [task] (task) ..."
		exit 1
		fi

		# Handles input arguments
		REPO=$1
		shift
		TASKS="$@"
		FLAGS="-j 10 -b $REPO -i LSSTCam/defaults -o u/agile/output --register-dataset-types --clobber-outputs --skip-existing-in u/agile/output"

		for TASK in $TASKS ; do
		FLAGS1=$FLAGS
		[ $TASK = "calibrate" ] \|\| [ $TASK = "step1" ] && FLAGS1="$FLAGS -C calibrate:etc/cmodel.py"
		pipetask --log-level=DEBUG run $FLAGS1 -p "etc/my_DRP2.yaml#$TASK"
		done
		exit $?
		```

		%% Cell type:code id:eeb74500-6291-458a-ae94-24becfd97aef tags:

		``` python
		# run step1 from etc/my_DRP2.yaml. See the file for the definition of this step.
		os.system(f"sh src/scripts/pipetask.sh {repo} step1")
		```

		%% Cell type:markdown id:f621dec2-9c5f-493d-8b47-4fbe41cf6a28 tags:

		## Examine at the output products

		%% Cell type:code id:e9eb56ff-43e8-4f86-9e9e-982e6f6bfd95 tags:

		``` python
		from lsst.daf.butler import Butler

		butler = Butler(f"{repo}", collections="u/agile/output")
		# do a butler query
		```

		%% Cell type:markdown id:a7b8f30f-78a2-45e0-822e-97a4d5931b6a tags:

		## A note on large-scale production

		Butler repositories have a limitation of working only in single-threaded applications. To circumvent this, it is suggested that any large-scale production does not call pipetask and/or butler directly, but through some driver software instead. BPS is one solution for these, but others exist, as well. Refer to: https://pipelines.lsst.io/modules/lsst.ctrl.bps/quickstart.html and references therein.

		%% Cell type:code id:a1b131c0-e579-4738-8775-c4eaec84d6d9 tags:

		``` python
		```

docs/notebooks/create_truth_catalog.ipynb

+0 −5

Original line number	Diff line number	Diff line
		%% Cell type:markdown id:70b92b7a-6039-4468-bf49-87dc87c2d965 tags:

		# Create an example AGILE truth catalog

		This notebooks demonstrates the basic usage of the truth catalog library, and generates an extermely small truth catalog for testing purposes.

		In AGILE, each individual class of source (e.g. AGN, galaxy, star) is first contained in its own catalog. The final step in the truth catalog creation is to simply combine these individual source catalogs into one combined catalog that can be used for future image simulations.

		The different classes of objects considered are:

		1. Galaxies
		2. AGNs
		3. Stars
		4. Binary stars


		The following notebook may be accessed [here](https://www.ict.inaf.it/gitlab/akke.viitanen/lsst_inaf_agile/-/blob/a619e4244f56e731d11724f8a70dd5e7d6e1fda4/docs/notebooks/create_truth_catalog.ipynb)

		%% Cell type:markdown id:3cbfa609-8950-474a-97f0-9fcefb71c103 tags:

		## Initialize

		%% Cell type:code id:9759604f-8be1-4a5d-b63a-88da53fab067 tags:

		``` python
		import os

		if not os.path.exists("src/lsst_inaf_agile"):
		os.chdir("../../")
		os.getcwd()
		dirname = "data/tests/test_agile"
		```

		%% Cell type:code id:a01c8b5b-8e8e-409d-94c2-185806a70646 tags:

		``` python
		import logging
		import numpy as np
		from lsst_inaf_agile.catalog_agn import CatalogAGN
		from lsst_inaf_agile.catalog_combined import CatalogCombined
		from lsst_inaf_agile.catalog_galaxy import CatalogGalaxy
		from lsst_inaf_agile.catalog_star import CatalogStar
		from lsst_inaf_agile.egg import Egg
		from lsst_inaf_agile.image_simulator import ImageSimulator
		from lsst_inaf_agile.merloni2014 import Merloni2014
		from lsst_inaf_agile import util
		```

		%% Cell type:markdown id:897ccf7a-3bdc-4ae6-943d-20a7eb2b6c9e tags:

		## Create the EGG galaxy catalog

		%% Cell type:code id:37224123-c322-4cc8-beab-fe8d56d3455b tags:

		``` python
		# The following list of keyword arguments may be expanded on.
		# Look into 'egg-gencat help'for all the available EGG arguments.
		egg_kwargs = Egg.get_example_egg_kwargs(dirname + "/egg.fits")
		egg_kwargs
		```

		%% Cell type:code id:e004eed5-a555-4307-80c2-ebe8460d4ae0 tags:

		``` python
		egg = Egg(egg_kwargs)
		egg.run()
		```

		%% Cell type:code id:62685aff-dd9b-4054-92b6-5bf4f3e6cb6c tags:

		``` python
		catalog_egg = Egg.read(egg_kwargs["out"])
		```

		%% Cell type:code id:cb439284-3a83-45d6-8727-ee749892eb03 tags:

		``` python
		# Convert to the AGILE galaxy catalog class
		catalog_galaxy = CatalogGalaxy(dirname, catalog_egg)
		```

		%% Cell type:markdown id:f0634979-f64c-40af-ac05-b8ab7095e39d tags:

		## Create the AGN catalog

		%% Cell type:code id:4f39b065-f97b-4aa7-9f59-8ed05794b724 tags:

		``` python
		# Create the AGN catalog
		kwargs_catalog_agn = dict(
		dirname=dirname,
		catalog_galaxy=catalog_galaxy,
		type_plambda="zou+2024",
		save_sed=1,
		seed=20251005,
		merloni2014=Merloni2014(1, 0, 0.05, 0.95),
		)
		catalog_agn = CatalogAGN(**kwargs_catalog_agn)
		```

		%% Cell type:markdown id:1fd2389a-8048-41de-bbc1-b7c8848fb87d tags:

		### Examine an example AGN

		%% Cell type:code id:6a915c09-565d-4cc3-87ce-c7a1bb2ba24e tags:

		``` python
		# Select a type1 AGN
		select = (catalog_agn["is_agn"] == 1) & (catalog_agn["is_optical_type2"] == 0)

		# Select the first AGN by ID
		ids = catalog_agn["ID"][select]
		my_id = ids[0]

		print(f"Found a type1 AGN with ID = {my_id}")
		```

		%% Cell type:markdown id:5ca1fcb0-67b4-4d9a-b72a-75ee712c4b11 tags:

		### Examine an AGN SED

		%% Cell type:code id:b00e821a-689e-47d8-a3af-841fef3b261a tags:

		``` python
		lam, flux = catalog_agn.get_sed(my_id)
		plt.loglog(lam, flux)
		```

		%% Cell type:markdown id:b5c2b317-bae9-4cc7-a221-075c73ae7a77 tags:

		### Estimate a single AGN light curve

		%% Cell type:code id:c345c7d1-4bcb-439a-98ca-803689db5604 tags:

		``` python
		# Select a type1 AGN
		select = (catalog_agn["is_agn"] == 1) & (catalog_agn["is_optical_type2"] == 1)

		# Select the first AGN by ID
		my_id = catalog_agn["ID"][select][0]

		# Estimate its lightcurve.
		# The lightcurve is estimated for 10-years with a cadence of one day.
		# The returned flux is in the observer frame.
		lc = catalog_agn.get_lightcurve(my_id, "lsst-r")
		plt.plot(lc)
		plt.xlabel("MJD")
		plt.ylabel(r"flux $r$ [uJy]")
		```

		%% Cell type:markdown id:448f5d18-f961-4410-81e3-b5bbfe23c15c tags:

		## Create the star and binary star catalogs

		%% Cell type:code id:3b21fd72-661e-4094-b6d5-f09cef1f8403 tags:

		``` python
		# Create the star catalogs
		catalog_star = CatalogStar(dirname, catalog_galaxy, is_binary=False)
		catalog_binary = CatalogStar(dirname, catalog_galaxy, is_binary=True)
		```

		%% Cell type:code id:93b9dbe2-c79e-4c2d-a2f8-74e6195ddeff tags:

		``` python
		# NOTE: dal Tio+ simulated only 10% of the toal number of binary systems. They suggest a value of
		# fbin=0.40 to account for this. To achieve this:
		# 1) input star catalog is downsampled to 1 - fbin = 60% of the original size
		# 2) the binary catalog is repeated 4 times
		# The helper function below does these two steps automatically.
		if "_once" not in globals():
		print(f"Old sizes are {len(catalog_star.stars)=}, {len(catalog_binary.stars)=}")
		catalog_star, catalog_binary = CatalogStar.get_star_binary_fbin(
		catalog_star, catalog_binary, fbin=0.40, nrepeat=4
		)
		_once = True
		print(f"New sizes are {len(catalog_star.stars)=}, {len(catalog_binary.stars)=}")
		```

		%% Cell type:markdown id:5c6ad011-601a-433e-86e5-f67ffb50bc12 tags:

		## Create the combined catalog of AGNs, galaxies, and stars

		%% Cell type:code id:8901a2f1-86a1-4d43-a32e-6158f8f017a8 tags:

		``` python
		# NOTE: we set cache=False to prevent an existing catalog file to be loaded from the disk.
		catalog_combined = CatalogCombined(
		dirname, catalog_galaxy, catalog_agn, catalog_star, catalog_binary, cache=False
		)
		```

		%% Cell type:markdown id:d0d0a3cc-dbca-4675-b79b-665308717ba5 tags:

		### Examine the truth catalog columns

		Here we list all the available columns in the truth catalog. The meaning of these columns is described in the appendix of Viitanen+2026.

		It is important to note that the truth catalog contains ONE row per object. That is, a single row corresponds to either an AGN, galaxy, or a star. Consequently, not all the different classes of objects have all the different columns available. For example, (host) galaxy stellar mass ('M') is only available for galaxies (and AGNs).

		%% Cell type:code id:cd17c16a-2e08-48c8-b45a-46542673d5b2 tags:

		``` python
		catalog_combined.get_dtype()
		```

		%% Cell type:markdown id:64b46ad7-07a2-470a-b035-2a690d5dd0ee tags:

		## Examine the output files

		At this stage, the static truth catalog is completed. To find out which files have been written to disk, we check the output directory.

		%% Cell type:code id:61db188b-6ca2-45f9-96e8-971daf27eb6d tags:

		``` python
		os.listdir(dirname)
		```

		%% Cell type:markdown id:a0074fd5-14b0-4a53-9c02-f9baa36cfb6f tags:

		The explanation of the catalog files is as follows:

		+ egg.fits -- the EGG galaxy catalog
		+ agn.fits -- the AGN catalog
		+ stars.fits -- the full star catalog
		+ binaries.fits -- the full binary star catalog
		+ catalog.fits -- the combined truth catalog

		In addition to:

		+ egg-seds* -- databse files needed to generate EGG SEDs (refer to EGG documentation)
		+ seds/ -- AGN SEDs in EGG format (refer to EGG documentation)
		+ lightcurves/ -- stored lightcurves (see the AGN catalog section above)

		Note that by default, all galaxy SEDs only reside in the EGG database. These can be stored on disk on demand, but essentially doubles the disk space used which can be problematic on larger catalogs.

		Also, light curves are only generated by request. In case images are simulated, then a subset of light curves will be estimated automatically for the sources within the region of interest.

		%% Cell type:markdown id:01298432-fed8-4570-bf87-fbb507302845 tags:

		## Example plot: (host) galaxy stellar mass versus redshift

		%% Cell type:code id:73045582-05f0-4fb3-8202-e77862cfed06 tags:

		``` python
		plt.figure(dpi=200)
		select = catalog_combined["Z"] > 0
		plt.plot(catalog_combined["Z"][select], catalog_combined["M"][select], ".")
		plt.xlabel(r"$z$")
		plt.ylabel(r"$\log (M\,/\,M_\mathrm{star})$");
		```

		%% Cell type:markdown id:22e8a779-187b-4482-a750-2d72bfbd41eb tags:

		## Example plot: g-band flux distribution

		%% Cell type:code id:eb4c0774-7cc7-45e2-b84d-8a21826add84 tags:

		``` python
		# Example plot: distribution of the observed g-band flux.
		plt.figure(dpi=200)
		plt.hist(catalog_combined["lsst-g_total"], bins=np.logspace(-3, 3, 61))
		plt.loglog()
		plt.xlabel(r"lsst-g_total [uJy]")
		plt.ylabel("frequency");
		```

		%% Cell type:markdown id:0bb771ab-a997-4f51-87aa-3fc3826c08db tags:

		## A note on the occupation fraction

		%% Cell type:markdown id:73415ddf-6699-4d14-8edd-f3f18d7dca7b tags:

		In the simulation, it is preliminarily assumed that each galaxy has a SMBH, and
		a corresponding value $M_\mathrm{BH}$. Local observations suggest that this
		might not be the case, and especially at low values of Mstar, a BH might be
		missing altogether. To facilitate for this, in the truth catalog a flag
		``has_bh'' is provided. The formal definition of this flag is:

		\begin{equation}
		\mathrm{has\_bh} = U < f_\mathrm{occ}(M_\mathrm{star}),
		\end{equation}

		where $U$ is a uniform random variable $U \sim \mathrm{Unif}(0, 1)$, and
		$f_\mathrm{occ}$ follows the observational results of Zou+2025. This column may
		be used as a weight for all calculations of statistical distributions. For
		example, the galaxy BH mass function can be calculated with and without the
		occupation fraction by simply weighting the mass function by unity (every
		galaxy is expected to host a BH), or ``has_bh'' (galaxies are expected to host
		BHs in accordance with the occupation fraction). An example is provided below.

		%% Cell type:code id:385e6d41-6bca-4fea-975e-b8979609a91a tags:

		``` python
		# Select all galaxies
		is_galaxy = catalog_combined["Z"] > 0.0

		# Select central BHs
		has_bh = catalog_combined["has_bh"]

		# Combine the two selections
		is_galaxy_and_has_bh = is_galaxy & has_bh

		# Do some statistical tests
		print(f"Total number of galaxies: {is_galaxy.sum()}")
		print(f"Total number of galaxies with BHs: {is_galaxy_and_has_bh.sum()}")

		# Plot Mstar vs. occupation fraction
		plt.title("BH occupation fraction vs. stellar mass")
		plt.plot(catalog_combined["M"][is_galaxy], catalog_combined["occupation_fraction"][is_galaxy], ".")
		plt.xlabel(r"$\log \left( M_\mathrm{star} / M_\odot \right)$")
		plt.ylabel(r"$f_\mathrm{occ}$")
		```

		%% Cell type:markdown id:99cf5e38-5d31-4aab-9112-b6ba6b76a0a3 tags:

		For AGN, the logic remains the same. To weigh AGN by the occupation fraction,
		one would use the product $\mathrm{is\_agn} \times \mathrm{has\_bh}$ instead of
		simply $\mathrm{is\_agn}$. Logically, this is equivalent to the condition "has
		BH AND BH is active". For $L_\mathrm{X} >
		10^{42}\,\mathrm{erg}\,\mathrm{s}^{-1}$, this has a negligible effect on the
		AGN population at large, as was investigated in Viitanen+2026.

		%% Cell type:code id:54c2f484-11fb-46d8-a9db-a0cca9dc5664 tags:

		``` python
		# Select galaxies with BHs
		has_bh = catalog_combined["has_bh"]

		# Select AGNs
		is_agn = catalog_combined["is_agn"]

		# Select logLX > 42 objects for the sake of an example
		is_loglx_gt_42 = catalog_combined["log_LX_2_10"] > 42

		# Combine the selections
		has_bh_and_is_agn = has_bh & is_agn
		has_bh_and_is_agn = has_bh & is_agn & is_loglx_gt_42


		print(f"Total number of AGNs: {is_agn.sum()}")
		print(f"Total number of BHs with AGNs: {has_bh_and_is_agn.sum()}")
		```

		%% Cell type:code id:522423f3-dff7-40da-938d-3b3606160cf7 tags:

		``` python
		```