Unverified Commit 67026177 authored by Akke Viitanen's avatar Akke Viitanen
Browse files

add AGILE DR1 notebooks

parent 76b9d7e8
Loading
Loading
Loading
Loading
Loading
+66 −0
Original line number Diff line number Diff line
%% Cell type:markdown id:a737c50d-9836-409d-94a9-79fc98641a02 tags:

# AGILE DR1 (Viitanen et al., 2026)

AGILE: AGN in the LSST era (Viitanen et al., 2026) is an end-to-end
simulation pipeline and part of the official LSST INAF in-kind contribution.
AGILE is designed to allow for the characterization of the AGN population as
seen by Rubin-LSST. The pipeline starts with the creation of an empirical truth
catalog of AGN, galaxies, and stars. Using variability recipes for AGN and
stars, the truth catalog is then fed to the image simulator (imSim), capable of
generating raw images from the LSSTCam. Finally, the resulting raw images are
analyzed using the official science pipelines.

The first data release of AGILE corresponds to $1\,{\rm deg}^2$ (21 LSSTCam
detectors) and 3 years of LSST survey in the COSMOS (150.117, +2.205)
deep-drilling field. The underlying truth catalog has an area of 24 deg2, and
extends to $0.2 < z < 5.5$ and host galaxy mass $\log(M/M_\odot) > 8.5$.

The survey baseline is based on baseline v4.0. However, to optimize lightcurve
cadence over depth, we prune the observation sequence by selecting only the
first exposure of each sequence. The total number of visits is $1\,441$.

The LSST Science Pipelines (version 8.0.0) are run on the simulated raw images
in order to produce single-visit calibrated exposures and source catalogs.
Then, the single-visit images are combined to produce the coadded image and
object catalogs. Finally, forced photometry is performed on the single-visit
images based on the object catalog.

With respect to the LSST data product, the AGILE data model follow closely that
of DP0.2 (https://dp0-2.lsst.io/data-products-dp0-2/index.html). LSST Butler
may be used to access both images and catalog. For the access of catalogs we
also provide a sqlite3 database which mimics the functionality and schema of
the TAP service in DP0.2 / DP1.

%% Cell type:markdown id:85b8bc23-e64e-4d0f-bcd8-5e27d4def1a7 tags:

## Demonstration notebooks

We provide the following notebooks demonstrating the usage of AGILE DR1:

- [1_agile_introduction.ipynb](1_agile_introduction.ipynb): this notebook
- [2_agile_truth.ipynb](2_agile_truth.ipynb): explore the underlying truth catalog
- [3_agile_sed.ipynb](3_agile_sed.ipynb): explore the measured and truth SEDs
- [4_agile_lightcurve.ipynb](4_agile_lightcurve.ipynb): explore the measured forced photometry and truth lightcurves
- [5_agile_images.ipynb](5_agile_images.ipynb): explore simulated single-visit and deep coadded images
- [6_agile_compare_dp1.ipynb](6_agile_compare_dp1.ipynb): compare AGILE DR1 to DP1 ECDFS

The full AGILE DR1 (incl. catalogs, images, SEDs, lightcurves) is hosted in the
INAF Google Drive: https://drive.google.com/drive/u/0/folders/1xxaBrTamUZ5U3ncuG5VtkEIiF76UCghK.

%% Cell type:markdown id:a2b3ce63-c6d4-4b9b-995b-d73a1feccb37 tags:

## Contact:

- Angela Bongiorno (angela.bongiorno@inaf.it, INAF-OAR)
- Akke Viitanen (akke.viitanen@iki.fi, INAF-OAR, Univ. Geneva)
- Ivano Saccheo (ivano.saccheo@bristol.ac.uk, INAF-OAR, Univ. Bristol)

Created by AV on 2025/07/09

Updated by AV on 2025/02/06

%% Cell type:code id:0f69a013-bb12-43ab-95bb-ae40917b103d tags:

``` python
```
+340 −0
Original line number Diff line number Diff line
%% Cell type:markdown id:d478f2c3-dc74-47a7-9f01-9c2503b99304 tags:

# AGILE DR1 Truth Catalog

This notebook demonstrates the exploration of the underlying AGILE truth catalog. The truth catalog is the static representation of AGN, galaxies and stars, and the combined truth catalog contains properties from all these three populations.

Here we demonstrate how to perform some typical catalog-level tasks using the truth catalog, such as selection AGN or galaxies or stars, and having a look at their properties.

%% Cell type:code id:94394f27-1e11-45b3-a5cd-110cc3d237f1 tags:

``` python
import os
import sqlite3

from astropy.table import Table
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.stats import binned_statistic
```

%% Cell type:code id:b50ec673-5275-4f47-83a4-79bb2aa7e799 tags:

``` python
filename_database = "../master.db"
```

%% Cell type:code id:36a98d03-7ead-4dbe-996a-6235868b4fb8 tags:

``` python
mpl.style.use("agile.mplstyle")
```

%% Cell type:code id:c24f3ae1-0867-4ded-8310-35bf5b3527f0 tags:

``` python
def flux2mag(flux):
    """Convert flux from nJy to AB magnitude"""
    return -2.5 * np.ma.log10(flux) + 31.4
```

%% Cell type:markdown id:a1c0244a-5054-402e-a827-58a60c544f23 tags:

# Show available columns

%% Cell type:code id:432b2db0-d38d-4f35-a4dc-e59b04b8aa5a tags:

``` python
with sqlite3.connect(filename_database) as con:
    df = pd.read_sql_query("SELECT * FROM Truth LIMIT 0", con)
table = Table.from_pandas(df)
for column in table.columns:
    print(column)
```

%% Cell type:markdown id:2b86e97c-334b-4df6-8328-8567a65ff262 tags:

# Query the database

%% Cell type:code id:96ff8923-5244-4ba0-9c43-f4d84e388f92 tags:

``` python
columns = [
    "ID",
    "RA",
    "DEC",
    "Z",
    "M",
    "SFR",
    "is_agn_ctn",
    "is_agn_ctk",
    "is_agn",
    "is_optical_type2",
]
for band in "ugrizy":
    columns += [f'"lsst-{band}_total"']
    columns += [f'"lsst-{band}_point"']
    columns += [f'"magabs_lsst-{band}_point"']
```

%% Cell type:code id:1b415873-4d03-4c9b-ad0f-edc8a329450e tags:

``` python
%%time
with sqlite3.connect(filename_database) as con:
    df = pd.read_sql_query(f"SELECT {', '.join(columns)} FROM Truth", con)
catalog = Table.from_pandas(df)
catalog
```

%% Cell type:markdown id:ac64afcb-39ac-4e70-88d1-82ea09c62d54 tags:

# Select AGN, galaxies or stars

%% Cell type:code id:ac8c5760-70a5-4c30-8196-945bd7a53885 tags:

``` python
# Calculate the number of objects in each truth class
is_agn = catalog["is_agn"] == 1
is_star = catalog["Z"].mask
is_galaxy = ~catalog["Z"].mask & (catalog["is_agn"] == 0)

print("Number of galaxies", is_galaxy.sum())
print("Number of AGN", is_agn.sum())
print("Number of stars", is_star.sum())
```

%% Cell type:markdown id:7b77fe12-901c-4157-8447-36cb36292e79 tags:

# Select type1/type2 AGN

%% Cell type:code id:abc69c9f-eb6d-4db2-a326-11408b069be5 tags:

``` python
agn = catalog[is_agn]
agn_type1 = catalog[is_agn & (catalog["is_optical_type2"] == 0)]
agn_type2 = catalog[is_agn & (catalog["is_optical_type2"] == 1)]
print("Number of AGN:", len(agn))
print("Number of AGN type1:", len(agn_type1))
print("Number of AGN tyep2:", len(agn_type2))
```

%% Cell type:markdown id:2fd7b863-67b7-4f19-b71c-641e9b497055 tags:

# AGN and host galaxy properties

%% Cell type:code id:64150bbb-35f2-4d5f-9ddd-331cf02454f1 tags:

``` python
plt.plot(agn_type2["RA"], agn_type2["DEC"], ".", label="AGN type2")
plt.plot(agn_type1["RA"], agn_type1["DEC"], ".", label="AGN type1")
plt.xlabel("RA")
plt.ylabel("DEC")
plt.xlim(plt.xlim()[::-1])
plt.legend(frameon=True)
```

%% Cell type:code id:351ed27b-9f2c-4748-9d32-75f2af85e86f tags:

``` python
plt.plot(agn_type2["M"], agn_type2["SFR"], ".", label="AGN type2")
plt.plot(agn_type1["M"], agn_type1["SFR"], ".", label="AGN type1")
plt.xlabel(r"$\log M_{\rm star}$ [$M_{\odot}$]")
plt.ylabel(r"${\rm SFR}$ [$M_{\odot}/yr$]")
plt.semilogy()
plt.legend()
```

%% Cell type:code id:0d1245f9-fe93-4860-8abf-f613b58cad1d tags:

``` python
plt.plot(agn_type2["Z"], agn_type2["magabs_lsst-g_point"], ".", label="AGN type2")
plt.plot(agn_type1["Z"], agn_type1["magabs_lsst-g_point"], ".", label="AGN type1")
plt.xlabel("Redshift")
plt.ylabel(r"$M_g$ [ABmag]")
plt.ylim(plt.ylim()[::-1])
plt.legend()
```

%% Cell type:markdown id:10f38787-82fc-414b-9c23-f2288e84389f tags:

# Compute observed ugrizy number counts

%% Cell type:code id:63278349-0f44-4fba-b853-22321d564a71 tags:

``` python
area = 24.0


def get_counts(label, band, dbin=0.10):
    """
    Computes the number of truth object defined by 'label'
    brighter than some limiting magnitude in 'band'.
    """

    # Get the selection corresponding to 'label'
    select = {
        "AGN1": (catalog["is_agn_ctn"] == 1) & (catalog["is_optical_type2"] == 0),
        "AGN2 (CTN)": (catalog["is_agn_ctn"] == 1) & (catalog["is_optical_type2"] == 1),
        "AGN2 (CTK)": (catalog["is_agn_ctk"] == 1) & (catalog["is_optical_type2"] == 1),
        "Galaxy": ~catalog["Z"].mask & (catalog["is_agn"] == 0),
        "Star": catalog["Z"].mask,
    }.get(label)

    # NOTE: for AGN we retrieve only the AGN flux and not that of the host galaxy.
    # This is assuming a scenario of high-quality AGN-host decomposition.
    if "AGN" in label:
        mag = flux2mag(1000 * catalog[select][f"lsst-{band}_point"])
    else:
        mag = flux2mag(1000 * catalog[select][f"lsst-{band}_total"])

    # Calculate the number counts
    bins = np.arange(0, 30 + dbin / 2, dbin)
    counts = np.histogram(mag, bins=bins)[0]
    counts = np.cumsum(counts) / area / dbin

    return bins[1:], counts
```

%% Cell type:code id:985e7d65-a9a0-482d-bdb0-68d79f2b72ab tags:

``` python
for band in "ugrizy":
    plt.figure()

    labels = "AGN1", "AGN2 (CTN)", "AGN2 (CTK)", "Galaxy", "Star"
    for label in labels:
        x, y = get_counts(label, band)
        plt.semilogy(x, y, label=label, lw=1 + 2 * ("AGN" in label))

    plt.xlim(10, 30)
    plt.xlabel(f"${band}$ [ABmag]")
    plt.ylabel(f"$N(<{band})$ [1/deg2/mag]")
    plt.legend()
```

%% Cell type:markdown id:72c0157d-2ac7-4b25-a7d5-8cc9bc40cfd3 tags:

# Compare truth flux vs. measured flux

%% Cell type:code id:f5875731-96ec-45d5-ab26-a8644fc42fc3 tags:

``` python
def get_select(object_table, band, col_flux, extendedness=None, maglim=np.inf):
    mag = flux2mag(object_table[f"{band}_{col_flux}"])
    select = mag < maglim

    if "cModel" in col_flux:
        col_flux = col_flux.replace("Flux", "")
    select &= object_table[f"{band}_{col_flux}_flag"] == 0

    if extendedness is not None:
        select &= object_table[f"{band}_extendedness"] == extendedness
        select &= object_table[f"{band}_extendedness_flag"] == 0

    return select
```

%% Cell type:code id:291592f3-b789-49a9-b2c9-e0f9bdaaae28 tags:

``` python
def get_agile(query):
    """
    Retrieves the table corresponding to 'query' from the agile database.
    """
    with sqlite3.connect(filename_database) as con:
        df = pd.read_sql_query(query, con)
    return Table.from_pandas(df)
```

%% Cell type:code id:5a702557-5806-4625-aa8b-91587a507e58 tags:

``` python
columns = ["objectId", "coord_ra", "coord_dec", "refExtendedness", "refBand"]
for band in "ugrizy":
    columns += [f"{band}_extendedness"]
    columns += [f"{band}_extendedness_flag"]
    for flux in "psf", "calib", "cModel":
        columns += [f"{band}_{flux}Flux"]
        columns += [f"{band}_{flux}FluxErr"]
        if flux == "cModel":
            columns += [f"{band}_{flux}_flag"]
        else:
            columns += [f"{band}_{flux}Flux_flag"]
columns
```

%% Cell type:code id:848f95e5-9dc8-4c22-b604-ce266e6333ca tags:

``` python
%%time
query = f"""
SELECT
    {', '.join(columns)},
    match_truth_type,
    match_candidate,
    t."lsst-r_total"
FROM Object
    JOIN MatchesTruth AS m USING (objectId)
    JOIN Truth AS t ON (t.ID = m.match_id)
WHERE
    1 = detect_isPrimary
"""
object_table_agile = get_agile(query)
object_table_agile
```

%% Cell type:code id:9370f735-18ff-4ee7-b93a-697c3085894b tags:

``` python
band = "r"

col_fluxes = "psfFlux", "calibFlux", "cModelFlux"
truth_names = "AGN1", "AGN2", "Galaxy", "Star"

fig, axes = plt.subplots(3, 4, figsize=(9, 9), sharex=True, sharey=True)

for row, col_flux in enumerate(col_fluxes):
    select_flux = get_select(object_table_agile, band, col_flux, extendedness=None)
    for col, truth_name in enumerate(truth_names):
        select = select_flux & (object_table_agile["match_truth_type"] == col)
        print(col_flux, truth_name, select.sum())

        bins = np.logspace(0, 6, 31)
        cens = (bins[:-1] * bins[1:]) ** 0.5

        flux1 = object_table_agile[f"lsst-{band}_total"][select] * 1000.0
        flux2 = object_table_agile[f"{band}_{col_flux}"][select]
        ratio = flux2 / flux1

        from scipy.stats import binned_statistic

        x = cens
        y16 = binned_statistic(flux1, ratio, bins=bins, statistic=lambda a: np.quantile(a, 0.16))[0]
        y50 = binned_statistic(flux1, ratio, bins=bins, statistic=lambda a: np.quantile(a, 0.50))[0]
        y84 = binned_statistic(flux1, ratio, bins=bins, statistic=lambda a: np.quantile(a, 0.84))[0]

        color = "C%d" % col
        axes[row, col].semilogx(x, y16, ls="dashed", color=color)
        axes[row, col].semilogx(x, y50, ls="solid", color=color)
        axes[row, col].semilogx(x, y84, ls="dashed", color=color)

        axes[row, col].set_xlim(1e1, 1e4)
        axes[row, col].set_ylim(0.6, 1.4)
        axes[row, col].axhline(1.0)

        if row == 0:
            axes[row, col].set_title(truth_name)
        if col == 0:
            axes[row, col].set_ylabel(f"{col_flux} / truth")
        fig.supxlabel("Truth flux [nJy]")

fig.tight_layout()
```

%% Cell type:code id:3e944bd2-f040-4a74-8ca2-c56c224dc421 tags:

``` python
```
+286 −0

File added.

Preview size limit exceeded, changes collapsed.

+266 −0

File added.

Preview size limit exceeded, changes collapsed.

+457 −0

File added.

Preview size limit exceeded, changes collapsed.

Loading