Unverified Commit 1b3bec4d authored by Akke Viitanen's avatar Akke Viitanen
Browse files

add requirements, omit writing dp1 visits

parent 67026177
Loading
Loading
Loading
Loading
+5 −0
Original line number Diff line number Diff line
%% Cell type:markdown id:a737c50d-9836-409d-94a9-79fc98641a02 tags:

# AGILE DR1 (Viitanen et al., 2026)

AGILE: AGN in the LSST era (Viitanen et al., 2026) is an end-to-end
simulation pipeline and part of the official LSST INAF in-kind contribution.
AGILE is designed to allow for the characterization of the AGN population as
seen by Rubin-LSST. The pipeline starts with the creation of an empirical truth
catalog of AGN, galaxies, and stars. Using variability recipes for AGN and
stars, the truth catalog is then fed to the image simulator (imSim), capable of
generating raw images from the LSSTCam. Finally, the resulting raw images are
analyzed using the official science pipelines.

The first data release of AGILE corresponds to $1\,{\rm deg}^2$ (21 LSSTCam
detectors) and 3 years of LSST survey in the COSMOS (150.117, +2.205)
deep-drilling field. The underlying truth catalog has an area of 24 deg2, and
extends to $0.2 < z < 5.5$ and host galaxy mass $\log(M/M_\odot) > 8.5$.

The survey baseline is based on baseline v4.0. However, to optimize lightcurve
cadence over depth, we prune the observation sequence by selecting only the
first exposure of each sequence. The total number of visits is $1\,441$.

The LSST Science Pipelines (version 8.0.0) are run on the simulated raw images
in order to produce single-visit calibrated exposures and source catalogs.
Then, the single-visit images are combined to produce the coadded image and
object catalogs. Finally, forced photometry is performed on the single-visit
images based on the object catalog.

With respect to the LSST data product, the AGILE data model follow closely that
of DP0.2 (https://dp0-2.lsst.io/data-products-dp0-2/index.html). LSST Butler
may be used to access both images and catalog. For the access of catalogs we
also provide a sqlite3 database which mimics the functionality and schema of
the TAP service in DP0.2 / DP1.

%% Cell type:markdown id:85b8bc23-e64e-4d0f-bcd8-5e27d4def1a7 tags:

## Demonstration notebooks

We provide the following notebooks demonstrating the usage of AGILE DR1:

- [1_agile_introduction.ipynb](1_agile_introduction.ipynb): this notebook
- [2_agile_truth.ipynb](2_agile_truth.ipynb): explore the underlying truth catalog
    - requires `master.db`
- [3_agile_sed.ipynb](3_agile_sed.ipynb): explore the measured and truth SEDs
    - requires `master.db` and `seds/`
- [4_agile_lightcurve.ipynb](4_agile_lightcurve.ipynb): explore the measured forced photometry and truth lightcurves
    - requires `master.db` and `lightcurves/`
- [5_agile_images.ipynb](5_agile_images.ipynb): explore simulated single-visit and deep coadded images
    - requires `master.db` and `repo_public/`
- [6_agile_compare_dp1.ipynb](6_agile_compare_dp1.ipynb): compare AGILE DR1 to DP1 ECDFS
    - requires `master.db`

The full AGILE DR1 (incl. catalogs, images, SEDs, lightcurves) is hosted in the
INAF Google Drive: https://drive.google.com/drive/u/0/folders/1xxaBrTamUZ5U3ncuG5VtkEIiF76UCghK.

%% Cell type:markdown id:a2b3ce63-c6d4-4b9b-995b-d73a1feccb37 tags:

## Contact:

- Angela Bongiorno (angela.bongiorno@inaf.it, INAF-OAR)
- Akke Viitanen (akke.viitanen@iki.fi, INAF-OAR, Univ. Geneva)
- Ivano Saccheo (ivano.saccheo@bristol.ac.uk, INAF-OAR, Univ. Bristol)

Created by AV on 2025/07/09

Updated by AV on 2025/02/06

%% Cell type:code id:0f69a013-bb12-43ab-95bb-ae40917b103d tags:

``` python
```
+9 −12
Original line number Diff line number Diff line
%% Cell type:markdown id:9895c6ba-7383-4bef-ac0a-ff6009f87e99 tags:

# AGILE DR1 and LSST DP1 comparison

This notebooks compares the AGILE DR1 to the LSST DP1 ECDFS. For the LSST DP1, refer to the documentation available on-line (https://dp1.lsst.io/).

## Requirements

This notebooks download data directly from the LSST DP1 database using TAP. To use TAP, an API access token needs to be generated. Follow the instructions

1. go to https://data.lsst.cloud and select the API aspect
2. follow the links to generate an API access token
3. select `read:tap` to be able to perform TAP queries
4. save the API access token to `$HOME/.rsp-tap.token`

It is a good idea to remove read/write permissions from others than yourself by using

`$ chmod 400 $HOME/.rsp-tap.token`.

%% Cell type:code id:69dab423-d2b2-4249-9a4c-190b46aebeac tags:

``` python
import os
import sqlite3

from astropy.wcs import WCS
from astropy import units as u
from astropy.coordinates import SkyCoord
from astropy.table import Table
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pyvo
from scipy.stats import binned_statistic
```

%% Cell type:code id:c2119bdb-2ba5-4370-8987-4ac339ffcdfb tags:

``` python
mpl.style.use("agile.mplstyle")
```

%% Cell type:code id:bb90a0f0-fbb5-42cb-a2c6-ffeff11281a1 tags:

``` python
COORD_ECDFS = SkyCoord(53.13, -28.10, unit="deg")
COORD_47TUC = SkyCoord(6.02, -72.08, unit="deg")
```

%% Cell type:code id:npYymEcxS7qB tags:

``` python
def get_tap():
    RSP_TAP_SERVICE = "https://data.lsst.cloud/api/tap"
    homedir = os.path.expanduser("~")
    token_file = os.path.join(homedir, ".rsp-tap.token")
    with open(token_file, "r") as f:
        token_str = f.readline()
    cred = pyvo.auth.CredentialStore()
    cred.set_password("x-oauth-basic", token_str)
    credential = cred.get("ivo://ivoa.net/sso#BasicAA")
    return pyvo.dal.TAPService(RSP_TAP_SERVICE, session=credential)


rsp_tap = get_tap()
```

%% Cell type:code id:f2b03b53-2ac6-40be-90fc-c2f7d2c718d3 tags:

``` python
def flux2mag(flux):
    """Convert flux from nJy to AB magnitude"""
    return -2.5 * np.ma.log10(flux) + 31.4
```

%% Cell type:code id:5c909a91-fbe6-440a-9a2d-17983dbdd52c tags:

``` python
def get_agile(query):
    """
    Retrieves the table corresponding to 'query' from the agile database.
    """
    with sqlite3.connect("../master.db") as con:
        df = pd.read_sql_query(query, con)
    return Table.from_pandas(df)
```

%% Cell type:code id:b9095ef5-1f90-42c6-ab59-0a08ef86da55 tags:

``` python
def get_dp1(query):
    """
    Retrieves the table corresponding to 'query' from the LSST dp1 database.
    """
    job = rsp_tap.submit_job(query)
    job.run()
    job.wait(phases=["COMPLETED", "ERROR"])
    if job.phase == "ERROR":
        job.raise_if_error()
    assert job.phase == "COMPLETED"
    results = job.fetch_result().to_table()
    return results
```

%% Cell type:markdown id:2ab1c1d6-6d98-4a72-9fd2-77c188906142 tags:

# Compare number of visits

%% Cell type:code id:b69deb23-489b-43b0-a70b-986655b6d2c6 tags:

``` python
visit_agile = get_agile("SELECT * FROM Visit")
```

%% Cell type:code id:b93d5e58-963c-47c3-b334-9fe41ecbb110 tags:

``` python
if not os.path.exists("dp1_visit.fits"):
    ra = COORD_ECDFS.ra.value
    dec = COORD_ECDFS.dec.value
    visit_dp1 = get_dp1(
        f"""
        SELECT *
        FROM dp1.Visit
        WHERE 1=CONTAINS(POINT('ICRS', ra, dec), CIRCLE('ICRS', {ra}, {dec}, 1))
        """
    )
    visit_dp1.write("dp1_visits.fits")
visit_dp1 = Table.read("dp1_visit.fits")
ra = COORD_ECDFS.ra.value
dec = COORD_ECDFS.dec.value
visit_dp1 = get_dp1(
    f"""
    SELECT visit, obsStartMJD, band
    FROM dp1.Visit
    WHERE 1=CONTAINS(POINT('ICRS', ra, dec), CIRCLE('ICRS', {ra}, {dec}, 1))
    """
)
```

%% Cell type:code id:9a6dca64-7e79-4bcd-864d-267307235c93 tags:

``` python
visits = visit_agile, visit_dp1
linestyles = "solid", "dashed"
for visit, ls in zip(visits, linestyles):
    for i, band in enumerate("ugrizy"):
        select = visit["band"] == band
        plt.hist(
            visit["obsStartMJD"][select],
            bins=np.arange(60300, 62000),
            cumulative=True,
            histtype="step",
            color="C%d" % i,
            ls=ls,
        )
plt.xlabel("MJD")
plt.ylabel("Cumulative visits")

plt.plot([], [], ls="solid", color="k", label="AGILE DR1")
plt.plot([], [], ls="dashed", color="k", label="DP1 ECDFS")
for band in "ugrizy":
    plt.plot([], [], ".", label=band)
plt.legend(loc="upper left")
```

%% Cell type:markdown id:0fddb51a-282c-4dbb-b72e-eef7a151197d tags:

# Compare catalogs

%% Cell type:markdown id:ebbec11f-f1de-439e-a75d-775a920855e1 tags:

## Load catalog AGILE

%% Cell type:code id:11ee646c-92fe-4971-a4db-db659243a0f6 tags:

``` python
columns = ["objectId", "coord_ra", "coord_dec", "refExtendedness", "refBand"]
for band in "ugrizy":
    columns += [f"{band}_extendedness"]
    columns += [f"{band}_extendedness_flag"]
    for flux in "psf", "calib", "cModel":
        columns += [f"{band}_{flux}Flux"]
        columns += [f"{band}_{flux}FluxErr"]
        if flux == "cModel":
            columns += [f"{band}_{flux}_flag"]
        else:
            columns += [f"{band}_{flux}Flux_flag"]
columns
```

%% Cell type:code id:146e7cbe-d638-4daf-a2c8-95bcf8343852 tags:

``` python
%%time
query = f"""
SELECT
    {', '.join(columns)},
    match_truth_type,
    match_candidate,
    t."lsst-r_total"
FROM Object
    JOIN MatchesTruth AS m USING (objectId)
    JOIN Truth AS t ON (t.ID = m.match_id)
WHERE
    1 = detect_isPrimary
"""
object_table_agile = get_agile(query)
object_table_agile
```

%% Cell type:markdown id:90f11277-c82c-4709-ac7b-e121981fc5f4 tags:

## Load catalog DP1

%% Cell type:code id:b8454e68-6b8a-46c4-92c2-08fa918501fb tags:

``` python
%%time
ra = COORD_ECDFS.ra.value
dec = COORD_ECDFS.dec.value
query = f"""
SELECT {', '.join(columns)} FROM dp1.Object
WHERE
    1 = CONTAINS(POINT('ICRS', coord_ra, coord_dec), CIRCLE('ICRS', {ra}, {dec}, 1.0))
"""

filename = "dp1_ecdfs_object.fits"
if not os.path.exists(filename):
    object_table = get_dp1(query)
    object_table.write(filename)
object_table_dp1_ecdfs = Table.read(filename)
```

%% Cell type:code id:27844c78-7d89-4912-b723-166f906ed07b tags:

``` python
%%time
ra = COORD_47TUC.ra.value
dec = COORD_47TUC.dec.value
query = f"""
SELECT {', '.join(columns)} FROM dp1.Object
WHERE
    1 = CONTAINS(POINT('ICRS', coord_ra, coord_dec), CIRCLE('ICRS', {ra}, {dec}, 1.0))
"""

filename = "dp1_47tuc_object.fits"
if not os.path.exists(filename):
    object_table = get_dp1(query)
    object_table.write(filename)
object_table_dp1_47tuc = Table.read(filename)
```

%% Cell type:code id:61c67b43-012f-4216-ad3f-2c338e303780 tags:

``` python
def get_select(object_table, band, col_flux, extendedness=None, maglim=np.inf):
    mag = flux2mag(object_table[f"{band}_{col_flux}"])
    select = mag < maglim

    if "cModel" in col_flux:
        col_flux = col_flux.replace("Flux", "")
    select &= object_table[f"{band}_{col_flux}_flag"] == 0

    if extendedness is not None:
        select &= object_table[f"{band}_extendedness"] == extendedness
        select &= object_table[f"{band}_extendedness_flag"] == 0

    return select
```

%% Cell type:markdown id:0cb04306-03bf-4e43-854e-e129e5174d54 tags:

## Sky positions

Here we show the ra, dec distribution of the object tables. In AGILE DR1, some regions of sky are missing sources due to 1) bright stars 2) deblending being skipped in crowded regions of sky. This is due to the internal parameters of LSST Science Pipelines on deciding when to not attempt deblending. The effect is also present in DP1, most strikingly in 47 Tuc, where a majority of the stars from globular cluster are not automatically identified.

%% Cell type:code id:66e744af-3a79-417d-9e7e-caca830136a5 tags:

``` python
def get_wcs(ra, dec):
    w = WCS(naxis=2)
    w.wcs.crval = [np.mean(ra), np.mean(dec)] * u.deg
    w.wcs.crpix = [0.0, 0.0]
    w.wcs.cdelt = [-0.2, 0.2] * u.deg
    w.wcs.ctype = ["RA---TAN", "DEC--TAN"]
    return w


def plot_ra_dec(fig, ax, ra, dec):
    coord = SkyCoord(ra, dec, unit="deg")
    ax.plot(coord.ra, coord.dec, ".", markersize=0.2)
    return fig, ax


object_tables = object_table_agile, object_table_dp1_ecdfs, object_table_dp1_47tuc
labels = "AGILE DR1", "DP1 ECDFS", "DP1 47 Tuc"

for object_table, label in zip(object_tables, labels):
    ra = object_table["coord_ra"]
    dec = object_table["coord_dec"]

    fig = plt.figure()
    ax = fig.gca()
    ax.set_title(label)
    plot_ra_dec(fig, ax, ra, dec)
    ax.set_xlim(ax.get_xlim()[::-1])
```

%% Cell type:markdown id:6e77462c-93a4-484b-9b38-f37045fef000 tags:

## Magnitude distribution

%% Cell type:code id:5761c3a3-cbe4-4725-9508-1b5bcb2dc745 tags:

``` python
def plot_magnitude_histogram(band):
    bins = np.arange(15, 30, 0.25)
    object_tables = object_table_agile, object_table_dp1_ecdfs
    labels = "AGILE DR1", "DP1 ECDFS"
    colors = "C0", "C1"

    fig = plt.figure()
    ax = fig.gca()

    for object_table, label, color in zip(object_tables, labels, colors):
        mag = flux2mag(object_table[f"{band}_cModelFlux"])
        select = get_select(object_table, band, "cModelFlux", 1)
        ax.hist(
            mag[select], bins=bins, label=label + " extended", color=color, histtype="step", ls="solid", lw=2
        )

        mag = flux2mag(object_table[f"{band}_psfFlux"])
        select = get_select(object_table, band, "psfFlux", 0)
        ax.hist(
            mag[select],
            bins=bins,
            label=label + " pointlike",
            color=color,
            histtype="step",
            ls="dotted",
            lw=1,
        )

    ax.set_xlabel(f"${band}$ [ABmag]")
    ax.set_ylabel("counts")
    ax.semilogy()
    ax.legend()


for band in "ugrizy":
    plot_magnitude_histogram(band)
```

%% Cell type:markdown id:b06315a9-fe36-410c-ab0d-f2f133851d62 tags:

## Signal-to-noise

%% Cell type:code id:33a3bd6c-5c89-4276-9d7f-c4599821168c tags:

``` python
def get_snr(flux, dflux):
    return np.ma.true_divide(flux, dflux)


def get_quantile(mag, snr, bins, q):
    return binned_statistic(mag, snr, bins=bins, statistic=lambda a: np.quantile(a, q))[0]


def get_mag_snr(object_table, band, col_flux, extendedness):
    select = get_select(object_table, band, col_flux, extendedness)
    mag = flux2mag(object_table[f"{band}_{col_flux}"])[select]
    snr = get_snr(object_table[f"{band}_{col_flux}"], object_table[f"{band}_{col_flux}Err"])[select]
    return mag, snr


def plot_snr(object_table, band, col_flux, extendedness, *args, **kwargs):
    bins = np.linspace(15, 30, 151)
    cens = (bins[1:] + bins[:-1]) / 2.0

    mag, snr = get_mag_snr(object_table, band, col_flux, extendedness)
    q16 = get_quantile(mag, snr, bins, 0.16)
    q50 = get_quantile(mag, snr, bins, 0.50)
    q84 = get_quantile(mag, snr, bins, 0.84)

    plt.plot(cens, q50, *args, **kwargs)
    plt.fill_between(cens, q16, q84, alpha=0.20)
    plt.semilogy()
    plt.xlabel(f"${band}$ [ABmag]")
    plt.ylabel(f"SNR {col_flux}")


plot_snr(object_table_agile, "r", "psfFlux", 0, ls="solid", label="AGILE DR1")
plot_snr(object_table_dp1_ecdfs, "r", "psfFlux", 0, ls="dashed", label="DP1 ECDFS")
plt.legend()
```

%% Cell type:code id:4dc429dc-f340-433a-b4fd-a6bcc278b5ec tags:

``` python
bins = np.linspace(15, 30, 151)

snr_limits = 3, 5, 10
names = ["label", "band"] + ["snr%d" % s for s in snr_limits]
dtype = [str, str] + [float] * len(snr_limits)
snr_table = Table(names=names, dtype=dtype)
for column in ["snr%d" % s for s in snr_limits]:
    snr_table[column].format = "%.2f"

object_tables = object_table_agile, object_table_dp1_ecdfs
labels = "AGILE DR1", "DP1 ECDFS"
for object_table, label in zip(object_tables, labels):
    for band in "ugrizy":
        mag, snr = get_mag_snr(object_table, band, "psfFlux", 0)
        q50 = get_quantile(mag, snr, bins, 0.50)

        idx = np.argsort(snr)
        mag_limits = [np.interp(snr_limit, snr[idx], mag[idx], left=0, right=0) for snr_limit in snr_limits]
        row = dict(label=label, band=band)
        for snr_limit, mag_limit in zip(snr_limits, mag_limits):
            row["snr%d" % snr_limit] = mag_limit
        snr_table.add_row(row)

snr_table
```

%% Cell type:markdown id:511274c9-d8b8-403a-aa1f-b1ea49571748 tags:

## Color-color diagrams

%% Cell type:code id:9cab49be-35f7-4c74-9765-6b4ab39e5e87 tags:

``` python
def get_color(flux1, flux2):
    """
    Returns the color defined as m1 - m2 = -2.5 * log10(flux1 / flux2)
    """
    return -2.5 * np.ma.log10(np.ma.true_divide(flux1, flux2))


def plot_color_color(
    fig, ax, object_table, band1, band2, band3, col_flux, extendedness, maglim, *args, **kwargs
):
    """
    Plots a color-color diagram from 'object_table' using the photometric bands
    'band1', 'band2', and 'band3'. The flux definition 'col_flux', 'extendedness'
    and 'maglim' are passed on to the selection function.

    The magnitude limit 'maglim' and 'extendedness' corresponds always to a
    selection in the r-band.
    """

    select = get_select(object_table, "r", col_flux, 0, maglim)
    select &= get_select(object_table, band1, col_flux, None)
    select &= get_select(object_table, band2, col_flux, None)
    select &= get_select(object_table, band3, col_flux, None)

    flux1 = object_table[f"{band1}_{col_flux}"][select]
    flux2 = object_table[f"{band2}_{col_flux}"][select]
    flux3 = object_table[f"{band3}_{col_flux}"][select]

    color1 = get_color(flux1, flux2)
    color2 = get_color(flux2, flux3)

    ax.plot(color1, color2, ".", *args, **kwargs)

    return fig, ax
```

%% Cell type:code id:b110fbdb-7863-467c-9c2d-12c6a8362cf7 tags:

``` python
maglim = 24

for i in range(4):
    band1, band2, band3 = "ugrizy"[i : i + 3]

    fig = plt.figure()
    ax = fig.gca()

    plot_color_color(
        fig, ax, object_table_agile, band1, band2, band3, "psfFlux", 0, maglim, label="AGILE DR1"
    )
    plot_color_color(
        fig, ax, object_table_dp1_ecdfs, band1, band2, band3, "psfFlux", 0, maglim, label="DP1 ECDFS"
    )

    ax.set_xlim(-1, 3)
    ax.set_ylim(-1, 3)
    ax.set_xlabel(f"{band1} - {band2}")
    ax.set_ylabel(f"{band2} - {band3}")
    ax.set_title(f"r-pointlike, r < {maglim}")
    ax.legend()
```

%% Cell type:code id:b4047dc5-9806-407d-89f8-4e091d1eedac tags:

``` python
maglim = 24

for i in range(4):
    band1, band2, band3 = "ugrizy"[i : i + 3]

    fig = plt.figure()
    ax = fig.gca()

    labels = "AGN1", "AGN2", "Galaxy", "star"
    for i, label in enumerate(labels):
        select = object_table_agile["match_truth_type"] == i
        plot_color_color(
            fig,
            ax,
            object_table_agile[select],
            band1,
            band2,
            band3,
            "psfFlux",
            0,
            maglim,
            label=f"AGILE DR1 {label}",
        )

    ax.set_xlim(-1, 3)
    ax.set_ylim(-1, 3)
    ax.set_xlabel(f"{band1} - {band2}")
    ax.set_ylabel(f"{band2} - {band3}")
    ax.set_title(f"r-pointlike, r < {maglim}")
    ax.legend()
```

%% Cell type:code id:d643176b-6b15-4cd7-b90d-7739de541b7f tags:

``` python
```

%% Cell type:code id:883d8b7e-0e0f-4fe3-98bc-9badaa71b762 tags:

``` python
```

%% Cell type:code id:17d08c75-42eb-4121-91f9-cc4d8025f589 tags:

``` python
```