Batch Module
Multi-source batch photometry over sky regions.
Configuration and utilities for batch photometry.
- class spxquery.batch.config.BatchConfig(center_ra: float, center_dec: float, radius: float, catalog_path: ~pathlib._local.Path, coverage_mode: str = 'any', bands: ~typing.List[str] | None = None, mjd_range: ~typing.Tuple[float, float] | None = None, max_images: int = 500, output_dir: ~pathlib._local.Path = <factory>, max_download_workers: int = 4, max_extract_workers: int = 12, photometry: ~spxquery.core.config.PhotometryConfig = <factory>, num_buckets: int = 64, keep_bucket_files: bool = False)[source]
Bases:
objectConfiguration for multi-source batch photometry over a sky region.
- Parameters:
center_ra (float) – Sky region center in degrees (ICRS).
center_dec (float) – Sky region center in degrees (ICRS).
radius (float) – Search radius in degrees.
catalog_path (Path) – CSV file with columns
targetid,ra,dec.coverage_mode (str) –
"any"(INTERSECTS) or"full"(CONTAINS).bands (list of str or None) – Bands to query, e.g.
["D1", "D3"].None= all.mjd_range (tuple of (float, float) or None) –
(mjd_min, mjd_max)to filter observations by time.None= no time filter (all epochs).max_images (int) – Safety gate — raise if query returns more images than this.
output_dir (Path) – Root directory for all batch outputs.
max_download_workers (int) – Parallel download threads.
max_extract_workers (int) – Parallel extraction processes (spawn-based).
photometry (PhotometryConfig) – Photometry parameters forwarded to extraction.
num_buckets (int) – Hash-partition buckets for aggregation.
keep_bucket_files (bool) – Keep temporary bucket CSVs after aggregation.
- catalog_path: Path
- output_dir: Path
- photometry: PhotometryConfig
- property image_dir: Path
- property per_image_dir: Path
- property lightcurve_dir: Path
- property bucket_dir: Path
- __init__(center_ra: float, center_dec: float, radius: float, catalog_path: ~pathlib._local.Path, coverage_mode: str = 'any', bands: ~typing.List[str] | None = None, mjd_range: ~typing.Tuple[float, float] | None = None, max_images: int = 500, output_dir: ~pathlib._local.Path = <factory>, max_download_workers: int = 4, max_extract_workers: int = 12, photometry: ~spxquery.core.config.PhotometryConfig = <factory>, num_buckets: int = 64, keep_bucket_files: bool = False) None
- spxquery.batch.config.load_catalog(catalog_path: Path) List[Source][source]
Load a source catalog CSV into a list of Source objects.
Expected columns:
targetid,ra,dec.
Region-based query for SPHEREx full-frame images.
Delegates to spxquery.core.query.query_spherex_region().
- spxquery.batch.query.query_region_observations(config: BatchConfig) QueryResults[source]
Query SPHEREx archive for full-frame images covering a sky region.
Thin wrapper that translates
BatchConfigfields intoquery_spherex_region()parameters.- Parameters:
config (BatchConfig) – Batch configuration with region definition and query parameters.
- Returns:
Matching observations with download URLs.
- Return type:
Multi-source aperture photometry extraction from SPHEREx images.
- spxquery.batch.extract.process_single_image(image_path: Path, sources: List[Source], config: PhotometryConfig, output_dir: Path, skip_existing: bool = True) Path | None[source]
Extract aperture photometry for all catalog sources in one image.
Optimized for batch processing: pre-computes shared arrays (background mask, error map, pixel scale) once per image, then uses local cutouts for per-source photometry instead of operating on the full image.
- Parameters:
image_path (Path) – Path to a SPHEREx MEF FITS file.
config (PhotometryConfig) – Photometry extraction parameters.
output_dir (Path) – Directory for per-image CSV output.
skip_existing (bool) – Skip images that already have an output CSV.
- Returns:
Path to the output CSV, or None if skipped / no results.
- Return type:
Path or None
- spxquery.batch.extract.run_extraction(image_dir: Path, sources: List[Source], config: PhotometryConfig, output_dir: Path, n_workers: int = 12, skip_existing: bool = True) int[source]
Run multi-source extraction across all images in a directory.
- Parameters:
image_dir (Path) – Directory containing SPHEREx FITS files (searched recursively).
sources (list of Source) – Catalog sources to extract photometry for.
config (PhotometryConfig) – Photometry parameters.
output_dir (Path) – Per-image CSV output directory.
n_workers (int) – Number of parallel workers.
skip_existing (bool) – Skip images with existing output CSVs.
- Returns:
Number of newly processed images.
- Return type:
Aggregate per-image photometry CSVs into per-source light curves.
- spxquery.batch.aggregate.aggregate_lightcurves(image_csv_dir: Path, lightcurve_dir: Path, bucket_dir: Path, num_buckets: int = 64, clean: bool = False, keep_bucket_files: bool = False) int[source]
Aggregate per-image CSVs into individual source light curves.
- Two-phase bucket design keeps memory bounded:
Stream per-image CSVs into hash-partitioned bucket files.
Process one bucket at a time, sort, write per-source CSVs.
Batch photometry pipeline — orchestrates query, download, extract, aggregate.
- spxquery.batch.pipeline.load_query_summary(output_dir: Path) dict[source]
Load a previously saved query_summary.yaml.
- Parameters:
output_dir (Path) – Root batch output directory containing the YAML file.
- class spxquery.batch.pipeline.BatchPipeline(config: BatchConfig)[source]
Bases:
objectMulti-source batch photometry pipeline.
Four stages: query -> download -> extract -> aggregate. Each stage can be run independently for resumable execution.
- Parameters:
config (BatchConfig) – Region, catalog, and processing configuration.
- __init__(config: BatchConfig)[source]
- run_download(skip_existing: bool = True) List[DownloadResult][source]
Stage 2: Download full-frame FITS images (no cutouts).
- run_extract(skip_existing: bool = True) int[source]
Stage 3: Extract multi-source photometry from each image.
- spxquery.batch.pipeline.run_batch(catalog: str, center_ra: float, center_dec: float, radius: float, output_dir: str = 'batch_output', bands: List[str] | None = None, coverage_mode: str = 'any', max_images: int = 500, max_download_workers: int = 4, max_extract_workers: int = 12, skip_existing: bool = True, photometry_config: PhotometryConfig | None = None) BatchPipeline[source]
Run the full batch photometry pipeline with one function call.
- Parameters:
catalog (str) – Path to CSV with columns targetid, ra, dec.
center_ra (float) – Region center in degrees.
center_dec (float) – Region center in degrees.
radius (float) – Region radius in degrees.
output_dir (str) – Root output directory.
coverage_mode (str) –
"any"(INTERSECTS) or"full"(CONTAINS).max_images (int) – Safety gate — raise if exceeded.
max_download_workers (int) – Parallel download threads.
max_extract_workers (int) – Parallel extraction processes.
skip_existing (bool) – Resume mode — skip already-processed images.
photometry_config (PhotometryConfig or None) – Override default photometry parameters.
- Returns:
The pipeline instance (for inspecting results).
- Return type: