Quality Control
SPXQuery applies quality control filtering to identify reliable photometric measurements and flag problematic data.
Overview
Quality control operates on two criteria:
Signal-to-Noise Ratio (SNR) - Filters low-significance detections
Pixel Flags - Rejects measurements affected by instrumental or processing issues
Important: Quality filtering applies only to visualization. All measurements are saved to the CSV file, allowing users to apply custom filtering for their analysis.
Variance Repair
Automatic Handling of Flagged Pixels
SPXQuery automatically repairs variance estimates for pixels with valid flux but NaN (not-a-number) variance values. This occurs when pixel flags indicate quality issues but the flux measurement itself is valid.
How Variance Repair Works
During photometry extraction, if the variance at the source position is NaN:
Validation: Check that the NaN variance correlates with pixel flags (e.g., non-functional pixels)
Repair: Replace NaN variance with the median variance from valid (unflagged) pixels in the image
Logging: Record that variance repair was applied for this observation
Example log message:
WARNING: Variance at source position is NaN for file_D3_20250325_062.fits
INFO: Median variance from valid pixels: 2.34e-05
INFO: Using median variance as fallback for flux uncertainty calculation
Why Variance Repair Matters
Without variance repair, observations with NaN variance would be discarded even when the flux measurement is valid. This preserves valuable data while providing a conservative uncertainty estimate.
Impact:
More complete light curves: Preserves observations that would otherwise be lost
Conservative uncertainties: Median variance provides a reasonable fallback estimate
Quality tracking: Flagged pixels are still tracked, allowing users to filter if desired
When Variance Repair is Applied
Variance repair is only applied when:
The source pixel has valid (non-NaN) flux
The variance at the source position is NaN
Valid pixels exist elsewhere in the image to compute median variance
If all pixels have NaN variance, the observation is skipped with an error message.
Signal-to-Noise Ratio (SNR)
Definition
SNR is computed as:
SNR = flux / flux_error
Where:
fluxis the aperture-corrected flux (MJy/sr)flux_erroris the combined uncertainty from photon noise and background variance
SNR Threshold
The sigma_threshold parameter (in VisualizationConfig) sets the minimum SNR for “good” measurements in plots:
from spxquery.utils.params import export_default_parameters
# Export and customize visualization config
params_file = export_default_parameters("config", "my_params.yaml")
# Edit the YAML file:
# visualization:
# sigma_threshold: 5.0 # Adjust as needed
# Load in pipeline
from spxquery.core.pipeline import run_pipeline
run_pipeline(
ra=304.69,
dec=42.44,
output_dir="output",
advanced_params_file="config/my_params.yaml"
)
Typical values:
3.0 - Marginal detections (relaxed)
5.0 - Standard detection threshold (default, recommended)
10.0 - High-confidence detections only (strict)
Effect on Visualization
In the combined plot:
Good measurements (SNR ≥ threshold): Filled circles, colored by wavelength/date
Rejected measurements (SNR < threshold): Gray crosses (×)
This allows you to see both the reliable measurements and the rejected data points for context.
Pixel Flags
SPHEREx Flag System
The SPHEREx FLAGS extension uses a bitmap where each bit represents a different quality issue. Multiple flags can be set for a single pixel.
Default Bad Flags
SPXQuery uses this default set of bad pixel flags (configured in PhotometryConfig):
bad_flags = [0, 1, 2, 6, 7, 9, 10, 11, 15]
Flag definitions:
Bit |
Flag Name |
Description |
|---|---|---|
0 |
TRANSIENT |
Transient event detected (cosmic ray, etc.) |
1 |
OVERFLOW |
Pixel overflow/saturation |
2 |
SUR_ERROR |
Sample-up-the-ramp error |
6 |
NONFUNC |
Non-functional pixel |
7 |
DICHROIC |
Dichroic reflection artifact |
9 |
MISSING_DATA |
Missing data |
10 |
HOT |
Hot pixel |
11 |
COLD |
Cold pixel |
15 |
NONLINEAR |
Non-linear response |
12 |
FULLSAMPLE |
Full sample available |
14 |
PHANMISS |
Phantom or missing |
17 |
PERSIST |
Detector persistence |
19 |
OUTLIER |
Statistical outlier |
Other Available Flags
SPHEREx provides additional flags that are not rejected by default:
Bit |
Flag Name |
Description |
Why Not Default |
|---|---|---|---|
21 |
SOURCE |
Source detected |
Informational |
Customizing Bad Flags
Use YAML configuration to customize bad flags:
# my_params.yaml
photometry:
bad_flags: [0, 1, 2] # Relaxed: only reject saturated/bad pixels
# Or strict filtering
photometry:
bad_flags: [0, 1, 2, 4, 6, 7, 9, 10, 11, 14, 15, 17] # Add PHANTOM, PHANMISS, PERSIST
# Or no flag filtering
photometry:
bad_flags: [] # Accept all flags
Then load in pipeline:
run_pipeline(
ra=304.69,
dec=42.44,
output_dir="output",
advanced_params_file="my_params.yaml"
)
How Flag Filtering Works
The FLAGS extension in SPHEREx FITS files contains integer values where each bit represents a flag. A pixel is rejected if any of the specified flag bits are set.
Example:
pixel_flag = 2097152 # Binary: 1000000000000000000000 (bit 21 set)
bad_flags = [0, 1, 2]
# Check if any bad flags are set
for bit in bad_flags:
if pixel_flag & (1 << bit):
reject_pixel() # Reject if bit is set
# Result: Not rejected (bit 21 is not in bad_flags)
Quality Assessment Workflow
1. Check Distribution
After running the pipeline, examine the light curve CSV to assess quality:
import pandas as pd
df = pd.read_csv("output/results/lightcurve.csv", comment="#")
# Check SNR distribution
print("SNR statistics:")
print(df['snr'].describe())
# Check flag distribution
print("\nFlag counts:")
print(df['flag'].value_counts())
2. Identify Patterns
Look for systematic issues:
# Identify low-SNR measurements
low_snr = df[df['snr'] < 5.0]
print(f"Low SNR: {len(low_snr)} / {len(df)} ({100*len(low_snr)/len(df):.1f}%)")
# Check which flags are most common
import numpy as np
def decode_flags(flag_value):
"""Extract which bits are set."""
return [bit for bit in range(32) if flag_value & (1 << bit)]
# Get all set flags across dataset
all_flags = []
for flag in df['flag']:
all_flags.extend(decode_flags(flag))
flag_counts = pd.Series(all_flags).value_counts()
print("\nMost common flag bits:")
print(flag_counts.head(10))
3. Adjust Filtering
Based on the assessment, adjust quality control parameters:
# If too few good measurements, relax threshold
run_pipeline(..., sigma_threshold=3.0)
# If specific flag is problematic, add to bad_flags
run_pipeline(..., bad_flags=[0, 1, 2, 6, 7, 9, 10, 11, 15, 17])
Visualization Quality Indicators
Combined Plot
The visualization shows three types of data points:
Good measurements (filled circles)
SNR ≥
sigma_thresholdNo bad pixel flags set
Colored by wavelength (left panel) or date (right panel)
Rejected measurements (gray crosses ×)
SNR <
sigma_thresholdORBad pixel flags set
Shown for context but not used in trend analysis
Upper limits (downward arrows, if applicable)
Non-detections (negative flux or SNR < threshold)
Plotted at 3σ upper limit
Interpreting the Plot
High rejection rate:
Many gray crosses → adjust
sigma_thresholdorbad_flagsCheck if source is too faint for aperture size
Clustered rejections:
Rejections at specific wavelengths → instrumental issue
Rejections at specific dates → transient contamination
No rejections:
All measurements pass quality control
May indicate overly relaxed filtering
CSV Output Format
The light curve CSV contains all measurements with quality flags:
obs_id,mjd,flux,flux_error,wavelength,bandwidth,band,flag,snr,is_upper_limit
2025W25_1B_0062_1,60842.269794,1007.005,43.199,1.940,0.048,D3,2097152,23.3,False
...
Quality-related columns:
flag- Integer bitmap of pixel flagssnr- Signal-to-noise ratio (flux / flux_error)is_upper_limit- Boolean indicating non-detection
Users can apply custom filtering:
import pandas as pd
df = pd.read_csv("output/results/lightcurve.csv", comment="#")
# Custom filtering
good = df[(df['snr'] >= 5.0) & (df['flag'] == 0)]
# Or more complex criteria
def has_bad_flags(flag_value, bad_flags=[0, 1, 2]):
return any(flag_value & (1 << bit) for bit in bad_flags)
df['is_good'] = (df['snr'] >= 5.0) & ~df['flag'].apply(has_bad_flags)
good = df[df['is_good']]
See Also
Pipeline Architecture - How quality control is applied
Parameters - Customizing quality thresholds
Tutorial - Practical examples