QML Power Spectrum Estimation Pipeline ====================================== .. currentmodule:: cosmoforge.qube The QML (Quadratic Maximum Likelihood) power spectrum estimation pipeline provides unbiased, minimum-variance power spectrum estimates from observed data. This script orchestrates the complete analysis workflow from data input through statistical inference, for any spin-0 or spin-2 field on the sphere. Overview -------- The ``main_qml.py`` script implements a comprehensive QML estimation pipeline optimized for power spectrum analysis of spherical fields. Key features include: * Unbiased power spectrum estimation with exact error propagation * Support for auto-correlation and cross-correlation analyses * MPI parallelization for computational efficiency * Integration with Fisher information matrix results * Comprehensive likelihood data for cosmological parameter fitting The pipeline handles both simulated and observed data, providing production-ready power spectrum estimates suitable for cosmological inference. Mathematical Background ----------------------- QML Estimator Theory ^^^^^^^^^^^^^^^^^^^^ For each multipole :math:`\ell`, the QML estimator provides the minimum-variance unbiased estimate: .. math:: \hat{C}_\ell = \frac{\mathbf{x}^T \mathbf{E}_\ell \mathbf{x}}{\text{Tr}[\mathbf{E}_\ell \mathbf{S}]} where: - :math:`\mathbf{x}` is the observed data vector (e.g. temperature and polarization maps) - :math:`\mathbf{E}_\ell = \mathbf{C}^{-1} \frac{\partial \mathbf{C}}{\partial C_\ell} \mathbf{C}^{-1}` is the estimator matrix - :math:`\mathbf{C}` is the total covariance matrix (signal + noise + systematics) - :math:`\mathbf{S}` is the signal covariance matrix Statistical Properties ^^^^^^^^^^^^^^^^^^^^^^ The QML estimator achieves optimal statistical properties: * **Unbiasedness**: :math:`\langle \hat{C}_\ell \rangle = C_\ell` for all multipoles * **Minimum Variance**: Saturates the Cramér-Rao bound given by Fisher information * **Gaussianity**: Asymptotically Gaussian for large sky coverage * **Correlation Structure**: Full covariance matrix available for likelihood analysis Usage ----- Command Line Interface ^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: bash # Single-process execution (small analyses) python main_qml.py # MPI parallel execution (recommended) mpirun -n 16 python main_qml.py # With custom configuration file python main_qml.py --config qml_analysis.yaml Script Workflow ^^^^^^^^^^^^^^^ The execution pipeline consists of the following stages: 1. **Environment Initialization**: Set up MPI communicator and process management 2. **Configuration Parsing**: Load and validate YAML configuration parameters 3. **Data Loading**: Read sky maps, noise covariance, and auxiliary data files 4. **Geometry Setup**: Define pixel selection, multipole binning, and field combinations 5. **Matrix Preparation**: Compute or load estimator and normalization matrices 6. **QML Computation**: Execute parallel power spectrum estimation 7. **Result Gathering**: Collect and synchronize estimates across MPI processes 8. **Output Generation**: Save power spectra, covariance matrices, and likelihood data 9. **Validation**: Perform sanity checks and quality assessment Configuration Format -------------------- The script requires a YAML configuration file specifying analysis parameters: .. code-block:: yaml # QML Analysis Configuration # Data specification mapfiles1: ["map_T.fits", "map_Q.fits", "map_U.fits"] covmatfile1: "noise_covariance.bin" maskfile: "analysis_mask.fits" # Cross-correlation (optional) do_cross: false mapfiles2: ["map2_T.fits", "map2_Q.fits", "map2_U.fits"] # if do_cross=true covmatfile2: "noise_covariance2.bin" # if do_cross=true # Analysis parameters nside: 512 lmin: 2 lmax: 2000 nbands: 40 # Number of bandpowers nfields: 3 # T, Q, U physical_labels: ["T", "Q", "U"] # Fiducial model clfile: "fiducial_power_spectra.txt" beamfile: "beam_transfer_function.fits" # Output files outfile: "qml_power_spectra.dat" outfilelhood: "likelihood_data.dat" outfilecov: "bandpower_covariance.dat" # Computation options compute_fisher: true # Include Fisher information save_intermediate: false # Save estimator matrices (large files) Implementation Features ----------------------- Core Algorithms ^^^^^^^^^^^^^^^ The pipeline implements several sophisticated algorithms: * **Iterative Matrix Methods**: Conjugate gradient for large matrix operations * **Bandpower Binning**: Optimal multipole averaging for noise reduction * **Cross-Spectrum Handling**: Proper treatment of cross-correlation estimates * **Numerical Stability**: Regularization and conditioning for robust computation Performance Optimization ^^^^^^^^^^^^^^^^^^^^^^^^^ Key optimization strategies include: * **Distributed Memory**: MPI parallelization across estimator computation * **Memory Management**: Efficient handling of large covariance matrices * **I/O Optimization**: Parallel file operations and memory mapping * **Computational Kernels**: Optimized linear algebra operations (BLAS/LAPACK) Quality Control ^^^^^^^^^^^^^^^ The pipeline includes comprehensive validation: * **Input Verification**: File format, dimensions, and consistency checking * **Numerical Stability**: Matrix conditioning and eigenvalue analysis * **Statistical Tests**: Unbiasedness checks and chi-squared validation * **Convergence Monitoring**: Iterative solver diagnostics Output Products --------------- The format of power spectrum outputs depends on the normalization mode selected when calling ``get_power_spectra(mode=...)``. The default ``"deconvolved"`` mode returns estimates of the true :math:`C_\ell`; ``"decorrelated"`` returns uncorrelated bandpowers with unit-variance errors; ``"convolved"`` returns raw estimates with the window matrix for forward-modelling comparisons. See :doc:`spectra` for full details. Primary Outputs ^^^^^^^^^^^^^^^ **Power Spectrum File** (``qml_power_spectra.dat``): .. code-block:: text # QML Power Spectrum Estimates # Column 1: Effective multipole # Column 2: TT power (μK²) # Column 3: EE power (μK²) # Column 4: BB power (μK²) # Column 5: TE cross-power (μK²) # Column 6: TB cross-power (μK²) # Column 7: EB cross-power (μK²) 2.5 5247.3 4.2 0.015 -142.1 0.002 0.001 7.5 4923.8 12.8 0.023 -129.4 0.003 0.002 ... **Likelihood Data** (``likelihood_data.dat``): Contains full statistical information for cosmological parameter fitting: - Bandpower central values and uncertainties - Complete covariance matrix between bandpowers - Window functions and effective multipoles - Calibration and beam uncertainties **Covariance Matrix** (``bandpower_covariance.dat``): Full covariance matrix enabling proper likelihood construction for parameter fitting. Diagnostic Outputs ^^^^^^^^^^^^^^^^^^ **Analysis Log**: Detailed execution information including: - Timing and memory usage statistics - Convergence diagnostics for iterative methods - Quality metrics and validation results - Parameter summary and configuration record **Intermediate Products** (optional): - Estimator matrices for reanalysis - Normalization factors and window functions - Pixel weights and effective areas Examples -------- Standard Auto-Correlation Analysis ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python """ Example: QML power spectrum estimation from CMB maps """ import subprocess import yaml # Configure standard analysis config = { 'mapfiles1': ['planck_T.fits', 'planck_Q.fits', 'planck_U.fits'], 'covmatfile1': 'planck_noise_cov.bin', 'maskfile': 'planck_analysis_mask.fits', 'nside': 1024, 'lmin': 2, 'lmax': 2500, 'nbands': 50, 'clfile': 'planck2018_fiducial.txt', 'beamfile': 'planck_beam.fits', 'outfile': 'planck_qml_spectra.dat', 'compute_fisher': True } # Save and run with open('qml_config.yaml', 'w') as f: yaml.dump(config, f) subprocess.run(['mpirun', '-n', '16', 'python', 'main_qml.py', '--config', 'qml_config.yaml']) Cross-Correlation Analysis ^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python """ Example: Cross-correlation between two CMB datasets """ # Configure cross-correlation cross_config = { 'do_cross': True, 'mapfiles1': ['dataset1_T.fits', 'dataset1_Q.fits', 'dataset1_U.fits'], 'mapfiles2': ['dataset2_T.fits', 'dataset2_Q.fits', 'dataset2_U.fits'], 'covmatfile1': 'dataset1_noise.bin', 'covmatfile2': 'dataset2_noise.bin', 'maskfile': 'common_mask.fits', 'nside': 512, 'lmin': 2, 'lmax': 2000, 'nbands': 30, 'outfile': 'cross_correlation_spectra.dat' } Multi-Scale Analysis Pipeline ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: bash # Progressive resolution analysis # Low-resolution initial estimate python main_qml.py --config configs/qml_nside256.yaml # Medium resolution mpirun -n 8 python main_qml.py --config configs/qml_nside512.yaml # High resolution (production) mpirun -n 32 python main_qml.py --config configs/qml_nside1024.yaml Systematic Error Studies ^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python """ Example: Impact of analysis choices on power spectrum estimates """ # Test different multipole ranges lmax_values = [1500, 2000, 2500, 3000] for lmax in lmax_values: config = base_config.copy() config['lmax'] = lmax config['outfile'] = f'qml_spectra_lmax{lmax}.dat' with open(f'config_lmax{lmax}.yaml', 'w') as f: yaml.dump(config, f) subprocess.run(['mpirun', '-n', '16', 'python', 'main_qml.py', '--config', f'config_lmax{lmax}.yaml']) Performance Guidelines ---------------------- Computational Scaling ^^^^^^^^^^^^^^^^^^^^^^ QML estimation scales with system size as: * **Memory**: :math:`O(N_{ell} \times N_{pix}^2)` for estimator matrices * **Computation**: :math:`O(N_{ell} \times N_{pix}^2)` for quadratic forms * **Communication**: :math:`O(N_{ell} \times N_{pix})` for MPI coordination Resource Requirements ^^^^^^^^^^^^^^^^^^^^^ Typical resource needs for different analysis configurations: .. list-table:: :header-rows: 1 :widths: 15 20 20 25 20 * - Configuration - Memory (GB) - Compute Time - Storage (GB) - Recommended Cores * - Quick test (nside=128) - ~4 - ~1 hour - ~1 - 4-8 * - Standard (nside=512) - ~64 - ~8 hours - ~10 - 16-32 * - High-res (nside=1024) - ~512 - ~48 hours - ~100 - 32-64 * - Ultimate (nside=2048) - ~4000 - ~2 weeks - ~1000 - 64-128 Optimization Strategies ^^^^^^^^^^^^^^^^^^^^^^^ For large-scale analyses: 1. **Iterative Solvers**: Use preconditioned conjugate gradient for matrix operations 2. **Block Processing**: Process multipole ranges in blocks to reduce memory 3. **Checkpointing**: Save intermediate results for fault tolerance 4. **Load Balancing**: Distribute estimator computation optimally across processes Validation and Quality Assessment --------------------------------- Statistical Tests ^^^^^^^^^^^^^^^^^ The pipeline includes several validation checks: **Unbiasedness Test**: Compare estimates from multiple simulations to input spectra **Chi-squared Test**: Verify statistical consistency of bandpower residuals **Cross-Validation**: Compare estimates from different sky regions or data splits **Null Tests**: Analyze difference maps and systematic-dominated modes Error Analysis ^^^^^^^^^^^^^^ Comprehensive error characterization includes: * **Statistical Errors**: From Fisher information matrix and Monte Carlo * **Systematic Errors**: From pipeline variations and null tests * **Calibration Uncertainties**: Propagated through covariance matrices * **Theoretical Modeling**: Impact of fiducial spectrum assumptions Troubleshooting --------------- Common Issues and Solutions ^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Convergence Problems**: * Check covariance matrix conditioning * Verify input map and mask consistency * Examine numerical precision requirements * Consider iterative solver tolerances **Memory Limitations**: * Reduce nside or multipole range * Use more MPI processes for distribution * Enable out-of-core computation modes **Unexpected Results**: * Validate input files and units * Run on known simulations for comparison * Check for systematic contamination * Verify calibration and beam corrections **Performance Issues**: * Profile MPI communication patterns * Optimize I/O and memory access patterns * Consider alternative solver algorithms * Balance computation vs. communication See Also -------- * :doc:`spectra` : Spectra class API documentation * :doc:`main_fisher` : Fisher matrix analysis pipeline * :doc:`produce_mock_inputs` : Mock data generation utilities * :mod:`cosmoforge.cosmocore.harmonic` : Spherical harmonic utilities