\documentclass[11pt, oneside]{article} % use "amsart" instead of "article" for AMSLaTeX format \usepackage{geometry} % See geometry.pdf to learn the layout options. There are lots. \geometry{letterpaper} % ... or a4paper or a5paper or ... %\geometry{landscape} % Activate for for rotated page geometry %\usepackage[parfill]{parskip} % Activate to begin paragraphs with an empty line rather than an indent \usepackage{graphicx} \usepackage{amssymb} \usepackage{hyperref} \hypersetup{ colorlinks = true } \title{Memo : pyuvdata.uvcal} \author{Zaki Ali, Bryna Hazelton, Adam Beardsley, Paul La Plante and the pyuvdata team} \date{July 3, 2017} % Activate to display a given date or no date \begin{document} \maketitle \section{Introduction} This memo introduces the new \textit{calfits} file format for storing calibration solutions using pyuvdata\footnote{\url{https://github.com/HERA-Team/pyuvdata}}, a python package that provides an interface to interferometric data. We are defining a file format with the code interface. Here, we describe the required and optional parameters of the \textit{calfits} format and interface to reading and writing, in addition to the structure of the underlying fits file. For examples please see the pyuvdata tutorial on ReadTheDocs: \url{http://pyuvdata.readthedocs.io/en/latest/tutorial.html}. \section{Basics} The \textit{calfits} format is a specification for the storage of radio interferometer calibration information in a fits file. The contents of the file are defined via explicit mapping to the UVCal object in the pyuvdata package. The UVCal object (really, it's a class) is a subclass of the of the UVBase class with a set of \textbf{uvparameters} that defines the UVCal object. A \textbf{uvparameter} is a pyuvdata object that has a name, form, description, value, required flag, expected type, acceptable values, and tolerances associated with it. \textbf{Uvparameters} are accessed as attributes to the UVBase class (and its subclasses, like UVCal). \section{Parameters} In order to conform to the \textit{calfits} file format there are a number of required parameters that need to be set as attributes of the UVCal class. Below is a list of the required parameters and their descriptions. \begin{itemize} \item{\textbf{cal\_type}: cal type parameter. Values are delay, gain or unknown.} \item{\textbf{Nants\_data}: Number of antennas that have data associated with them (i.e. number of unique entries in ant\_array). May be smaller than the number of antennas in the telescope'} \item{\textbf{Nants\_telescope}: Number of antennas in the array. May be larger than the number of antennas with data} \item{\textbf{Nfreqs}: Number of frequency channels.} \item{\textbf{Njones}: Number of Jones calibration parameters (Number of jones matrix elements calculated in calibration).} \item{\textbf{Nspws}: Number of spectral windows (ie non-contiguous spectral chunks). More than one spectral window is not currently supported.} \item{\textbf{Ntimes}: Number of times with different calibrations calculated (if a calibration is calculated over a range of integrations, this gives the number of separate calibrations along the time axis).} \item{\textbf{ant\_array}: Array of antenna indices for data arrays, shape (Nants\_data). type = int, 0 indexed} \item{\textbf{antenna\_names}: List of antenna names, shape (Nants\_telescope), with numbers given by antenna\_numbers (which can be matched to ant\_array). There must be one entry here for each unique entry in ant\_array, but there may be extras as well.} \item{\textbf{antenna\_numbers}: List of integer antenna numbers corresponding to antenna\_names, shape (Nants\_telescope). There must be one entry here for each unique entry in ant\_array, but there may be extras as well} \item{\textbf{channel\_width}: Channel width of of a frequency bin. Units Hz.} \item{\textbf{flag\_array}: Array of flags to be applied to calibrated data (logical OR of input and flag generated by calibration). True is flagged. Shape: (Nants\_data, Nspws, Nfreqs, Ntimes, Njones), type = bool.} \item{\textbf{freq\_array}: Array of frequencies, shape (Nspws, Nfreqs), units Hz.} \item{\textbf{gain\_convention}: The convention for applying the calibration solutions to data. Values are "divide" or "multiply", indicating that to calibrate one should divide or multiply uncalibrated data by gains. Mathematically this indicates the alpha exponent in the equation: calibrated data = gain$^{\alpha} \times $ uncalibrated data. A value of ``divide'' represents $\alpha=-1$ and ``multiply'' represents $\alpha=1$.} \item{\textbf{history}: String of history} \item{\textbf{integration\_time}: Integration time of a time bin, units seconds.} \item{\textbf{jones\_array}: Array of antenna polarization integers, shape (Njones). linear pols -5:-8 (jxx, jyy, jxy, jyx).circular pols -1:-4 (jrr, jll. jrl, jlr).} \item{\textbf{quality\_array}: Array of qualities of calibration solutions. The shape depends on cal\_type, if the cal\_type is "gain" or "unknown" the shape is: (Nants\_data, Nspws, Nfreqs, Ntimes, Njones), if the cal\_type is "delay" the shape is: (Nants\_data, Nspws, 1, Ntimes, Njones), type = float} \item{\textbf{spw\_array}: Array of spectral window numbers, shape (Nspws)} \item{\textbf{telescope\_name}: Name of telescope. e.g. HERA. String.} \item{\textbf{time\_array}: Array of calibration solution times, center of integration, shape (Ntimes), units Julian Date} \item{\textbf{time\_range}: Time range (in JD) that gain solutions are valid for. list: [start\_time, end\_time] in JD.} \item{\textbf{x\_orientation}: Orientation of the physical dipole corresponding to what is labelled as the x polarization. Values are east (east/west orientation), north (north/south orientation) or unknown.} \end{itemize} There are also some optionally required parameters that depend on the calibration type. These parameters include. \begin{itemize} \item{\textbf{delay\_array}: Required if cal\_type =``delay''. Array of delays with units of seconds. Shape: (Nants\_data, Nspws, 1, Ntimes, Njones), type = float.} \item{\textbf{gain\_array}: Required if cal\_type = ``gain''. Array of gains, shape: (Nants\_data, Nspws, Nfreqs, Ntimes, Njones), type = complex float.} \item{\textbf{freq\_range}: Required if cal\_type = ``delay''. Frequency range that solutions are valid for. list: [start\_frequency, end\_frequency] in Hz.} \end{itemize} In addition to the required parameters, there are a number of truly optional parameters that may be passed in. These include: \begin{itemize} \item{\textbf{git\_origin\_cal}: Origin (on github for e.g) of calibration software. Url and branch.} \item{\textbf{git\_hash\_cal}: Commit hash of calibration software (from git\_origin\_cal) used to generate solutions.} \item{\textbf{input\_flag\_array}: Array of input flags, True is flagged. shape: (Nants\_data, Nspws, Nfreqs, Ntimes, Njones), type = bool.} \item{\textbf{observer}: Name of observer who calculated solutions in this file.} \item{\textbf{total\_quality\_array}: Array of qualities of the calibration solution for the entire array. The shape depends on cal\_type, if the cal\_type is "gain" or "unknown", the shape is: (Nspws, Nfreqs, Ntimes, Njones), if the cal\_type is "delay", the shape is (Nspws, 1, Ntimes, Njones), type = float.} \end{itemize} Once these parameters are set in the UVCal object, a \textit{calfits} file may be written out. \section{Reading and Writing a \textit{calfits} File} Writing out the UVCal object to a file is very simple: just run UVCal.write\_calfits(filename). That will write a fits file called ``filename''. Note that a filename check will be done and a new file will not be written with the same name. You can override this functionality with the clobber key word. Reading in a calfits file is also straightforward. First instantiate the UVCal object and then run UVCal.read\_calfits(filename). This updates the UVCal object with all the parameters from the the fits file. There are examples of working with pyuvdata UVCal objects and \textit{calfits} files in the tutorial (\url{http://pyuvdata.readthedocs.io/en/latest/tutorial.html}). \subsection{The FITS file} Depending on the calibration type (gain vs delay), the \textit{calfits} file format can consists of up to 4 HDUs. The primary header in either case is the same and consists of relevant meta information for a UVCal object to be instantiated. Also, the second HDU is the same in either case and is the ANTENNAS HDU. This HDU is a BinaryTable and consists of ANTNAME, ANTINDEX, and ANTARR, corresponding to antenna\_names, antenna\_numbers, and ant\_array in the above list, respectively. When the calibration type is ``gain'', the essential data contains only these 2 HDUs. In this case, the image data in the primary HDU consists is a 6 dimensional array, where each dimension corresponds to (Nants, Nspws, Nfreqs, Ntimes, Njones, Number of arrays in image array), respectively. In other words, the primary data HDU contains the 5 axes of the data given in the list above, and then a sixth axis corresponding to the individual quantities being saved. For instance, if there is an input\_flag\_array the image array consists of [ real(gain\_array), imag(gain\_array), flag\_array, input\_flag\_array, quality\_array], which is concatenated along the last axis and so the last dimension is equal to 5. However, if no input\_flag\_array is given, the input\_flag\_array is left out of the above array and a the last dimension is equal to 4. When the calibration type is ``delay'', there are 3 data HDUs. The image data in the primary HDU is still a 5 dimensional array as before (dimensions are Nants, Nfreqs, Ntimes, Njones, number of arrays in image array), but with Nfreqs = 1 as a placeholder axis. This axis is added to keep the data arrays the same size between the delay-type and gains-type formats. In this case the image data is [delay\_array, quality\_array], concatenated along the last axis. The flag arrays are stored in the third HDU (ImageHDU) which has the flag\_array and may have an input\_flag\_array. For both delay-types, there is also an optional total\_quality\_array HDU, which contains information about the overall $\chi^2$ value of the whole array. The size of the array is (Nspws, Nfreqs, Ntimes, Njones). For delay-type calibrations, Nfreqs = 1 as above. If present, there will be 3 total HDUs for gain-type files, and 4 total HDUs for delay-type. Note that self-consistency checks are run when reading and writing calfits files to ensure that arrays have the proper size across different HDUs. \end{document}