Content - 5c2a1d8ff72c43f44a61b3a0603f9e8265277c14

\documentclass[11pt, oneside]{article}   	% use "amsart" instead of "article" for AMSLaTeX format
\usepackage{geometry}                		% See geometry.pdf to learn the layout options. There are lots.
\geometry{letterpaper}                   		% ... or a4paper or a5paper or ... 
%\geometry{landscape}                		% Activate for for rotated page geometry
%\usepackage[parfill]{parskip}    		% Activate to begin paragraphs with an empty line rather than an indent
\usepackage{graphicx}			
							
\usepackage{amssymb}

\usepackage{hyperref} 
\hypersetup{
    colorlinks = true
}

\title{Memo : pyuvdata.uvcal}
\author{Zaki Ali, Bryna Hazelton, Adam Beardsley, Paul La Plante and the pyuvdata team}
\date{July 3, 2017}							% Activate to display a given date or no date

\begin{document}
\maketitle
\section{Introduction}
This memo introduces the new \textit{calfits} file format for storing
calibration solutions using
pyuvdata\footnote{\url{https://github.com/RadioAstronomySoftwareGroup/pyuvdata}}, a python package that
provides an interface to interferometric data. We are defining a file format
with the code interface. Here, we describe the required and optional parameters
of the \textit{calfits} format and interface to reading and writing, in addition
to the structure of the underlying fits file. For examples please see the
pyuvdata tutorial on ReadTheDocs: \url{http://pyuvdata.readthedocs.io/en/latest/tutorial.html}.

\section{Basics}

The \textit{calfits} format is a specification for the storage of radio
interferometer calibration information in a fits file.  The contents of the file
are defined via explicit mapping to the UVCal object in the pyuvdata package.
The UVCal object (really, it's a class) is a subclass of the of the UVBase class
with a set of \textbf{uvparameters} that defines the UVCal object. A
\textbf{uvparameter} is a pyuvdata object that has a name, form, description,
value, required flag, expected type, acceptable values, and tolerances
associated with it.  \textbf{Uvparameters} are accessed as attributes to the
UVBase class (and its subclasses, like UVCal).

\section{Parameters}
In order to conform to the \textit{calfits} file format there are a number of
required parameters that need to be set as attributes of the UVCal class. Below
is a list of the required parameters and their descriptions.

\begin{itemize}
\item{\textbf{cal\_type}: cal type parameter. Values are delay, gain or
    unknown.}
\item{\textbf{Nants\_data}: Number of antennas that have data associated with them 
    (i.e. number of unique entries in ant\_array). May be smaller than the number of 
    antennas in the telescope'}
\item{\textbf{Nants\_telescope}: Number of antennas in the array. May be larger
    than the number of antennas with data}
\item{\textbf{Nfreqs}: Number of frequency channels.}
\item{\textbf{Njones}: Number of Jones calibration parameters (Number of
    jones matrix elements calculated in calibration).}
\item{\textbf{Nspws}: Number of spectral windows (ie non-contiguous spectral
    chunks). More than one spectral window is not currently supported.}
\item{\textbf{Ntimes}: Number of times with different calibrations calculated
    (if a calibration is calculated over a range of integrations, this gives the
    number of separate calibrations along the time axis).}
\item{\textbf{ant\_array}: Array of antenna indices for data arrays, shape
    (Nants\_data). type = int, 0 indexed}
\item{\textbf{antenna\_names}: List of antenna names, shape (Nants\_telescope),
    with numbers given by antenna\_numbers (which can be matched to ant\_array).
    There must be one entry here for each unique entry in ant\_array, but there may be extras as well.}
\item{\textbf{antenna\_numbers}: List of integer antenna numbers corresponding
    to antenna\_names, shape (Nants\_telescope). There must be one entry here for each unique entry 
    in ant\_array, but there may be extras as well}
\item{\textbf{channel\_width}: Channel width of of a frequency bin. Units Hz.}
\item{\textbf{flag\_array}: Array of flags to be applied to calibrated data
    (logical OR of input and flag generated by calibration). True is flagged.
    Shape: (Nants\_data, Nspws, Nfreqs, Ntimes, Njones), type = bool.}
\item{\textbf{freq\_array}: Array of frequencies, shape (Nspws, Nfreqs), units
    Hz.}
\item{\textbf{gain\_convention}: The convention for applying the calibration solutions to data.
    Values are "divide" or "multiply", indicating that to calibrate one should divide or multiply
    uncalibrated data by gains. Mathematically this indicates the alpha exponent in the equation: 
    calibrated data = gain$^{\alpha} \times $ uncalibrated data. A value of
    ``divide'' represents $\alpha=-1$ and ``multiply'' represents $\alpha=1$.}
\item{\textbf{history}: String of history}
\item{\textbf{integration\_time}: Integration time of a time bin, units seconds.}
\item{\textbf{jones\_array}: Array of antenna polarization integers, shape
    (Njones). linear pols -5:-8 (jxx, jyy, jxy, jyx).circular pols -1:-4 (jrr,
    jll. jrl, jlr).}
\item{\textbf{quality\_array}: Array of qualities of calibration solutions. The
    shape depends on cal\_type, if the cal\_type is "gain" or "unknown" the shape is:
    (Nants\_data, Nspws, Nfreqs, Ntimes, Njones), if the cal\_type is "delay" the shape is: 
    (Nants\_data, Nspws, 1, Ntimes, Njones), type = float}
\item{\textbf{spw\_array}: Array of spectral window numbers, shape (Nspws)}
\item{\textbf{telescope\_name}: Name of telescope. e.g. HERA. String.}
\item{\textbf{time\_array}: Array of calibration solution times, center of integration, shape
    (Ntimes), units Julian Date}
\item{\textbf{time\_range}: Time range (in JD) that gain solutions are valid
    for. list: [start\_time, end\_time] in JD.}
\item{\textbf{x\_orientation}: Orientation of the physical dipole corresponding
    to what is labelled as the x polarization. Values are east (east/west
    orientation), north (north/south orientation) or unknown.}
\end{itemize}

There are also some optionally required parameters that depend on the
calibration type. These parameters include.
\begin{itemize}
\item{\textbf{delay\_array}: Required if cal\_type =``delay''. Array of delays with
    units of seconds. Shape: (Nants\_data, Nspws, 1, Ntimes, Njones), type = float.}
\item{\textbf{gain\_array}: Required if cal\_type = ``gain''. Array of gains, 
    shape: (Nants\_data, Nspws, Nfreqs, Ntimes, Njones), type = complex float.}
\item{\textbf{freq\_range}: Required if cal\_type = ``delay''. Frequency range that
   solutions are valid for. list: [start\_frequency, end\_frequency] in Hz.}
\end{itemize}

In addition to the required parameters, there are a number of truly optional
parameters that may be passed in. These include:

\begin{itemize}
\item{\textbf{git\_origin\_cal}: Origin (on github for e.g) of calibration
    software. Url and branch.}
\item{\textbf{git\_hash\_cal}: Commit hash of calibration software (from
    git\_origin\_cal) used to generate solutions.}
\item{\textbf{input\_flag\_array}: Array of input flags, True is flagged. shape:
    (Nants\_data, Nspws, Nfreqs, Ntimes, Njones), type = bool.}
\item{\textbf{observer}: Name of observer who calculated solutions in this
    file.}
\item{\textbf{total\_quality\_array}: Array of qualities of the calibration
    solution for the entire array. The shape depends on cal\_type, if the cal\_type is
    "gain" or "unknown", the shape is: (Nspws, Nfreqs, Ntimes, Njones), if the 
    cal\_type is "delay", the shape is (Nspws, 1, Ntimes, Njones), type = float.}
\end{itemize}

Once these parameters are set in the UVCal object, a \textit{calfits} file may
be written out.

\section{Reading and Writing a \textit{calfits} File}
Writing out the UVCal object to a file is very simple: just run
UVCal.write\_calfits(filename). That will write a fits file called
``filename''. Note that a filename check will be done and a new file will not be
written with the same name. You can override this functionality with the clobber
key word.

Reading in a calfits file is also straightforward. First instantiate the UVCal object and
then run UVCal.read\_calfits(filename). This updates the UVCal object with all
the parameters from the the fits file.

There are examples of working with pyuvdata UVCal objects and \textit{calfits} files in
the tutorial (\url{http://pyuvdata.readthedocs.io/en/latest/tutorial.html}).

\subsection{The FITS file}
Depending on the calibration type (gain vs delay), the \textit{calfits} file
format can consists of up to 4 HDUs. The primary header in either case is the
same and consists of relevant meta information for a UVCal object to be
instantiated. Also, the second HDU is the same in either case and is the ANTENNAS
HDU. This HDU is a BinaryTable and consists of ANTNAME, ANTINDEX, and ANTARR,
corresponding to antenna\_names, antenna\_numbers, and ant\_array in the above
list, respectively.

When the calibration type is ``gain'', the essential data contains only these 2
HDUs. In this case, the image data in the primary HDU consists is a 6
dimensional array, where each dimension corresponds to (Nants, Nspws, Nfreqs,
Ntimes, Njones, Number of arrays in image array), respectively. In other words,
the primary data HDU contains the 5 axes of the data given in the list above,
and then a sixth axis corresponding to the individual quantities being
saved. For instance, if there is an input\_flag\_array the image array consists
of [ real(gain\_array), imag(gain\_array), flag\_array, input\_flag\_array,
quality\_array], which is concatenated along the last axis and so the last
dimension is equal to 5. However, if no input\_flag\_array is given, the
input\_flag\_array is left out of the above array and a the last dimension is
equal to 4.

When the calibration type is ``delay'', there are 3 data HDUs. The image data in
the primary HDU is still a 5 dimensional array as before (dimensions are Nants,
Nfreqs, Ntimes, Njones, number of arrays in image array), but with Nfreqs = 1 as
a placeholder axis. This axis is added to keep the data arrays the same size
between the delay-type and gains-type formats. In this case the image data is
[delay\_array, quality\_array], concatenated along the last axis. The flag
arrays are stored in the third HDU (ImageHDU) which has the flag\_array and may
have an input\_flag\_array.

For both delay-types, there is also an optional total\_quality\_array HDU, which
contains information about the overall $\chi^2$ value of the whole array. The
size of the array is (Nspws, Nfreqs, Ntimes, Njones). For delay-type
calibrations, Nfreqs = 1 as above. If present, there will be 3 total HDUs for
gain-type files, and 4 total HDUs for delay-type. Note that self-consistency
checks are run when reading and writing calfits files to ensure that arrays have
the proper size across different HDUs.

\end{document}