Loading...
 

The Data Set



Data Set

The challenge is to identify simulated signals buried in the simulated data set that we provide. The data is saved in a particular binary format called frame files, which is the format accepted by the gravitational-wave (GW) observatories worldwide. The data set consists of 2-hours worth of simulated data from a future GW observatory called IndIGO. Each frame file contains a channel called I1:INDIGO-STRAIN storing 3600 seconds of "strain" data. The frame file also contains related information, such as the sampling rate of the data, the GPS time-stamps etc.

http://www.gw-indigo.org/mdc-2011/data/frames_4k_v2/I-INDIGO-977875220-3600.gwf (113 MB)
http://www.gw-indigo.org/mdc-2011/data/frames_4k_v2/I-INDIGO-977878820-3600.gwf (113 MB)

Above, 977875220 and 977878820 shows the starting GPS time of each frame (corresponding to UTC times 2011-01-01 00:00:06 and 2011-01-01 01:00:06) and 3600 shows the length of the data in each frame file (3600 seconds).

The best way to download the data in a Unix-like operating system would be to use the GNU wget tool, which usually comes pre-installed with any Unix-like system. In the shell type:
wget http://www.gw-indigo.org/mdc-2011/data/frames_4k_v2/I-INDIGO-977875220-3600.gwf
wget http://www.gw-indigo.org/mdc-2011/data/frames_4k_v2/I-INDIGO-977878820-3600.gwf

This should download the frame files to your computer.

Data in ASCII text format

For those who have trouble dealing with the frame library, we have also created ASCII/text files of the mock data. These can be downloaded from here:

http://www.gw-indigo.org/mdc-2011/data/frames_4k_v2/I-INDIGO-977875220-3600.txt.gz (158 MB)
http://www.gw-indigo.org/mdc-2011/data/frames_4k_v2/I-INDIGO-977878820-3600.txt.gz (158 MB)

The text files are of the form:

#start time 977875220
#stop time 977878820
#sampling rate 4096.0 Hz
-1.779599161844960676e-17
-1.811919829796309542e-17
-1.812375465960586710e-17
....

Software tools

We recommended using high-level programming languages such as Matlab/GNU-Octave or Python. For all practical purposes, "high-level'' just means they are easy to learn and use! Additionally, Octave and Python are free software, and come preinstalled with many Unix-like operating systems. Indeed, the participants are free to use any language of their preference. But reading the binary files containing the GW data (frame files) is particularly easy with these languages. We will also assume that the participants are using a Unix-like operating system, such as GNU Linux, Solaris, Mac OS X, BSD etc. Some useful resources for Matlab/Octave/Python are given below:

Matlab


Octave


Python

If you are using a Unix like operating system there is high chance that your OS has already Python installed. In the shell terminal type python. For this primer we will assume you are using a version of Python >= 2.4. If your OS does not have Python, you can install the appropriate version from the Python homepage.
Python by itself comes pretty light-weight. To be able to do a wide variety of numerical manipulation on arrays we will need numpy. Similary scipy package comes with necessary scientific subroutines to do data analysis. Sub-packages needed for Python:
You might also need several other sub-packages of Python. There is a company, Enthought scientific computing solutions, which bundles the latest Python distribution with all the different sub-packages. For academic use (using an academic email) it will let you download this bundled Python distribution free of cost.
There are several learning resources for Python in the web. We list a few here for your reference.

Frame library

The data collected by GW observatories worldwide is stored in a particular binary format, called frame format. A frame is a unit of information from the interferometer data over a finite time interval. The frame library is a collection of tools to read and manipulate frame files. To install the Frame library, first download the latest version from: http://lappweb.in2p3.fr/virgo/FrameL/.

The example below shows how to download, compile and install the frame library. The executing the following commands in a Unix-type system should work:
wget http://lappweb.in2p3.fr/virgo/FrameL/libframe-8.15.tar.gz
tar xzvf libframe-8.15.tar.gz
pushd libframe-8.15
./configure --prefix=SUITABLE_FOLDER_WHERE_YOU_WANT_TO_INSTALL
make
make install
popd

To be able to link frame library with other programs it is useful to have pkg-config program. Most probably your OS should have pkg-config pre-installed. If not you can download and install it from:
http://www.freedesktop.org/wiki/Software/pkg-config.
The following example demonstrates the usefulness of pkg-config.
$ export PKG_CONFIG_PATH=FOLDER_WHERE_YOU_INSTALLED_FRAMELIB/lib/pkgconfig:$PKG_CONFIG_PATH
$ pkg-config --libs --cflags-only-I libframe

And as an output of the above command you will get something similar to what is show below.
-I/opt/local/include -L/opt/local/lib -lFrame

This header and library locations will help to link other programs with the frame library. There are wrapper codes to use the frame library to read/write frame files in Matlab/Octave/Python.

Frame library interface to Matlab/Octave/Python

This section describes how to read data into Matlab/Octave/Python. Mostly we will assume that the platform is Linux. But these instructions should work in any Unix-type system with minor changes.

Matlab

The Frame library also provides a "mex" file which can be used to read the data stored in the frame files to Matlab. First we need to compile the source code to create the mex file. Start Matlab, and go to the following directory: ~/libframe-8.15/matlab
You should be able to see the file frgetvect.c here. Now compile the mex file in Matlab command window:
>> mex frgetvect.c FOLDER_WHERE_YOU_INSTALLED_FRAMELIB/lib/libFrame.so -I../src

This will create a mex file frgetvect.mexa64 (the extension depends on the system architecture). Copy frgetvect.mexa64 and frgetvect.m to your working directory or add ~/libframe-8.15/matlab to your Matlab path. Make sure that Matlab can access the mex file by typing:
>> which frgetvect

in the Matlab command window. Matlab should return the location of the newly compiled mex file. Type help frgetvect to see how to use this function. Here is an example:
>> [d, t] = frgetvect('I-INDIGO-977875220-3600.gwf', 'I1:INDIGO-STRAIN', 977875220, 16, 0);
Warning: frgetvect:info:Opening I-INDIGO-977875220-3600.gwf for channel I1:INDIGO-STRAIN 
(t0=977875220.00, duration=16.00)
>> figure; plot(t, d); xlabel('Seconds starting from 977875220'); ylabel('d(t)');

This will plot 16 seconds of time-series data starting from GPS seconds 977875220, as shown in the Figure below.

An example of the time series data retrieved from frame ?les.
An example of the time series data retrieved from frame ?les.

Octave

Go to the directory ~/libframe-8.15/octave, and open the Makefile using a text editor. In the Makefile change the following variables to point to the location of your newly installed frame library
FRAME_INC = FOLDER_WHERE_YOU_INSTALLED_FRAMELIB/include
FRAME_LIB = FOLDER_WHERE_YOU_INSTALLED_FRAMELIB/lib

Now type make in the shell. This will create, among other files, a binary file called loadframe.oct. Copy loadframe.oct and loadframe.o to your working directory or add ~/libframe-8.15/octave to your Octave path. Do help loadframe in the Octave command line to see how to use this function. Here is an example:
octave:1> [d, fs] = loadframe("I-INDIGO-977875220-3600.gwf", "I1:INDIGO-STRAIN", 1, 977875220);
octave:2> length(d)/fs
ans = 3600

Note that, unlike the case of the Matlab mex file, here we cannot specify the length of data to be read in a single call: loadframe loads the entire data in the frame file (3600 seconds) into the vector d, and hence is rather slow. Also, fs = 1/ dt is the sampling frequency of the data.

Python

Pylal is a package which offers several python based routines to do GW data analysis. Here we will use a strip-down version of pylal only to be used to read the frame data. Assuming you have numpy, pkg-config and wget you can just download the following script and run it. This shell script will install both the frame library and the pylal python wrapper for you in any Unix like OS:
http://gw-indigo.org/mdc-2011/tools/install_python_tools.sh .
By default this installs in the following directory: ~/indigo_python/install. To use the python frame reading wrapper do:
source ~/indigo_python/install/pylal/etc/pylal-user-env.sh

Then you can start Python. In Python prompt do:
>>> from pylal import Fr

You can load a gwf file into a python array using frgetvect function, e.g.
frame = Fr.frgetvect("I-INDIGO-977875220-3600.gwf","I1:INDIGO-STRAIN")

Now let's look at this array:
>>> frame
(array([ 3.85241898e-17, 3.70307674e-17, 3.55302314e-17, ...,
-3.90825860e-17, -3.74059633e-17, -3.56520304e-17]), 977875220.0, ('',), '')

The first index holds time series strain data (Note: unlike MATLAB/Octave, Python starts counting array index from 0):
>>> frame[0]
array([ 3.85241898e-17, 3.70307674e-17, 3.55302314e-17, ...,
-3.90825860e-17, -3.74059633e-17, -3.56520304e-17])

The second index shows the GPS start time of the frame.
>>> frame[1]
977875220.0

The fourth index shows the inverse of sampling rate, i.e. spacing between the time-series strain data.
>>> frame[3]
(0.000244140625,)
>>> frame[3][0]
0.000244140625
>>> 1/frame[3][0]
4096.0

This particular frame file has 3600 s of data, which we can verify by,
>>> len(frame[0])*frame[3][0]
3600.0


Support

If you have any issues with the software or data, you can use the mailing list of the MDC participants to discuss such issues: mdc-participants AT gw-indigo DOT org