pyPhotometry data¶

pyPhotometry can save data either as binary files with a .ppd file extension or as comma seperated value files with a .csv file extension. File names are determined by the subject ID, date and time the recording started, and file type, e.g. m1-2018-08-30-103945.ppd

Saving the data as a binary .ppd file generates a single file per recording which contains both the acquisition settings and the data.

Saving the data as a .csv file generates two files per recording; a .csv file containing the data and a .json file containing the acquisition settings.

The binary data files are more compact than the .csv files, a 1 hour recording at 130Hz sampling rate yields a .ppd file of ~3.6MB and a .csv file of ~8MB.

Importing data¶

Compatibility

The .ppd file format changed in version v1.1 of the pyPhotometry software such that in pulsed acquisition modes, rather than saving the analog signals after baseline subtraction, the raw LED-on signal and LED-off baseline are saved separately. The version >= 1.1 data import code for both Python and Matlab support both old and new .ppd files, but data import code from version <1.1 will not correctly open files generated with pyPhotometry v1.1 or later.

If you are using Python for analysis you can import .ppd files using the import_ppd function in the data_import module:

from data_import import import_ppd

data = import_ppd('path\\to\\data_file.ppd', low_pass=20, high_pass=0.001)

The import_ppd function returns a dictionary with the following entries:

'subject_ID'    # Subject ID
'date_time'     # Recording start date and time (ISO 8601 format string)
'mode'          # Acquisition mode
'sampling_rate' # Sampling rate (Hz)
'LED_current'   # Current for LEDs 1 and 2 (mA)
'version'       # Version number of pyPhotometry
'n_analog_signals'  # Number of analog signals
'n_digital_signals' # Number of digital signals
# For each analog signal x in [1, n_analog_signals]
'analog_x'      # Analog signal (volts).  This is the baseline subtracted signal if in pulsed modes.
'analog_x_filt' # Filtered analog signal (volts).
'analog_x_raw_LED_on' # Analog signal before baseline subtraction (volts), pulsed modes only.
'analog_x_raw_baseline' # Baseline signal with LED off (volts), pulsed modes only.'
'analog_x_clipping' # Samples where analog signal was clipping (bool), i.e. input voltage >= 3.3V.
# For each analog signal x in [1, n_digital_signals]
'digital_x'     # Digital signal (bool).
'pulse_inds_x'  # Locations of rising edges on digital input (samples).
'pulse_times_x' # Times of rising edges on digital input (ms).
'time'          # Time of each sample relative to start of recording (ms)

The high_pass and low_pass arguments provided to the import_data function determine the frequency in Hz of highpass and lowpass filtering applied to the filtered analog signals. To disable highpass or lowpass filtering set the respective argument to None. The filtering applies a 2nd order Butterworth filter in the forward and reverse directions to give a 4th order zero phase filter.

If you are using Matlab for analysis you can import data with the function import_ppd.m in the tools folder:

data = import_ppd('path\\to\\data_file.ppd')

The Matlab import function returns a struct with the same fields as the dictionary returned by the Python import function, but without the filtered versions of the analog signals or digital input pulse times.

Data preprocessing¶

Photometry data typically needs preprocessing to remove noise, and correct for photobleaching and movement artifacts. Some photometry data preprocessing methods are shown in this notebook. A Python function implementing the preprocessing method shown in the notebook is provided in the data import module. To use it do:

from data_import import import_ppd, preprocess_data

data = import_ppd('path\\to\\data_file.ppd')

processed_signal = preprocess_data(data_dict=data, 
                                   signal="analog_1", 
                                   control="analog_2", 
                                   low_pass=10,
                                   normalisation="dF/F",
                                   plot=True)

For more information see the function docstring.

Binary data format¶

The binary .ppd files generated by pyPhotometry have the following structure:

Start byte	End byte	Content	Format
1	2	header size	2 byte little endian integer
3	2+header size	header	JSON object encoded as UTF-8 string
3+header size	file end	data	see below

The first two bytes of the file indicate the size of the header. The header is a UTF-8 encoded string which represents a JSON object with the following entries:

'subject_ID'         # Subject ID
'date_time'          # Recording start date and time (ISO 8601 format string)
'end_time'           # Recording end data and time (ISO 8601 format string)
'mode'               # Acquisition mode
'sampling_rate'      # Sampling rate (Hz).
'version'            # Version number of pyPhotometry
'volts_per_division' # Volts per division of the analog signals.
'LED_current'        # Current for LEDs 1 and 2 (mA)
'n_analog_channels'  # Number of analog channels
'n_digital_channels' # Number of digital channels

The remainder of the data file contains the analog and digital signals. Each two bytes chunk of data encodes a 16 bit little endian unsigned integer. The most significant 15 bits of each integer encode one sample of analog signal and the least significant bit encodes one sample of digital signal. To convert the analog samples to volts, multiply them by the 'volts_per_division' value from the header information.

In continuous acquisition modes and pulsed modes prior to version 1.1 (where only the baseline-subtracted signal is saved) the sequence of samples is:

Sample number modulo n_analog_channels	Analog data (15 most significant bits)	Digital data (least significant bit)
1	Analog channel 1	Digital channel 1
2	Analog channel 2	Digital channel 2

In Python the steps to convert these data into signals are:

# Convert the data bytes into an array of 16 bit unsigned integers.
data = numpy.frombuffer(data_bytes, dtype=numpy.dtype('<u2')) 

# Analog signals are most significant 15 bits of each integer,
# extract them by bit shifting 1 to the right.
analog = data >> 1

# Digital signals are least significant bit of each integer,
# extract them by bitwise AND with integer 1.
digital = data & 1

# Channels 1 and 2 are alternating samples:
analog_1  =  analog[0::2] * volts_per_division
analog_2  =  analog[1::2] * volts_per_division
digital_1 = digital[0::2]
digital_2 = digital[1::2]

In pulsed acquisition modes in pyPhotometry version 1.1 and later, the LED-on signal and LED-off baseline for each channel are saved separately in sequential samples yielding the following sequence of samples:

Sample number modulo 2*n_analog_channels	Analog data (15 most significant bits)	Digital data (least significant bit)
1	Analog channel 1 led-on signal	Digital channel 1
2	Analog channel 1 led-off baseline
3	Analog channel 2 led-on signal	Digital channel 2
4	Analog channel 2 led-off baseline

In Python the steps to convert these data into signals are:

# Convert the data bytes into an array of 16 bit unsigned integers.
data = numpy.frombuffer(data_bytes, dtype=numpy.dtype('<u2')) 

# Analog signals are most significant 15 bits of each integer,
# extract them by bit shifting 1 to the right.
analog = data >> 1

# Digital signals are least significant bit of each integer,
# extract them by bitwise AND with integer 1.
digital = data & 1

# Extract signals and baselines and compute baseline subtracted signal.
analog_1_LED_on_sig = analog[0::4] * volts_per_division # LED-on signal
analog_1_baseline = analog[1::4] * volts_per_division   # LED-off baseline
analog_1 = analog_1_LED_on_sig - analog_1_baseline # Baseline subtracted signal.
analog_2_LED_on_sig = analog[2::4] * volts_per_division # LED-on signal
analog_2_baseline = analog[3::4] * volts_per_division   # LED-off baseline
analog_2 = analog_2_LED_on_sig - analog_2_baseline # Baseline subtracted signal.

digital_1 = digital[0::4]
digital_2 = digital[2::4]

Comma separated value data format¶

The .csv files generated by pyPhotometry are UTF-8 encoded text files with 4 entries per line, separated by commas. Each line contains one sample each from the two analog inputs and two digital inputs, in the order:

Analog_1, Analog_2, Digital_1, Digital_2

Each analog sample is an integer between 0 and 32768 and each digital sample an integer 0 or 1. The first line of the file contains the column names seperated by commas, such that the start of a file might read:

Analog1, Analog2, Digital1, Digital2
25443,13364,0,0
25435,13563,0,1
25442,13759,1,0

The .json file containing the acquisition settings is a UTF-8 encoded text file which represents a JSON object containing the same information as the binary data files header. The .json files are human readable if opened in a text editor. The analog signal values in the .csv file can be converted into volts using the volts_per_division information in the .json file.

Synchronisation¶

To synchronise pyPhotometry data with behavioural data we typically send sync puses from the behavioural hardware to a pyPhotometry digital input. For more information see the synchronisation page of the pyControl docs.

An example analysis showing how to synchronise pyControl behavioural data with neural activity recorded using pyPhotometry is provided in this data synchronisation jupyter notebook.