Introduction: Getting started

In this section, we explain how to get started with the dataset and modeling. For ease all the necessary element and codes are put into one python library called - phyaat. Here we explain the functionalities that Phyaat library has with possible tuning the process of preprocessing and feature extractions. To start with a quick exmaple to preditive modeling check the Predictive Modeling tab.

For quick start with predictive modeling, check EXAMPLE CODE

Table of Contents

1. Install Library

First install the python library

pip install phyaat

2. Download dataset

Once Phyaat library is installed, the dataset can be downloaded using it. You could download all the dataset together or data of one particulat subject for testing and running.

import phyaat
print('Version :' ,phyaat.__version__)
import phyaat as ph

2.1 To download dataset of subject #1

To download data set of only one subject with subject id=1 (subject=1), use following code. Here baseDir = ‘../PhyAAt_Data’ is path where data will be downloaded and stored. Make sure you have permission to write in given path.

dirPath = ph.download_data(baseDir='../PhyAAt_Data', subject=1,verbose=0,overwrite=False)
#returns a dictionary containing file names of all the subjects available in baseDir
SubID = ph.ReadFilesPath(dirPath)

# list of all the subjects in the dataset directory
print(SubID.keys())

2.2 To download dataset of all the subjects

dirPath = ph.download_data(baseDir='../PhyAAt_Data', subject=-1,verbose=0,overwrite=False)

#Check the humber of subjects are in directory - read the file path of all the subjects available

baseDir='../PhyAAt_Data'   # or dirPath return path from above

#returns a dictionary containing file names of all the subjects available in baseDir
SubID = ph.ReadFilesPath(baseDir)
# list of all the subjects in the dataset directory
print(SubID.keys())

3. Preprocessing

#Creat an object holding data of a subjects

Subj = ph.Subject(SubID[1])

3.1. Filtering

Highpass filter with cut-off frrequency of 0.5Hz

This is very standard to remove any dc component and drift in the signal

#filtering with highpass filter of cutoff frequency 0.5Hz
Subj.filter_EEG(band =[0.5],btype='highpass',order=5)

Filtering with custum range of feequecy should be between 0-64Hz To analyse EEG in particulare band of frequency, such as for ERP analysis, you might need to apply for custom range of frequency band.

Lowpass filter

#filtering with lowpass filter
Subj.filter_EEG(band =[30],btype='lowpass',order=5)

Bandpass filter

#filtering with bandpass filter Theta
Subj.filter_EEG(band =[4,8],btype='bandpass',order=5)

Filter settings

#method = 'lfilter' # 'filtfilt', 'SOS'
#useRaw=False # if True, it will use raw eeg and overwirte old processed EEG

Subj.filter_EEG(band =[0.5],btype='highpass',order=5,method='lfilter',fs=128.0,verbose=0,use_joblib=False,n_jobs=-1,useRaw=False)

3.2 Applyting Artifact Removal Algorithm on EEG

ATAR Algorithm - Wavelet based approach (in version>0.0.2)

A wavelet based tunable algorithm

# with window size =128 (1 sec, recommonded). To save time, use winsize=128*10, 10 sec window

Subj.correct(method='ATAR',verbose=1,winsize=128, wv='db3', thr_method='ipr',  OptMode='soft',beta=0.1)

# check all the parameters for ATAR
help(ph.Subject.correct)

Tune the parameters of ATAR algorithm: Mostly, we tune beta,

OptMode='elim'
beta=0.2

Subj.correct(method='ATAR',verbose=1,winsize=128, wv='db3', thr_method='ipr',  OptMode=OptMode,beta=beta)

#check all the parameters here
help(ph.Subject.correct)

ICA based approach

# with window size =128 (1 sec, recommonded). To save time, use winsize=128*10, 10 sec window

Subj.correct(method='ICA',verbose=1,winsize=128)

Change parameters of ICA based artifact Removal

KurThr = 2
Corr   = 0.8
ICAMed = 'extended-infomax' #picard, fastICA

Subj.correct(method='ICA',winsize=128,hopesize=None,Corr=Corr,KurThr=KurThr,
             ICAMed=ICAMed,verbose=0, window=['hamming',True],
             winMeth='custom')

#check all the parameters here
help(ph.Subject.correct)

4. Extract X,y for a task Rhythmic Features

4.1 Extracting Features Segment-wise

# Task 4:  LWR classification
X_train,y_train, X_test,y_test = Subj.getXy_eeg(task=4)

print('DataShape: ',X_train.shape,y_train.shape,X_test.shape, y_test.shape)
print('\nClass labels :',np.unique(y_train))

# Task 1: Attention Score Prediction

X_train,y_train, X_test,y_test = Subj.getXy_eeg(task=1)

print('DataShape: ',X_train.shape,y_train.shape,X_test.shape, y_test.shape)
print('\nlabels :',np.unique(y_train))


# Task 2: Noise Level Predicition
X_train,y_train, X_test,y_test = Subj.getXy_eeg(task=2)

print('DataShape: ',X_train.shape,y_train.shape,X_test.shape, y_test.shape)
print('\nlabels :',np.unique(y_train))


# Task 3: Semanticity Classification
X_train,y_train, X_test,y_test = Subj.getXy_eeg(task=3)

print('DataShape: ',X_train.shape,y_train.shape,X_test.shape, y_test.shape)
print('\nClass labels :',np.unique(y_train))

#If features are extracted for task 1, 2 or 3 (listening segments)
# next time while extracting won't compute features again, unless redo=True

4.1 Extracting Features Window-wise

 winsize=128 # 1 sec window
 hopesize=32 # 0.25 shift for next window, if None, overlape is half of windowsize

X_train,y_train, X_test,y_test = Subj.getXy_eeg(task=1, features='rhythmic',
                            winsize=winsize, hopesize=hopesize)

print('DataShape: ',X_train.shape,y_train.shape,X_test.shape, y_test.shape)
print('\nClass labels :',np.unique(y_train))

4.2 Random split for train-test


X_train,y_train, X_test,y_test = Subj.getXy_eeg(task=1, features='rhythmic',
                           winsize=winsize, hopesize=hopesize,split='random')

print('DataShape: ',X_train.shape,y_train.shape,X_test.shape, y_test.shape)
print('\nClass labels :',np.unique(y_train))

4.3 Hyperparameters for feature extraction method


X_train,y_train, X_test,y_test = Subj.getXy_eeg(task=1, features='rhythmic', eSample=[0, 0],
               verbose=1, redo=False, split='serial', splitAt=100, normalize=False,
               log10p1=True, flat=True, filter_order=5, method='welch', window='hann',
               scaling='density', detrend='constant', period_average='mean',
               winsize=-1, hopesize=None)

#Check help
print(ph.Subject.getXy_eeg)

4.4 Extracting EEG Features with custom frequency bands

fBands = [[None,8],[8,24],[24,32]]

X_train,y_train, X_test,y_test = Subj.getXy_eeg(task=1, redo=True,normalize=False, log10p1=True,
                               flat=False, filter_order=5, filter_method='SOS', method='welch', window='hann',
                               scaling='density', detrend='constant', period_average='mean',
                               fBands=fBands, Sum=True, Mean=False, SD=False,verbose=0,
                               useRaw=False,redo_warn=True,use_v0=False)

#Check help
print(ph.Subject.getXy_eeg)

5. Predictive Modeling

Once you have X_train,y_train, X_test,y_test, it is easy to apply any ML or DL model to train and test. Here is a simple example of SVM. For more details on other models, check here - Predictive Modeling Examples

# Normalization - SVM works well with normalized features
means = X_train.mean(0)
std   = X_train.std(0)
X_train = (X_train-means)/std
X_test  = (X_test-means)/std


# Training
clf = svm.SVC(kernel='rbf', C=1,gamma='auto')
clf.fit(X_train,y_train)

# Predition
ytp = clf.predict(X_train)
ysp = clf.predict(X_test)

# Evaluation

print('Training Accuracy:',np.mean(y_train==ytp))
print('Testing  Accuracy:',np.mean(y_test==ysp))

6. Extracting LWR segments for extranal processing

L,W,R, Scores, Cols = Subj.getLWR()

Check here - code for using extracting signals and processing with external libraries