Compare commits
24 Commits
770576fb6c
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
| be4489dc73 | |||
| b621dbf9b0 | |||
| f554a5822f | |||
| dd2df8f1a8 | |||
| fc8cde3a88 | |||
| 5f5e7ffcac | |||
| d09480c5bb | |||
| 6161e088c6 | |||
| 02f604cdeb | |||
| 0d513bfb96 | |||
| 6c1a555179 | |||
| a5b97a3a65 | |||
| 7f616a4a34 | |||
| 106a3ad434 | |||
| 26dd78a410 | |||
| cd20b9b776 | |||
| c6a6fd7012 | |||
| cfa950b85b | |||
| 86044cc7c7 | |||
| 4875cf29cb | |||
| c44cb7cb13 | |||
| 6f8c1af6c6 | |||
| bb3406bd88 | |||
| 1629441d5f |
2
.gitignore
vendored
2
.gitignore
vendored
@@ -2,5 +2,5 @@
|
||||
/Dataset/n170
|
||||
|
||||
# Ignore unnecessary files
|
||||
utils/__pycache__
|
||||
/utils/__pycache__
|
||||
.idea/
|
||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
60
README.md
60
README.md
@@ -1,21 +1,36 @@
|
||||
## Semesterproject of the lecture "Semesterproject Signal processing and Analysis
|
||||
of human brain potentials (eeg) WS 2020/21
|
||||
## Semesterproject of the lecture "Semesterproject Signal processing and Analysis of human brain potentials (eeg)" WS 2020/21
|
||||
|
||||
This repository holds the code of the semesterproject as well as the report.
|
||||
The main files are 'preprocessing_and_cleaning.py', 'erp_analysis.py' and 'decoding_tf_analyis.py'.
|
||||
The files hold:
|
||||
- preprocessing_and_cleaning.py : Holds the pre-processing pipeline of the project. By executing the file all subjects are pre-processed. Subjects 001, 003, 014 are pre-processed with manually selected pre-processing information, all other subjects are pre-processed with the given pre-processing information. Pre-processed cleaned data is saved in the BIDS file structure as 'sub-XXX_task-N170_cleaned.fif' where XXX is the subject number.
|
||||
Details can be found in the comments of the code.
|
||||
- erp_analysis.py : Holds the code for the erp-analysis. Computes the peak-differences and t-tests for several experimental contrasts. Details can be found in the comments of the code.
|
||||
- decoding_tf_analysis.py : Holds the code for the decoding and time-frequency analysis. Details can be found in the comments of the code.
|
||||
This repository holds the code of the semesterproject as well as the report, created by Julius Voggesberger.
|
||||
As the dataset for the project, the N170-dataset was chosen.
|
||||
As the three subjects, to be manually pre-processed, the subjects 001, 003 and 014 were chosen.
|
||||
The rest of the subjects were pre-processed with provided pre-processing information.
|
||||
|
||||
The folder 'utils' holds helper functions for some plots needed for the analysis and to load data, generate strings etc. and holds the code given in the lecture.
|
||||
The folder 'test' holds mostly unittests that test helper functions and one function which visually checks if N170 peaks are extracted correctly.
|
||||
### Structure
|
||||
```
|
||||
├── Dataset: The dataset of the project as well as the manually selected bad segments are stored here.
|
||||
| ├── n170: Store the dataset here.
|
||||
| └── preprocessed: Bad segments are stored here.
|
||||
├── cached_data: Data that is generated in the analysis part is stored here.
|
||||
| ├── decoding_data: Results of the classifiers.
|
||||
| ├── erp_peaks: ERP peaks needed for the ERP analysis.
|
||||
| └── tf_data: Time-frequency data needed for the tf-analysis.
|
||||
├── test: Contains unittests and one visual check.
|
||||
├── utils: Contains helper methods
|
||||
| ├── ccs_eeg_semesterproject: Methods given in the lecture.
|
||||
| ├── ccs_eeg_utils_reduced: Method for reading in BIDS provided in the lecture.
|
||||
| ├── file_utils.py: Methods for reading in files and getting epochs.
|
||||
| └── plot_utils.py: Methods for manually created plots.
|
||||
├── preprocessing_and_cleaning.py: The preprocessing pipeline.
|
||||
├── erp_analysis.py: The ERP-Analysis and computation of ERP peaks.
|
||||
├── decoding_tf_analysis.py: Decoding and time-frequency analysis.
|
||||
└── semesterproject_report_voggesberger: The report of the project.
|
||||
```
|
||||
|
||||
For the code to work properly, the N170 dataset needs to be provided.
|
||||
When first running the analysis, it may take a while. After running it one time the data is cached, so that it can be reused if the analysis should be executed again. Be careful though, as a parameter has to be explicitly set in the code, so that the already computed data is used. This parameter is a boolean given to each analysis function which caches data.
|
||||
|
||||
This code was created using Python 3.7 and the following libraries:
|
||||
### Running the project
|
||||
To run the project python 3.7 is required and anaconda recommended.\
|
||||
To ensure reproducability, randomstates were used for methods which are non-deterministic.
|
||||
The randomstates used are either '123' or '1234'.\
|
||||
The following libraries are needed:
|
||||
- Matplotlib 3.3.3
|
||||
- MNE 0.22.0
|
||||
- MNE-Bids 0.6
|
||||
@@ -23,3 +38,18 @@ This code was created using Python 3.7 and the following libraries:
|
||||
- Scikit-Learn 0.23.2
|
||||
- Pandas 1.2.0
|
||||
- Scipy 1.5.4
|
||||
|
||||
For the code to work, the N170 dataset needs to be provided and put into the folder 'Dataset/n170/', so that the file structure 'Dataset/n170/sub-001', etc. exists.
|
||||
The pre-processed raw objects are saved in their respective subject folder, in 'Dataset/n170/'.
|
||||
When first running the analysis, it may take a while.
|
||||
After running it one time the data is cached, so that it can be reused if the analysis should be executed again at a later time.
|
||||
For the cached data to be used, a boolean parameter has to be set in the respective analysis method.
|
||||
|
||||
It may be necessary to set the parent directory 'semesterproject_lecture_eeg' as 'Sources Root' for the project, if pycharm is used as an IDE.
|
||||
|
||||
### Parameters
|
||||
Parameters have to be changed manually in the code, if different settings want to be tried.
|
||||
|
||||
### Visualisation
|
||||
The visualisation methods that were used to generate the visualisations in the report, are contained in the code, if they were created manually.
|
||||
If a visualisation method from mne was used to create the visualisation, it may exist in the code or not.
|
||||
@@ -46,6 +46,7 @@ def events_to_labels(evts, events_dict, mask=None): # TODO Test schreiben
|
||||
def permutation_test(baseline, score, n_iter):
|
||||
"""
|
||||
An implementation of a permutation test for classification scores.
|
||||
|
||||
:param baseline: The classification scores of the baseline, i.e. selection by chance
|
||||
:param score: The classification scores which are tested for significance
|
||||
:param n_iter: number of permutations
|
||||
@@ -120,6 +121,8 @@ def decoding(dataset, filename, compute_metric=True, mask=None):
|
||||
# Compute index of time point 0
|
||||
index = math.floor((len(metric[0]) / time_scale) * 100)
|
||||
baseline = np.array(metric[:index]).flatten()
|
||||
|
||||
# Plot the result
|
||||
plt.plot(np.linspace(-200, 1000, 1127), np.mean(metric, axis=0))
|
||||
plt.ylabel('Accuracy (%)')
|
||||
plt.xlabel('Time (ms)')
|
||||
@@ -129,11 +132,12 @@ def decoding(dataset, filename, compute_metric=True, mask=None):
|
||||
# Compute the permutation tests
|
||||
for t in range(len(metric[0][index:])):
|
||||
score_t = np.asarray(metric[:, t + index])
|
||||
p = permutation_test(baseline, score_t, 100)
|
||||
p = permutation_test(baseline, score_t, 1000)
|
||||
p_values.append(p)
|
||||
if t % 50 == 0:
|
||||
print(str(t) + " Out of " + str(len(metric[0][index:])))
|
||||
|
||||
# Plot the result
|
||||
plt.plot(times[index:], p_values)
|
||||
plt.ylabel('P-Value')
|
||||
plt.xlabel('Time (ms)')
|
||||
@@ -143,14 +147,17 @@ def decoding(dataset, filename, compute_metric=True, mask=None):
|
||||
|
||||
def create_tfr(raw, condition, freqs, n_cycles, response='induced', baseline=None, plot=False):
|
||||
"""
|
||||
Compute the time frequency representation (TFR) of data for a given condition via morlet wavelets
|
||||
Compute the time frequency representation (TFR) of data for a given condition via Morlet wavelets
|
||||
|
||||
:param raw: the data
|
||||
:param condition: the condition for which to compute the TFR. Given as a list of tuples of the form (stimulus, condition) # TODO ambiguous use of condition
|
||||
:param condition: the condition for which to compute the TFR. Given as a list of tuples of the form (stimulus, texture)
|
||||
:param freqs: the frequencies for which to compute the TFR
|
||||
:param n_cycles: the number of cycles used by the morlet wavelets
|
||||
:param response: type of expected TFR. Can be total, induced or evoked. Default is induced
|
||||
:param n_cycles: the number of cycles used by the Morlet wavelets
|
||||
:param response: type of expected TFR. Can be total, induced or evoked. Default is induced,
|
||||
the others were not used for the report, only for exploration
|
||||
:param baseline: baseline used to correct the power. A tuple of the form (start, end).
|
||||
Default is None and no baseline correction will be applid
|
||||
Default is None and no baseline correction will be applied
|
||||
:param plot: True if results should be plotted, else false.
|
||||
:return: The TFR or the given data for a given condition. Has type AverageTFR
|
||||
"""
|
||||
epochs, _ = get_epochs(raw, condition, tmin=-0.2, tmax=1)
|
||||
@@ -168,6 +175,7 @@ def create_tfr(raw, condition, freqs, n_cycles, response='induced', baseline=Non
|
||||
power_induced = tfr_morlet(epochs.subtract_evoked(), freqs=freqs, n_cycles=n_cycles, return_itc=False, n_jobs=4)
|
||||
power = mne.combine_evoked([power_total, power_induced], weights=[1, -1])
|
||||
if plot: power.plot(picks='P7')
|
||||
# Apply a baseline correction to the power data
|
||||
power.apply_baseline(mode='ratio', baseline=baseline)
|
||||
if plot:
|
||||
plot_oscillation_bands(power)
|
||||
@@ -175,7 +183,7 @@ def create_tfr(raw, condition, freqs, n_cycles, response='induced', baseline=Non
|
||||
return power
|
||||
|
||||
|
||||
def time_frequency(dataset, filename, compute_tfr=True):
|
||||
def time_frequency(dataset, filename, scaling='lin', compute_tfr=True):
|
||||
"""
|
||||
Runs time frequency analysis
|
||||
|
||||
@@ -183,15 +191,14 @@ def time_frequency(dataset, filename, compute_tfr=True):
|
||||
:param filename: Filename of either the file from which the TFRs will be loaded
|
||||
or to which they will be saved
|
||||
:param compute_tfr: If True the TFRs will be created, else the TFRs will be loaded from a precomputed file
|
||||
:param scaling: default 'lin' for linear scaling, else can be 'log' for logarithmic scaling
|
||||
"""
|
||||
# Parameters
|
||||
# Frequency space (from, to, steps) -> Control frequency resolution : Between num=50-80 good for 1-50Hz
|
||||
# freqs = np.linspace(0.1, 50, num=50) # Use this for linear space scaling
|
||||
freqs = np.logspace(*np.log10([0.1, 50]), num=50)
|
||||
# Number of cycles -> Controls time resolution ? At ~freqs/2 good for high frequency resolution
|
||||
n_cycles = freqs / 2 # 1 for high time resolution & freq smoothing, freqs/2 for high freq resolution & time smooth
|
||||
# Baseline -> Should not go post-stimulus, i.e. > 0 -> Best ist pre-stimulus (e.g. -400 to -200ms)
|
||||
baseline = [-0.5, 0]
|
||||
if scaling == 'lin':
|
||||
freqs = np.linspace(0.1, 50, num=50) # Use this for linear space scaling
|
||||
else:
|
||||
freqs = np.logspace(*np.log10([0.1, 50]), num=50)
|
||||
n_cycles = freqs / 2
|
||||
cond1 = []
|
||||
cond2 = []
|
||||
times = None
|
||||
@@ -209,6 +216,8 @@ def time_frequency(dataset, filename, compute_tfr=True):
|
||||
raw.set_montage('standard_1020', match_case=False)
|
||||
|
||||
# Create the two conditions we want to compare
|
||||
# IMPORTANT: If different conditions should be compared you have to change them here, by altering the second
|
||||
# argument passed to create_tfr
|
||||
power_cond1 = create_tfr(raw, [('face', 'intact')], freqs, n_cycles, 'induced', (-0.2, 0))
|
||||
print(' CONDITION 1 LOADED')
|
||||
cond1.append(power_cond1)
|
||||
@@ -219,25 +228,32 @@ def time_frequency(dataset, filename, compute_tfr=True):
|
||||
cond2.append(power_cond2)
|
||||
print(' DONE')
|
||||
|
||||
# Save the data so we can access the results more easily
|
||||
np.save('cached_data/tf_data/' + filename + '_cond1', cond1)
|
||||
np.save('cached_data/tf_data/' + filename + '_cond2', cond2)
|
||||
else:
|
||||
# If the data should not be recomputed, load the given filename
|
||||
cond1 = np.load('cached_data/tf_data/' + filename + '_cond1.npy', allow_pickle=True).tolist()
|
||||
cond2 = np.load('cached_data/tf_data/' + filename + '_cond2.npy', allow_pickle=True).tolist()
|
||||
if times is None:
|
||||
times = cond1[0].times
|
||||
|
||||
# Some plots
|
||||
mne.grand_average(cond1).plot(picks=['P7'], vmin=-3, vmax=3, title='Grand Average P7')
|
||||
mne.grand_average(cond2).plot(picks=['P7'], vmin=-3, vmax=3, title='Grand Average P7')
|
||||
plot_oscillation_bands(mne.grand_average(cond1))
|
||||
plot_oscillation_bands(mne.grand_average(cond2))
|
||||
|
||||
# Compute the cluster permutation
|
||||
F, clusters, cluster_p_values, h0 = mne.stats.permutation_cluster_test(
|
||||
[mne.grand_average(cond1).data, mne.grand_average(cond2).data], n_jobs=4, verbose='INFO',
|
||||
seed=123)
|
||||
plot_tf_cluster(F, clusters, cluster_p_values, freqs, times)
|
||||
plot_tf_cluster(F, clusters, cluster_p_values, freqs, times, scaling)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
mne.set_log_level(verbose=VERBOSE_LEVEL)
|
||||
ds = 'N170'
|
||||
decoding(ds, 'faces_vs_cars', True)
|
||||
time_frequency(ds, 'face_intact_vs_all_0.1_50hz_ncf2', True)
|
||||
time_frequency(ds, 'face_intact_vs_all_0.1_50hz_ncf2', 'log', True)
|
||||
|
||||
|
||||
@@ -93,16 +93,18 @@ def create_peak_difference_feature(df, max_subj=40):
|
||||
|
||||
def analyze_erp(channels, precompute=True):
|
||||
"""
|
||||
Execute several statistical tests for different hypothesis, to analyze ERPs
|
||||
Execute several statistical tests for different hypothesis, to analyse ERPs
|
||||
:param channels: The channels for which the tests are executed
|
||||
:param precompute: If true, the peak-difference data will be computed. Else it will be loaded from a precomputed file,
|
||||
if it exists. This should only be set 'False' if the method was already executed once!
|
||||
"""
|
||||
if precompute:
|
||||
# Precompute the erp peaks
|
||||
precompute_erp_df('N170')
|
||||
|
||||
for c in channels:
|
||||
print("CHANNEL: " + c)
|
||||
# Load the erp peak data and create the features for the t-tests
|
||||
erp_df = pd.read_csv('cached_data/erp_peaks/erp_peaks_' + c + '.csv', index_col=0)
|
||||
feature_df = create_peak_difference_feature(erp_df)
|
||||
# 1. H_a : There is a difference between the N170 peak of recognizing faces and cars
|
||||
|
||||
@@ -52,7 +52,7 @@ def filter_data(raw):
|
||||
"""
|
||||
Filter the data of a single subject with a bandpass filter.
|
||||
The lower bound ist 0.5Hz to compensate the slow drifts.
|
||||
The upper bound is 50Hz to compensate the high frequencies, including the power line spike at 60Hz
|
||||
The upper bound is 48Hz to compensate the high frequencies, including the power line spike at 60Hz
|
||||
:param raw: The data to be filtered
|
||||
:return: The filtered data
|
||||
"""
|
||||
|
||||
BIN
semesterproject_report_voggesberger.pdf
Normal file
BIN
semesterproject_report_voggesberger.pdf
Normal file
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
@@ -18,7 +18,7 @@ def load_bad_annotations(filepath, fileending="badSegments.csv"):
|
||||
|
||||
def load_preprocessed_data(subject, dataset):
|
||||
"""
|
||||
Load the raw object as well as the annotations of the preprocessed file
|
||||
Load the raw object of the preprocessed file
|
||||
:param subject: The subject, for which we want to load the raw object
|
||||
:param dataset: The currently viewed dataset
|
||||
:param selected_subjects: The manually preprocessed subjects
|
||||
|
||||
@@ -56,15 +56,17 @@ def plot_grand_average(dataset):
|
||||
linestyles=['solid', 'solid', 'dotted', 'dotted'])
|
||||
|
||||
|
||||
def plot_tf_cluster(F, clusters, cluster_p_values, freqs, times):
|
||||
def plot_tf_cluster(F, clusters, cluster_p_values, freqs, times, scaling='lin'):
|
||||
"""
|
||||
Plot teh F-Statistic values of permutation clusters with p-values <= 0.05 in color and > 0.05 in grey.
|
||||
Plot the F-Statistic values of permutation clusters with p-values <= 0.05 in color and > 0.05 in grey.
|
||||
Currently only works well for the linear scaling. For the logarithmic scaling a different x-axis has to be chosen
|
||||
|
||||
:param F: F-Statistics of the permutation clusters
|
||||
:param clusters: all permutation clusters
|
||||
:param cluster_p_values: p-values of the clusters
|
||||
:param freqs: frequency domain
|
||||
:param times: time domain
|
||||
:param scaling: default 'lin' for linear scaling, else can be 'log' for logarithmic scaling
|
||||
"""
|
||||
good_c = np.nan * np.ones_like(F)
|
||||
for clu, p_val in zip(clusters, cluster_p_values):
|
||||
@@ -74,16 +76,33 @@ def plot_tf_cluster(F, clusters, cluster_p_values, freqs, times):
|
||||
bbox = [times[0], times[-1], freqs[0], freqs[-1]]
|
||||
plt.imshow(F, aspect='auto', origin='lower', cmap=cm.gray, extent=bbox, interpolation='None')
|
||||
a = plt.imshow(good_c, cmap=cm.RdBu_r, aspect='auto', origin='lower', extent=bbox, interpolation='None')
|
||||
|
||||
if scaling == 'log':
|
||||
ticks = [1, 4, 8, 12, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50]
|
||||
labels = [round(freqs[i], 2) for i in range(len(freqs)) if i + 1 in ticks]
|
||||
plt.yticks(ticks, labels)
|
||||
|
||||
plt.colorbar(a)
|
||||
plt.xlabel('Time (s)')
|
||||
plt.ylabel('Frequency (Hz)')
|
||||
plt.show()
|
||||
|
||||
|
||||
|
||||
def plot_oscillation_bands(condition):
|
||||
"""
|
||||
Plot the oscillation bands for a given condition in the time from 130ms to 200ms
|
||||
:param condition: the condition to plot the oscillation bands for
|
||||
"""
|
||||
fig, axis = plt.subplots(1, 5, figsize=(25, 5))
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=0, fmax=4, title='Delta', axes=axis[0], show=False, vmin=0, vmax=1.5, tmin=0, tmax=1)
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=4, fmax=8, title='Theta', axes=axis[1], show=False, vmin=0, vmax=0.7, tmin=0, tmax=1)
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=8, fmax=12, title='Alpha', axes=axis[2], show=False, vmin=-0.15, vmax=0.2, tmin=0, tmax=1)
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=13, fmax=30, title='Beta', axes=axis[3], show=False, vmin=-0.18, vmax=0.2, tmin=0, tmax=1)
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=30, fmax=45, title='Gamma', axes=axis[4], vmin=0, vmax=0.2, tmin=0, tmax=1)
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=0, fmax=4, title='Delta', axes=axis[0], show=False, vmin=0,
|
||||
vmax=1.5, tmin=0.13, tmax=0.2)
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=4, fmax=8, title='Theta', axes=axis[1], show=False, vmin=0,
|
||||
vmax=0.7, tmin=0.13, tmax=0.2)
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=8, fmax=12, title='Alpha', axes=axis[2], show=False, vmin=-0.25,
|
||||
vmax=0.2, tmin=0.13, tmax=0.2)
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=13, fmax=30, title='Beta', axes=axis[3], show=False, vmin=-0.21,
|
||||
vmax=0.2, tmin=0.13, tmax=0.2)
|
||||
condition.plot_topomap(baseline=(-0.2, 0), fmin=30, fmax=45, title='Gamma', axes=axis[4], vmin=-0.05, vmax=0.2,
|
||||
tmin=0.13,
|
||||
tmax=0.2)
|
||||
|
||||
Reference in New Issue
Block a user