Compare commits

..

16 Commits

4 changed files with 63 additions and 20 deletions

View File

@@ -1,20 +1,36 @@
## Semesterproject of the lecture "Semesterproject Signal processing and Analysis of human brain potentials (eeg) WS 2020/21 ## Semesterproject of the lecture "Semesterproject Signal processing and Analysis of human brain potentials (eeg)" WS 2020/21
This repository holds the code of the semesterproject as well as the report. This repository holds the code of the semesterproject as well as the report, created by Julius Voggesberger.
The main files are 'preprocessing_and_cleaning.py', 'erp_analysis.py' and 'decoding_tf_analyis.py'. As the dataset for the project, the N170-dataset was chosen.
The files hold: As the three subjects, to be manually pre-processed, the subjects 001, 003 and 014 were chosen.
- preprocessing_and_cleaning.py : Holds the pre-processing pipeline of the project. By executing the file all subjects are pre-processed. Subjects 001, 003, 014 are pre-processed with manually selected pre-processing information, all other subjects are pre-processed with the given pre-processing information. Pre-processed cleaned data is saved in the BIDS file structure as 'sub-XXX_task-N170_cleaned.fif' where XXX is the subject number. The rest of the subjects were pre-processed with provided pre-processing information.
Details can be found in the comments of the code.
- erp_analysis.py : Holds the code for the erp-analysis. Computes the peak-differences and t-tests for several experimental contrasts. Details can be found in the comments of the code.
- decoding_tf_analysis.py : Holds the code for the decoding and time-frequency analysis. Details can be found in the comments of the code.
The folder 'utils' holds helper functions for some plots needed for the analysis and to load data, generate strings etc. and holds the code given in the lecture. ### Structure
The folder 'test' holds mostly unittests that test helper functions and one function which visually checks if N170 peaks are extracted correctly. ```
├── Dataset: The dataset of the project as well as the manually selected bad segments are stored here.
| ├── n170: Store the dataset here.
| └── preprocessed: Bad segments are stored here.
├── cached_data: Data that is generated in the analysis part is stored here.
| ├── decoding_data: Results of the classifiers.
| ├── erp_peaks: ERP peaks needed for the ERP analysis.
| └── tf_data: Time-frequency data needed for the tf-analysis.
├── test: Contains unittests and one visual check.
├── utils: Contains helper methods
| ├── ccs_eeg_semesterproject: Methods given in the lecture.
| ├── ccs_eeg_utils_reduced: Method for reading in BIDS provided in the lecture.
| ├── file_utils.py: Methods for reading in files and getting epochs.
| └── plot_utils.py: Methods for manually created plots.
├── preprocessing_and_cleaning.py: The preprocessing pipeline.
├── erp_analysis.py: The ERP-Analysis and computation of ERP peaks.
├── decoding_tf_analysis.py: Decoding and time-frequency analysis.
└── semesterproject_report_voggesberger: The report of the project.
```
For the code to work properly, the N170 dataset needs to be provided. ### Running the project
When first running the analysis, it may take a while. After running it one time the data is cached, so that it can be reused if the analysis should be executed again. Be careful though, as a parameter has to be explicitly set in the code, so that the already computed data is used. This parameter is a boolean given to each analysis function which caches data. To run the project python 3.7 is required and anaconda recommended.\
To ensure reproducability, randomstates were used for methods which are non-deterministic.
This code was created using Python 3.7 and the following libraries: The randomstates used are either '123' or '1234'.\
The following libraries are needed:
- Matplotlib 3.3.3 - Matplotlib 3.3.3
- MNE 0.22.0 - MNE 0.22.0
- MNE-Bids 0.6 - MNE-Bids 0.6
@@ -22,3 +38,18 @@ This code was created using Python 3.7 and the following libraries:
- Scikit-Learn 0.23.2 - Scikit-Learn 0.23.2
- Pandas 1.2.0 - Pandas 1.2.0
- Scipy 1.5.4 - Scipy 1.5.4
For the code to work, the N170 dataset needs to be provided and put into the folder 'Dataset/n170/', so that the file structure 'Dataset/n170/sub-001', etc. exists.
The pre-processed raw objects are saved in their respective subject folder, in 'Dataset/n170/'.
When first running the analysis, it may take a while.
After running it one time the data is cached, so that it can be reused if the analysis should be executed again at a later time.
For the cached data to be used, a boolean parameter has to be set in the respective analysis method.
It may be necessary to set the parent directory 'semesterproject_lecture_eeg' as 'Sources Root' for the project, if pycharm is used as an IDE.
### Parameters
Parameters have to be changed manually in the code, if different settings want to be tried.
### Visualisation
The visualisation methods that were used to generate the visualisations in the report, are contained in the code, if they were created manually.
If a visualisation method from mne was used to create the visualisation, it may exist in the code or not.

View File

@@ -183,7 +183,7 @@ def create_tfr(raw, condition, freqs, n_cycles, response='induced', baseline=Non
return power return power
def time_frequency(dataset, filename, compute_tfr=True): def time_frequency(dataset, filename, scaling='lin', compute_tfr=True):
""" """
Runs time frequency analysis Runs time frequency analysis
@@ -191,10 +191,13 @@ def time_frequency(dataset, filename, compute_tfr=True):
:param filename: Filename of either the file from which the TFRs will be loaded :param filename: Filename of either the file from which the TFRs will be loaded
or to which they will be saved or to which they will be saved
:param compute_tfr: If True the TFRs will be created, else the TFRs will be loaded from a precomputed file :param compute_tfr: If True the TFRs will be created, else the TFRs will be loaded from a precomputed file
:param scaling: default 'lin' for linear scaling, else can be 'log' for logarithmic scaling
""" """
# Parameters # Parameters
# freqs = np.linspace(0.1, 50, num=50) # Use this for linear space scaling if scaling == 'lin':
freqs = np.logspace(*np.log10([0.1, 50]), num=50) freqs = np.linspace(0.1, 50, num=50) # Use this for linear space scaling
else:
freqs = np.logspace(*np.log10([0.1, 50]), num=50)
n_cycles = freqs / 2 n_cycles = freqs / 2
cond1 = [] cond1 = []
cond2 = [] cond2 = []
@@ -245,11 +248,12 @@ def time_frequency(dataset, filename, compute_tfr=True):
F, clusters, cluster_p_values, h0 = mne.stats.permutation_cluster_test( F, clusters, cluster_p_values, h0 = mne.stats.permutation_cluster_test(
[mne.grand_average(cond1).data, mne.grand_average(cond2).data], n_jobs=4, verbose='INFO', [mne.grand_average(cond1).data, mne.grand_average(cond2).data], n_jobs=4, verbose='INFO',
seed=123) seed=123)
plot_tf_cluster(F, clusters, cluster_p_values, freqs, times) plot_tf_cluster(F, clusters, cluster_p_values, freqs, times, scaling)
if __name__ == '__main__': if __name__ == '__main__':
mne.set_log_level(verbose=VERBOSE_LEVEL) mne.set_log_level(verbose=VERBOSE_LEVEL)
ds = 'N170' ds = 'N170'
decoding(ds, 'faces_vs_cars', True) decoding(ds, 'faces_vs_cars', True)
time_frequency(ds, 'face_intact_vs_all_0.1_50hz_ncf2', True) time_frequency(ds, 'face_intact_vs_all_0.1_50hz_ncf2', 'log', True)

Binary file not shown.

View File

@@ -56,7 +56,7 @@ def plot_grand_average(dataset):
linestyles=['solid', 'solid', 'dotted', 'dotted']) linestyles=['solid', 'solid', 'dotted', 'dotted'])
def plot_tf_cluster(F, clusters, cluster_p_values, freqs, times): def plot_tf_cluster(F, clusters, cluster_p_values, freqs, times, scaling='lin'):
""" """
Plot the F-Statistic values of permutation clusters with p-values <= 0.05 in color and > 0.05 in grey. Plot the F-Statistic values of permutation clusters with p-values <= 0.05 in color and > 0.05 in grey.
Currently only works well for the linear scaling. For the logarithmic scaling a different x-axis has to be chosen Currently only works well for the linear scaling. For the logarithmic scaling a different x-axis has to be chosen
@@ -66,6 +66,7 @@ def plot_tf_cluster(F, clusters, cluster_p_values, freqs, times):
:param cluster_p_values: p-values of the clusters :param cluster_p_values: p-values of the clusters
:param freqs: frequency domain :param freqs: frequency domain
:param times: time domain :param times: time domain
:param scaling: default 'lin' for linear scaling, else can be 'log' for logarithmic scaling
""" """
good_c = np.nan * np.ones_like(F) good_c = np.nan * np.ones_like(F)
for clu, p_val in zip(clusters, cluster_p_values): for clu, p_val in zip(clusters, cluster_p_values):
@@ -75,12 +76,19 @@ def plot_tf_cluster(F, clusters, cluster_p_values, freqs, times):
bbox = [times[0], times[-1], freqs[0], freqs[-1]] bbox = [times[0], times[-1], freqs[0], freqs[-1]]
plt.imshow(F, aspect='auto', origin='lower', cmap=cm.gray, extent=bbox, interpolation='None') plt.imshow(F, aspect='auto', origin='lower', cmap=cm.gray, extent=bbox, interpolation='None')
a = plt.imshow(good_c, cmap=cm.RdBu_r, aspect='auto', origin='lower', extent=bbox, interpolation='None') a = plt.imshow(good_c, cmap=cm.RdBu_r, aspect='auto', origin='lower', extent=bbox, interpolation='None')
if scaling == 'log':
ticks = [1, 4, 8, 12, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50]
labels = [round(freqs[i], 2) for i in range(len(freqs)) if i + 1 in ticks]
plt.yticks(ticks, labels)
plt.colorbar(a) plt.colorbar(a)
plt.xlabel('Time (s)') plt.xlabel('Time (s)')
plt.ylabel('Frequency (Hz)') plt.ylabel('Frequency (Hz)')
plt.show() plt.show()
def plot_oscillation_bands(condition): def plot_oscillation_bands(condition):
""" """
Plot the oscillation bands for a given condition in the time from 130ms to 200ms Plot the oscillation bands for a given condition in the time from 130ms to 200ms