sappelhoff's Blog

Second week of GSoC: Description of two exemplary work projects

sappelhoff
Published: 06/09/2019

In today's blog post I will describe two example projects that I have been working on during the last week. finally I will describe how these two examples relate to my overall goal in this GSoC.

Conversion of MNE-somato-data

This week I spent some time converting a dataset to comply with the Brain Imaging Data Structure: The MNE-somato-data, which is used in several code examples and tutorials for the MNE-Python documentation.

The Brain Imaging Data Structure is an emerging standard on how to organize and structure neuroimaging data recordings such as MRI, EEG, MEG, or iEEG data. Such a standard is invaluable to improve the sharing of data, performing quality analysis, and building automated pipelines.

Converting existing datasets to this new standard allows us to reap all of these benefits and build on them in the future.

However, the conversion is often not very straight forward. In the particular case of the MNE-somato-data I was facing a severe lack of documentation. Thus, the conversion from an arbitrary data structure to the standard of BIDS was slower than expected, yet now the somato dataset has a much better documentation on top of being organized in a sensible standard.


Autoreject documentation

The autoreject package is Python software to "clean" electrophysiology data such as EEG and MEG. It uses a process of crossvalidation to automatically find thresholds that can be used to reject or retain parts of the data. In addition, there is an algorithm to repair data data that might be rejected otherwise (because of exceeding the crossvalidated threshold).

When using a software package such as autoreject, the documentation of the inner workings are almost as important as the functionality of the software itself: Especially when it comes to the analysis of scientific data by researchers, who are often not trained to go through sourcecode and understand the inner workings themselves.

The autoreject package has some documentation in the form of examples that show off the basic functionality. On top, there is a small FAQ section that addresses user needs beyond getting information about basic functionality.

This week, I added a section on the general understanding of the algorithm, not directly related to code. Providing this intuitive explanation up front can be used to approach the more mathematical explanations to be found in the associated scientific publication.

Throughout this process, I have tried to follow the guidelines on "good documentations" that are always split into 4 parts: "Tutorials", "How-to guides", "Explanation", and "Reference"

good documentation picture
 

 

How does this related to my overall project?

My overall project goal is to enable or enhance automatic processing of neurophsyiology datasets organized using BIDS. The conversion of the MNE-somato-data to BIDS provides me with a testing case for analyses pipelines. And, as already evident from its name, the autoreject package is a prime candidate for automatic processing of neurophysiology data and it is a good idea to improve the documentation of the software that you want other people to use.

View Blog Post

First week of GSoC: Going down several rabbit holes

sappelhoff
Published: 06/02/2019

1. What did you do this week?

This week was characterized by many different smaller tasks, such as:

  1. improvements to documentation,
  2. fixing of bugs (typo-bugs),
  3. speeding up continuous integration through caching,
  4. opening issues to discuss potential APIs for analysis pipelines,
  5. ... and some more


To track my progress in GSoC, I have made a repository: github.com/sappelhoff/gsoc2019 where I host a changelog file that contains each issue/pr/task that I have worked on, divided by weeks and days.

For my overall project, the most important work was probably opening an issue to discuss potential APIs for analysis pipelines. I suggested a JSON-file centered approach, and Mainak (my mentor) pointed me to several existing solutions. In our chat on Gitter we later agreed to target the mne-study-template and improving it, before making an attempt to program a new pipeline from scratch.


2. What is coming up next?

Next week I will also travel to Rome for the OHBM conference, where I will meet Mainak and Alex who are mentoring me during this GSoC.


3. Did you get stuck anywhere?

I do not feel like I "got stuck" with anything in particular, but neither did I make a good "first step" with my project. As indicated in the title of this post, I always started to do something, and then got sidetracked by minor issues that I first wanted to fix. This ended up being a big time investment for each fix. It feels good to fix minor issues, but it should not distract me from the overall goal of the GSoC :-)

View Blog Post

Google Summer of Code (GSoC) 2019: Analysis Pipelines and BIDS

sappelhoff
Published: 05/28/2019

Dear Python and GSoC Community,

this year I will take part in the Google Summer of Code (GSoC) to 
dedicate three months of coding towards improving neuro-data analysis 
with MNE-Python.

I am a PhD student in my second year, mostly working with EEG data in 
the domain of human decision making. In my free time I contribute to 
open source software and more recently, I have become a maintainer for 
the "Brain Imaging Data Structure" (BIDS), an emerging standard for 
organizing neuroimaging data.

In my GSoC project with MNE-Python, I will be drawing on "Brain Imaging 
Data Structure"  to build automated, standardized analysis pipelines. 
Stay tuned! :-)

If you want to get in touch, feel free to reach out via
Gitter or Github (@sappelhoff).


Cheers,

Stefan
View Blog Post