sappelhoff's Blog

Eighth week of GSoC: Mixed tasks and progress

Published: 07/21/2019

Two thirds of the GSoC program are already over - time is passing very quickly. This past week, we made some progress with the mne-study-template and making it usable with BIDS formatted data.

Alex has improved the flow substantially, with Mainak serving as the "Continuous Integration service", regularly running different datasets to the pipeline and reporting where they get stuck. My own tasks were very diverse this week:

MNE-BIDS maintenance

I fixed several bugs with MNE-BIDS that we found while working on the study template. For example:

Reviewing and user support

Furthermore, I was very happy to see many issues raised on MNE-BIDS. The issues showed that more and more people are picking up MNE-BIDS and using it in their data analysis pipelines. However, that also meant that in the last week, I did more user support and reviewing of pull requests than usual.

For example, a nice pull request that I reviewed was done by Marijn (who is also an MNE-Python contributor). He improved MNE-BIDS' find_matching_sidecar function by introducing a "race for the best candidate" of the matching sidecar file.

Work on mne-study-template

Finally, I also worked on the mne-study-template myself - however, my contributions were rather modest. I mostly cleaned up the configuration files, formatted testing data, and made workflows work where they got stuck.

See for example here.

Next week, I want to work more on the mne-study-template.

View Blog Post

Seventh week of GSoC: Just a status report

Published: 07/14/2019

1. What did you do this week?

Main work:

  • I improved MNE-BIDS's "read_raw_bids" function in PR #219 by allowing it to automatically set channel types in a "raw" object by parsing an accompanying BIDS channels.tsv file
  • I worked on the MNE-STUDY-TEMPLATE, making it work for the first step "loading and filtering" for a different dataset and modality than it was intended for
  • I started a PR (#221) to expose the MNE-BIDS "copyfile functions" to the command line interface

Other work:

  • User support in MNE-BIDS and MNE-Python over the issues
  • Bugfix in MNE-BIDS (PR #217)
  • Made dev docs accessible for MNE-BIDS via the CircleCI API (PR #216)
  • ... and lots of other stuff. As usual, I keep my log on my GSoC repository

2. What is coming up next?

I will ...

  • finish the PR on improving the MNE-BIDS command line (including docs)
  • Go back to the MNE-STUDY-TEMPLATE and try to make it work for the EEG data beyond the filtering
  • Make a dedicated example to MNE-BIDS' read_raw_bids function
  • allow read_raw_bids to read the digitization files accompanying a raw data file

3. Did you get stuck anywhere?

As usual, there were many minor points where I got stuck. And as usual, I got lots of support from my mentoring team. This week, there was nothing serious however :-)

Some examples:

  • Trying to kill the warnings MNE-BIDS currently throws when running the tests
  • Fighting with Freesurfer and the MNE bindings to Freesurfer
View Blog Post

Sixth week of GSoC: On taking breaks and abstaining from rewriting old code

Published: 07/07/2019

This past week I finally finished the big Pull Request on coordinate systems and writing T1 MRI data for BIDS. Right after the PR was done, I felt like taking a break and not immediately start coding again. Usually, that works quite well in my life as a PhD student: There is a large diversity of non-coding tasks going from reading, over writing, to simply recording data (very practical work). With currently being a student in the Google Summer of Code, I perceive much less diversity of tasks: There are lots of features to be implemented ... and as soon as one feature gets done, the next one should be tackled.

  • How do other software developers take breaks (or when)?
  • Do they even feel like taking a break after finishing a certain feature?
  • Or is this an issue too individual to be generally answered?
  • ... or could it be that my perception of the different coding tasks was a bit too coarse last week and that "implementing features" is more diverse than it sounds?

Anyhow, I overcame my short period of lower motivation and started to work again on the mne-study-template, just to face the next challenge: Rather than iteratively improve the codebase, I felt the urge to completely rewrite it from scratch.

Some background: I did not design or implement the codebase so far, so everything is rather new to me. I quickly realized that the study template is very biased towards specific types of data to be processed ... and also very biased towards a specific structure that the data should be set up in. With my job to make the mne-study-template more general and relying on a data standard rather than arbitrary data structures, a re-write seemed most efficient to me. Fortunately, I remembered this quote by Joel Spolsky from his blog entry Things you should never do, Part 1:

"Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand."

The takeaway of the post is, that re-writing an existing codebase from scratch is rarely a good idea.

So next week, I'll further dig into the study template and start with iterative improvements..

View Blog Post

Fifth week of GSoC: Coordinate Systems and Transformations

Published: 06/30/2019

1. What did you do this week?

This week's primary work related to work I started in my second week of GSoC. There I was starting to convert MNE-Python's "somato" dataset to the BIDS standard, so that I can use it as an easy test case for improving BIDS-MNE-pipelines.

That work was then halted, because we realized that some files could not yet be saved according to the BIDS standard, because the specification does not cover them (as of yet).

Thanks to an idea by Alex, this week was dedicated to implement code that can quickly recalculate all files, without having to save them --> thus, we achive full BIDS compatibility.

Let's have a concrete summary in bullet points:

  • When handling MEG data, we are often dealing with three different coordinate systems
    • One system to specify the head of the study participant
    • One system to specify the sensors of the MEG machine
    • One system that specifies the MRI scan of the study participants head
  • For source-space analyses, we need to align these coordinate system in a process called coregistration
  • The coregistration is achieved through transformation matrices, which specify how points have to be rotated and translated to fit from one system into the other
  • In MNE-Python, these transformation matrices are called `trans` ... and currently, there is no fixed way in BIDS, how/where to save these as files
  • In BIDS however, we DO know how to save anatomical landmarks such as the Nasion, and left and right preauricular points
  • Thus, we simply save all of the points in their respective coordinate systems, and then call a function that calculates the `trans` by fitting the points to each other

Sounds more straight forward than it turned out to be, and I spent nearly the whole week wrapping my head around concepts, private functions, improving docs, and implementing the necessary code.

It's still not completely finished, by we are getting close.

2. What is coming up next?

Next week, I hope to finalize the work on coordinate systems and transformations. Then I will finally start to make a BIDS version of the mne-study template. Perhaps starting with the first steps, instead of tackling all steps (including source localization) at the same time.

I realized that currently the mne-study-template is quite MEG centric. I will see whether that will make me run into problems.

3. Did you get stuck anywhere?

I got stuck at several points while working out coregistration, as is documented in several posts I made:

Luckly, I received lots of helpful comments by Mainak, Eric, and Alex ... so I made some progress regardless of the challenges. :-)

View Blog Post

Fourth week of GSoC: Second half of "short GSoC pause" (due to summer school), and getting back to work - Issues with the BIDS-validator

Published: 06/24/2019

The past week I finished the summer school that I was running from the 11th of June until the 19th of June. It was a success given the reactions from all participants1. However I was very happy when I could return to my GSoC project and coding starting last Thursday.

I started by adding a feature to MNE-BIDS: When reading a raw file into a python object, we want to automatically scan an accompanying meta data file (channels.tsv) to populate the python object with information about bad channels in the data. When implementing the feature, most of the problems I encountered were due to an interaction with the BIDS-validator, which I want to dedicate today's blog post to.

The BIDS-validator

I have written about BIDS before: It's a standard for organizing neuroimaging data. A standard can only be a standard when there is a set of testable rules to follow. The BIDS-validator is a software that automatically checks a dataset for its compliance with the BIDS set of testable rules. The current BIDS-validator is written in JavaScript, which offers a unique advantage: It can be run locally inside a browser (see here), that is: No files are uploaded. This way, users of BIDS can employ the BIDS-validator without having to download software. Yet, for users with some programming experience, the BIDS-validator can also be downloaded as a command line tool to run on nodejs.

Alas, the big advantage of having the BIDS-validator instantiated in Javascript also comes at a cost: The programming language itself. With BIDS being a standard for scientific data, most of the user base consists of researchers. Only a fraction of researchers in the field of neuroscience is well versed in Javascript, with the lingua francas of the field being Matlab and Python (and according to my experiences increasingly Python and less and less Matlab). This means, that open source contributions from the researchers to the BIDS-validator are limited and the BIDS-validator development relies on a small core of contributors and to some extend on contributions from a commercial company, funded through grants given to BIDS. The resulting problem is that the BIDS-validator often lags behind the development of the standard ... or that not all rules are tested to an appropriate extend.

Some rules of BIDS are implemented in the form of regular expressions (see here), and are thus"programming language agnostic". This is a great starting point for BIDS-validators implemented in other languages, and there is some limited Python support.

Thinking about this, I often find myself going down the road of writing a complete BIDS-validator in Python. The advantage would be obvious: Much easier development! And the expense that it wouldn't be as easily available from the Browser as the current Javascript implementation would be negligible for anyone who can install a python package. Yet, as soon as we have more than one validator, we need to ensure that they produce exactly the same results ... and that could lead to another set of problems very soon.

It seems there is no easy way out of this situation. For me, that means that while I develop features with Python during my GSoC, I will have to occasionally spend disproportionate amounts of time debugging and enhacing the BIDS-validator with its codebase in Javascript.


1Although Eric from MNE-Python told me that "Success is measured by how many people you convinced to use and contribute to MNE-Python :)"

View Blog Post