Hello everyone,
In this post, I would like to tell you a little bit more about my Google Summer of Code (GSoC) Project and give you a quick summary of the progress I’ve made so far.
About my project.
I’m a PhD student from Germany. In my PhD work I focus on the analysis of brain activity patterns and how these are influenced by individuals' personality and other situational factors. Thus, some of the ideas for my GSoC project are, at least to some extend, rooted in issues I've come across while analyzing data, looking for ways to describe the relationship between a set variables and patterns of brain activity.
In short, the core of my project consist in developing a set of tools and tutorials that extend the capabilities for regression analysis in MNE-Python, the premier toolbox for analyzing neural time series in Python. In statistics, linear regression is typically used for describing the relationship between predictors and response variables or targets. In particular, by determining the strength of the relationship between these variables, linear regression algorithms can help identify variables and/or subsets of data that contain relevant information about the things we would like to predict (e.g., in my case, patterns of brain activity).
To date, linear regression functionality in MNE-Python is capable of handling regression designs mostly characterized by the introduction of categorical predictors based on ordinary least squares estimation. Even though this approach can be used to inspect relationships between a wide variety of predictors and targets, the limited options for specifying more complex regression models, such as those based on robust and hierarchical estimation algorithms (see for instance here) are currently preventing users from making use of the functionality of MNE’s linear regression at a larger scale, and from using more elaborated regression tools commonly implemented in multiple scientific fields for which MNE is relevant.
The major goal of my GSoC Project is to provide a certain degree of flexibility and allow users to fit different types of models in accordance to their research questions. Feel free to the visit the wiki-page on GitHub, if you’d like to learn more about the project.
What I've done so far.
During the first week of GSoC, I've worked on integrating open data resources in MNE-Python and wrote a set of functions that allow for an easy handling of these resources (see previous post and this this PR on GitHub for further details). This kind of "open data sets", are fundamental to my project, since I plan to validate new implementations of the linear regression framework on them. Furthermore, I added some initial example code to explain linear regression functionality on the basis of this, newly integrated, data set (see here).
There were some issues along the way, specially when it came down to integrating my code in MNE's API. Thus, a big chunk of work from last week was related to fixing errors and making improvements to my code. However, I believe I learned a lot during the process and I'm looking forward to further consolidate my proposal of the API for statistical modeling in MNE-Python next week.
Stay tuned!