epassaro's Blog

Blog post: Week 2

Published: 06/02/2019

Pandas + regex = ♥

I always avoid working with regular expressions, but sometimes is the right tool to use.

I had to write a parser for a variety of files which are almost identical in format. These files are the output of Fortran routines dated from 1995 to present, and contains atomic measurements made by physicists. The subtle differences between them makes impossible to use whitespaces as separators.

Fortunately, Pandas allows you to use regular expressions as 'sep' argument in pandas.read_csv function.

Also, one of my mentors is really good at regular expressions, so after a few tries we have our perfect parser.

See an example

Now we're capable of extracting data from +300 files in a simple and homogeneous way!

On Week 2 I had to write move from these Jupyter Notebooks to the actual codebase. This was a challenge to me because I'm not so confident about my object oriented programming skills, but it worked out!. I successfully wrote new classes for parsers which can read files and dump data in the HDF5 format.

See an example


Moving to Python 3, continuous integration and more:

When I decided to learn Python I went for 3.5, so I skipped Python 2. The only thing I knew about Python "legacy" was the use of the print statement without parentheses.

At the beginning of the coding period I was told to get Travis CI to work again. Unit testing and continuous integration were things I've heard about but never had the chance to use. So porting our codebase to Python 3 was absolutely necessary in order to move on.

A few things I've learned in the process:

  • Look for range(), zip(), and map() functions and use list() before them.
  • Sometimes is good to pin package versions close to the ones that worked when the package was built.
  • itertools() is a deprecated method in Python 3, look for it!
  • Of course use parentheses in the print statements.

Fortunately, Travis CI is "easy" to configure, specially if you have experience with bash.



This entry also can be found at dev.to/epassaro

View Blog Post

Check in: Week 1

Published: 05/31/2019

1. What did you do this week?

On the first week of the coding period I wrote the energy levels and oscillatory strengths parser, as stipulated on my proposal. Also I had to port the Carsus package from Python 2 to Python 3 and build it with Travis CI.

2. What is coming up next?

Next week I will work on the collitional energies parser.

3. Did you get stuck anywhere?

No, I didn't.

View Blog Post

Winter is coding

Published: 05/21/2019

Hi! I'm Ezequiel from Argentina and during the next 12 weeks (southern hemisphere winter) I will be working with the TARDIS sub-organization in the project called "Expansion of the TARDIS Atomic Database" as part of the Google Summer of Code 2019 program.


TARDIS is a Monte Carlo radiative transfer code whose primary goal is the calculation of theoretical spectra for supernovae based on a number of input parameters, such as the supernova brightness and the abundances of the different chemical elements present in the ejecta. The main idea for this procedure is that by finding a close match between theoretical and observed spectra the parameters that actually describe the supernovae can be identified.

The objective of this proposal is to incorporate new atomic data into the TARDIS database. In order to accomplish this job several tasks are required: parsers for different file types must be written, unit testing, full integration with TARDIS codebase and more. Finally, will be crucial to determine how new atomic data affects the synthethic spectra.

The result of this work will not only be of great value for TARDIS, but also for many researchers who require atomic measurements.


Coding period starts on May 27th, so now we'are in the middle of something called "bonding period" where organizations and students do some preliminary work. Stay tuned for more updates!

View Blog Post