Scientific Software
The software packages I have authored address pertinent needs of the structural biology research field and leverage my ten-year background in studying protein structure and dynamics experimentally. My software mainly focuses on modeling intrinsically disordered proteins, designing the architecture for modular docking software, handling protein structure data, assembling data analysis pipelines, and establishing code-developing best practices. Projects are listed below.
Opinion and technical articles
I enjoy writing about coding solutions to advanced problems I face at work. My article “Rethinking Python Decorators” on how to rewrite decorators to apply them in multiprocessing and pickling exemplifies my approach toward programming and advanced problem-solving. My articles on design choices and implementation strategies in Python are on DEV.to.
Talks on software development
-
“From modeling multidomain proteins with HADDOCK2 to building HADDOCK3.”
20 Years of HADDOCK, Huizen, The Netherlands, 7-10th November, 2023
Twitter | Linkedin | Slides -
“Modular code for a modular software: developing HADDOCK3”.
Instruct Software Developers Exchange Webinar 5, June 10th, 2022, invited.
Website | Twitter | LinkedIn | Youtube | Slides -
“Introducing HADDOCK3: Enabling modular integrative modelling pipelines”
BioExcel Webminar, June 7th, 2022, invited
Website | Twitter | LinkedIn | Youtube | Slides -
“Introducing HADDOCK3, Enabling modular integrative modelling pipelines”
NMR Meeting at Utrecht University, 1st June, 2022
Slides -
“Effective open-science practices for organizing a scientific software repository - extended”
International Symposium on Grids & Clouds (ISGC), 20-25th March, 2022, Virtual Conference.
Abstract | Slides -
“Effective open-science practices for organizing a scientific software repository - short”
OSCU Open Science Symposium - Faculty of Science, October 10th, 2021, Utrecht University, Netherlands.
Slides on OSF | Slides on GDrive
Scientific Software
Python project skeleton
Python-project-skeleton is a template repository where I explain how to configure a python package using the latest best practices in open-source software. PPS has an efficient package structure, documentation, tests, and continuous integration actions covering all needs for automatic deployment. Others can use PPS as a template for their projects or navigate around it for educational purposes because I thoroughly document all strategies I adopted.
IDPConformerGenerator
IDPConformerGenerator is a flexible, highly competitive, and modular open-source software platform for sampling the conformational space of Intrinsically Disordered Proteins. From an input aminoacid sequence, IDPCG can generate large and diverse ensembles of all-atom disordered protein states that obey geometric, steric, and other physical restraints. It also models loops and terminal tails with post-transcriptional modifications and accounts for bound protein partners and lipid bilayers!
HADDOCK3
“HADDOCK3 is the next generation integrative modelling software in the long-lasting HADDOCK project. It represents a complete rethinking and rewriting of the HADDOCK2.X series, implementing a new way to interact with HADDOCK and offering new features to users who can now define custom workflows … read more”
pdb-tools
pdb-tools
is a swiss army knife for manipulating and editing PDB files. It
is an ecosystem of more than 40 commands users can chain freely to allow maximum
versatility. pdb-tools
is written entirely in Python standard library,
requiring no dependencies.
SPyCi-PDB
SPyCi-PDB is a user-friendly Python interface to back-calculate experimental data for singular or ensembles of PDB structures. SPyCi-PDB can calculate NMR chemical shifts, NOE, J-couplings, PRE, RDCs, SAXS, and hydrodynamic radius.
taurenmd
taurenmd is a command-line ecosystem to analyze and manipulate Molecular Dynamics trajectories. I wrote taurenmd to have a centralized platform with a common API that leverages the potential of several software, such as MDAnalysis and MDtraj. taurenmd also serves as a hub to implement standardized protocols for MD analysis. taurenmd can be used as a python library but is was designed to work mainly as command-lines.
FarSeer-NMR
FarSeer-NMR is a software suite for automatic treatment, analysis and plotting of large and multi-variable data sets of NMR peaklists of proteins. Users can provide peaklists from complex titration schemes, for example, the same protein against multiple ligands, a temperature range, and a pH range, and FarSeer-NMR will calculate the spectral changes for every combination of conditions.
mdacli
mdacli
is a command-line interface for MDAnalysis. The aim is to provide
access to all MDAnalysis’ operations via command-lines. Command lines are
created automatically from MDAnalysis analysis classes, avoiding the need to
manually update the codebase for new functionalities in the main MDAnalysis
project.