I have talked in the past in this blog about the need for standardisation of data science and data science processes. Microsoft has made progress on that front. It has released a particular methodology called TDSP (Team Data Science Process). This is an attempt to formalize the way that data scientists work and collaborate.

In order to further support this methodology Microsoft recently released two very valuable tools for TDSP which you can find at this repository: https://github.com/Azure/Azure-TDSP-Utilities/tree/master/DataScienceUtilities. The Modeling tool allows for automated test of different algorithms and IDEAR aids with exploration and reporting.

Both of the tools are in R. The modelling tool uses YAML to specify experiments, which makes it pretty convenient. The cases covered now are binary classification and regression, but I would expect more things to see coming up in the future.

Hopefully, we’ll see more tools like that coming out.