Or at least a big part of it should be…

TL;DR
Data scientists should become MLOps Engineer or an Analytics Translator.

We keep doing the same things over and over again

In my experience as a data science consultant, I see many companies doing the same things over and over again. Which is odd, because data scientists (as partly technical people) can be expected to automate repetitive tasks. Indeed we have many tools available to automate common tasks. Hardly any data scientist I know is still programming their own machine learning algorithms. We all use Scikit-Learn and other libraries to train decision trees and the likes. Many of us use MLflow or similar tools to keep track of the experiments…


This blog post provides some context for my more extensive blog post titled
Data science is boring.

What comprises a data science project? I like to think of a data science project as a three-layered cake: Business, Analysis, Data.

Three layered cake of a data science project. (Here’s how to bake your own at Odlums.)

Generally speaking, during a data science project we move from top to bottom, and back up again, although in reality the path tends to be more mercurial.


By Erik Jan de Vries for BigData Republic

Many popular machine learning algorithms can enjoy great speed improvements if they are run on a GPU. In this blog I will discuss setting up your Linux system for GPU powered machine learning using Nvidia-Docker. Hopefully it will help you avoid some of the frustrations I faced while setting up my system.

This blog post is part of a larger series:

My laptop is an MSI GS63VR laptop, with Nvidia Geforce 1070 GPU and an…


By Erik Jan de Vries for BigData Republic

Is Linux your preferred operating system for machine learning, especially for GPU accelerated algorithms? But do you like to use Microsoft Office for word processing, spreadsheets and presentations? Or perhaps you like to use your GPU for gaming on Microsoft Windows? In this blog I will discuss my experiences with setting up a dual boot system. Hopefully it will help you avoid some of the frustrations I faced while setting up my system.

Update (2020–10): Microsoft has created Windows Subsystem for Linux (WSL), which offers most of the benefits previously only possible…


By Erik Jan de Vries for BigData Republic

As a Data Scientist, Linux is my preferred operating system for machine learning, especially for GPU accelerated algorithms. On the other hand I like to use Microsoft Office for word processing, spreadsheets and presentations. This blog is the introduction to a series, in which I discuss my experiences with setting up my laptop for Data Science.

As a Data Scientist, there are several very different aspects to my job. The most obvious one is probably…

Erik Jan de Vries

Consultant and Lead Data Scientist at BigData Republic — https://linkedin.com/in/erikjandevries

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store