TL;DR
Data scientists should become MLOps Engineer or an Analytics Translator.
In my experience as a data science consultant, I see many companies doing the same things over and over again. Which is odd, because data scientists (as partly technical people) can be expected to automate repetitive tasks. Indeed we have many tools available to automate common tasks. Hardly any data scientist I know is still programming their own machine learning algorithms. We all use Scikit-Learn and other libraries to train decision trees and the likes. Many of us use MLflow or similar tools to keep track of the experiments we run and the models we train. …
This blog post provides some context for my more extensive blog post titled
Data science is boring.
What comprises a data science project? I like to think of a data science project as a three-layered cake: Business, Analysis, Data.
Generally speaking, during a data science project we move from top to bottom, and back up again, although in reality the path tends to be more mercurial.
By Erik Jan de Vries for BigData Republic
Many popular machine learning algorithms can enjoy great speed improvements if they are run on a GPU. In this blog I will discuss setting up your Linux system for GPU powered machine learning using Nvidia-Docker. Hopefully it will help you avoid some of the frustrations I faced while setting up my system.
This blog post is part of a larger series:
My laptop is an MSI GS63VR laptop, with Nvidia Geforce 1070 GPU and an Intel GPU integrated onto the CPU. My goal is to use the Nvidia GPU for machine learning, while using the integrated Intel GPU for displaying the desktop. In addition, I would like to use Docker containers for running my machine learning algorithms, so as the foundations for my machine learning workstation I am going to set up Nvidia-Docker to allow my Docker containers to run algorithms on the GPU using CUDA. …
By Erik Jan de Vries for BigData Republic
Is Linux your preferred operating system for machine learning, especially for GPU accelerated algorithms? But do you like to use Microsoft Office for word processing, spreadsheets and presentations? Or perhaps you like to use your GPU for gaming on Microsoft Windows? In this blog I will discuss my experiences with setting up a dual boot system. Hopefully it will help you avoid some of the frustrations I faced while setting up my system.
Update (2020–10): Microsoft has created Windows Subsystem for Linux (WSL), which offers most of the benefits previously only possible with a dual boot system. Currently GPU acceleration is not available for the Linux kernel, but they appear to be working on that for a future release. Make sure to check out WSL!
Getting started: https://www.sitepoint.com/wsl2/ …
By Erik Jan de Vries for BigData Republic
As a Data Scientist, Linux is my preferred operating system for machine learning, especially for GPU accelerated algorithms. On the other hand I like to use Microsoft Office for word processing, spreadsheets and presentations. This blog is the introduction to a series, in which I discuss my experiences with setting up my laptop for Data Science.
As a Data Scientist, there are several very different aspects to my job. The most obvious one is probably analysing data and making predictions. On the other hand, the most important aspect of my job may well be interacting with the business: if I don’t fully understand how the business works, I cannot create the best solutions for them. Unfortunately, these two worlds — analysing data and communicating with the business — have very different technological requirements. …
About