Experiment design for (marketing) campaigns Nov 2, 2022 In a marketing campaign we always need to measure the performance. This is typically done by looking at the performance difference of target and control group. To effectively measure success, the business stakeholder will probably tell you about a few metrics they are interested about. However, it is up to the data scientist/analyst to set-up the experiment in a way that allows a statistically meaningful measurement. This needs to be done before launching the campaign. ...
Getting started in Data Science - For the people in a completely different domain Jul 21, 2020 You might be sitting at a pivot point in life where you decided to try something completely new and someone told you data science is interesting. You want to get started but the commonly available material is too technical for you to figure out how to get started. And you might be after some structured steps that will walk you through the basics and set you on the right track. ...
Serverless data ingestion using AWS Fargate Jul 21, 2020 This page documents the steps I undertook to automate the daily data ingestion in a database we use on daily basis (Postgres in RDS). It can serve as a template for future application containerization. While this may be trivial for the DevOps team, it is certainly a bit outside of the usual data science skill set. While it is not uncommon for data scientists to run things via docker when they need completely isolated things, the Fargate part is something most data scientist will not touch in the usual workflow. ...
Hosting your own password protected Jupyter Notebook server on AWS Nov 4, 2018 It’s very common to have a need to share your interactive Jupyter notebook with your colleagues. Maybe you are working on a project where you depend on interactivity to easily present information, or maybe your spooky stakeholders want to see ‘something’ on hover in your carefully crafted plots!, forcing you to avoid a github or nbviewer solution. You have an option for Google Colab notebooks, but for some reason you don’t want to use that either. ...
Realtime Sentiment Analysis using Apache Ni-Fi, R and Shiny web app Feb 14, 2018 Imagine doing a sentiment analysis on an ongoing event based on realtime Twitter feed. Twitter will give you a lot of tweets, and you can use R, Python or any other language to collect them. But what happens if you want to do that for a really long time with running a R/Python function in the background i.e. how do you handle streaming? The open source world has provided us with a couple of alternatives which can handle terabytes of streaming data. ...