# Dr. Dror

Foo is not just bar

# Articles

#### Hedge Yourself From a Risky Data Science Job

I recently came across a wonderful post by Talia Borodin titled "Think Your Company Needs a Data Scientist? You're Probably Wrong". If you didn't ready it yet, make sure you do it! It contains a wonderful collection of truths that every individual who has anything to do with data science …

• Mon 19 February 2018
• Stats

#### Comparing $z$ and $\chi^2$ tests

Showing that at least in one certain case the two tests are the same

• Fri 05 January 2018
• HowTo

#### Moving from local machine to Dask cluster using Terraform

Tutorial on how to start a cluster of dask instances on AWS (EC2). Using this cluster execute an expansive grid search.

• Tue 14 November 2017
• HowTo

#### The importance of order

When grouping by DataFrame the order does matter and may be surprising.

• Fri 22 September 2017
• DS

#### Messages taken home from "AI in Data Science presented by RecdoTech"

Some remarks and highlights from taken from a meetup I attended.

• Sun 10 September 2017
• Stats

#### Why do we need to divide by n-1?

Trying to give an intuitive understanding what's the difference between a biased and unbiased estimators of variance of a sample.

• Mon 28 August 2017
• HowTo

#### Averages, sums and counts when grouping by

A gotcha when aggregated time series data involving hourly based counts.

#### Group by date from a column

Assume you have data set as follows: ID Date Value x x x where each row contains an ID, a date (given as pd.Datetime) and a value. The objective is to count how many rows occur in each day. In [1]: import pandas as pd import numpy as np …

• Tue 27 June 2017
• DS

#### Benchmarking Columns Operations

Benchmarking different ways to process two columns simultaneously.

• Thu 22 June 2017
• ML

#### Some learnings from implementing a transformer

I had to (or at least I thought I had to) implement a transformer to be used in a sklearn.pipeline.Pipeline. In a nutshell, I implemented badly the transform method. The original version can be found in this gist. In the following version I fixed it. Furthermore, I left …