Motivation Consider the case where you implement some logic somewhere, and this logic should be used from within several different places. Think of a logic and two apps using it. The logic should be yielding logging messages that would be visible as part of the running of the different apps …
You will build a RESTful API exposing a trained prediction model.
I recently came across a wonderful post by Talia Borodin titled "Think Your Company Needs a Data Scientist? You're Probably Wrong". If you didn't ready it yet, make sure you do it! It contains a wonderful collection of truths that every individual who has anything to do with data science …
Showing that at least in one certain case the two tests are the same
Tutorial on how to start a cluster of dask instances on AWS (EC2). Using this cluster execute an expansive grid search.
When grouping by DataFrame the order does matter and may be surprising.
Some remarks and highlights from taken from a meetup I attended.
Trying to give an intuitive understanding what's the difference between a biased and unbiased estimators of variance of a sample.
A gotcha when aggregated time series data involving hourly based counts.
Assume you have data set as follows: ID Date Value x x x where each row contains an ID, a date (given as pd.Datetime) and a value. The objective is to count how many rows occur in each day. In : import pandas as pd import numpy as np …