# Dr. Dror

Foo is not just a "Bar"

# Group by date from a column

(Original notebooks can be found in this gist)

Assume you have data set as follows:

ID Date Value
x x x

where each row contains an ID, a date (given as `pd.Datetime`) and a value. The objective is to count how many rows occur in each day.

``````import pandas as pd
import numpy as np
import datetime
import random
``````
``````def random_date(start, end):
"""Generate a random datetime between `start` and `end`

Thanks to https://stackoverflow.com/a/8170651/671013
"""
return start + datetime.timedelta(
# Get a random amount of seconds between `start` and `end`
seconds=random.randint(0, int((end - start).total_seconds())),
)
``````

Let's generate a random data set:

``````N = 100
df = pd.DataFrame(
{
"date": [random_date(datetime.datetime(2017,5,1), datetime.datetime(2017,7,1)) for x in range(N)],
"val": np.random.choice([0,1], size=N)
}
)
``````

Magic:

``````df.groupby(
[df['date'].map(lambda x: x.month),
df['date'].map(lambda x: x.day)]
).size()
``````
``````    date  date
5     1       4
2       2
3       2
4       1
5       4
6       2
7       1
9       1
10      2
11      1
12      1
13      2
14      2
15      1
16      1
17      4
19      4
20      2
21      1
22      1
24      1
25      1
26      3
27      1
28      2
29      1
31      1
6     1       2
2       2
3       2
4       3
5       2
6       1
7       1
9       4
10      2
11      1
12      1
13      1
15      1
16      2
18      1
19      5
20      3
21      2
23      3
24      3
26      2
27      3
29      1
30      3
dtype: int64
``````