python - Pandas: How to resample dataframe such that each combination is present? -
assume have following data frame:
# data t = pd.to_datetime(pd.series(['2015-01-01', '2015-02-01', '2015-03-01', '2015-04-01', '2015-01-01', '2015-02-01'])) g = pd.series(['a', 'a', 'a', 'a', 'b', 'b']) v = pd.series([12.1, 14.2, 15.3, 16.2, 12.2, 13.7]) df = pd.dataframe({'time': t, 'group': g, 'value': v}) # show data >>> df time group value 0 2015-01-01 12.1 1 2015-02-01 14.2 2 2015-03-01 15.3 3 2015-04-01 16.2 4 2015-01-01 b 12.2 5 2015-02-01 b 13.7
what have in end following data frame:
>>> df time group value 0 2015-01-01 12.1 1 2015-02-01 14.2 2 2015-03-01 15.3 3 2015-04-01 16.2 4 2015-01-01 b 12.2 5 2015-02-01 b 13.7 6 2015-03-01 b 13.7 7 2015-04-01 b 13.7
the missing observations in group b
should added , missing values should default last observed value.
how can achieve this? in advance!
you can use pivot
reshaping, ffill
nan
(fillna
method ffill
) , reshape original unstack
reset_index
:
print (df.pivot(index='time',columns='group',values='value') .ffill() .unstack() .reset_index(name='value')) group time value 0 2015-01-01 12.1 1 2015-02-01 14.2 2 2015-03-01 15.3 3 2015-04-01 16.2 4 b 2015-01-01 12.2 5 b 2015-02-01 13.7 6 b 2015-03-01 13.7 7 b 2015-04-01 13.7
another solution first find date_range
min
, max
values of time
. groupby
resample
d
ffill
:
notice:
i think forget parameter format='%y-%d-%m'
in to_datetime
, if last number month
:
t = pd.to_datetime(pd.series(['2015-01-01', '2015-02-01', '2015-03-01', '2015-04-01', '2015-01-01', '2015-02-01']), format='%y-%d-%m') idx = pd.date_range(df.time.min(), df.time.max()) print (idx) datetimeindex(['2015-01-01', '2015-01-02', '2015-01-03', '2015-01-04'], dtype='datetime64[ns]', freq='d') df1 = (df.groupby('group') .apply(lambda x: x.set_index('time') .reindex(idx)) .ffill() .reset_index(level=0, drop=true) .reset_index() .rename(columns={'index':'time'})) print (df1) time group value 0 2015-01-01 12.1 1 2015-01-02 14.2 2 2015-01-03 15.3 3 2015-01-04 16.2 4 2015-01-01 b 12.2 5 2015-01-02 b 13.7 6 2015-01-03 b 13.7 7 2015-01-04 b 13.7
Comments
Post a Comment