python - Loop over groups Pandas Dataframe and get sum/count -
i using pandas structure , process data. dataframe:
and code enabled me dataframe:
(data[['time_bucket', 'beginning_time', 'bitrate', 2, 3]].groupby(['time_bucket', 'beginning_time', 2, 3])).aggregate(np.mean) now want have sum (ideally, sum , count) of 'bitrates' grouped in same time_bucket. example, first time_bucket((2016-07-08 02:00:00, 2016-07-08 02:05:00), must 93750000 sum , 25 count, case 'bitrate'.
i did :
data[['time_bucket', 'bitrate']].groupby(['time_bucket']).agg(['sum', 'count']) and result :
but want have data in 1 dataframe.
can simple loop on 'time_bucket' , apply function calculate sum of bitrates ? ideas ? thx !
i think need merge, need same levels of indexes of both dataframes, use reset_index. last original multiindex set_index:
data = pd.dataframe({'a':[1,1,1,1,1,1], 'b':[4,4,4,5,5,5], 'c':[3,3,3,1,1,1], 'd':[1,3,1,3,1,3], 'e':[5,3,6,5,7,1]}) print (data) b c d e 0 1 4 3 1 5 1 1 4 3 3 3 2 1 4 3 1 6 3 1 5 1 3 5 4 1 5 1 1 7 5 1 5 1 3 1 df1 = data[['a', 'b', 'c', 'd','e']].groupby(['a', 'b', 'c', 'd']).aggregate(np.mean) print (df1) e b c d 1 4 3 1 5.5 3 3.0 5 1 1 7.0 3 3.0 df2 = data[['a', 'c']].groupby(['a'])['c'].agg(['sum', 'count']) print (df2) sum count 1 12 6 print (pd.merge(df1.reset_index(['b','c','d']), df2, left_index=true, right_index=true) .set_index(['b','c','d'], append=true)) e sum count b c d 1 4 3 1 5.5 12 6 3 3.0 12 6 5 1 1 7.0 12 6 3 3.0 12 6 i try solution output df1, aggregated impossible right data. if sum level c, 8 instead 12.


Comments
Post a Comment