pytz - Pandas convert datetime with a separate time zone column -


i have dataframe column time zone , column datetime. convert these utc first join other data, , i'll have calculations convert utc viewers local time zone eventually.

datetime              time_zone 2016-09-19 01:29:13   america/bogota  2016-09-19 02:16:04   america/new_york 2016-09-19 01:57:54   africa/cairo  def create_utc(df, column, time_format='%y-%m-%d %h:%m:%s'):     timezone = df['tz']     df[column + '_utc'] = df[column].dt.tz_localize(timezone).dt.tz_convert('utc').dt.strftime(time_format)     df[column + '_utc'].replace('nat', np.nan, inplace=true)     df[column + '_utc'] = pd.to_datetime(df[column + '_utc'])     return df 

that flawed attempt. error truth ambiguous makes sense because 'timezone' variable referring column. how refer value in same row?

edit: here results answers below on 1 day of data (394,000 rows , 22 unique time zones). edit2: added groupby example in case wants see results. fastest, far.

%%timeit  tz in df['tz'].unique():     df.ix[df['tz'] == tz, 'datetime_utc2'] = df.ix[df['tz'] == tz, 'datetime'].dt.tz_localize(tz).dt.tz_convert('utc') df['datetime_utc2'] = df['datetime_utc2'].dt.tz_localize(none) 

1 loops, best of 3: 1.27 s per loop

%%timeit  df['datetime_utc'] = [d['datetime'].tz_localize(d['tz']).tz_convert('utc') i, d in df.iterrows()] df['datetime_utc'] = df['datetime_utc'].dt.tz_localize(none) 

1 loops, best of 3: 50.3 s per loop

df['datetime_utc'] = pd.concat([d['datetime'].dt.tz_localize(tz).dt.tz_convert('utc') tz, d in df.groupby('tz')])    **1 loops, best of 3: 249 ms per loop** 

here vectorized approach (it loop df.time_zone.nunique() times):

in [2]: t out[2]:              datetime         time_zone 0 2016-09-19 01:29:13    america/bogota 1 2016-09-19 02:16:04  america/new_york 2 2016-09-19 01:57:54      africa/cairo 3 2016-09-19 11:00:00    america/bogota 4 2016-09-19 12:00:00  america/new_york 5 2016-09-19 13:00:00      africa/cairo  in [3]: tz in t.time_zone.unique():    ...:         mask = (t.time_zone == tz)    ...:         t.loc[mask, 'datetime'] = \    ...:             t.loc[mask, 'datetime'].dt.tz_localize(tz).dt.tz_convert('utc')    ...:  in [4]: t out[4]:              datetime         time_zone 0 2016-09-19 06:29:13    america/bogota 1 2016-09-19 06:16:04  america/new_york 2 2016-09-18 23:57:54      africa/cairo 3 2016-09-19 16:00:00    america/bogota 4 2016-09-19 16:00:00  america/new_york 5 2016-09-19 11:00:00      africa/cairo 

Comments

Popular posts from this blog

angular - Is it possible to get native element for formControl? -

unity3d - Rotate an object to face an opposite direction -

javascript - Why jQuery Select box change event is now working? -