'H2O python - How to let h2oframe to dataframe with correctly character and datetime

I have a csv file, and want to use H2O to do DeepLearning. But it has some Chinese and datetime that when I finish my Deeplearning need to save output to csv, it can't return to original data.

I use small data to show my problem here.

 In[1]: df = pd.DataFrame({'datetime':['2016-12-17 00:00:00'],'time':['00:00:30'],'month':['月'], 'weekend':['周六']})
        print(df.dtypes)
        df
out[1]: datetime    object
        time        object
        month       object
        weekend     object
        dtype: object
             datetime   time              month weekend
        0   2016-12-17 00:00:00 00:00:30    月   周六 

In[2]: h2o_frame = h2o.H2OFrame(df);h2o_frame ;h2o_frame.types ;h2o_frame

C:\Users\thi\Anaconda3\lib\site-packages\h2o\utils\shared_utils.py:170: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead. data = _handle_python_lists(python_obj.as_matrix().tolist(), -1)[1]

out[2]: Parse progress: |█████████████████████████████████████████████████████████| 100%
                  datetime                time       month        weekend
        2016-12-17 00:00:00  1970-01-01 00:00:30    <0xA4EB>    <0xA9>P<0xA4BB>

the time I want it just only 00:00:30, any way to fix it?

month and weekends I don't find any way to let it show Chinese, but I still finish my deeplearning

But when I want to let h2oframe back to DataFrame and save to csv file, it save <0xA4EB> for me but not , and datetime change to int

 In[3]: dff = h2o_frame.as_data_frame();dff
out[3]:         datetime     time     month        weekend
        0   1481932800000   30000   <0xA4EB>    <0xA9>P<0xA4BB>
  • How to correctly return character from h2oframe to DataFrame
  • How to correctly return datetime from h2oframe to DataFrame


Solution 1:[1]

One simplest way to solve this is, when you convet pandas frame to H2OFrame use argument column_types ,as below:

In [69]: col_types
Out[69]: ['categorical', 'categorical', 'categorical', 'categorical']

In [70]: h2o_frame = h2o.H2OFrame(df,column_types=col_types);h2o_frame ;h2o_frame.types ;h2o_frame
Parse progress: |?????????????????????????????????????????????????????????????????????????????| 100%
Out[70]: 
datetime             month    time      weekend
-------------------  -------  --------  ---------
2016-12-17 00:00:00  ?       00:00:30  ??

[1 row x 4 columns]


In [71]: dff = h2o_frame.as_data_frame();dff
Out[71]: 
              datetime month      time weekend
0  2016-12-17 00:00:00     ?  00:00:30      ??

Solution 2:[2]

allfiles = h2o.import_file(path='data/', pattern=".csv")
df = allfiles.as_data_frame()
df['datetime'] = pd.to_datetime(df["datetime"], unit='ms')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Henry Ecker
Solution 2 user1098761