'Network Flow Dataframe - Merging Memory Error - Unable to allocate array with shape and data type
I have big 3 CSV files and they are all 76 same columns. The number of rows are different 17809 rows - 124262 rows - 108779 rows I am trying to merge these 3 data frames but I am having a memory error. Can I solve this issue or is it impossible for my hardware? 16GB Ram, i5 11th.
I found this solution to merge them but there is an error. I want them to be in one dataframe.
from functools import reduce
data_frames = [a, b, c]
df_merged = reduce(lambda left,right: pd.merge(left,right,on=['Intrusion'], how='outer'), data_frames)
df_merged
MemoryError: Unable to allocate 101. GiB for an array with shape (13517346950,) and data type int64
Solution 1:[1]
The answer was in Linux, I loved it. awk 'FNR > 1' file1.csv file2.csv> output.csv That is all. https://predictivehacks.com/?all-tips=how-to-concatenate-multiple-csv-files-in-linux
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |