'How to parse this custom log file in Python3
The log file is generated by a program written in C++.
Here is the demo log:
|Frame:0|NUMBER:0
|Frame:1|NUMBER:1|{INDEX:0|RECT:[11,24][31,43]}
|Frame:2|NUMBER:2|{INDEX:0|RECT:[11,24][31,43]}|{INDEX:1|RECT:[11,24][31,43]}
|Frame:3|NUMBER:0
I am trying to read those log files into a list/dict or etc.
Here is the information that I hope to capture from the demo log above:
#frame, number, index, rect
[0, 0]
[1, 1, 0, 11,24,31,43]
[2, 2, 0, 11,24,31,43, 1, 11,24,31,43]
[3, 0]
Solution 1:[1]
Thanks to @Juan Facundo Peña.
This answer is base his answer. Which makes some improvement to the duplicate keys.
import re
program_result = []
code_list = []
with open("2.log", "r") as f:
logs = f.readlines()
for line in logs:
if line.startswith("|Frame:"):
parsed_line = line.split("|")
code_dict = {}
next_rect_idx_key = ""
for parse in parsed_line:
rect_idx = 0
split_line = parse.strip("{}").split(":")
key = split_line[0]
if not key:
continue
data_as_strings = re.findall(r"\d+", split_line[-1])
data_as_integers = [int(s) for s in data_as_strings]
if("" != next_rect_idx_key):
code_dict[next_rect_idx_key] = data_as_integers
next_rect_idx_key = ""
else:
if('INDEX' == key):
next_rect_idx_key = key + str(data_as_integers)
else:
code_dict[key] = data_as_integers
print(code_dict)
code_list.append(code_dict)
Solution 2:[2]
This can be solved using the re
library.
import re
code_list = []
with open("log_file.log", "r") as f:
logs = f.readlines()
for line in logs:
parsed_line = line.split("|")
code_dict = {}
for parse in parsed_line:
split_line = parse.split(":")
key = split_line[0]
if not key:
continue
value = re.findall(r"\d+", split_line[-1])
code_dict[key] = value
code_list.append(code_dict)
You will end up with a list of dictionaries (i.e.:code_list
), each of which contains both the key and the values in each line.
In line 3, you will have two "INDEX - RECT" dictionaries, but you can then split the whole logs list by "Frame" to understand what codes belong to what line (if needed).
If you only wish for the numbers, you can also try:
import re
code_list = []
with open("log_file.log", "r") as f:
logs = f.readlines()
for line in logs:
codes = re.findall(r"\d+", line)
code_list.append(codes)
This approach will give you a list of lists, each of which contains a single line.
Edit: if you try to loop through a single string other than a file, try:
import re
code_list = []
logs = log_string.split("\n")
for line in logs:
# <<<business as usual>>>
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | John |
Solution 2 |