'How do I extract data from a string into a 2d array?
Using axios
, I am fetching data from a website. Unfortunately,the data fetch is in HTML
format. The data fetched is like this:
1 Agartala VEAT 120830Z 23004KT 5000 HZ SCT018 SCT025 34/27 Q1004 NOSIG= 2 Ahmedabad VAAH 120830Z 23008KT 6000 NSC 44/21 Q1001 NOSIG= 3 Allahabad VEAB 120800Z 03006KT 6000 FEW025 39/26 Q0999 NOSIG= 4 Amritsar VIAR 120830Z VRB02KT 2800 DU NSC 42/13 Q1000 BECMG 3000= 5 Bangalore VOBL 120830Z 28014KT 6000 -DZ BKN008 SCT012 OVC080 24/21 Q1009 NOSIG= 6 Baroda VABO 120830Z 20008KT 6000 NSC 41/19 Q1001 NOSIG= 7 Bhaunagar VABV 120830Z 14016KT 5000 DU NSC 39/22 Q1002 NOSIG= 8 Bhopal VABP 120830Z 31010KT 6000 SCT030 43/03 Q1002 NOSIG= 9 Bhubaneswar ...
I have removed the HTML tags using some for loops. The original HTML data is:
...
<tr>
<td align="left" style="padding:3px; border-style:solid; border-width:1px; border-
collapse:collapse; border-color:#3366aa;"><font style="font-family:verdana; font-
size:11px; color:#000000;">1</font></td>
<td align="left" style="padding:3px; border-style:solid; border-width:1px; border-
collapse:collapse; border-color:#3366aa;"><font style="font-family:verdana; font-
size:11px; color:#000000;">Agartala </font></td>
<td align="left" style="padding:3px; border-style:solid; border-width:1px; border-
collapse:collapse; border-color:#3366aa;"><font style="font-family:verdana; font-
size:11px; color:#000000;">VEAT 120830Z 23004KT 5000 HZ SCT018 SCT025 34/27 Q1004 NOSIG=
</font></td>
</tr>
...
I want to extract the above data and store it in a 2d array like this:
[['1', 'Agartala', 'VEAT 120830Z 23004KT 5000 HZ SCT018 SCT025 34/27 Q1004 NOSIG='], [...], ...]
. I have tried extracting the above using simple for loop, but it does not work. This is the function which I have tried:
let extractData = () => {
let str1 = cleanHTML(), count = 0;
let tempList = [], str2 = '';
for (let i = 0; i < str1.length; i++) {
if (count == 0) {
if (str1[i] != ' ') {
str2 += str1[i];
}
else {
//console.log(str2);
tempList.push(parseInt(str2));
count = 1;
str2 = '';
}
}
else if (count == 1) {
//console.log(str1[i]);
if (str1[i] != ' ') {
str2 += str1[i];
}
else {
tempList.push(str2);
count = 2;
str2 = '';
}
}
else if (count == 2 && i+2<str1.length) {
if (str1[i] != ' ' && str1[i+1] != ' ' && str1[i+2] != ' ') {
str2 += str1[i];
}
else {
tempList.push(str2);
//console.log(tempList);
count = 0;
str2 = '';
dataList.push(tempList);
tempList = [];
}
}
}
console.log(dataList);
}
I have also tried checking for \t
, \r
and \n
instead of a space. But, the result comes out to be different. What am I doing wrong?
Solution 1:[1]
As Chris G suggested, You can achieve that simply by setting the html as innerHTML of a div and then use querySelectorAll()
to grab the <font>
elements and get their .innerText
.
Demo :
const axiosResponse = `<table><tr>
<td align="left" style="padding:3px; border-style:solid; border-width:1px; border-
collapse:collapse; border-color:#3366aa;"><font style="font-family:verdana; font-
size:11px; color:#000000;">1</font></td>
<td align="left" style="padding:3px; border-style:solid; border-width:1px; border-
collapse:collapse; border-color:#3366aa;"><font style="font-family:verdana; font-
size:11px; color:#000000;">Agartala </font></td>
<td align="left" style="padding:3px; border-style:solid; border-width:1px; border-
collapse:collapse; border-color:#3366aa;"><font style="font-family:verdana; font-
size:11px; color:#000000;">VEAT 120830Z 23004KT 5000 HZ SCT018 SCT025 34/27 Q1004 NOSIG=
</font></td>
</tr></table>`;
document.getElementById('showContent').innerHTML = axiosResponse;
const fontHTML = document.getElementById('showContent').querySelectorAll('font');
const res = [];
for (i = 0; i < fontHTML.length; i++) {
res.push(fontHTML[i].innerText);
}
console.log(res);
<div id="showContent">
</div>
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Rohìt JÃndal |