'Read .csv file data and convert that data to json format row-wise
I am using the button to upload .csv file. After uploading I want that data in JSON format. For that, I was trying to print .csv file data in the console. I want to access the result key as shown below in the image attached with the orange-colored highlighted line. But I was getting result as shown in the image attached with the purple-colored highlighted line. I also tried to access readyState but it is also not showing as expected. Attached is the code and image below.
export default {
data(){
return{
isSelecting: false,
selectedFile: null,
finalResult : [],
}
},
methods:{
handleFileImport() {
this.isSelecting = true;
// After obtaining the focus when closing the FilePicker, return the button state to normal
window.addEventListener('focus', () => {
this.isSelecting = false
}, { once: true });
// Trigger click on the FileInput
this.$refs.uploader.click();
},
onFileChanged(e) {
this.selectedFile = e.target.files[0];
let reader = new FileReader();
reader.onload = e => this.$emit("load", e.target.result);
reader.readAsText(this.selectedFile);
this.finalResult.push(reader)
console.log("file reader-->",this.finalResult)
console.log("ReadyState-->",this.finalResult[0].readyState)
console.log("result--->",this.finalResult[0].result);
const timeout = setTimeout(this.finalResult[0].readyState,3000)
console.log("After using timeout-->",timeout)
}
}
}
Solution 1:[1]
There is no need for any loops nor apply, what we want here is three different group counts and some formatting. With the assumption - and as seen in the sample data - there is no need for a split.
my_data = structure(list(Sector = c("AAA", "BBB", "AAA", "CCC", "AAA",
"BBB", "AAA", "CCC"), Sub_Sector = c("AAA1", "BBB1", "AAA1",
"CCC1", "AAA1", "BBB2", "AAA1", "CCC2"), count = c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), type = c("Actual", "Actual", "Actual", "Actual",
"Actual", "Actual", "Actual", "Actual")), class = "data.frame", row.names = c(NA,
-8L))
library(data.table)
setDT(my_data)
expand_collapse_compliance <- function(x) {
x <- rbindlist(list(
x[, .(Sector1 = Sector, Actual = .N), by = Sector],
setnames(x[, .(Actual = .N), by = .(Sector, Sub_Sector)], "Sector", "Sector1"),
x[, .(Sector = "Total", Actual = .N)]
), fill = T)
setcolorder(x, c("Sector1", "Sector", "Sub_Sector", "Actual"))
setorder(x, Sector1, Sector, na.last = T)
x
}
expand_collapse_compliance(my_data)
# Sector1 Sector Sub_Sector Actual
# 1: AAA AAA <NA> 4
# 2: AAA <NA> AAA1 4
# 3: BBB BBB <NA> 2
# 4: BBB <NA> BBB1 1
# 5: BBB <NA> BBB2 1
# 6: CCC CCC <NA> 2
# 7: CCC <NA> CCC1 1
# 8: CCC <NA> CCC2 1
# 9: <NA> Total <NA> 8
sidenote
There is no need to convert NA
to ""
as in shiny DT will show blancs for it.
Speedtest
As I mentioned that for
is often faster on very small data sets as any library solutions use some functions that take some time to load once...
my_data_small = structure(list(Sector = c("AAA", "BBB", "AAA", "CCC", "AAA",
"BBB", "AAA", "CCC"), Sub_Sector = c("AAA1", "BBB1", "AAA1",
"CCC1", "AAA1", "BBB2", "AAA1", "CCC2"), count = c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), type = c("Actual", "Actual", "Actual", "Actual",
"Actual", "Actual", "Actual", "Actual")), class = "data.frame", row.names = c(NA,
-8L))
library(data.table)
setDT(my_data)
test replications elapsed relative
2 eccB 150 0.32 1.00
1 eccDT 150 0.72 2.25
# well just make it a milion times bigger :D
my_data_large <- rbindlist(rep(list(my_data_small), 1000000L))
test replications elapsed relative
2 eccB 50 79.30 5.146
1 eccDT 50 15.41 1.000
Solution 2:[2]
The appropriate function of the *apply
family could be tapply
using a split-apply-combine approach. Since we need tapply
only when there are multiple Sub_Sector
's, we implement a case handling for sake of speed.
expand_collapse_complianceA <- \(data) {
r <- do.call(rbind, c(by(data, data$Sector, \(x) {
if (length(unique(x$Sub_Sector)) != 1L) {
tt <- t(unname(with(x, tapply(count, list(Sector, Sub_Sector), sum))))
tt <- cbind(x[!duplicated(x$Sub_Sector), 1:2], foo='', Actual=tt)
} else {
tt <- as.data.frame(t(c(unlist(x[!duplicated(x$Sub_Sector), 1:2]), foo='',
Actual=sum(x$count))))
}
rbind(c(tt[1, 1], '', tt[1, 1], sum(as.numeric(tt[, 4]))), tt)[c(1, 3, 2, 4)]
}), make.row.names=FALSE))
rbind(r, c('', 'Total', '', sum(as.numeric(r$Actual[!r$foo %in% ''])))) |>
setNames(c('Sector1', 'Sector', 'Sub_Sector', 'Actual'))
}
Note: R version 4.1.2 (2021-11-01)
.
Gives
expand_collapse_compliance(my_data)
# Sector1 Sector Sub_Sector Actual
# 1 AAA AAA 4
# 2 AAA AAA1 4
# 3 BBB BBB 2
# 4 BBB BBB1 1
# 5 BBB BBB2 1
# 6 CCC CCC 2
# 7 CCC CCC1 1
# 8 CCC CCC2 1
# 9 Total 8
expand_collapse_complianceA(my_data) |>
(\(x) DT::datatable(
x, rownames=F, escape=FALSE, selection=list(mode="single", target="row"),
options=list(pageLength=50, scrollX=TRUE, dom='tp', ordering=F,
columnDefs=list(list(visible=FALSE, targets=0),
list(className='dt-left', targets='_all'))),
class='hover cell-border stripe'))()
expand_collapse_complianceA
now needs just 1/10 of the time as the original for
loop. Here a benchmark (tested on 1080 rows).
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# ecc_for 304.723781 305.426934 346.878188 308.208294 335.944407 598.94351 10 c
# ecc_tapply 29.768177 29.851975 31.083977 30.611982 32.058980 34.50901 10 a
# ecc_tidy 135.326594 135.952068 143.967550 138.475437 149.352409 164.94652 10 b
# ecc_DT 3.267969 3.611711 4.610916 3.664493 3.707528 13.48797 10 a
Of course data.table is faster. However, I's like to see performance when the data is about to exhaust the RAM.
Benchmark Code:
microbenchmark::microbenchmark(
ecc_for=expand_collapse_compliance(dat),
ecc_tapply=expand_collapse_complianceA(dat),
ecc_tidy={library(dplyr);library(tidyr);expand_collapse_compliance1(dat)},
ecc_DT={library(data.table);expand_collapse_complianceDT(as.data.table(dat))},
times=10L)
Note, that the "tidy" version has some flaws so far (at least with the new data).
res_for <- expand_collapse_compliance(dat)
res_tapply <- expand_collapse_complianceA(dat)
res_tidy <- {library(dplyr);library(tidyr);expand_collapse_compliance1(dat)}
all.equal(res_for, res_tapply, check.attributes=FALSE)
# [1] TRUE
all.equal(res_for, res_tidy, check.attributes=FALSE)
# [1] "Component “Sub_Sector”: 1053 string mismatches"
# [2] "Component “Actual”: target is character, current is numeric"
Data
dat <- expand.grid(Sector=c("AA", "AB", "AC", "AD", "AE", "AF", "AG", "AH", "AI", "AJ",
"AK", "AL", "AM", "AN", "AO", "AP", "AQ", "AR", "AS", "AT", "AU",
"AV", "AW", "AX", "AY", "AZ", "BA"),
Sub_Sector=1:40, stringsAsFactors=F)
dat <- transform(dat, Sub_Sector=Reduce(paste0, dat[1:2]), count=1, type='Actual')
dat <- dat[order(dat$Sector), ]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 |