'Read .csv file data and convert that data to json format row-wise

I am using the button to upload .csv file. After uploading I want that data in JSON format. For that, I was trying to print .csv file data in the console. I want to access the result key as shown below in the image attached with the orange-colored highlighted line. But I was getting result as shown in the image attached with the purple-colored highlighted line. I also tried to access readyState but it is also not showing as expected. Attached is the code and image below.

export default {
data(){
  return{
    isSelecting: false,
    selectedFile: null,
    finalResult : [],
     }
},
methods:{
    handleFileImport() {
      this.isSelecting = true;

      // After obtaining the focus when closing the FilePicker, return the button state to normal
      window.addEventListener('focus', () => {
        this.isSelecting = false
      }, { once: true });

      // Trigger click on the FileInput
      this.$refs.uploader.click();
    },
    onFileChanged(e) {
      this.selectedFile = e.target.files[0];
      let reader = new FileReader();
      reader.onload = e => this.$emit("load", e.target.result);
      reader.readAsText(this.selectedFile);
      this.finalResult.push(reader)
      console.log("file reader-->",this.finalResult)
      console.log("ReadyState-->",this.finalResult[0].readyState)
      console.log("result--->",this.finalResult[0].result);
      const timeout = setTimeout(this.finalResult[0].readyState,3000)
      console.log("After using timeout-->",timeout)
     }
  }
}

Image attached



Solution 1:[1]

There is no need for any loops nor apply, what we want here is three different group counts and some formatting. With the assumption - and as seen in the sample data - there is no need for a split.

my_data = structure(list(Sector = c("AAA", "BBB", "AAA", "CCC", "AAA",
    "BBB", "AAA", "CCC"), Sub_Sector = c("AAA1", "BBB1", "AAA1",
    "CCC1", "AAA1", "BBB2", "AAA1", "CCC2"), count = c(1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L), type = c("Actual", "Actual", "Actual", "Actual",
    "Actual", "Actual", "Actual", "Actual")), class = "data.frame", row.names = c(NA,
    -8L))

library(data.table)
setDT(my_data)

expand_collapse_compliance <- function(x) {
  x <- rbindlist(list(
    x[, .(Sector1 = Sector, Actual = .N), by = Sector], 
    setnames(x[, .(Actual = .N), by = .(Sector, Sub_Sector)], "Sector", "Sector1"),
    x[, .(Sector = "Total", Actual = .N)]
  ), fill = T)
  setcolorder(x, c("Sector1", "Sector", "Sub_Sector", "Actual"))
  setorder(x, Sector1, Sector, na.last = T)
  x
}

expand_collapse_compliance(my_data)

#    Sector1 Sector Sub_Sector Actual
# 1:     AAA    AAA       <NA>      4
# 2:     AAA   <NA>       AAA1      4
# 3:     BBB    BBB       <NA>      2
# 4:     BBB   <NA>       BBB1      1
# 5:     BBB   <NA>       BBB2      1
# 6:     CCC    CCC       <NA>      2
# 7:     CCC   <NA>       CCC1      1
# 8:     CCC   <NA>       CCC2      1
# 9:    <NA>  Total       <NA>      8

sidenote There is no need to convert NA to "" as in shiny DT will show blancs for it.

Speedtest

As I mentioned that for is often faster on very small data sets as any library solutions use some functions that take some time to load once...

my_data_small = structure(list(Sector = c("AAA", "BBB", "AAA", "CCC", "AAA",
    "BBB", "AAA", "CCC"), Sub_Sector = c("AAA1", "BBB1", "AAA1",
    "CCC1", "AAA1", "BBB2", "AAA1", "CCC2"), count = c(1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L), type = c("Actual", "Actual", "Actual", "Actual",
    "Actual", "Actual", "Actual", "Actual")), class = "data.frame", row.names = c(NA,
    -8L))

library(data.table)
setDT(my_data)

   test replications elapsed relative
2  eccB          150    0.32     1.00
1 eccDT          150    0.72     2.25

# well just make it a milion times bigger :D
my_data_large <- rbindlist(rep(list(my_data_small), 1000000L))

   test replications elapsed relative
2  eccB           50   79.30    5.146
1 eccDT           50   15.41    1.000

Solution 2:[2]

The appropriate function of the *apply family could be tapply using a split-apply-combine approach. Since we need tapply only when there are multiple Sub_Sector's, we implement a case handling for sake of speed.

expand_collapse_complianceA <- \(data) {
  r <- do.call(rbind, c(by(data, data$Sector, \(x) {
    if (length(unique(x$Sub_Sector)) != 1L) {
      tt <- t(unname(with(x, tapply(count, list(Sector, Sub_Sector), sum))))
      tt <- cbind(x[!duplicated(x$Sub_Sector), 1:2], foo='', Actual=tt)
    } else {
      tt <- as.data.frame(t(c(unlist(x[!duplicated(x$Sub_Sector), 1:2]), foo='',
                              Actual=sum(x$count))))
    }
    rbind(c(tt[1, 1], '', tt[1, 1], sum(as.numeric(tt[, 4]))), tt)[c(1, 3, 2, 4)]
  }), make.row.names=FALSE))
  rbind(r, c('', 'Total', '', sum(as.numeric(r$Actual[!r$foo %in% ''])))) |>
    setNames(c('Sector1', 'Sector', 'Sub_Sector', 'Actual'))
}

Note: R version 4.1.2 (2021-11-01).

Gives

expand_collapse_compliance(my_data)
#   Sector1 Sector Sub_Sector Actual
# 1     AAA    AAA                 4
# 2     AAA              AAA1      4
# 3     BBB    BBB                 2
# 4     BBB              BBB1      1
# 5     BBB              BBB2      1
# 6     CCC    CCC                 2
# 7     CCC              CCC1      1
# 8     CCC              CCC2      1
# 9          Total                 8


expand_collapse_complianceA(my_data) |> 
  (\(x) DT::datatable(
    x, rownames=F, escape=FALSE, selection=list(mode="single", target="row"), 
    options=list(pageLength=50, scrollX=TRUE, dom='tp', ordering=F, 
                 columnDefs=list(list(visible=FALSE, targets=0),
                                 list(className='dt-left', targets='_all'))), 
    class='hover cell-border stripe'))()

enter image description here

expand_collapse_complianceA now needs just 1/10 of the time as the original for loop. Here a benchmark (tested on 1080 rows).

# Unit: milliseconds
#       expr        min         lq       mean     median         uq       max neval cld
#    ecc_for 304.723781 305.426934 346.878188 308.208294 335.944407 598.94351    10   c
# ecc_tapply  29.768177  29.851975  31.083977  30.611982  32.058980  34.50901    10 a  
#   ecc_tidy 135.326594 135.952068 143.967550 138.475437 149.352409 164.94652    10  b 
#     ecc_DT   3.267969   3.611711   4.610916   3.664493   3.707528  13.48797    10 a  

Of course data.table is faster. However, I's like to see performance when the data is about to exhaust the RAM.

Benchmark Code:

microbenchmark::microbenchmark(
  ecc_for=expand_collapse_compliance(dat),
  ecc_tapply=expand_collapse_complianceA(dat),
  ecc_tidy={library(dplyr);library(tidyr);expand_collapse_compliance1(dat)},
  ecc_DT={library(data.table);expand_collapse_complianceDT(as.data.table(dat))},
  times=10L)

Note, that the "tidy" version has some flaws so far (at least with the new data).

res_for <- expand_collapse_compliance(dat)
res_tapply <- expand_collapse_complianceA(dat)
res_tidy <- {library(dplyr);library(tidyr);expand_collapse_compliance1(dat)}


all.equal(res_for, res_tapply, check.attributes=FALSE)
# [1] TRUE
all.equal(res_for, res_tidy, check.attributes=FALSE)
# [1] "Component “Sub_Sector”: 1053 string mismatches"             
# [2] "Component “Actual”: target is character, current is numeric"

Data

dat <- expand.grid(Sector=c("AA", "AB", "AC", "AD", "AE", "AF", "AG", "AH", "AI", "AJ", 
                     "AK", "AL", "AM", "AN", "AO", "AP", "AQ", "AR", "AS", "AT", "AU", 
                     "AV", "AW", "AX", "AY", "AZ", "BA"),
            Sub_Sector=1:40, stringsAsFactors=F)
dat <- transform(dat, Sub_Sector=Reduce(paste0, dat[1:2]), count=1, type='Actual')
dat <- dat[order(dat$Sector), ]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2