'cSpade mining frequent sequences
I'm new to cSpade. Trying to figure out how to mine frequent sequences but got a summary that seems to say the codes are not correct. Here's the codes I used:
transactions = read.csv("SequenceOnly.csv")
names(transactions) = c("sequenceID", "eventID", "SIZE", "items")
transactions <- data.frame(lapply(transactions, as.factor))
transactions <- transactions[order(transactions$sequenceID, transactions$eventID),]
write.table(transactions, "mytxtout.txt", sep=";", row.names = TRUE, col.names = TRUE, quote = TRUE)
trans_matrix <- read_baskets("mytxtout.txt", sep = ";", info = c("sequenceID","eventID","SIZE", "items"))
s1 <- cspade(trans_matrix, parameter = list(support = 0.3), control = list(verbose = TRUE))
s1.df <- as(s1, "data.frame")
summary(s1)
Then the summary looks like this: set of 0 sequences with
most frequent items: integer(0)
most frequent elements: integer(0)
element (sequence) size distribution: < table of extent 0 >
sequence length distribution: < table of extent 0 >
summary of quality measures: < table of extent 0 >
Wondering what was wrong? Thanks for any advice!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|