'Find and replace duplicate rows with the newest data in a defined range using Google Apps Script
Google sheet using columns A:M. Each day a new set of raw data (around 30 rows) is pasted on the next available row in column B manually. This raw data contains duplicate rows that have some updated column contents since the previous day. Column C contains the Job Number which is used to determine duplicates, and column A contains a sequential ID from 1 to n.
I need to create a function that finds duplicate rows based on column C, then uses the row with the largest Column A value to overwrite the data in the lowest Column A value so that the order of jobs numbers added to the sheet never changes once a job is in there.Image of sheet currently In the image; job 22,484 on row 4 would have due date updated to 10/5/22 and job data on row 5 would be removed. For each day's raw data there will be 20-25 duplicate entries.
The maximum rows used will be 5000 so the script can be inefficient if necessary.
Thanks! Connor
Here's a link to dummy sheet! (Thanks Mike)
Solution 1:[1]
Solution:
function myFunction() {
const sheet = SpreadsheetApp.getActiveSpreadsheet()
.getSheetByName(`YOUR_SHEET`)
let data = sheet.getRange(`B2:M`)
.getDisplayValues()
.filter(row => row.every(cell => cell.length))
const duplicates = [...new Set(data.flatMap(i => i[1]))].map(i => data.filter(item => item[1] === i))
.filter(i => i.length > 1)
.map(i => i[i.length-1])
duplicates.forEach(i => {
const removeRow = data.splice(data.findIndex(item => item === i), 1)
data[data.findIndex(item => item[1] === i[1])] = removeRow.flat()
})
sheet.getRange(`B2:M`).clear()
sheet.getRange(2, 2, data.length, data[0].length).setValues(data)
}
This will get all non-empty rows of table data, find the duplicates, and replace the previous(old) entries with the new data.
Let me know if this works for you!
Commented:
function myFunction() {
const sheet = SpreadsheetApp.getActiveSpreadsheet()
.getSheetByName(`YOUR_SHEET`)
// Get the target range..
let data = sheet.getRange(`B2:M`)
// As text.. (Avoids date issues)
.getDisplayValues()
// ...and ignore blank rows.
.filter(row => row.every(cell => cell.length))
// Get all unique `Job No`'s and replace their value in this array with the relevant rows from 'data'...
const duplicates = [...new Set(data.flatMap(i => i[1]))].map(i => data.filter(item => item[1] === i))
// Keep all `Job No`'s rows with more than 1 entry..
.filter(i => i.length > 1)
// ...And keep only the most recent.
.map(i => i[i.length-1])
// For each of these duplicates..
duplicates.forEach(i => {
// Remove the 'new' row..
const removeRow = data.splice(data.findIndex(item => item === i), 1)
// Replace the 'old' row..
data[data.findIndex(item => item[1] === i[1])] = removeRow.flat()
})
sheet.getRange(`B2:M`).clear()
sheet.getRange(2, 2, data.length, data[0].length).setValues(data)
}
Solution 2:[2]
Remove Dupes
function removeDupsOnColC() {
const ss = SpreadsheetApp.getActive();
const sh = ss.getSheetByName(`DEBUG`)
let data = sh.getRange(`B2:M` + sh.getLastRow()).getDisplayValues();
let ua = [];
let oA = [];
data.forEach((r,i) => {
if(!~ua.indexOf(r[2])) {
ua.push(r[2]);
oA.push(r);
}
})
sh.getRange(2, 2, oa.length, oa[0].length).setValues(oa)
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Cooper |