'Newbie - need script to remove duplicates based on multiple criteria
So I'm super new to Google Sheets - as I mostly use Excel and Smartsheet. I was trying to accomplish this by just recording a macro, so this is VERY messy.
I have a doc that we're uploading data into on a regular basis, and need to be able to remove duplicated items.
The items that have already been sorted will have a category applied, so my thought was that we could remove items duplicated in column 'H' (opp id) and then with a blank cell in Column 'B' (Category)
The way I was doing it at first was to record the macro with me manually doing the conditional formatting to highlight duplicates on column H, and then pulled a script from here to delete rows that were blank in column B. However, the script removes ALL items that are blank in column B and not just the items I had filtered based on the conditional formatting.
I know this is because the script is pulling the whole sheet, but I just can't figure out how to fix it.
function DeleteDuplicates() {
var spreadsheet = SpreadsheetApp.getActive();
spreadsheet.getRange('J9').activate();
spreadsheet.setActiveSheet(spreadsheet.getSheetByName('Report'), true);
spreadsheet.getRange('H:H').activate();
var conditionalFormatRules = spreadsheet.getActiveSheet().getConditionalFormatRules();
conditionalFormatRules.splice(0, 1, SpreadsheetApp.newConditionalFormatRule()
.setRanges([spreadsheet.getRange('H1:H981')])
.whenFormulaSatisfied('=countif(H:H,H1)>1')
.setBackground('#FF00FF')
.setFontColor('#000000')
.build());
spreadsheet.getActiveSheet().setConditionalFormatRules(conditionalFormatRules);
spreadsheet.getRange('J15').activate();
var spreadsheet = SpreadsheetApp.getActive();
spreadsheet.getRange('D15').activate();
spreadsheet.setActiveSheet(spreadsheet.getSheetByName('Report'), true);
spreadsheet.getRange('A2').activate();
spreadsheet.getActiveSheet().getFilter().remove();
spreadsheet.getRange('A2:AC112').activate();
spreadsheet.getRange('A2:AC112').createFilter();
spreadsheet.getRange('H2').activate();
var criteria = SpreadsheetApp.newFilterCriteria()
.build();
spreadsheet.getActiveSheet().getFilter().setColumnFilterCriteria(8, criteria);
var sheet = SpreadsheetApp.getActiveSheet();
var rows = sheet.getDataRange();
var numRows = rows.getNumRows();
var values = rows.getValues();
var rowsDeleted = 0;
for (var i = 0; i <= numRows - 1; i++) {
var row = values[i];
if (row[1] == '') {
sheet.deleteRow((parseInt(i)+1) - rowsDeleted);
rowsDeleted++;}
spreadsheet.getActiveSheet().getFilter().removeColumnFilterCriteria(8);
spreadsheet.getRange('J13').activate();
}
}
};
Solution 1:[1]
so my thought was that we could remove items duplicated in column 'H' (opp id) and then with a blank cell in Column 'B' (Category)
Essentially, you are looking to remove all rows with duplicate Column H values AND with blank cells in Column B?
If that's the case,
Try:
function myFunction() {
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(`Report`)
const data = sheet.getDataRange().getValues()
const uniqueItems = [... new Set(data.map(i => i[7]))]
const updatedData = []
for (let item of uniqueItems) {
const itemData = data.filter(i => i[7] === item)
if (itemData.length > 1) {
itemData.filter(i => i[1] !== ``)
.forEach(i => updatedData.push(i))
} else { updatedData.push(...itemData) }
}
sheet.getRange(1, 1, updatedData.length, updatedData[0].length).setValues(updatedData)
}
Commented:
function myFunction() {
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(`Report`)
const data = sheet.getDataRange().getValues()
// Get all unique items in the sheet.
const uniqueItems = [... new Set(data.map(i => i[7]))]
const updatedData = []
// For each unique item..
for (let item of uniqueItems) {
// Get all rows containing the item.
const itemData = data.filter(i => i[7] === item)
// If there's more than one row..
if (itemData.length > 1) {
// Remove all rows with blank cells in Column B...
itemData.filter(i => i[1] !== ``)
// ... and add to our updatedData.
.forEach(i => updatedData.push(i))
// If there's only one row for this item, add to updatedData
} else { updatedData.push(...itemData) }
}
sheet.getRange(1, 1, updatedData.length, updatedData[0].length).setValues(updatedData)
}
Solution 2:[2]
Description
Based on your original script it seems your programming experience is limited. Although NEWAZA's methods may work I propose a simpilier script that may be easier for you to follow. Much of original script is from moving the cursor around while recording a macro and is redundant. Instead I propose a simple macro that should be easy for you to follow.
I'm assuming new data is added and so I first perform a sort of column H and B. Then I loop through backwards to find duplicates and blank cells and delete rows that match the criteria. I loop backwards so I don't have to keep track of row number as rows are deleted. I'm also assuming rows to be deleted are not contiguous which may result in a slight performance hit.
Script
function deleteDuplicates() {
try {
var spreadsheet = SpreadsheetApp.getActive();
let sheet = spreadsheet.getSheetByName('Report');
let range = sheet.getDataRange();
range.sort([{column: 8, ascending: true},{column: 2, ascending: false}]);
let values = range.getValues();
// assuming the conditional format rule has already been applied to this sheet
for( let i=values.length; i>2; i-- ) {
let j = i-1;
let k = j-1;
if( ( values[j][7] === values[k][7] ) && ( values[j][1] === "" ) ) {
sheet.deleteRow(i);
}
}
}
catch(err) {
console.log(err);
}
}
Reference
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | NEWAZA |
Solution 2 | TheWizEd |