Re: [Apps-Script] compare existing data to new data

16 views
Skip to first unread message

Yogesh Chhetri

unread,
Nov 18, 2024, 2:31:38 AMNov 18
to google-apps-sc...@googlegroups.com
Hello Michael

Here are two approaches to address this:

Solution 1: Ignore the last column during the duplicate check:

const dataRange = monthSheet.getDataRange();
const values = dataRange.getValues();

const uniqueRows = [];
const uniqueRowsSet = new Set();

values.forEach(row => {
  const newRow = [...row]; // Copy the row
  newRow.splice(lastCol - 1, 1); // Remove element at last column index (ignoring the cleaned names)
  const rowString = newRow.join();
  if (!uniqueRowsSet.has(rowString)) {
    uniqueRows.push(row);
    uniqueRowsSet.add(rowString);
  }
});

This approach creates a copy of each row (newRow) and removes the element at the index corresponding to the last column before creating the rowString. This ensures that the comparison happens based on the original data without the cleaned account names.

Solution 2: Move account name cleaning after duplicate check:  

const dataRange = monthSheet.getDataRange();
const values = dataRange.getValues();

const uniqueRows = [];
const uniqueRowsSet = new Set();

values.forEach(row => {
  const rowString = row.join();
  if (!uniqueRowsSet.has(rowString)) {
    uniqueRows.push(row);
    uniqueRowsSet.add(rowString);
  }
});

// Update account names with cleaned values after identifying unique rows
const accountTypesRange = monthSheet.getRange(1, accountsCol, uniqueRows.length, 1);
const accountTypes = accountTypesRange.getValues();
const updatedAccountTypes = accountTypes.map(row => [cleanAccountName_(row[0])]);
monthSheet.getRange(1, lastCol, updatedAccountTypes.length, 1).setValues(updatedAccountTypes);

This approach identifies unique rows based on the original data first. After identifying unique rows, it updates the cleaned account names only for those unique rows. Choose the solution that best suits your workflow. Both approaches will ensure that duplicate checks happen based on the original data before any cleaning takes place.


On Sun, Nov 17, 2024 at 5:46 AM Michael Timpano <maicol....@gmail.com> wrote:
Hello,

I am trying to append only unique rows to the sheet, by comparing existing data in the sheet with the new data being collected. It appends the same duplicate rows. Not sure what types of checks I can use along the way to ensure the comparison is in fact the same for the existing and new data. Thanks in advance for this concern. Take care.

function cleanAndProcessMonthSheet_(monthSheet, accountsLastCol, accountsCol, lastCol, sheetUniqueValues) {
  console.log('Checking to see if there are any duplicate rows.');

  // **Clean account names and update the month sheet**
  // Get all account names starting from row 1 (no header in monthly sheets)
  const lastRow = monthSheet.getLastRow();
  if (lastRow < 1) {
    console.log("No data rows to process in the month sheet.");
    return;
  }

  // Get the account types and clean them
  const accountTypesRange = monthSheet.getRange(1, accountsCol, lastRow, 1); // Start from row 1
  const accountTypes = accountTypesRange.getValues();
  const updatedAccountTypes = accountTypes.map(row => [cleanAccountName_(row[0])]); // Clean and convert to uppercase

  // Update the specified last column (lastCol) with the cleaned values, starting from row 1
  monthSheet.getRange(1, lastCol, updatedAccountTypes.length, 1).setValues(updatedAccountTypes);
  console.log("Cleaned account names updated in the last column.");

  // **Retrieve all unique account names and store them in the provided Set**
  const accountNames = monthSheet.getRange(1, accountsLastCol, lastRow, 1).getValues().flat().filter(Boolean);
  accountNames.forEach(value => sheetUniqueValues.add(value));

  // **Retrieve all the data in the sheet and check for duplicate rows**
  const dataRange = monthSheet.getDataRange();
  const values = dataRange.getValues();

  const uniqueRows = [];
  const uniqueRowsSet = new Set();

  values.forEach(row => {
    const rowString = row.join(); // Create a simple unique string representation of the row
    if (!uniqueRowsSet.has(rowString)) {
      uniqueRows.push(row); // Add only unique rows
      uniqueRowsSet.add(rowString);
    }
  });

  // **Overwrite the sheet with unique rows if duplicates were found**
  if (uniqueRows.length < values.length) {
    console.log("Duplicate rows found. Removing duplicates.");
    monthSheet.getRange(1, 1, uniqueRows.length, uniqueRows[0].length).setValues(uniqueRows);
    console.log("Successfully removed duplicates and updated the sheet.");
  } else {
    console.log("No duplicates found, no changes made.");
  }
}

For a bit more information I don't know if writing to the sheet -   monthSheet.getRange(1, lastCol, updatedAccountTypes.length, 1).setValues(updatedAccountTypes);
  console.log("Cleaned account names updated in the last column."); is happening at the wrong time and affecting the comparison? Basically the last column in the sheet has been processed to be in uppercase. 

Not sure if this is already the case for the new data when the comparison happens? Or should I ignore the last column in the existing data and new data for the comparison.

I have another approach that writes the new data and then a function runs to remove the duplicate rows, but that seems counterintuitive. Any help or suggestions are appreciated. Take care.

--
You received this message because you are subscribed to the Google Groups "Google Apps Script Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-apps-script-c...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/google-apps-script-community/b0576bcc-17fa-46ae-bb70-f028068ae322n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages