remove-duplicate-rows

Finds rows in a datatable that are duplicated and separates the duplicates into a duplicates datatable.

If a row has an exact duplicate, one one row is kept, and all the extra duplicate copies are removed, separating them into a second datatable output. If the input datatable is five rows, and they are all identical, then the data output will contain one row, and the duplicates output will contain four rows.

Arguments

Inputs:

data: The input datatable.

Output:

data: The datatable with only a single copy of each row.

duplicates: All extra copies of duplicated rows.

Possible use cases

  • Removing duplicated rows after using merge-datatables on two partially-overlapping datasets.

  • Determining which rows are duplicated, and how many times they are duplicated.

Last updated