Data-flo
Data-floSource CodeCGPS
  • INTRODUCTION
    • What is Data-flo
    • Getting Started - Sign In
    • Privacy and Terms Of Service
    • Contact - Help & reporting errors
    • Change log
  • USING DATA-FLO
    • Data-flo site navigation
      • Transformations Page
      • Run Page
      • Canvas
    • Data
      • Bringing data in to Data-flo
      • Getting data out of Data-flo
      • Data Types
        • Boolean
        • Datatable
        • File
        • Graph
        • List
        • Map
        • Number
        • Text
    • Regular Expressions (RegEx)
    • Adaptors overview
      • Components of an adaptor
      • Binding types
        • Bind to Data-flo input
        • Bind to value
        • Bind to another transformation
    • Specific adaptors
      • add-column
      • append-to-list
      • calculate-time-difference
      • change-column-case
      • columns-concatenation
      • concatenate-text
      • create-microreact-project
      • csv-file-to-datatable
      • csv-to-datatable
      • datatable-columns
      • datatable-to-csv-file
      • datatable-to-graph
      • datatable-to-list
      • datatable-to-map
      • datatable-to-sqlite-file
      • date-to-text
      • dbf-file
      • dot-to-graph
      • download-file
      • dropbox-file
      • epicollect-project
      • extend-datatable
      • figshare-file
      • file-to-text
      • filter-blank-values
      • filter-columns
      • filter-list
      • filter-rows
      • filter-rows-numerically
      • force-directed-layout
      • format-date-column
      • forward-geocoding
      • ftp-file
      • gather-rows
      • google-drive-file
      • google-spreadsheet
      • graph-to-dot
      • join-datatables
      • list-to-datatable
      • lookup-map-value
      • merge-datatables
      • merge-lists
      • microreact-project
      • mysql-database
      • newick-leaf-labels
      • oracle-database
      • postgress-database
      • prepend-to-list
      • remove-columns
      • remove-duplicate-rows
      • rename-columns
      • replace-blank-values
      • replace-column-values
      • replace-text
      • replace-text-in-list
      • replace-value
      • reverse-geocoding
      • row-column-value
      • s3-file
      • select-columns
      • send-email-message
      • slice-datatable
      • slice-list
      • smb-file
      • sort-datatable
      • sort-list
      • split-column
      • split-datatable-rows
      • split-list
      • split-text
      • spread-rows
      • spreadsheet-file
      • sql-server-database
      • sqlite-database
      • sum-rows
      • text-template
      • text-to-file
      • unique-list-items
      • update-epicollect-entries
      • update-microreact-project
      • update-smb-file
      • upload-file-to-google-drive
      • upload-files-to-google-drive
      • url-builder
      • yaml-to-json
    • Building a data-flo
      • Debugging mode
      • Show detailed errors on Run Page
      • Permissions - Access Control
    • Tips & Tricks
  • TUTORIALS
    • Prep outbreak data for Microreact
    • Common use cases, solved
      • Fixing datatable headers
      • Select, remove, rename, reorder columns
      • Data in separate files
      • There's no single-column unique row ID (primary key)
      • Ensure non-dates stay non-dates
      • Connect directly to a database
      • Access files on a drive
Powered by GitBook
On this page
  • Arguments
  • Example
  • Possible use cases

Was this helpful?

  1. USING DATA-FLO
  2. Specific adaptors

join-datatables

Joins two datatables based on a common column between them.

This is a commonly-used adaptor for combining two separate datasources. As long as there is a common column, the datatables can be associated and joined.

Arguments

Inputs:

main data: The main (left) datatable. When inner join is False, all rows from this datatable will be included.

main column: The column in main data containing values shared by other column in other data.

other data: The other (right) datatable, to be joined to main datatable.

other column: The column in other data containing values shared by main column in main data. If more than one row match main column value, only the first matching row will be joined. If unspecified, the name of main column will be used.

inner join: Specifies whether to include rows that have matching values in both tables. If unspecified, defaults to "false" (all rows from main data will be included, and rows from other data will be included when 'other column' matches 'main column').

columns: Specifies which columns of other data to include. If unspecified, all columns in other data will be included. NOTE: This is a map. Key=column name in other data, Value=desired column name (this provides ability to rename columns from other data)

overwrite: Specifies whether to overwrite columns which exist in both tables. If unspecified, defaults to "false" (columns will be included twice if they exist in both tables).

distance: Specifies whether to use fuzzy matching when joining the rows. If unspecified, defaults to 100 (full match).

Output:

data: A datatable containing joined rows and columns.

skipped: A datatable containing rows from main data that have not been joined. If inner join is "false", then there are no skipped rows from main data.

TIP: keep in mind that unmatched rows from other data will not be returned in either data or skipped.

Example

The following images show a simple comparison of join-datatables adaptor configuration.

  • The first output shows that when inner join is false (default), all rows from main data are included, even when there is no additional data to join from other data. Note that when the datatables are swapped, the output is different (see the third output, which keeps unmatched rows from the new main data, which was the other data input in the first table.

  • The second output excludes rows from main data that lack a counterpart in other data. In this case, the skipped output will be a single row (sample, country, capital. 2, us, <empty>)

Possible use cases

  • Combine lab and sequencing data with epidemiological metadata.

  • Use after split-column

Previousgraph-to-dotNextlist-to-datatable

Last updated 2 years ago

Was this helpful?

Data-flo - Run: Adaptor Demo: Join datatables - Data-flo
the example data-flo available to copy and explore
Logo
inputs, data-flo, and outputs showing major functionality of join-datatables adaptor