Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions concepts/mapping.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
title: "Mapping"
description: "Mapping"
icon: "map"
---

In Flatfile, _mapping_ describes the process of getting data from an uploaded file
into a destination worksheet in your application.

## The Ghost of Mapping Past

Historically mapping in Flatfile has consisted of two pieces:

1. A _field mapping_ which describes which columns in the source data
corresponds to which fields in the destination worksheet. For example, maybe the "fname"
column in the source data should be mapped to the "First Name" field in the destination worksheet.
2. For every destination field of type [enum](/blueprint/field-types#enum),
an _enum mapping_ describing which values in the source column
correspond to which allowable enum values in the destination column. For example, you could have an enum
column of country codes, and you might need the source value "United States" to map to the enum value "US".

Flatfile has developed machine-learning models trained on millions of historical mappings that can suggest
both field and enum mappings based on the column names and values in the source data. You can of course
override these suggestions. The Flatfile platform remembers your previously chosen mappings and will
suggest them when it sees similar schemas / data in the future.

## The Ghost of Mapping Present

Many users of Flatfile need more control over the mapping process than
just assigning source fields to destination fields. For example, you might need to
concatenate multiple source fields into a single destination field, or you might need to
choose the first non-empty value from a set of source fields.

To this end, we've recently introduced a wider variety of
[mapping rules](https://github.com/FlatFilers/flatfile-mapping/blob/main/README.md#mapping-rules)
that specify these more complex mapping-related transformations on your data,
and the notion of a _mapping program_ that selectively applies these rules to your dataset.

As of right now the only rule type supported by the Flatfile application is the `assign` rule,
as described above.

However, we have introduced a [Python SDK](developer-tools/mapping-libraries/flatfile-mapping-python)
that allows you to create your own mapping rules and mapping programs and apply them to your
own datasets outside of Flatfile (e.g. in an ETL pipeline).
The SDK is currently in extreme beta, but we would love to hear
your feedback on it. It supports the full complement of mapping rules, as described on its website.

If you have a Flatfile secret key, the SDK also exposes an "automapping" feature that uses
the previously-described machine learning models to suggest mapping programs for a given
source and destination schema. As with the Flatfile app, this feature currently only provides `assign` rules
(and also `ignore` rules that don't do anything but just communicate that a source field was not mapped).

## The Ghost of Mapping Future

In the near future we will be expanding mapping in several ways:

### Richer Mapping Rules in the Flatfile application

We are working on adding support for the full set of mapping rules
in the Flatfile application. This will allow a user mapping an incoming file
to create complex mapping programs, save them, and reuse them.

### Richer AI Assist for Mapping Rules

Currently our machine learning algorithms only suggest "assign" rules.
In the future they will also be able to suggest more complex mapping rules;
for instance, if your incoming data had a "first name" and "last name" field,
and your destination sheet only a "full name" field, the AI might suggest a
"concatenate" rule that combines the two source fields into the destination field.

In addition, we are working on a natural-language interface for creating mapping rules;
for instance, you could type "concatenate first name and last name into full name"
and the AI would produce the appropriate rule.

### JavaScript SDK

We are working on a JavaScript SDK that has similar functionality to the Python SDK,
the most notable difference being that it can't interact with Pandas dataframes.
It will be available soon.
9 changes: 9 additions & 0 deletions developer-tools/mapping-libraries/flatfile-mapping-python.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
title: "flatfile-mapping (Python)"
description: "Python library for running mapping programs over your own data"
icon: "github"
---

See documentation on GitHub:

https://github.com/flatfilers/flatfile-mapping
5 changes: 5 additions & 0 deletions mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@
"concepts/workbooks",
"concepts/actions",
"concepts/jobs",
"concepts/mapping",
"concepts/events",
"concepts/spaces",
"concepts/listeners"
Expand Down Expand Up @@ -238,6 +239,10 @@
"developer-tools/core-libraries/wrappers/vanilla"
]
},
{
"group": "Mapping libraries",
"pages": ["developer-tools/mapping-libraries/flatfile-mapping-python"]
},
{
"group": "Plugins",
"pages": ["plugins/overview"]
Expand Down