Are you an LLM? You can read better optimized documentation at /docs/data.md for this page in Markdown format
Managing Salesforce data
OrgFlow's data features let you move record data into and out of your Salesforce orgs quickly and easily. Powered by Salesforce's Bulk API and Pub/Sub API, these features are designed for speed and scale while remaining simple to use.
Data operations
There are currently two data operations in OrgFlow:
| Operation | Description |
|---|---|
| Export data from Salesforce | Query records from a Salesforce org and download them as files in your choice of format |
| Import data into Salesforce | Upload records from a source file or URL and ingest them into a Salesforce org using insert, update, upsert or delete operations |
Data vs. DevOps
OrgFlow has two main product areas: DevOps and Data. The DevOps features deal with flowing changes to Salesforce metadata (the structure and configuration of your org) using Git-based version control, while the data features deal with Salesforce data (the actual records stored in your org) and operate independently of any Git repository.
The two areas share underlying infrastructure and common functionality such as workspaces, stacks, jobs and schedules, but they serve different purposes and are operated independently.
Targeting any Salesforce org
Unlike DevOps operations which operate on environments in your stack, data operations can target any Salesforce org that you can authenticate to — it doesn't have to be an org that is configured as an environment in your stack. This means you can use OrgFlow's data features against production orgs, sandboxes, scratch orgs and developer editions, regardless of whether those orgs are participating in your DevOps workflow.
You can of course also use OrgFlow's data features on Salesforce orgs that form environments in your stack, and OrgFlow provides convenient shortcuts to start data operations from environments in your stack.
To be used with OrgFlow's data features, a Salesforce org must support:
- Salesforce REST API
- Salesforce Bulk API
For best performance in query operations, also make sure query job platform events are enabled in the Salesforce org (see below for details).
File formats
Both import and export support multiple file formats:
| Format | Extension | Description |
|---|---|---|
| CSV | .csv | Comma-separated values — widely supported and easy to work with |
| JSON | .json | JavaScript Object Notation — useful for structured data pipelines |
| Parquet | .parquet | Columnar format — efficient for large datasets and analytics |
| Excel | .xlsx | Microsoft Excel workbook — convenient for manual review and edits |
When exporting, you can select one or more formats and OrgFlow will produce a downloadable file for each. When importing, the source file can be in any of these formats.
TIP
The files produced by an export operation can be used directly by an import operation. This makes it easy to export data from one Salesforce org and import it into another.
Choosing a format
Each format has trade-offs in terms of processing speed, file size and compatibility:
- Parquet is the fastest format for OrgFlow to process and produces compact files. If speed is your priority, Parquet is the best choice. It is well-suited for large datasets and integrates easily with data analytics tools, but is not human-readable and requires specialized software to open.
- CSV offers a good balance of speed, compatibility and simplicity. CSV files are fast to process, easy to open in any text editor or spreadsheet application, and are widely supported across tools and platforms. This makes CSV a good general-purpose default and a lightweight alternative to Excel.
- JSON is useful when you need structured data for programmatic consumption or data pipelines. However, because field names are repeated for every record, JSON files can become very large with wide or large datasets compared to other formats.
- Excel is convenient for manual review and editing in spreadsheet applications, but is significantly slower for OrgFlow to process than the other formats. If you don't specifically need an Excel file, consider using CSV instead for better performance. Excel files are also limited to a maximum of 1,048,576 records (see below).
Date/Time fields cannot be round-tripped in JSON files
Due to a bug in one of our upstream dependencies, Date/Time fields in exported JSON files cannot be used directly in a subsequent import operation. Every record in the import source that has a set value for one or more Date/Time fields will be rejected by the Bulk API and fail to ingest. (Date fields and Time fields are not affected by this issue.)
Instead we recommend using Parquet or CSV formats for round-tripping Salesforce data. These formats round-trip Date/Time fields reliably, and are generally faster and more efficient than JSON.
Excel is limited to 1,048,576 records
Excel files are limited to a maximum of 1,048,576 records due to limitations in most software tools that can open them. If the dataset for a data operation (import or export) contains more than 1,048,576 records, no Excel result file will be created.
Data operations and jobs
Data operations run as jobs in OrgFlow, just like DevOps operations. This means they benefit from the same job infrastructure, and features such as real-time progress monitoring, job timeline, artifacts and job history.
Data operations are metered separately from DevOps operations and their active worker minutes are categorized as active worker minutes (data). Passive worker minutes are the same for both DevOps and data operations. See the billing topic for more information about worker minutes and billing.
Salesforce Bulk API and Pub/Sub API
Under the hood, OrgFlow uses the Salesforce Bulk API to query and ingest data. The Bulk API is optimized for processing large volumes of records efficiently. OrgFlow automatically handles the details of creating Bulk API jobs, splitting data into batches, and retrieving results — so you can focus on the data itself rather than the API mechanics. OrgFlow uses Bulk API V2 for query operations, and Bulk API V1 for anything else.
For query operations, OrgFlow takes performance a step further by combining the Bulk API with the Salesforce Pub/Sub API. Rather than having to wait until the whole Bulk API job is completed in Salesforce, OrgFlow subscribes to BulkApi2JobEvent platform event notifications via the Pub/Sub API which allows it to process batches as and when they are ready.
This translates into significant performance gains, since OrgFlow can download, transform and process early batches while the Bulk API job continues to produce subsequent batches in Salesforce.
At the time of writing, BulkApi2JobEvent platform events are still in preview and must be enabled in your Salesforce org. If this feature is not enabled, or if no BulkApi2JobEvent platform events are detected through the Pub/Sub API for any other reason, OrgFlow gracefully falls back to continuously polling for Bulk API job completion. This typically makes the data operation slower overall, but produces the same result in the end.