You are here:
Limitations and Considerations for Semi and Anti Joins
Semi and anti joins are powerful tools for filtering data based on the existence of related records. To ensure accurate results and optimal performance, consider these specific behaviors when getting started.
Unsupported Operations:
cogroupand joins can't be used in the same query.- Totals and subtotals aren't supported.
- Boolean filter logic isn't supported.
- Blends and joins can't be combined in the same query.
Optimizing Performance:
- Filter the second dataset before running a join. Join performance is directly proportional to the amount of data returned by the second dataset.
- Execute join statements before any projections on the query results. For example, if
your query includes a
foreachstatement, such asq = foreach q generate count(q1) as 'A';, run it after the join.
Supported Filters
- Filters can be applied to both the primary and secondary datasets.
Join Limitations
| Limitation | Details |
|---|---|
| Dataset Limit | You can join a maximum of two datasets. For combining more datasets (up to six), consider using a blend. |
| Field Pairings | Up to five field pairings are allowed between datasets. |
| Order of Operations | Datasets must be joined before exploration. Join Data Source becomes unavailable if groupings, measures, or filters are added first. |
| Self-Joins | Combining rows within the same dataset is supported. When a filter is applied in a self-join, only one dataset is visible. Global filters are applied to the primary dataset in the join. |
| Primary Dataset Focus | Faceting and record-level actions are applied exclusively to the primary dataset. |

