Problem/Motivation
The migration system keeps track of what has been migrated already from a source by writing a record for each migrated source row to a map table.
The SqlBase source plugin base class makes use of the map table by doing an SQL JOIN to the map table, so that the source query is filtered to only those source records that haven't already been imported.
This means that if you do an incremental migration, the migration process doesn't have to go through lots of source records that have already been imported, because they are simply eliminated from the query result.
However, the ContentEntity migration source, which provides entities from the current Drupal site as the source rows, doesn't consider the map.
This means that if you do an incremental migration, or do your migration in batches, either for performance or during development, ALL the entities that have already been migrated are iterated over, loaded, and checked against the map.
This makes incremental migrations very slow, as they have to go over all the already migrated entities before they get to entities that need to be migrated.
Steps to reproduce
Run an incremental migration with lots of source records (10k or so).
Proposed resolution
Move the addMapJoin()
method to a new MapJoinTrait
.
Use the new trait in both SqlBase
and ContentEntity
.
In ContentEntity
, get the SQL query from the source entity query, and JOIN to the map table.
Remaining tasks
User interface changes
None.
API changes
None.
Data model changes
None.
Release notes snippet
TBD