Problem/Motivation
The package_manager
module that will be added to Drupal core for both https://www.drupal.org/project/automatic_updates and https://www.drupal.org/project/project_browser has a dependency on:
php-tuf/composer-stager":"^2"
— see the composer.json
When Automatic Updates or Project Browser needs to run Composer commands that will modify any of the site's code (e.g., composer require
, composer update
, or composer remove
), the Composer Stager library is what copies the site's entire codebase to a separate directory, runs the composer commands in that separate directory, and then syncs the changes back to the site's real codebase. In other words, this library is what ends up modifying the site's actual code files, so it's essential that we have confidence in its reliability and security, both when we initially add it to Drupal core, and ongoing after that.
So far, 99% of Composer Stager has been written by @TravisCarden, a Drupal contributor working for Acquia as part of Acquia's investment in the Automatic Updates initiative. The Composer Stager library does not make Drupal-specific assumptions and was written with the hope that other Composer-managed PHP CMSes and applications could also some day benefit from it and contribute to it.
Proposed resolution
Place the Composer Stager library into core governance. This would mean, for example, all Drupal core committers having commit access to it, following Drupal core review processes for committing to it, and having a security policy in the repo that states that security issues should be reported according to Drupal's security reporting process.
Existing maintainers would maintain commit access to the repository, but should co-ordinate any new releases or major changes with core committers.
Dependency evaluation
Per https://www.drupal.org/about/core/policies/core-dependency-policies/depe... :
- Repository
- https://github.com/php-tuf/composer-stager
- Code quality
- The project has extensive unit and functional test coverage, strict static analysis, and automated coding standards.
- Maintainership of the package
- Currently by @TravisCarden, but the Proposed Resolution is to place this under Drupal core maintenance.
- Security policies of the package
- The current github repo does not have a security policy, but the Proposed Resolution is to place this under Drupal core maintenance and therefore under Drupal core security policies. See the Add SECURITY.md pull request.
- Expected release and support cycles
- To be determined by Drupal core release managers as part of placing this under Drupal core governance.
- Other dependencies it'd add, if any
Runtime dependencies
symfony/process
: Already a direct Drupal core runtime dependency.symfony/translation-contracts
: Already an indirect Drupal core runtime dependency viasymfony/validator
.symfony/filesystem
: Currently a Drupal core dev dependency, but @catch approved it being promoted to a runtime dependency in https://github.com/php-tuf/composer-stager/issues/60#issuecomment-154401....
Dev dependencies
Composer Stager has 20ish dev dependencies. See https://github.com/php-tuf/composer-stager/issues/78 for links to the child issues to evaluate each one.
Remaining tasks
- Since the repo will be under core governance, perform a dependency evaluation of all dev dependecies, as we would if we were adding that as a core dev dependency (https://github.com/php-tuf/composer-stager/issues/78).
- Review the code that's in Composer Stager for core quality standards.
- Add needed governance documents (security policy, links to Drupal core governance documents, etc.) to the repo. (https://github.com/php-tuf/composer-stager/issues/79)
Code overview for reviewers
Composer Stager's README and Wiki, which includes a glossary, are a good place to start. In addition to what's written there, the following is a quick orientation to the project's codebase. Read at least either that README or that glossary first, in order to understand the purpose of Composer Stager's four core operations (Begin, Stage, Commit, and Clean), before reading the following.
API vs Internal
The repository has two top-level directories within /src: API and Internal. All files within the API directory are annotated with @api
. All files within the Internal directory are annotated with @internal
. Almost all interfaces are part of the API and almost all concrete classes are Internal. There are a few exceptions, and it's hopefully clear when reviewing the code why a few interfaces are internal and why a few concrete classes are part of the API.
Preconditions
For each of the 4 core operations (begin, stage, commit, clean), there are preconditions that must all pass. For example, for the begin operation, there's a precondition that the staging directory must not already exist. See the constructors of BeginnerPreconditions, StagerPreconditions, CommitterPreconditions, and CleanerPreconditions, to see the required preconditions of the corresponding operation.
Several preconditions include checking for the presence of hard links or unsupported symlinks. The reason for checking for these is that the premise of Composer Stager is that once the active codebase is copied to staging, then Composer operations in staging won't affect the live site. However, if the codebase includes, for example, symlinks to absolute file paths, then copying that to staging will result in the staging copy pointing to the same place that the active codebase points to, which would open the risk that something that writes to that could end up affecting the live site. However, relative symlinks that don't traverse outside of the project root are allowed (except on Windows, which adds other challenges to working with symlinks that we decided to punt to later), because having those is common for codebases that include Drush, local path repositories, or other use cases.
Several preconditions, such as the ones that check for hard links and symlinks per above, require iterating every directory and file in the codebase being operated on. This iteration is performed by AbstractFileIteratingPrecondition, which uses FileFinder, which in turn uses PHP's RecursiveDirectoryIterator.
Because the codebase could be quite large (e.g., a site with a lot of contrib modules) and iterating each file in PHP could be slow, Composer Stager performs automated benchmarking of these operations as of version 2.0.0-beta1. See https://github.com/php-tuf/composer-stager/pull/230 for details. Current results indicate that for a codebase that includes Drupal core plus the ~130 most installed contrib modules that each of the expensive operations (begin and commit) takes on the order of 15-30 seconds on GitHub Actions containers. That time includes the combined time of both running the preconditions and then performing the file sync operation itself.
File syncers: Rsync and PHP
The Begin operation copies the codebase (the entire Drupal application) from its live location (the "active" directory) to a temporary location (the "staging" directory). The Commit operation (which is called after all of the desired Composer commands have finished operating on the staging directory) copies back. On systems that have rsync available, RsyncFileSyncer is used. Otherwise, PhpFileSyncer is used. We expect RsyncFileSyncer
to be significantly faster than PhpFileSyncer
, especially for the Commit operation which only needs to sync back the files that are different (i.e., the ones that changed as a result of the Composer commands that ran in the staging directory), which rsync is very efficient at doing. The speed of the Commit operation matters, since Package Manager puts the Drupal site into maintenance mode during that time so as not to serve requests during the time that files are being copied from staging to active (i.e., when the live codebase is in a partially updated state).
Exclusions
One of the parameters to Beginner::begin()
and Committer::commit()
is $excusions
. Composer Stager is un-opinionated about what codebase it's being used on, so it defaults this to NULL, but Package Manager passes several directories to it, including sites/default/files
. This is the set of paths to exclude from copying. This is so that the Drupal site can still serve requests (and content editors can still be uploading files) while Composer commands are running in the staging directory. Without excluding these, the Commit operation would delete/undo changes that were made on the live site since the time that the Begin operation ran.
String translations
All potentially user-facing strings within Composer Stager's code files are translated via TranslatableAwareTrait::t(). This should hopefully make it easy for Drupal's POTX to find them. The implementation of this t() method uses a Composer Stager translation API that's modeled on Symfony's translation contracts component (not on Drupal's translation API, since Composer Stager is intended to be Drupal-agnostic). Drupal's Package Manager module overrides Composer Stager's translation factory service to return a Drupal-compatible TranslatableMarkup object. See #3368808: Override Composer Stager's TranslatableFactory to return Drupal's TranslatableMarkup for details.
Release notes snippet
php-tuf/composer-stager
is now a dependency, which enables the experimental Automatic Updates and Project Browser functionality.