Problem/Motivation
Drupal 7 introduced multiple loading of entities and the controller classes. This means that especially when building lists of content, the best case can be a single multiget from cache to get all the fully populated entity objects (when using entitycache module) or that all nodes are loaded with a flat number of database queries regardless of the number of nodes. This improved on the situation in Drupal 6 where each load was loaded individually, and each node_load() could execute a decent number of database queries each.
For this to work, there is a general pattern of "get the IDs, multiple load the nodes based on the IDs, do stuff with them" - this is great for content listing, however it does not work if for example you need to load 7 different nodes in seven different blocks - those still get grabbed one at a time.
For the WSCCI iniatiative (especially after the irc meeting last night and discussions in #1177246: Context values: Keys or objects?), there is a desire to have the publicly accessible context properties be either literals or objects. In the case of entities, this would be $context['node'] or $context['user'] most likely.
There is also a desire to improve block caching, so that from just context, you can build a cache key and load a block from cache (core already allows you to do this via drupal_render(), #cache, and pre_render callbacks although it is only partly formalized).
This means that if we just got an $nid from somewhere, use the $nid as part of a cache key, then get a cache hit, we won't actually need to load the node (or a 'parent' page won't need to load the node, but an ESI callback might on a cache miss in a different PHP process altogether).
But... passing around $node objects, and yet only needing $nid to generate a cache key seem mutually exclusive I hear you say!
Proposed resolution
There has been a lot of discussion about making entities into proper classes (rather than stdClass) with methods etc. Additionally, that we should keep the Load controller (and associated other controllers as they come into play) as separate classes to the actual entity class (so $entity->save() calls a method from an EntitySave class, the code wouldn't be baked into the Entity class). Details are not fully worked out yet but that is the general direction.
This means I think we should be able to construct an entity object like this:
$entity = new Entity($type, $id); // $id likely optional so that mock entities can be created, potentially new ones.
This means I can do $node = new Entity('node', $nid); $node->id (or $node->id() or whatever), and that will work fine.
What occurred to me yesterday, is there may well be a way to reconcile this with multiple load, would look something like this:
* When you instantiate the class with an $id, the $id gets added to the EntityLoad controller - which maintains a list of $ids of that type that are in use.
* (two options), when you call $entity->load(), or just access a property that is missing (triggering __get() or similar), the $entity object calls back to the EntityLoad controller to fetch the actual loaded node. I am not yet tied to the magic method vs. the explicit ->load() method for this, the internal implementation in terms of this issue could be very similar.
* When ->load() (either directly called or via magic) asks the EntityLoad controller, it inspects the list of entity IDs that it has, vs. the ones that are already loaded. At this point, it is able to multiple load all of the nodes that weren't loaded already - so they're ready for later when requested. (we could also add a method or property to disable this behaviour, allow the default to be overridden or similar).
This would give us the following:
- for lists, multiple load works more or less the same as now, except you could just foreach over $nids, instantiate the class and load it - no get $nids, load, then foreach pattern. So it should be a simpler for people who just want to load nodes and do stuff with them. The front-loading would be encapsulated in the class logic rather than a pattern that has to be copy/pasted around.
- for non lists (like multiple blocks on a page with different nodes based on different relationships), if we build the context for the blocks, then go through and execute the block handlers in sequence (either building the actual render array or rendering the string, doesn't matter for this), multiple load will work for this case too, whereas it currently doesn't.
- For sites using ESI/big pipe or similar, processes that don't need to load entities won't do so just to generate cache keys or similar.
- There are advantages for doing things like mocking the context object too (i.e. a block callback never has to call $node = node_load($nid); it just acts on a class that was passed into it), so dependency injection, consistency etc. It keeps us closer to menu_get_object() and passing parameters vs. node_load(arg(1));
- This approach should very compatible with adding an LRU cache for entities, see #375494: Restrict the number of nodes held in the node_load() static cache and #1199866: Add an in-memory LRU cache.
Remaining tasks
No patch yet, we really need to get #1018602: Move entity system to a module in.
User interface changes
None.
API changes
Probably.