Quantcast
Channel: Issues for Drupal core
Viewing all articles
Browse latest Browse all 292473

Refactor file.inc to make it simpler and prevent PDO exceptions

$
0
0

Problem/Motivation

There is a unique constraint on the uri column in {file_managed}. When inserting a new row it is only checked that the URI is unique on the filesystem, but not in the database. If the file has been deleted from the filesystem, but the entity still exists, this results in a PDO exception.

Proposed resolution

The code is already too complicated and buggy - previous patches posted in this issue were making it more complicated: refactor the code related to creating unique URIs.

Managed file operations, e.g. file_copy() / file_move() should not have the option to "take over" an existing file entity. It should be hardcoded to make sure the new URI is unique in database and filesystem. A file entity's URI can change, but the contents of a file entity's file must never change.

Replace constants:
FILE_EXISTS_RENAME, FILE_EXISTS_REPLACE, FILE_EXISTS_ERROR
should be:
FILE_CHECK_EXISTS_FALSE, FILE_CHECK_EXISTS_FS, FILE_CHECK_EXISTS_DB_FS

We do not need FILE_EXISTS_ERROR - code that needs this functionality can check itself.

The database should be checked for an existing URI before the filesystem as communicating with the filesystem can be very expensive, e.g. it could be in the cloud.

There is no situation where it is necessary to check database for an existing URI but not the filesystem.

Combine file_destination() and file_create_filepath() into file_uri_prepare() - a single function that ensures a URI is suitable for a new file.

Calls to file_valid_uri() just get in the way as it is perfectly valid to pass a filepath instead of a URI.

file_move() shouldn't return an entity as it is just modifying the entity that is passed to it.

file_copy() / file_move() / file_save_data() should set the filename of the resultant entity to the basename of the desired destination (provided it isn't a directory), not the basename of the resultant URI.

Remaining tasks

Review patch.

User interface changes

none

API changes

file_copy(File $source, $destination = NULL, $replace = FILE_EXISTS_RENAME) =>
file_copy(File $file, $destination = NULL)

file_unmanaged_copy($source, $destination = NULL, $replace = FILE_EXISTS_RENAME) =>
file_unmanaged_copy($source, $destination = NULL, $check_exists = FILE_CHECK_EXISTS_FS)

file_destination($destination, $replace) and file_create_filename($basename, $directory) =>
file_uri_prepare($uri, $check_exists = FILE_CHECK_EXISTS_FS)

file_move(File $source, $destination = NULL, $replace = FILE_EXISTS_RENAME) =>
file_move(File $file, $destination = NULL)

file_unmanaged_move($source, $destination = NULL, $replace = FILE_EXISTS_RENAME) =>
file_unmanaged_move($source, $destination = NULL, $check_exists = FILE_CHECK_EXISTS_FS)

drupal_file_exists($uri, $check_db = FALSE)

file_save_upload($source, $validators = array(), $destination = FALSE, $replace = FILE_EXISTS_RENAME) =>
file_save_upload($source, $validators = array(), $destination = NULL)

file_save_data($data, $destination = NULL, $replace = FILE_EXISTS_RENAME) =>
file_save_data($data, $destination = NULL)

file_unmanaged_save_data($data, $destination = NULL, $replace = FILE_EXISTS_RENAME) =>
file_unmanaged_save_data($data, $destination = NULL, $check_exists = FILE_CHECK_EXISTS_FS)

system_retrieve_file($url, $destination = NULL, $managed = FALSE, $replace = FILE_EXISTS_RENAME) =>
system_retrieve_file($url, $destination = NULL, $check_exists = FILE_CHECK_EXISTS_FS)

Original report by pwolanin

Found this in Drupal 7, but the fix should be the same for both 7 and 8 (and maybe 6?).

Discussed the problem with aaronwinborn and effulgensia and the it seems pretty clear we have a bug in terms of file_destination() because for a managed file it only checks if the file exists on disk.

This can lead to an SQL error if the URI exists in the {file_managed} table. For example, if I have 2 webnodes with local /tmp directories, a file being saved to temporary:// may not exist on disk, but may exist in the database. This can also happen if entries are not correctly purged when a file is deleted off disk.

There is a second bug which is a renaming race condition. If 2 users are trying to upload a file at about the same time, the simple incrementing logic may give both users the same file name.

Proposed fixes:

#1: when saving a managed file, check the URI in the database as well as on disk.

#2: when generating a suffix in the case of a duplicate file to be renamed, use a random 4-6 character string, rather than just incrementing.


Viewing all articles
Browse latest Browse all 292473

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>