Page MenuHomePhabricator

Security review for FileImporter extension
Closed, ResolvedPublic

Description

This extension is currently still in development but the team thinks it is ready for a first security review.

Project Information

Description of the tool/project

An extension that provides the ability to import a file from another wiki including all file and description history.
It will also make some predefined modifications to the description text.
It will also maintain previous revisions attachment to the correct user account on the new wiki

Description of how the tool will be used at WMF

It will be deployed on Commons.

Dependencies

  • This also depends on the deployment the FileExporter extension

Has this project been reviewed before?

Only review inside the WMDE-TechWish team.

Working test environment

  • None setup on labs, but the extension is easy to install (see instructions on mw.org)
  • After enabling the extension, navigate to Special:ImportFile

Post-deployment

WMDE-TechWish / WMDE-TechWish

Related Objects

Event Timeline

We are still actively working on this extension but it is currently blocked on a minor refactoring in core.
We should have an estimated date that it would be ready for review in the coming weeks.

The team believes this is ready for it's initial review now.

We are holding off on work on T161012 for now which was the last thing that we wanted to get in before a review.
If we do decide to continue with that task we can always request an individual review of the patch once ready.

The team believes this is ready for it's initial review now.

We are holding off on work on T161012 for now which was the last thing that we wanted to get in before a review.
If we do decide to continue with that task we can always request an individual review of the patch once ready.

Addshore renamed this task from WIP Security review for FileImporter extension to Security review for FileImporter extension.Aug 7 2017, 9:47 AM
Addshore updated the task description. (Show Details)

FileImporterHooks is kinda empty

So... A few little things from some testing (not security)

	"fileimporter-desc": "Easy importing of files from other sites",

This should say other MediaWiki sites, as it doesn't import files from any web "site"

Screen Shot 2017-11-02 at 18.22.40.png (314×1 px, 37 KB)

Maybe mention something on here too... The example suggests it can be a wiki page, but it's vague that it *has* to be a (media)wiki page

Also, why doesn't the file example have an extension?

Importing https://test.wikipedia.org/w/index.php?title=File:00000218.gif&action=history ....

Screen Shot 2017-11-02 at 18.29.12.png (168×656 px, 18 KB)

^ No, it's not 5 revisions. There is only one version of the file. There are 5 revisions of the file page. Maybe be specific and differentiate between number of revisions of the file history and the number of page revisions

Trying to import https://commons.wikimedia.org/wiki/File:30th_Street_at_5th_Avenue_-_Stra%C3%9Fe_in_New_York.jpg fails. I think it may be related to the original revision of the file being AWOL. Should be some more options than just failing (badly) like this below

[4e4f1c377fe077133b676185] /wiki/Special:ImportFile FileImporter\Exceptions\HttpRequestException from line 111 of /vagrant/mediawiki/extensions/FileImporter/src/Services/Http/HttpRequestExecutor.php:

Backtrace:

#0 /vagrant/mediawiki/extensions/FileImporter/src/Services/Http/HttpRequestExecutor.php(78): FileImporter\Services\Http\HttpRequestExecutor->executeWithCallback(string, array)
#1 /vagrant/mediawiki/extensions/FileImporter/src/Operations/FileRevisionFromRemoteUrl.php(76): FileImporter\Services\Http\HttpRequestExecutor->executeAndSave(string, string)
#2 /vagrant/mediawiki/extensions/FileImporter/src/Data/ImportOperations.php(47): FileImporter\Operations\FileRevisionFromRemoteUrl->prepare()
#3 /vagrant/mediawiki/extensions/FileImporter/src/Services/Importer.php(104): FileImporter\Data\ImportOperations->prepare()
#4 /vagrant/mediawiki/extensions/FileImporter/src/SpecialImportFile.php(227): FileImporter\Services\Importer->import(User, FileImporter\Data\ImportPlan)
#5 /vagrant/mediawiki/extensions/FileImporter/src/SpecialImportFile.php(133): FileImporter\SpecialImportFile->doImport(FileImporter\Data\ImportPlan)
#6 /vagrant/mediawiki/includes/specialpage/SpecialPage.php(522): FileImporter\SpecialImportFile->execute(NULL)
#7 /vagrant/mediawiki/includes/specialpage/SpecialPageFactory.php(578): SpecialPage->run(NULL)
#8 /vagrant/mediawiki/includes/MediaWiki.php(287): SpecialPageFactory::executePath(Title, RequestContext)
#9 /vagrant/mediawiki/includes/MediaWiki.php(851): MediaWiki->performRequest()
#10 /vagrant/mediawiki/includes/MediaWiki.php(523): MediaWiki->main()
#11 /vagrant/mediawiki/index.php(43): MediaWiki->run()
#12 /var/www/w/index.php(5): include(string)
#13 {main}

Future/feature request(s)

  • Better duplicate handling when a file exists locally - when there are more page revisions, it should allow a reimport/import of the missing revisions. Not tested when there's new revisions of a file, but I'm guessing this handles the same...
  • Make limit of importable number revisions (file/file page) into a $wg. Likely useful for WMF wikis when we're not collecting across the interwebs, so we more isn't necessarily gonna be much slower
  • If a page hits the limit, maybe import (or offer to) the newest $number rather than just stopping completely? Gonna be useful for WMF when we still want to import. Obviously, we can just Special:Export the file page history and import it...
Reedy triaged this task as Medium priority.Nov 2 2017, 6:46 PM

So... A few little things from some testing (not security)

	"fileimporter-desc": "Easy importing of files from other sites",

This should say other MediaWiki sites, as it doesn't import files from any web "site"

Screen Shot 2017-11-02 at 18.22.40.png (314×1 px, 37 KB)

Maybe mention something on here too... The example suggests it can be a wiki page, but it's vague that it *has* to be a (media)wiki page

I'm getting the feeling looking at the code structure, there's a plan to support other "remotes" than MediaWiki... Eventually

Still, there should be some better wording that we can use here.

Or, even at least list all the supported "remotes", especially on the Special:ImportFile page

Hell, just being able to select which file versions and page history versions to import would be very nice... Sometimes you might only want the newest, or one of the old file revisions

I should probably split various of these off into seperate bugs/feature requests

So... A few little things from some testing (not security)

	"fileimporter-desc": "Easy importing of files from other sites",

This should say other MediaWiki sites, as it doesn't import files from any web "site"

As you discovered below the plan would be to leave this open to other sites (flickr perhaps).

Screen Shot 2017-11-02 at 18.22.40.png (314×1 px, 37 KB)

Maybe mention something on here too... The example suggests it can be a wiki page, but it's vague that it *has* to be a (media)wiki page

Also, why doesn't the file example have an extension?

Yep this should include a file extension! T179643

Importing https://test.wikipedia.org/w/index.php?title=File:00000218.gif&action=history ....

Screen Shot 2017-11-02 at 18.29.12.png (168×656 px, 18 KB)

^ No, it's not 5 revisions. There is only one version of the file. There are 5 revisions of the file page. Maybe be specific and differentiate between number of revisions of the file history and the number of page revisions

I believe this is covered by T179281.
We want to go back to stating how many text revisions and how many image revisions will be imported.

Trying to import https://commons.wikimedia.org/wiki/File:30th_Street_at_5th_Avenue_-_Stra%C3%9Fe_in_New_York.jpg fails. I think it may be related to the original revision of the file being AWOL. Should be some more options than just failing (badly) like this below

[4e4f1c377fe077133b676185] /wiki/Special:ImportFile FileImporter\Exceptions\HttpRequestException from line 111 of /vagrant/mediawiki/extensions/FileImporter/src/Services/Http/HttpRequestExecutor.php:

Backtrace:

#0 /vagrant/mediawiki/extensions/FileImporter/src/Services/Http/HttpRequestExecutor.php(78): FileImporter\Services\Http\HttpRequestExecutor->executeWithCallback(string, array)
#1 /vagrant/mediawiki/extensions/FileImporter/src/Operations/FileRevisionFromRemoteUrl.php(76): FileImporter\Services\Http\HttpRequestExecutor->executeAndSave(string, string)
#2 /vagrant/mediawiki/extensions/FileImporter/src/Data/ImportOperations.php(47): FileImporter\Operations\FileRevisionFromRemoteUrl->prepare()
#3 /vagrant/mediawiki/extensions/FileImporter/src/Services/Importer.php(104): FileImporter\Data\ImportOperations->prepare()
#4 /vagrant/mediawiki/extensions/FileImporter/src/SpecialImportFile.php(227): FileImporter\Services\Importer->import(User, FileImporter\Data\ImportPlan)
#5 /vagrant/mediawiki/extensions/FileImporter/src/SpecialImportFile.php(133): FileImporter\SpecialImportFile->doImport(FileImporter\Data\ImportPlan)
#6 /vagrant/mediawiki/includes/specialpage/SpecialPage.php(522): FileImporter\SpecialImportFile->execute(NULL)
#7 /vagrant/mediawiki/includes/specialpage/SpecialPageFactory.php(578): SpecialPage->run(NULL)
#8 /vagrant/mediawiki/includes/MediaWiki.php(287): SpecialPageFactory::executePath(Title, RequestContext)
#9 /vagrant/mediawiki/includes/MediaWiki.php(851): MediaWiki->performRequest()
#10 /vagrant/mediawiki/includes/MediaWiki.php(523): MediaWiki->main()
#11 /vagrant/mediawiki/index.php(43): MediaWiki->run()
#12 /var/www/w/index.php(5): include(string)
#13 {main}

Looks like this is reproducible on our test system, I imagine there are more edge cases like this that we have not covered.
I have filed T179645 for this.

Future/feature request(s)

  • Better duplicate handling when a file exists locally - when there are more page revisions, it should allow a reimport/import of the missing revisions. Not tested when there's new revisions of a file, but I'm guessing this handles the same...

T179646

  • Make limit of importable number revisions (file/file page) into a $wg. Likely useful for WMF wikis when we're not collecting across the interwebs, so we more isn't necessarily gonna be much slower
  • If a page hits the limit, maybe import (or offer to) the newest $number rather than just stopping completely? Gonna be useful for WMF when we still want to import. Obviously, we can just Special:Export the file page history and import it...

T179647

So... A few little things from some testing (not security)

	"fileimporter-desc": "Easy importing of files from other sites",

This should say other MediaWiki sites, as it doesn't import files from any web "site"

Screen Shot 2017-11-02 at 18.22.40.png (314×1 px, 37 KB)

Maybe mention something on here too... The example suggests it can be a wiki page, but it's vague that it *has* to be a (media)wiki page

I'm getting the feeling looking at the code structure, there's a plan to support other "remotes" than MediaWiki... Eventually

Still, there should be some better wording that we can use here.

Or, even at least list all the supported "remotes", especially on the Special:ImportFile page

So, Special:ImportFile should only really thought about right now as an easy way to test the feature.
When deployed this page will mostly be hidden / the form could be disabled.
The primary route through the extension right now will be the link from the FileExporter extension on file pages on the import sources. Clicking this will take users straight to the preview page.
@Lea_WMDE any other thoughts here?

The primary route through the extension right now will be the link from the FileExporter extension on file pages on the import sources. Clicking this will take users straight to the preview page.

Yes, that's right. It is not finally decided if this page will exist in the end, but if it did, we would definitely put more thought into making it understandable for users. Right now it just helps us test.

@Reedy Do you know when you will start looking into the security side of things?

@Reedy Do you know when you will start looking into the security side of things?

I already have (I should probably move it on the workboard). Just looking at security issues is harder, and it's useful to have an idea how extensions work. Which often means finding other non security issues as you go through the code, some are big, some are small.... T179645 is a good example that the extension isn't quite ready for production yet ;)

There's a few parts I want to have another look over, but nothing serious.

Like I say, I would propose that T179645 is fixed before looking at any deployments, as the missing revision of a file is common enough that it will break soon enough due to the issue. And as Adam said, there's probably similar edge cases missing too, but might not find those until it gets wider usage.

The rest are pretty much feature requests, nothing that should probably block deployment anyway

So... A few little things from some testing (not security)

	"fileimporter-desc": "Easy importing of files from other sites",

This should say other MediaWiki sites, as it doesn't import files from any web "site"

As you discovered below the plan would be to leave this open to other sites (flickr perhaps).

Maybe include a list here -- other sites (such as MediaWiki) or something. New sites can be added later... Maybe.

In T160982#3732180, @Lea_WMDE wrote:

@Reedy Do you know when you will start looking into the security side of things?

I already have (I should probably move it on the workboard).

That's great news! And we appreciate you filing the bugs, too - T179645 is on our TODO-list :)

Hey. Sorry. The security releases ended up ruining most of last week for me, both on the day (took 8 hours dealing with CI starting at 8PM finishing after 4am..), and time taken on the few days before preparing for it after timescales being brought forward.

Neither of those bugs are security review dependant, but the latter is definitely required before deployment onto the main wikis (I don't see it as a reason to block deployment to beta), purely because a lot of users would likely end up falling over it.

Will look over the last few bits I wanted to check over again this afternoon, and hopefully sign it off today

Reedy claimed this task.
Reedy moved this task from In Progress to Waiting on the deprecated-security-team-reviews board.

Ok, yup, I'm happy.

As above, there's a couple of other tasks that need completing before wider rollout (these aren't security blockers), but it's fine to go to the beta cluster wikis, and potentially even to the test wikis in the meantime.