Commons:Requests for comment/Technical needs survey/TimedText: Difference between revisions

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Content deleted Content added
→‎Discussion: tight integrations are better than build our own
Line 23: Line 23:
::::but traditional categorisation method is inferior to the assessment structure in wikisource, which i think is a lot easier to use (just clicking the coloured dots) and provides a standard classification.
::::but traditional categorisation method is inferior to the assessment structure in wikisource, which i think is a lot easier to use (just clicking the coloured dots) and provides a standard classification.
::::then this reminded me of the need to have a transcription tool, because transcribing audio/video is different from a text. transcribing audio/video requires pausing the playback and setting timestamps.--[[User:RZuo|RZuo]] ([[User talk:RZuo|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 07:07, 2 January 2024 (UTC)
::::then this reminded me of the need to have a transcription tool, because transcribing audio/video is different from a text. transcribing audio/video requires pausing the playback and setting timestamps.--[[User:RZuo|RZuo]] ([[User talk:RZuo|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 07:07, 2 January 2024 (UTC)
* Regarding 5. Heavily suggest that this is a case of "external specialised services are better than build and maintain our own service". We used to have Amara integration and for the few years that that worked, it was pretty ok. Finding a good online editor, hosting it on Toolforge and adding a few integrations is going to be way more maintainable than trying to ram yet another component into Mediawiki. —[[User:TheDJ|Th<span style="color: green">e</span>DJ]] ([[User talk:TheDJ|talk]] • [[Special:Contributions/TheDJ|contribs]]) 10:16, 2 January 2024 (UTC)

Revision as of 10:16, 2 January 2024

TimedText

Description of the problems

  • Problem description:
  1. need an easy/user-friendly way to categorise timedtext. beneficial for categorising based on languages, quality of transcript, etc.
  2. need an easy way to check all timedtext pages associated with a file. something similar to https://commons.wikimedia.org/w/index.php?oldid=828200732#L-166 .
  3. need a more intuitive way of going to the associated file on a timedtext page. currently it's by ctrl+click the file (the mediaplayer box), or open up the popup and click the circle i. i needed this so much that i wrote a script before i learnt the ctrl+click trick https://commons.wikimedia.org/w/index.php?oldid=828200732#L-159.
  4. a way to assess the quality of timedtext (similar to wikisource?). incomplete, transcribed, non-synchronised, proofread, verified...?--RZuo (talk) 23:59, 31 December 2023 (UTC)[reply]
  5. a tool/interface that helps transcription, something like https://www.nikse.dk/subtitleedit/online .--RZuo (talk) 07:07, 2 January 2024 (UTC)[reply]
  • Proposal type: feature request
  • Proposed solution:
  • Phabricator ticket:
  • Further remarks:

Discussion

  •  Oppose You did not explain why this would be useful and why there are these needs. Also 4 can already be done via file categories. Opposing for now since this so far doesn't seem to be anywhere near the most important issues and can to a large degree already be done; very many other issues would be more important and haven't been listed here. --Prototyperspective (talk) 11:17, 1 January 2024 (UTC)[reply]
    can you point to me an english timedtext that's incomplete, and an english timedtext that's been proofread, based on your claim that "4 can already be done via file categories"? RZuo (talk) 11:46, 1 January 2024 (UTC)[reply]
    I said it can already be done, not that it is already being done and I would encourage such to be done, especially if machine translation / auto-caption tools are leveraged for WMC multilingualism (which could be very impactful). However, I can also point you to an example: Category:Videos by Terra X with English subtitle file unchecked – these need proofreading (see the cats above for more). I think people usually just upload timedtexts that are already complete but a new category for incomplete ones would be useful.
    1. also is already being done with cats like "…with subtitles in English". Prototyperspective (talk) 16:13, 1 January 2024 (UTC)[reply]
as i've tested at TimedText:Sandbox.webm.en.srt, timedtext pages can be categorised in the same way as other pages, but hotcat doesnt work on tt pages, so it's cumbersome. which is why i said we "need an easy/user-friendly way to categorise timedtext". the most basic solution is to make hotcat work on tt pages.
but traditional categorisation method is inferior to the assessment structure in wikisource, which i think is a lot easier to use (just clicking the coloured dots) and provides a standard classification.
then this reminded me of the need to have a transcription tool, because transcribing audio/video is different from a text. transcribing audio/video requires pausing the playback and setting timestamps.--RZuo (talk) 07:07, 2 January 2024 (UTC)[reply]
  • Regarding 5. Heavily suggest that this is a case of "external specialised services are better than build and maintain our own service". We used to have Amara integration and for the few years that that worked, it was pretty ok. Finding a good online editor, hosting it on Toolforge and adding a few integrations is going to be way more maintainable than trying to ram yet another component into Mediawiki. —TheDJ (talkcontribs) 10:16, 2 January 2024 (UTC)[reply]