User:Frostly/Fortuna

Fortuna

The all-in-one file management tool

Fortuna is an upcoming tool designed to help editors working in image copyright patrolling be able to detect violations of copyright policy in a more streamlined way. Fortuna is similar to CopyPatrol in that it analyses files for copyright violations. The tool will also include other features, such as duplicate files detection, a colour-based visual search engine, a notifications system for file contributors to notify them that their file is being used on other sites, and a reverse image search engine. Fortuna will also provide an API for other tools to use.

Motivation

edit

Wikimedia Commons currently has many copyright violations that go by undetected. Copyright patrollers are often underrrepresented in terms of the number of tools supporting them, and current solutions such as OgreBot's new user upload log have been deprecated or removed.

Movement Strategy

edit

Fortuna is aligned with two of the Movement Strategy goals:

Increase the Sustainability of Our Movement

edit
Systematic approach to improve satisfaction and productivity
edit

Assessing the needs of groups and volunteers, taking into account their local contexts for effective support and recognition of efforts.

I've already reached out to some contributors working in copyright, and will continue to do so. I also work in copyright myself!

Continuously engaging and supporting publicly diverse types of online and offline contributors.

People working in copyright are often underrepresented in terms of technology to support them; there definitely are less tools for copyright than in other areas.

Increased Wikimedia Awareness (prioritized initiative)
edit

to secure the attention, trust, and interest of knowledge consumers

I think that my tool would help remove many copyright violations and increase trust in the licensing of the Movement's content.

Identify Topics for Impact

edit
Identify Wikimedia's Impact (prioritized initiative)
edit

Understand how our projects can be misused or abused by detecting threats with significant potential for harm

I certainly believe that copyright violations are a way that the projects can be abused!

Previous feature requests

edit

2015 and 2022 Community Wishlist Survey, Phabricator task.

Features

edit

Patrol

edit

Web app (all features) and gadget (one-by-one info)

Fortuna Patrol is an interface similar to CopyPatrol to review detected copyright violations. Violations will be each given a point scale in terms of how likely Fortuna believes that the file is a copyright violation.

edit
  • Earwig's Copyvio Detector
  • CopyPatrol

Implementation notes

edit
  • Use copyviobot-esque system, with an extension?

Curio

edit

Extension

Fortuna Curio is an interface to review images that may be duplicated across Commons and other projects. The name comes from the word "curate".

Implementation notes

edit

Colorways

edit

Extension

Fortuna Colorways allows for analysis of the colors in an image and searching images by color.

  • K-mean clustering to find dominant colors in image; 10 should probably be used as a max for # of dominant colors, similar to TinEye's recommendation
  • Use Elasticsearch or other database (e.g. Redis) to store images and retrieve similar results; inspiration from TinEye

Discover

edit

Mobile app integration and extension (with web interface and API in MW)

Fortuna Discover acts as a search engine, allowing users to find ("discover") content on Wikimedia projects by taking or uploading photos.

  • Needs OSS implementation; only solution right now is TinEye or Cloud Vision, etc

Charisto

edit

Extension

Fortuna Charisto allows contributors to subscribe to notifications of when their content is used online. The name Charisto comes from the Greek word for "thank you", "ef̱charistó̱".

  • Would it be cheaper to self-monitor using TinEye's general API?

Merido

edit

Extension and FastAPI server

Fortuna will have an API named Merido that allows other tools to search for copyright violations without needing to use a variety of APIs and without the need of additional funding. The name comes from the Greek word for "share", "merídio".

  • Plug in to GraphQL extension for GraphQL API support

Grafeas

edit

Extension integration with Wikisource OR gadget OR separate extension

Identify text in images through OCR, and link to Earwig's copyright tool to check if they are copyvios.

Aspis

edit

Gadget, extension and mobile integration

Hide potentially explicit images on pages and file description pages.

  • Google API currently; any other solutions?
  • Safe in Greek

Syntomo

edit

Extension integration

Generate file captions/descriptions/depicts statements/categories.

Implementation

edit

Technology

edit

Fortuna will utilize a variety of APIs for detection in order to maximize results.

Services used:

Fortuna will be written in Elm for safety and quality assurance. The frontend will use OOUI to match with the interfaces of Wikimedia projects and to allow for easier user onboarding. The tool will be hosted on Toolforge and use ToolsDB for database storage and caching.

Funding

edit