Toolhub/Progress reports/2021-09-24
Report on activities in the Toolhub project for the week ending 2021-09-24.
Minor code cleanups
edit- ui[AuditLog]: link list events to the list's page
- ui: Use named routes with router.push
- ui[CommonsImage]: delay rendering until metadata is available
- ui[ListInfo]: Move data reset to beforeMount
- ui[CrawlerHistory]: remove unnecessary string interpolation
- ui[router]: consistent route names and urls
- ui[AddRemoveTools]: Track active tab in query string
- ui[theme]: Update 'info' theme color
- ui[RegisterToolUrl]: Remove help sidebar
Kubernetes deployment progress
editWork has been slowly progressing over the last couple of months on the configuration needed to deploy Toolhub in the Wikimedia Foundation's production Kubernetes clusters. For most services currently hosted in the Kubernetes (k8s) cluster this is a relatively easy process. The journey for Toolhub has been more difficult primarily because Toolhub is doing more complicated things than most of the existing services. A typical k8s hosted service today in our production cluster is a node.js microservice which only interacts with MediaWiki and possibly another microservice. Toolhub is one of the first full applications to need to connect to a larger number of internal and external services:
- a MariaDB database
- an Elasticsearch full text index
- a Memcached key-value store
- the OAuth 2.0 API endpoints on metawiki
- arbitrary URLs to toolinfo.json files submitted to the application
Three weeks ago we managed to get the application deployed into the "staging" k8s cluster which is used to practice deployments before moving in the "eqiad" and "codfw" clusters which serve production traffic. This week we moved forward by configuring the initial database schema that will be used by the production application. Once this was done, users with the correct level of production access could setup HTTP proxy access to test the application.
Testing via the proxy allowed us to find a new set of missing configuration for the Kubernetes deployment. Specifically we found that Toolhub's Django backend application was not able to communicate with metawiki. Bryan had thought that the configuration to use the url-downloader HTTP proxy for web requests would handle this need. It turns out however that url-downloader blocks access to other services hosted inside the Foundation's production network. Giuseppe Lavagetto was able to explain that ideal solution for Toolhub to communicate with metawiki's API would be through a service proxy. This however requires custom configuration of the user agent used to access the API so that it will send a Host: meta.wikimedia.org header in the request that is different than the hostname used in the proxy URL. Toolhub is using a third-party library (Python Social Auth) for it's OAuth 2.0 implementation which makes this type of custom HTTP request manipulation tricky. Giuseppe agreed that it would be reasonable to make a different configuration change instead to allow Toolhub to connect directly to URLs on hosts which resolve to the "text-lb.{codfw,eqiad}.wikimedia.org" varnish clusters. This configuration has now been added to Toolhub's deployment configuration along with changes to the existing HTTP proxy configuration to tell the application which URLs should be communicated with directly rather than through the url-downloader proxy.
Deploying into the "eqiad" k8s cluster was the next step in our testing plan. The first attempt at this failed due to database server configuration (phab:T271480#7375578). Once the database permissions were fixed by Manuel Aróstegui with gerrit:723329, the deployment into "eqiad" completed with no errors.
Next week we will continue this work by setting up the necessary LVS load balancer, DNS records, and Varnish configuration to make https://toolhub.wikimedia.org route to the Toolhub deployment in the eqiad cluster. This will allow us to proceed with the final testing steps and initial configuration of the application.
Wrap up
editBryan's goal for the week was to complete the technical tasks of creating the initial database tables and test the crawler process from inside the Kubernetes staging cluster. The database tables have been created and some testing done, but that testing revealed missing configuration which took the rest of the week to design and implement. With that configuration now in place, work will continue next week on setting up LVS, DNS, and Varnish for https://toolhub.wikimedia.org. Once that is done we will be able to continue testing OAuth authentication and the crawler. If all goes well we may be ready to announce Toolhub to the movement in the following week!