User:MPopov (WMF)/Notes/Android app analytics
(These notes are an ongoing work-in-progress.)
Metrics
editGoogle Play Store reports
editThe following monthly reports are available from the Play Console:
- Acquisition channels in acquisition reports include Play Store (users find the app by browsing or searching on the Play Store app), Google Search, third-party referrers (users find the app via an untagged deep link to the Play Store), and AdWords (Google’s advertising service).
- App ratings over time are calculated from users’ 1-5 star ratings.
App stickiness (DAU/MAU)
editWe currently have an overall (global) daily & monthly active users (DAU/MAU). T186828 is about calculating those at a per-country level. Since these queries rely on the wmfuuid
field (which contains the appInstallId
) in the X-Analytics column of webrequests, the raw DAU and MAU counts are lower than what they should be because web requests from the app will not contain that info if the user has turned off "Send usage reports" in the app settings.[1]
Build Variants
editThere are four main build variants with the following differences:
- Dev: sends events to the beta EL cluster (
deployment-eventlog05.eqiad.wmflabs
) and sampling is disabled, same as Alpha. Developer Settings in the UI are enabled by default. - Alpha: sends events to the beta EL cluster (
deployment-eventlog05.eqiad.wmflabs
) and sampling is disabled, same as Dev. - Beta: sends events to production EL and sampling is enabled, same as Prod.
- Prod: sends events to production EL and sampling is enabled, same as Beta.
Developer Settings
editThis button is visible by default in the Dev build and hidden otherwise but can be revealed by going to the About screen and tapping the circular W icon 7 times. The app install ID can then be found under readingAppInstallID.
EventLogging
editSampling Rates
editPer T187239#4025260, when a user has opted-in (or not opted-out, as the case may be) to sending us usage reports, the amount of data we receive from those users varies by funnel (each feature or metric has its own funnel). Different analytics funnels have different activation rates. If N is the number of users who are opted-in, then if a funnel is configured to have:
SAMPLE_LOG_ALL
(default), that funnel activates for 100% of those N usersSAMPLE_LOG_10
, that funnel activates for ~10% of those N usersSAMPLE_LOG_100
, that funnel activates for ~1% of those N usersSAMPLE_LOG_1K
, that funnel activates for ~0.1% of those N users
Also, because of how modulo works, if a user's appInstallId
(unique, randomly generated on first launch) activates the funnels with SAMPLE_LOG_100
rate, then all the funnels with SAMPLE_LOG_10
rate also get activated.
Filtering
edit
Older versions of the app (2.4.184 and below) had a bug wherein instead of the wiki
ID such as "enwiki", the app would send the version (see T188557 for more details). Therefore, if calculating statistics on usage of features by wiki, some conditions are necessary:
SELECT *
FROM log.MobileWikiAppLinkPreview_15730939
WHERE timestamp >= '20180201' AND timestamp < '20180301'
AND RIGHT(LEFT(userAgent, 28), 7) > '2.4.184' -- non-bugged version
AND INSTR(userAgent, '-r-') > 0 -- release version
LIMIT 10;
Here's the equivalent HiveQL version if working with the event logs in Hadoop:
SELECT *
FROM event.mobilewikiapplinkpreview
WHERE revision = 15730939
AND year = 2018
AND month = 2
AND useragent.wmf_app_version > '2.4.184' -- non-bugged version
AND INSTR(useragent.wmf_app_version, '-r-') > 0 -- release version
LIMIT 10;
Debugging
editSetting build variant to devDebug
will:
- cause all schemas to become unsampled
- make events show up in Logcat
- send events to beta instance in Labs
For example:
02-26 11:16:04.838 10715-10715/org.wikipedia.dev D/org.wikipedia.analytics.Funnel: log():137: SearchFunnel: Sending event, event_action = start 02-26 11:16:04.843 10715-10787/org.wikipedia.dev D/OkHttp: --> POST https://deployment.wikimedia.beta.wmflabs.org/beacon/event?%7B%22schema%22%3A%22MobileWikiAppSearch%22%2C%22revision%22%3A15729321%2C%22wiki%22%3A%22enwiki%22%2C%22event%22%3A%7B%22action%22%3A%22start%22%2C%22source%22%3A0%2C%22appInstallID%22%3A%226449295c-34c3-4a9f-8a8d-4750479bf808%22%2C%22searchSessionToken%22%3A%221c74d24d-1684-4ce5-bcdb-183f07b6357b%22%7D%7D (0-byte body)
so in that case the following event data is POST
-ed:
{
"schema": "MobileWikiAppSearch",
"revision": 15729321,
"wiki": "enwiki",
"event": {
"action":"start",
"source":0,
"appInstallID":"6449295c-34c3-4a9f-8a8d-4750479bf808",
"searchSessionToken":"1c74d24d-1684-4ce5-bcdb-183f07b6357b"
}
}
Note: the reason source
is included in the event data but not in the debug log is because source
added by the second call to preprocessData
, which happens after the first set of calls to preprocessData
.
Verifying
edit
Thoroughly verifying events from the Alpha and Dev builds requires having multiple SSH connections open to deployment-eventlog05.eqiad.wmflabs
for monitoring 4 logs simultaneously:
tail -f /srv/log/eventlogging/client-side-events.log | grep "<app install id>"
tail -f /srv/log/eventlogging/all-events.log | grep "<app install id>"
tail -f /srv/log/eventlogging/systemd/eventlogging-processor@client-side-00.log | grep "<app install id>"
tail -f /srv/log/eventlogging/systemd/eventlogging-processor@client-side-01.log | grep "<app install id>"
client-side-events.log
has all incoming events (as raw, encoded URI query strings) regardless of their validity and all-events.log
only has events which have been validated against the appropriate schemas. If there are any issues with the incoming events or their validation, there will be detailed messages in the two eventlogging-processor@-client-side-XX
logs.
Refer to AE's documentation for more information on EL testing & verification, and see Developer Settings above about obtaining the app install ID.
Miscellaneous
edit- See T189756#4054802 regarding the values of
source
in feed customization events. - Article language switching (if an article is available in multiple languages):
- In the MobileWikiAppLinkPreview schema,
source
maps toHistory Entry
enumeration, where language link has an ID of 6. - So Link Previews events with
source = 6
are interpreted as the user switching between languages the article is available in.
- In the MobileWikiAppLinkPreview schema,