User Details
- User Since
- Mar 6 2015, 10:50 PM (507 w, 5 d)
- Availability
- Available
- LDAP User
- MNeisler
- MediaWiki User
- MNeisler (WMF) [ Global Accounts ]
Mon, Nov 25
@kostajh
No if Javascript was enabled at the time of the edit, this should still be counted a JS edit event event if an ad blocker extension was installed at the time. We identified this issue in an earlier analysis and added instrumentation in T263505 to help distinguish between people who have JS disabled and people who are using ad blocker extensions for all wikitext edits.
Fri, Nov 22
@ppelberg Please see the analysis results summarized below for review:
Tue, Nov 19
Fri, Nov 8
Question: when do you think we should plan to prioritize this measurement (T379285)?
I've confirmed instrumentation is available in event.mediawiki_web_ui_actions to calculate the metrics identified for this task. This includes ability to track click to the "hide" button and clicks to the menu options for both the appearance and tools menu.
Thu, Nov 7
Wed, Nov 6
Here is the pre and post analysis report reviewing the impact of the change in Reference Check logic on identified metrics.
Fri, Nov 1
Progress update
- Updated the exploratory analysis to include blocks on other wikis, historical blocks, and quick rate of editing from new accounts, especially after a while since account creation (e.g. sleeper account). Based on initial results, quick rate of editing seems strongly associated with blocked accounts.
- Gathered dataset with all identified key features and completed preprocessing steps needed to run the model.
- Started work on modeling the data to identify the importance of the currently reviewed data points.
Wed, Oct 30
Oct 24 2024
Oct 23 2024
@ppelberg Below is a summary of the initial results for review. This is based on all new content edits published between May and June 2024 on an English Wikipedia main namespace by newcomers, Junior Contributors, or unregistered users.
Oct 18 2024
Progress update for this week:
Oct 15 2024
Oct 11 2024
Progress update for this week:
Oct 4 2024
Oct 3 2024
Update: I reached out to @diego who clarified that he does not currently have a list of keywords but a TF-IDF approach, that uses words to classify articles. While the model itself doesn't output the list, additional steps could be completed to output a list of keywords related to peacock behavior and promotional tone. Diego indicated he might be able to find time to do this during the next week or so depending on task urgency and priorities.
Oct 2 2024
Thanks @MNeisler. I'm a little confused by the language "network error" and "no results" with network error being 20%, with no results being so low. From the service end of things, ~20% is the right percentage of "no results found" - I don't see how that would translate into such a tiny percentage? What does "network error" and "no results" here mean in terms of status codes?
Sep 20 2024
@MNeisler I had some discussion with @cwylo, in which she suggested that we consider looking at blocked accounts eswiki. The reason being that administrative activity looks quite different on enwiki than on eswiki. However, for this quarter, I think it would be fine if we restrict the scope to looking at enwiki only, so this is just a note that we might want to consider it for the future.
Sep 18 2024
Of the times when people open Citoid and do NOT follow through to insert a reference, what reasons explain why this might be the case?
Sep 13 2024
Update:
- Ran query and collected initial sample of blocked users on English Wikipedia for the month of July 2024.
- Limited dataset to users that created accounts less than 90 days prior to the time they were blocked.
- Included data on block reason and block duration, which can be used to segment different user groups during data exploration.
- Began exploratory analysis to understand the distribution of accounts across different data points. I'm starting with editing data such as number of edits, articles created, and reverts. I will share the initial results and propose some additional data points that I believe would be useful to explore next week.
@kostajh - Are there any specific data points from the project overview doc you'd like me to prioritize for the exploratory analysis?
- Also identified a possible approach to determine the relevant importance of different variables on block rates, which relies on random forest (a classification algorithm) and logistic regression models. Inspired by work completed in T356765 to understand variables that impact the deletion rate of content translated articles. I'll learn more about what types of approaches might be needed based on the patterns discovered during the exploratory analysis.
Sep 12 2024
Thanks @DLynch for checking and updating the documentation. I've confirmed I can find the automatic-generate-fail-searchResults and automatic-generate-fail-network under the CitoidInspector feature.
Sep 10 2024
What percentage of edits to the Wikipedia main namespace are reverted because of the presence of peacock behavior?
This task was blocked on the deployment of temp accounts. While we verified that the new user_is_temp field now exists in EditAttemptStep, we wanted to wait for temp accounts to be deployed to confirm that data was logging for the new field as expected.
Sep 5 2024
@ppelberg On August 3rd, overall citoid requests from MyBib dropped substantially. I re-ran this analysis to include 30 days post August 3rd and did not identify any significant changes to Citoid feature use due to this change in request volume.
Sep 4 2024
Sep 3 2024
Aug 29 2024
Here's is an update on current per platform data currently available on the superset dashboard:
Aug 28 2024
I've completed a review of all identified past interventions that have sought to impact constructive activation. Some insights and limitations of these past interventions are noted below:
Aug 27 2024
I reviewed constructive activation rates across recent registration months as well as YoY trends to confirm how this metric as defined [i] fluctuates over time. See the findings documented below.
Aug 23 2024
Aug 22 2024
@ppelberg I'm documenting some of the additional data exploration ideas that were discussed in today's "KR WE1.2 Steering Committee monthly sync" meeting here as they seem related to the effort to identify edit checks that will be impactful. These can be used to refine this task or moved to a separate task as needed.
Assuming doing so would not require a great deal of effort, I think it would be worthwhile to re-run this analysis for the 30 days that followed August 3, 2024 to see how – if at all – the decrease in Citoid request volume impacts these numbers.
Aug 21 2024
Aug 20 2024
Resolving this task as I believe the patch deployed in T368495 resolved any outstanding issues.
Aug 19 2024
Here are the initial results exploring Citoid feature use by user experience level and platform. Please let me know if you have any questions or any further breakdowns would be useful. cc @ppelberg
Aug 15 2024
Aug 13 2024
I've completed an initial review of the third-party platforms from a product analytics perspective. Here is the current draft summary. Please feel free to add any questions or comments in the doc.
Jul 31 2024
The proposed new event (automatic-generate-manual-fallback) and the sequence of events noted in T370561#10023765 looks good to me. No changes.