Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: return only unique Author ID <=> Post Author (ID) pairings. #22065

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

eddiesshop
Copy link

Because of the join across the wp_posts table, there could potentially be many posts where an author has been updated. Currently, each of those rows are returned but that data is discarded because the SELECT statement only grabs author_id <=> post_author pairings. For large data sets, this leads to an extremely inflated data list of duplicated author_id <=> post_author (ID) pairings.

With the GROUP BY statement, we ensure that only unique author_id <=> post_author (ID) pairings are returned, leading to much more efficient operation (from a code standpoint).

Context

See this issue: #22064

This PR makes the update_indexables_author_to_reassigned operation, and the wp yoast cleanup operation, much more efficient, by tackling only the (Old) Author ID <=> (New) Post Author (ID) pairings that need to be processed.

Summary

This PR can be summarized in the following changelog entry:
changelog: enhancement

Fixes an issue where running the wp yoast cleanup CLI command would hang when it reaches the update_indexables_author_to_reassigned step (for very large data sets). The only way to fix this would be by manually deleting the rows that are returned by this query from wp_yoast_indexable table.
*

Relevant technical choices:

Test instructions

Test instructions for the acceptance test before the PR gets merged

This PR can be acceptance tested by following these steps:

  1. Create an arbitrarily large number of posts and ensure they are assigned to the same author.
  2. Run wp yoast index
  3. Create a new author in the DB.
  4. Assign all posts to this new author.
  5. Run wp yoast cleanup. Notice that the command hangs at the update_indexables_author_to_reassigned step.
  6. Run the following query: SELECT wp_yoast_indexable.author_id, wp_posts.post_author FROM wp_yoast_indexable JOIN wp_posts on wp_yoast_indexable.object_id = wp_posts.id WHERE object_type='post' AND wp_yoast_indexable.author_id <> wp_posts.post_author ORDER BY wp_yoast_indexable.author_id. Notice that this query returns the same number of rows as the posts that you created.
  7. Now run the following query: SELECT wp_yoast_indexable.author_id, wp_posts.post_author FROM wp_yoast_indexable JOIN wp_posts on wp_yoast_indexable.object_id = wp_posts.id WHERE object_type='post' AND wp_yoast_indexable.author_id <> wp_posts.post_author GROUP BY wp_yoast_indexable.author_id, wp_posts.post_author ORDER BY wp_yoast_indexable.author_id. Notice that only 1 row is returned, indicating the Old Author ID being updated with the New Post Author ID.
  8. Checkout this branch.
  9. Run wp yoast cleanup. Notice that the command runs without hanging.

Relevant test scenarios

  • Changes should be tested with the browser console open
  • Changes should be tested on different posts/pages/taxonomies/custom post types/custom taxonomies
  • Changes should be tested on different editors (Default Block/Gutenberg/Classic/Elementor/other)
  • Changes should be tested on different browsers
  • Changes should be tested on multisite
    Please see test steps above for reasoning on selecting second choice.

Test instructions for QA when the code is in the RC

N/A

  • QA should use the same steps as above.

QA can test this PR by following these steps:
N/A

Impact check

This PR affects the following parts of the plugin, which may require extra testing:
N/A

UI changes

  • This PR changes the UI in the plugin. I have added the 'UI change' label to this PR.

Other environments

  • This PR also affects Shopify. I have added a changelog entry starting with [shopify-seo], added test instructions for Shopify and attached the Shopify label to this PR.

Documentation

  • I have written documentation for this change. For example, comments in the Relevant technical choices, comments in the code, documentation on Confluence / shared Google Drive / Yoast developer portal, or other.

Quality assurance

  • I have tested this code to the best of my abilities.
  • During testing, I had activated all plugins that Yoast SEO provides integrations for.
  • I have added unit tests to verify the code works as intended.
  • If any part of the code is behind a feature flag, my test instructions also cover cases where the feature flag is switched off.
  • I have written this PR in accordance with my team's definition of done.
  • I have checked that the base branch is correctly set.

Innovation

  • No innovation project is applicable for this PR.
  • This PR falls under an innovation project. I have attached the innovation label.
  • I have added my hours to the WBSO document.

Fixes #22064

Because of the join across the `wp_posts` table, there could potentially be _many_ posts where an author has been updated. Currently, each of those rows are returned but that data is discarded because the `SELECT` statement only grabs `author_id` <=> `post_author` pairings. For large data sets, this leads to an extremely inflated data list of duplicated `author_id` <=> `post_author` (ID) pairings.

With the `GROUP BY` statement, we ensure that only unique `author_id` <=> `post_author` (ID) pairings are returned, leading to much more efficient operation (from a code standpoint).
@enricobattocchi
Copy link
Member

Hey @eddiesshop, thanks for the suggestion! We'll try to schedule a review and test in the upcoming weeks (we can't commit to a date since we are currently working on a large project).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

update_indexables_author_to_reassigned frequently hangs due to inefficient query
3 participants