[wp-trac] [WordPress Trac] #52900: Instantly index WordPress web sites content in Search Engines
WordPress Trac
noreply at wordpress.org
Tue Oct 19 00:38:05 UTC 2021
#52900: Instantly index WordPress web sites content in Search Engines
-------------------------------------------------+-------------------------
Reporter: fabricecanel | Owner: (none)
Type: feature request | Status: new
Priority: normal | Milestone: Awaiting
| Review
Component: General | Version:
Severity: normal | Resolution:
Keywords: reporter-feedback has-patch has- | Focuses:
unit-tests |
-------------------------------------------------+-------------------------
Comment (by fabricecanel):
Replying to [comment:22 dd32]:
> Replying to [comment:21 fabricecanel]:
> > We did our first pull request
>
> Hi @fabricecanel,
>
> To follow up on some earlier comments here - have you looked into
integrating with either http://pingomatic.com/ or http://blo.gs/cloud.php
?
>
> They're admittedly not very modern API's, but benefit from millions of
existing sites already making use of them, combined with existing
standards such as Sitemaps it can provide what's needed without additional
code on the clients side.
> > Replying to [comment:22 dd32]: As shared in this feature request,
today Microsoft Bing and Yandex release Microsoft Bing and Yandex, came up
with this search industry wide specification https://www.indexnow.org/
open to all major search engines; already supported by Microsoft Bing,
Yandex and few actors in the industry. We need a service secure (key is
provided by the site), easy to integrate, scaling to the whole industry,
all scenarios (web site, CMS, CDN, SEO companies), targeted for search
engines as to support add, update and delete, and helping search engines
to minimize crawl load. So, a broader scope. One key scenario for
WordPress sites is that most sites owners expect to see their content
quickly indexed (except in case of noindex tag) without having to do
something to do, ability to be indexed fast should be built in the search
engines, not all webmasters want to adopt a ping service to see their
content stolen and duplicated all over the internet.
>
> There might also be room in the middle to act as a middleman - consuming
those API's and relaying it onto Bing and others using the API, or having
Pingomattic or blo.gs to relay it onwards to those too.
>
> Before this proposal is really viable to consider for WordPress
inclusion (IMHO) there needs to be industry support on it being a
generalised system that allows for all players (small and large) to be
supported without additional need from site authors or software vendors. A
standard is only truely open if multiple vendors support it, otherwise
it's just an proprietary format that so happens to be documented publicly.
>
> To me, it seems that having client websites actively "pinging" select
search engines added in WordPress core is not exactly open, I would want
anyone interested in the data being able to access a stream of the changes
- and having them get their crawler added to WordPress seems like a high
barrier to entry.
>
> This seems like one of the major benefits of centralised open relay
services like those mentioned above.
>
> I'm assuming that one of the reasons for this approach, based on the
inclusion of a per-site key that can be validated through a HTTP callback,
is that the existing methods include a lot of spam and lack of any way to
verify that whom sent the request is actually the author of it. Monitoring
the Blo.gs feed definitely shows a LOT of spam. While the key verification
will allow verifying it is who they say they are, it won't prevent spam
being pushed into the system.
>
> ----
>
> To throw some ideas in here:
> - What would need to be done to improve the existing pingback services
in place?
> - Do they ''need'' to be replaced?
> - Do they need to supply extra details to clients to improve the
service?
>
> Looking at the output from blo.gs feed:
> {{{
> <weblog name="My Site" url="https://example.org/" service="ping"
ts="20210928T08:00:00Z" />
> }}}
> > Replying to [comment:22 dd32]: Existing ping services are not open.
Users of these ping systems, generally ping only a few dominant players.
https://www.indexnow.org/ is open, it shares URLs submitted between all
search engines having adopted. You ping one, you ping in fact all.
>
> That's not super useful as-is, it doesn't say what changed, but the
addition of a link to a) The sitemap and b) the page changed would benefit
greatly and provide a lot of what this proposal adds.
> > Replying to [comment:22 dd32]: a) Sitemaps is a great way to tell
search engines all the relevant URLs on your site. Search Engines attempt
looking at sitemaps once a day. Do you like to wait 1+ days to see your
content indexed? IndexNow https://www.indexnow.org/ allows you to have
your content index now, not in few days. b) Page changes is not a great
solution we have to pull often millions of sites to discover if the
content has changed. Right model is IndexNow + Sitemaps... IndexNow to
get indexing done fast and sitemaps to catchup if a ping is missed.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/52900#comment:28>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list