[wp-trac] [WordPress Trac] #62521: HTML Processor virtual token seek does not seek to correct location
WordPress Trac
noreply at wordpress.org
Fri Nov 22 10:14:01 UTC 2024
#62521: HTML Processor virtual token seek does not seek to correct location
--------------------------+------------------------------
Reporter: jonsurrell | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: HTML API | Version: 6.7
Severity: normal | Resolution:
Keywords: | Focuses:
--------------------------+------------------------------
Description changed by jonsurrell:
Old description:
> {{{#!php
> <?php
> $processor = WP_HTML_Processor::create_full_parser( 'text only' );
>
> $advance_and_log_tag = function () use ( $processor ) {
> assert( $processor->next_tag( array( 'tag_closers' => 'visit' ) )
> );
> echo str_repeat( ' ', $processor->get_current_depth() ) .
> ( $processor->is_tag_closer() ? ' /' : '' ) .
> $processor->get_token_name() .
> "\n";
> };
>
> $advance_and_log_tag();
> $advance_and_log_tag();
> $advance_and_log_tag();
> $advance_and_log_tag();
> // Now at `<BODY>` virtual token, not present in the HTML string.
> assert( 'BODY' === $processor->get_token_name() && !
> $processor->is_tag_closer() );
> assert( $processor->set_bookmark( 'apparently <BODY> open tag' ) );
> $advance_and_log_tag();
> $advance_and_log_tag();
> // Now at `</HTML>` virtual token, not present in the HTML string.
> assert( $processor->seek( 'apparently <BODY> open tag' ) );
> // Expected to return to `<BODY>` open tag.
> echo $processor->get_token_name() . "\n";
> // prints: #text
> assert( 'BODY' === $processor->get_token_name() );
> // AssertionError!
> }}}
>
> The above prints:
>
> {{{
> HTML
> HEAD
> /HEAD
> BODY
> /BODY
> /HTML
> #text
>
> Fatal error: Uncaught AssertionError: assert('BODY' ===
> $processor->get_token_name()) …
> }}}
New description:
When an HTML_Processor bookmark is set at a virtual token (a node in the
resulting document that does not correspond to an HTML token present in
the input string), seek behavior becomes unreliable.
For example:
{{{#!php
<?php
$processor = WP_HTML_Processor::create_full_parser( 'text only' );
$advance_and_log_tag = function () use ( $processor ) {
assert( $processor->next_tag( array( 'tag_closers' => 'visit' ) )
);
echo str_repeat( ' ', $processor->get_current_depth() ) .
( $processor->is_tag_closer() ? ' /' : '' ) .
$processor->get_token_name() .
"\n";
};
$advance_and_log_tag();
$advance_and_log_tag();
$advance_and_log_tag();
$advance_and_log_tag();
// Now at `<BODY>` virtual token, not present in the HTML string.
assert( 'BODY' === $processor->get_token_name() && !
$processor->is_tag_closer() );
assert( $processor->set_bookmark( 'apparently <BODY> open tag' ) );
$advance_and_log_tag();
$advance_and_log_tag();
// Now at `</HTML>` virtual token, not present in the HTML string.
assert( $processor->seek( 'apparently <BODY> open tag' ) );
// Expected to return to `<BODY>` open tag.
echo $processor->get_token_name() . "\n";
// prints: #text
assert( 'BODY' === $processor->get_token_name() );
// AssertionError!
}}}
The above prints:
{{{
HTML
HEAD
/HEAD
BODY
/BODY
/HTML
#text
Fatal error: Uncaught AssertionError: assert('BODY' ===
$processor->get_token_name()) …
}}}
--
--
Ticket URL: <https://core.trac.wordpress.org/ticket/62521#comment:3>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list