Deprecated: Function get_magic_quotes_gpc() is deprecated in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 99

Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 619

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1169

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176
8000 [localize] Fix "&", "<" and ">" getting replaced with html escape sequences. by IIIMADDINIII · Pull Request #5058 · lit/lit · GitHub
Nothing Special   »   [go: up one dir, main page]

Skip to content

Conversation

IIIMADDINIII
Copy link

Fixes #5012

"&", "<" and ">" do not need to be escaped in Template Literal Strings.
This causes invalid translations when these symbols are used in non HTML Context.
I think it is the code author/translator responsibility to properly escape special characters depending on the context.
lit/localize can not know in which context the strings are used.

Copy link
google-cla bot commented Aug 22, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@justinfagnani justinfagnani changed the title Fix "&", "<" and ">" getting replaced with html escape sequences. [localize] Fix "&", "<" and ">" getting replaced with html escape sequences. Aug 22, 2025
@justinfagnani
Copy link
Collaborator

@aomarks any thoughts?

@IIIMADDINIII is there some kind of test you can add so this won't regress if it's a good fix?

Copy link
changeset-bot bot commented Aug 24, 2025

🦋 Changeset detected

Latest commit: a366b9b

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@lit/localize-tools Minor
@lit-labs/cli-localize Patch
@lit-labs/cli Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@IIIMADDINIII
Copy link
Author
IIIMADDINIII commented Aug 24, 2025

After digging a little deeper i figured out the following:

@justinfagnani There is already a test for this, but the expected value included the escape sequences:
https://github.com/lit/lit/pull/5058/files#diff-f69680f4d4214898d3d6247b7ad6a30a08aec9f7ff0f83fb699c59668cfe07efL347-L351

There also already was a test case for keeping escape sequences in html translations:
https://github.com/lit/lit/blob/main/packages/localize-tools/testdata/build-runtime-xliff/input/foo.ts#L59-L60
The Previous logic worked as follows:

  1. Extract would parse html Templates with html5 -> Removing escape sequences only from html Templates
  2. Write the Translations to XLIFF file -> Adding escape sequences as needed for it to be valid xml (all Types)
  3. Build would read XLIFF file form disk -> Removing the previously added Escape Sequences (all Types)
  4. escapeTextContentToEmbedInTemplateLiteral -> Always adding escape Sequences for some Characters (all Types)

This Pull request is changing this, by not removing the escape sequences during extract (copy source string instead of using the parser result). So the escape sequences do not need to be added back by escapeTextContentToEmbedInTemplateLiteral which is flawed.

Advantages of this solution:

  • Not removing escape sequences which are in the source
  • Not adding escape sequences which are not in the source
  • I think it is expected behavior that translations do not mess with escape sequences
  • All Types ("", str`` , html`` ) have the same Behavior.

Disadvantages of this solution:

  • HTML escape sequences in source are double escaped in XLIFF files (The & sign needs to be escaped in XLIFF)
  • Changing XLIFF Source Translations for existing projects might be a breaking change

I would consider this a breaking change, because if some project has existing translations, these would need to be fixed to add the escape sequences which where previously removed.
If a source string currently contains a &lt; and was already translated then the target is not double escaped.
So even if the sources are extracted again, the translations do not contain the double escapes and during build the escaping would be removed, creating invalid html.

I think this pull request implements the correct behavior, but if the braking change is not desired, It might be a good idea to instead only apply the escaping in escapeTextContentToEmbedInTemplateLiteral for html Templates.

await checkTransform(
'msg(str`Hello <b>${msg("World", {id: "bar"})}</b>!`, {id: "foo"});',
'`Hola &lt;b&gt;Mundo&lt;/b&gt;!`;',
'`Hola <b>Mundo</b>!`;',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does look more correct to me. We should not be doing HTML escaping of expressions that only contain msg() calls, string literals, and str templates.

We do need to be careful about HTML that we emit into html templates, both because that is a security boundary for Lit (a maliciously crafted html template can execute arbitrary code. this is fine becaus 8000 e html templates are themselves source code).

<trans-unit id="h02c268d9b1fcb031">
<source>&lt;Hello<ph id="0">&lt;b></ph>&lt;World &amp; Friends><ph id="1">&lt;/b></ph>!></source>
<target>&lt;Hola<ph id="0">&lt;b></ph>&lt;Mundo &amp; Amigos><ph id="1">&lt;/b></ph>!></target>
<source>&amp;lt;Hello<ph id="0">&lt;b></ph>&amp;lt;World &amp;amp; Friends&amp;gt;<ph id="1">&lt;/b></ph>!&amp;gt;</source>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks less correct. Why is it double-escaped?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR
The original template for this translation is html`&lt;Hello<b>&lt;World &amp; Friends&gt;</b>!&gt;` which already includes escape sequences. To write this in an XML file, the & symbols need to be escaped.

Long Version:
I have the opinion, that localize should not change how I write my HTML in templates. Lets say I want to put the cent Symbol in a temlate (¢). But for some reason i need to escape it. I would write: html`&cent;` .

The old Version would convert all escape sequences back, so the &cent; would become a ¢ Symbol. This is then written to the translation file as a ¢ Symbol.
During build it would read the ¢ Symbol and output html`¢` to the source. So it is not the same as how I wrote the template.

To fix this, we need to preserve the original escape sequences. So instead off writing the ¢ symbol to the translation files we need to write &cent; to the translation file. Doing so will cause the XML serializer to escape the & Sign to make it valid XML. This results in &amp;cent; to be written to the translation file. If you open this with an XML viewer you will not See the double escape.

The XML parser will convert the &amp;cent; back to &cent; while reading the file during build. So localize can See the original content and will emit html`&cent;` to the source. This is the behaivor i would expect.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That explanation sounds reasonable to me

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. I'm surprised that no other tests needed changing here, just these .xlf files. Do we not use them in any other parts of the flow?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the behavior is mostly the same and most of the test do not include escape sequences, not a lot needed to be changed. If it helps, i could add some tests to check preserving escape sequences.

Regarding the Question: I do not know exactly. My understanding is, that the translation files are updated when extract is run. And when build is run it will do an extract first (without updating translations) to check for missing translations and then read the translations from the files.

@IIIMADDINIII
Copy link
Author

Is there anything i need to to to get this merged??
I am just asking, because i am not so familiar with open source development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[@lit/localize] A translation of "2 > 1" results in "2 &gt; 1" when using msg("2 > 1")

3 participants

0