How to display raw html code in PRE or something like it but without escaping itHow to show “raw” html code in a html pageIgnore XML tags within HTML pageprevent rendering of <li> tag or any other HTML tagsRegEx match open tags except XHTML self-contained tagsEncode html entities in javascriptWhat was the <XMP> tag used for?Why was the xmp HTML tag deprecated?Display html code in html pageHow to render only parts of a string as HTMLHow to prevent Javascript from parsing &quot; to "?Editing escaped code (for display)A library to convert ANSI escapes (terminal formatting/color codes) to HTMLHow do I escape ampersands in XML so they are rendered as entities in HTML?How to display HTML in TextView?What characters can be used for up/down triangle (arrow without stem) for display in HTML?How to create an HTML button that acts like a link?How I change the thickness of my <hr> tagHow do I reformat HTML code using Sublime Text 2?Display html code in html pageDisplay raw html in tinymce that is wrapped in code/pre, not rendered

How do I adjust encounters to challenge my lycanthrope players without negating their cool new abilities?

Given 0s on Assignments with suspected and dismissed cheating?

Do Grothendieck universes matter for an algebraic geometer?

Substring join or additional table, which is faster?

Offered a new position but unknown about salary?

Were any toxic metals used in the International Space Station?

Formal Definition of Dot Product

Why doesn't Iron Man's action affect this person in Endgame?

Would life always name the light from their sun "white"

How does this Martian habitat 3D printer built for NASA work?

Will the volt, ampere, ohm or other electrical units change on May 20th, 2019?

Do we have C++20 ranges library in GCC 9?

Why are solar panels kept tilted?

Find the unknown area, x

is it correct to say "When it started to rain, I was in the open air."

Is this a group? If so, what group is it?

Is it safe to use two single-pole breakers for a 240v circuit?

Why was my Canon Speedlite 600EX triggering other flashes?

How do I identify the partitions of my hard drive in order to then shred them all?

The meaning of the Middle English word “king”

Will casting a card from the graveyard with Flashback add a quest counter on Pyromancer Ascension?

Meaning of "work with shame"

Segmentation fault when popping x86 stack

Is Valonqar prophecy unfulfilled?



How to display raw html code in PRE or something like it but without escaping it


How to show “raw” html code in a html pageIgnore XML tags within HTML pageprevent rendering of <li> tag or any other HTML tagsRegEx match open tags except XHTML self-contained tagsEncode html entities in javascriptWhat was the <XMP> tag used for?Why was the xmp HTML tag deprecated?Display html code in html pageHow to render only parts of a string as HTMLHow to prevent Javascript from parsing &quot; to "?Editing escaped code (for display)A library to convert ANSI escapes (terminal formatting/color codes) to HTMLHow do I escape ampersands in XML so they are rendered as entities in HTML?How to display HTML in TextView?What characters can be used for up/down triangle (arrow without stem) for display in HTML?How to create an HTML button that acts like a link?How I change the thickness of my <hr> tagHow do I reformat HTML code using Sublime Text 2?Display html code in html pageDisplay raw html in tinymce that is wrapped in code/pre, not rendered






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








56















I'd like to display raw HTML. We all know one has to escape each "<" and ">" like this



 <PRE> this is a test &ltDIV&gt </PRE>


However, I do not want to do this. I'd like a way to keep the HTML code as is (since it is easier to read, (inside the editor) and I might want to copy it and use it again myself as actual HTML code, and do not want to have to change it again or have 2 versions of the same code one escaped and one not escaped).



Is there any other environment that is more "raw" than PRE that might allow this? So one does not have to keep editing HTML and changing everything each time they want to show some raw HTML code, may be in HTML5?



Something like <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM>



screen shot



The javascript solution does not work on FF 21, here is screen shot
enter image description here



screen shot 2



The first solution still does not work on firefox, here is screen shot
enter image description here










share|improve this question



















  • 3





    Am I the only one to think that it's incredible that we need to be so hacky just to perform such a common task as showing code? I really think that a solution to this problem should be addressed earlier than other new, upcoming but not as useful HTML tags.

    – Nobita
    Nov 26 '14 at 9:39

















56















I'd like to display raw HTML. We all know one has to escape each "<" and ">" like this



 <PRE> this is a test &ltDIV&gt </PRE>


However, I do not want to do this. I'd like a way to keep the HTML code as is (since it is easier to read, (inside the editor) and I might want to copy it and use it again myself as actual HTML code, and do not want to have to change it again or have 2 versions of the same code one escaped and one not escaped).



Is there any other environment that is more "raw" than PRE that might allow this? So one does not have to keep editing HTML and changing everything each time they want to show some raw HTML code, may be in HTML5?



Something like <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM>



screen shot



The javascript solution does not work on FF 21, here is screen shot
enter image description here



screen shot 2



The first solution still does not work on firefox, here is screen shot
enter image description here










share|improve this question



















  • 3





    Am I the only one to think that it's incredible that we need to be so hacky just to perform such a common task as showing code? I really think that a solution to this problem should be addressed earlier than other new, upcoming but not as useful HTML tags.

    – Nobita
    Nov 26 '14 at 9:39













56












56








56


9






I'd like to display raw HTML. We all know one has to escape each "<" and ">" like this



 <PRE> this is a test &ltDIV&gt </PRE>


However, I do not want to do this. I'd like a way to keep the HTML code as is (since it is easier to read, (inside the editor) and I might want to copy it and use it again myself as actual HTML code, and do not want to have to change it again or have 2 versions of the same code one escaped and one not escaped).



Is there any other environment that is more "raw" than PRE that might allow this? So one does not have to keep editing HTML and changing everything each time they want to show some raw HTML code, may be in HTML5?



Something like <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM>



screen shot



The javascript solution does not work on FF 21, here is screen shot
enter image description here



screen shot 2



The first solution still does not work on firefox, here is screen shot
enter image description here










share|improve this question
















I'd like to display raw HTML. We all know one has to escape each "<" and ">" like this



 <PRE> this is a test &ltDIV&gt </PRE>


However, I do not want to do this. I'd like a way to keep the HTML code as is (since it is easier to read, (inside the editor) and I might want to copy it and use it again myself as actual HTML code, and do not want to have to change it again or have 2 versions of the same code one escaped and one not escaped).



Is there any other environment that is more "raw" than PRE that might allow this? So one does not have to keep editing HTML and changing everything each time they want to show some raw HTML code, may be in HTML5?



Something like <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM>



screen shot



The javascript solution does not work on FF 21, here is screen shot
enter image description here



screen shot 2



The first solution still does not work on firefox, here is screen shot
enter image description here







html pre






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 28 '13 at 5:53







Robert H

















asked May 28 '13 at 4:09









Robert HRobert H

76811118




76811118







  • 3





    Am I the only one to think that it's incredible that we need to be so hacky just to perform such a common task as showing code? I really think that a solution to this problem should be addressed earlier than other new, upcoming but not as useful HTML tags.

    – Nobita
    Nov 26 '14 at 9:39












  • 3





    Am I the only one to think that it's incredible that we need to be so hacky just to perform such a common task as showing code? I really think that a solution to this problem should be addressed earlier than other new, upcoming but not as useful HTML tags.

    – Nobita
    Nov 26 '14 at 9:39







3




3





Am I the only one to think that it's incredible that we need to be so hacky just to perform such a common task as showing code? I really think that a solution to this problem should be addressed earlier than other new, upcoming but not as useful HTML tags.

– Nobita
Nov 26 '14 at 9:39





Am I the only one to think that it's incredible that we need to be so hacky just to perform such a common task as showing code? I really think that a solution to this problem should be addressed earlier than other new, upcoming but not as useful HTML tags.

– Nobita
Nov 26 '14 at 9:39












7 Answers
7






active

oldest

votes


















89














You can use the xmp element, see What was the <XMP> tag used for?. It has been in HTML since the beginning and is supported by all browsers. Specifications frown upon it, but HTML5 CR still describes it and requires browsers to support it (though it also tells authors not to use it, but it cannot really prevent you).



Everything inside xmp is taken as such, no markup (tags or character references) is recognized there, except, for apparent reason, the end tag of the element itself, </xmp>.



Otherwise xmp is rendered like pre.



When using “real XHTML”, i.e. XHTML served with an XML media type (which is rare), the special parsing rules do not apply, so xmp is treated like pre. But in “real XHTML”, you can use a CDATA section, which implies similar parsing rules. It has no special formatting, so you would probably want to wrap it inside a pre element:



<pre><![CDATA[
This is a demo, tags like <p> will
appear literally.
]]></pre>


I don’t see how you could combine xmp and CDATA section to achieve so-called polyglot markup






share|improve this answer




















  • 3





    +1 excellent! Would you know if this <xmp> is also supported in polyglot etc? Also, is there a (x)(ht)ml version where <![CDATA[<tag>bla & bla</tag>]]> could be used?

    – GitaarLAB
    May 28 '13 at 9:25











  • @GitaarLAB, good questions, I’ll extend my answer.

    – Jukka K. Korpela
    May 28 '13 at 10:00






  • 3





    xmp is an obsolete tag.

    – jlguenego
    Oct 17 '14 at 14:31






  • 2





    @jlguenego, Surely with such a statement you know you need [citation needed]?

    – Pacerier
    Apr 30 '15 at 6:30












  • @JukkaK.Korpela, What does <xmp> stand for?

    – Pacerier
    Apr 30 '15 at 6:34


















22
















Essentially the original question can be broken down in 2 parts:



  • Main objective/challenge: embedding(/transporting) a raw formatted code-snippet
    (any kind of code) in a web-page's markup (for simple copy/paste/edit due to no
    encoding/escaping)

  • correctly displaying/rendering that code-snippet (possibly edit it) in the
    browser

The short (but) ambiguous answer is: you can't, ...but you can (get very close).

(I know, that are 3 contradicting answers, so read on...)



(polyglot)(x)(ht)ml Markup-languages rely on wrapping (almost) everything between begin/opening and end/closing tags/character(sequences).

So, to embed any kind of raw code/snippet inside your markup-language, one will always have to escape/encode every instance (inside that snippet) that resembles the character(-sequence) that would close the wrapping 'container' element in the markup. (During this post I'll refer to this as rule no 1.)

Think of "some "data" here" or <i>..close italics with '</i>'-tag</i>, where it is obvious one should escape/encode (something in) </i and " (or change container's quote-character from " to ').



So, because of rule no 1, you can't 'just' embed 'any' unknown raw code-snippet inside markup.

Because, if one has to escape/encode even one character inside the raw snippet, then that snippet would no longer be the same original 'pure raw code' that anyone can copy/paste/edit in the document's markup without further thought. It would lead to malformed/illegal markup and Mojibake (mainly) because of entities.

Also, should that snippet contain such characters, you'd still need some javascript to 'translate' that character(sequence) from (and to) it's escaped/encoded representation to display the snippet correctly in the 'webpage' (for copy/paste/edit).



That brings us to (some of) the datatypes that markup-languages specify. These datatypes essentially define what are considered 'valid characters' and their meaning (per tag, property, etc.):



  • PCDATA (Parsed Character DATA): will expand entities and one must
    escape <, & (and > depending on markup language/version).

    Most tags like body, div, pre, etc, but also textarea (until
    HTML5) fall under this type.

    So not only do you need to encode all the container's closing character-sequences
    inside the snippet, you also have to encode all <, & (,>) characters
    (at minimum).

    Needless to say, encoding/escaping this many characters falls outside this
    objective's scope of embedding a raw snippet in the markup.

    '..But a textarea seems to work...', yes, either because of the browsers
    error-engine trying to make something out of it, or because HTML5:



  • RCDATA (Replaceable Character DATA): will not not treat tags inside the
    text as markup (but are still governed by rule 1), so one doesn't need to
    encode < (>). BUT entities are still expanded, so they and 'ambiguous
    ampersands' (&) need special care.

    The current HTML5 spec says the textarea is now a RCDATA field and (quote):




    The text in raw text and RCDATA elements must not contain any
    occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F SOLIDUS)
    followed by characters that case-insensitively match the tag name of
    the element followed by one of U+0009 CHARACTER TABULATION (tab),
    U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
    (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F SOLIDUS (/).




    Thus no matter what, textarea needs a hefty entity translation handler or
    it will eventually Mojibake on entities!



  • CDATA (Character Data) will not treat tags inside the text as
    markup and will not expand entities
    .

    So as long as the raw snippet code does not violate rule 1 (that one can't
    have the containers closing character(sequence) inside the snippet), this
    requires no other escaping/encoding.


Clearly this boils down to: how can we minimize the number of characters/character-sequences that still need to be encoded in the snippet's raw source and the number of times that character(sequence) might appear in an average snippet; something that is also of importance for the javascript that handles the translation of these characters (if they occur).



So what 'containers' have this CDATA context?



Most value properties of tags are CDATA, so one could (ab)use a hidden input's value property (proof of concept jsfiddle here).

However (conform rule 1) this creates an encoding/escape problem with nested quotes (" and ') in the raw snippet and one needs some javascript to get/translate and set the snippet in another (visible) element (or simply setting it as a text-area's value). Somehow this gave me problems with entities in FF (just like in a textarea). But it doesn't really matter, since the 'price' of having to escape/encode nested quotes is higher then a (HTML5) textarea (quotes are quite common in source code..).



What about trying to (ab)use <![CDATA[<tag>bla & bla</tag>]]>?

As Jukka points out in his extended answer, this would only work in (rare) 'real xhtml'.

I thought of using a script-tag (with or without such a CDATA wrapper inside the script-tag) together with a multi-line comment /* */ that wraps the raw snippet (script-tags can have an id and you can access them by count). But since this obviously introduces a escaping problem with */, ]]> and </script in the raw snippet, this doesn't seem like a solution either.



Please post other viable 'containers' in the comments to this answer.



By the way, encoding or counting the number of - characters and balancing them out inside a comment tag <!-- --> is just insane for this purpose (apart from rule 1).




That leaves us with Jukka K. Korpela's excellent answer: the <xmp> tag seems the best option!



The 'forgotten' <xmp> holds CDATA, is intended for this purpose AND is indeed still in the current HTML 5 spec (and has been at least since HTML3.2); exactly what we need! It's also widely supported, even in IE6 (that is.. until it suffers from the same regression as the scrolling table-body).

Note: as Jukka pointed out, this will not work in true xhtml or polyglot (that will treat it as a pre) and the xmp tag must still adhere to rule no 1. But that's the 'only' rule.



Consider the following markup:



<!-- ATTENTION: replace any occurrence of &lt;/xmp with </xmp -->
<xmp id="snippet-container">
<div>
<div>this is an example div &amp; holds an xmp tag:<br />
<xmp>
<html><head> <!-- indentation col 0!! -->
<title>My Title</title>
</head><body>
<p>hello world !!</p>
</body></html>
&lt;/xmp> <!-- note this encoded/escaped tag -->
</div>
This line is also part of the snippet
</div>
</xmp>


The above codeblok illustrates a raw piece of markup where <xmp id="snippet-container"> contains an (almost raw) code-snippet (containing div>div>xmp>html-document).

Notice the encoded closing tag in this markup? To comply with rule no 1, this was encoded/escaped).



So embedding/transporting the (sometimes almost) raw code is/seems solved.



What about displaying/rendering the snippet (and that encoded &lt;/xmp>)?



The browser will (or it should) render the snippet (the contents inside snippet-container) exactly the way you see it in the codeblock above (with some discrepancy amongst browsers whether or not the snippet starts with a blank line).

That includes the formatting/indentation, entities (like the string &amp;), full tags, comments AND the encoded closing tag &lt;/xmp> (just like it was encoded in the markup). And depending on browser(version) one could even try use the property contenteditable="true" to edit this snippet (all that without javascript enabled). Doing something like textarea.value=xmp.innerHTML is also a breeze.



So you can... if the snippet doesn't contain the containers closing character-sequence.



However, should a raw snippet contain the closing character-sequence </xmp (because it is an example of xmp itself or it contains some regex, etc), you must accept that you have to encode/escape that sequence in the raw snippet AND need a javascript handler to translate that encoding to display/render the encoded &lt;/xmp> like </xmp> inside a textarea (for editing/posting) or (for example) a pre just to correctly render the snippet's code (or so it seems).



A very rudimentary jsfiddle example of this here. Note that getting/embedding/displaying/retrieving-to-textarea worked perfect even in IE6. But setting the xmp's innerHTML revealed some interesting 'would-be-intelligent' behavior on IE's part. There is a more extensive note and workaround on that in the fiddle.



But now comes the important kicker (another reason why you only get very close):
Just as an over-simplified example, imagine this rabbit-hole:



Intended raw code-snippet:



<!-- remember to translate between </xmp> and &lt;/xmp> -->
<xmp>
<p>a paragraph</p>
</xmp>


Well, to comply with rule 1, we 'only' need to encode those </xmp[> nrtf/] sequences, right?



So that gives us the following markup (using just a possible encoding):



<xmp id="container">
<!-- remember to translate between &lt;/xmp> and &lt;/xmp> -->
<xmp>
<p>a paragraph</p>
&lt;/xmp>
</xmp>


Hmm.. shalt I get my crystal ball or flip a coin? No, let the computer look at its system-clock and state that a derived number is 'random'. Yes, that should do it..



Using a regex like: xmp.innerHTML.replace(/&lt;(?=/xmp[> nrtf/])/gi, '<');, would translate 'back' to this:



<!-- remember to translate between </xmp> and </xmp> -->
<xmp>
<p>a paragraph</p>
</xmp>


Hmm.. seems this random generator is broken... Houston..?

Should you have missed the joke/problem, read again starting at the 'intended raw code-snippet'.



Wait, I know, we (also) need to encode .... to ....

Ok, rewind to 'intended raw code-snippet' and read again.

Somehow this all begins to smell like the famous hilarious-but-true rexgex-answer on SO, a good read for people fluent in mojibake.



Maybe someone knows a clever algorithm or solution to fix this problem, but I assume that the embedded raw code will get more and more obscure to the point where you'd be better of properly escaping/encoding just your <, & (and >), just like the rest of the world.



Conclusion: (using the xmp tag)



  • it can be done with known snippets that do not contain the container's closing character-sequence,

  • we can get very close to the original objective with known snippets that only use 'basic first-level' escaping/encoding so we don't fall in the rabbithole,

  • but ultimately it seems that one can't do this reliably in a 'production-environment' where people can/should copy/paste/edit 'any unknown' raw snippets while not knowing/understanding the implications/rules/rabbithole (depending on your implementation of handling/translating for rule 1 and the rabbit-hole).

Hope this helps!



PS:
Whilst I would appreciate an upvote if you find this explanation useful, I kind of think Jukka's answer should be the accepted answer (should no better option/answer come along), since he was the one who remembered the xmp tag (that I forgot about over the years and got 'distracted' by the commonly advocated PCDATA elements like pre, textarea, etc.).

This answer originated in explaining why you can't do it (with any unknown raw snippet) and explain some obvious pitfalls that some other (now deleted) answers overlooked when advising a textarea for embedding/transport. I've expanded my existing explanation to also support and further explain Jukka's answer (since all that entity and *CDATA stuff is almost harder than code-pages).






share|improve this answer

























  • What you write is very true and according to the spec, but at the end of the day OP is after a solution which will allow him to copy the text out of an element and use it again. I have tested on Chrome, Firefox and IE, putting all the special characters you mention into the HTML source inside the textarea, and it doesn't want to break. When I copy the value out of the textarea it is always exactly what was in the HTML source originally.

    – Mathijs Flietstra
    May 28 '13 at 6:33






  • 1





    I interpreted the original question as: 'how to have a formatted raw code-snippet inside an element inside a valid html-source' (you also start your answer with: <textarea readonly> <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM> </textarea>). Even without that restriction (so it doesn't matter how the (correct) raw source gets into an element) one still needs an escaping routine, if only to safeguard against </textarea[ >/] (which is rather obvious when you think about that) for example.

    – GitaarLAB
    May 28 '13 at 6:54












  • PS: I am looking into <![CDATA[<tag>bla & bla</tag>]]>, but I'm currently unsure at the moment about the exact rules across markup-languages (html,xhtml,xml,polyglot,etc) and serving-methods.

    – GitaarLAB
    May 28 '13 at 7:08











  • A truly shakespearian presentation! When does the movie come out?

    – Kebman
    Aug 20 '17 at 23:19


















6














Cheap and cheerful answer:



<textarea>Some raw content</textarea>


The textarea will handle tabs, multiple spaces, newlines, line wrapping all verbatim.
It copies and pastes nicely and its valid HTML all the way. It also allows the user to resize the code box.
You don't need any CSS, JS, escaping, encoding.



You can alter the appearance and behaviour as well.
Here's a monospace font, editing disabled, smaller font, no border:



<textarea
style="width:100%; font-family: Monospace; font-size:10px; border:0;"
rows="30" disabled
>Some raw content</textarea>


This solution is probably not semantically correct. So if you need that, it might be best to choose a more sophisticated answer.






share|improve this answer























  • Simplier solution & it does the job!

    – RousseauAlexandre
    Oct 27 '17 at 11:08


















3














echo '<pre>' . htmlspecialchars("<div><b>raw HTML</b></div>") . '</pre>';


I think that's what you're looking for?



In other words, use htmlspecialchars() in PHP






share|improve this answer






























    3














    @GitaarLAB and @Jukka elaborate that <xmp> tag is obsolete, but still the best. When I use it like this



    <xmp>
    <div>Lorem ipsum</div>
    <p>Hello</p>
    </xmp>


    then the first EOL is inserted in the code, and it looks awful.



    It can be solved by removing that EOL



    <xmp><div>Lorem ipsum</div>
    <p>Hello</p>
    </xmp>


    but then it looks bad in the source. I used to solve it with wrapping <div>, but recently I figured out a nice CSS3 rule, I hope it also helps somebody:



    xmp margin: 5px 0; padding: 0 5px 5px 5px; background: #CCC; 
    xmp:before content: ""; display: block; height: 1em; margin: 0 -5px -2em -5px;


    This looks better.






    share|improve this answer






























      2














      xmp is the way to go, i.e.:



      <xmp>
      # your code...
      </xmp>





      share|improve this answer






























        1














        If you have jQuery enabled you can use an escapeXml function and not have to worry about escaping arrows or special characters.



        <pre>
        $fn:escapeXml('
        <!-- all your code -->
        ');
        </pre>





        share|improve this answer





















          protected by Community Sep 16 '15 at 14:35



          Thank you for your interest in this question.
          Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



          Would you like to answer one of these unanswered questions instead?














          7 Answers
          7






          active

          oldest

          votes








          7 Answers
          7






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          89














          You can use the xmp element, see What was the <XMP> tag used for?. It has been in HTML since the beginning and is supported by all browsers. Specifications frown upon it, but HTML5 CR still describes it and requires browsers to support it (though it also tells authors not to use it, but it cannot really prevent you).



          Everything inside xmp is taken as such, no markup (tags or character references) is recognized there, except, for apparent reason, the end tag of the element itself, </xmp>.



          Otherwise xmp is rendered like pre.



          When using “real XHTML”, i.e. XHTML served with an XML media type (which is rare), the special parsing rules do not apply, so xmp is treated like pre. But in “real XHTML”, you can use a CDATA section, which implies similar parsing rules. It has no special formatting, so you would probably want to wrap it inside a pre element:



          <pre><![CDATA[
          This is a demo, tags like <p> will
          appear literally.
          ]]></pre>


          I don’t see how you could combine xmp and CDATA section to achieve so-called polyglot markup






          share|improve this answer




















          • 3





            +1 excellent! Would you know if this <xmp> is also supported in polyglot etc? Also, is there a (x)(ht)ml version where <![CDATA[<tag>bla & bla</tag>]]> could be used?

            – GitaarLAB
            May 28 '13 at 9:25











          • @GitaarLAB, good questions, I’ll extend my answer.

            – Jukka K. Korpela
            May 28 '13 at 10:00






          • 3





            xmp is an obsolete tag.

            – jlguenego
            Oct 17 '14 at 14:31






          • 2





            @jlguenego, Surely with such a statement you know you need [citation needed]?

            – Pacerier
            Apr 30 '15 at 6:30












          • @JukkaK.Korpela, What does <xmp> stand for?

            – Pacerier
            Apr 30 '15 at 6:34















          89














          You can use the xmp element, see What was the <XMP> tag used for?. It has been in HTML since the beginning and is supported by all browsers. Specifications frown upon it, but HTML5 CR still describes it and requires browsers to support it (though it also tells authors not to use it, but it cannot really prevent you).



          Everything inside xmp is taken as such, no markup (tags or character references) is recognized there, except, for apparent reason, the end tag of the element itself, </xmp>.



          Otherwise xmp is rendered like pre.



          When using “real XHTML”, i.e. XHTML served with an XML media type (which is rare), the special parsing rules do not apply, so xmp is treated like pre. But in “real XHTML”, you can use a CDATA section, which implies similar parsing rules. It has no special formatting, so you would probably want to wrap it inside a pre element:



          <pre><![CDATA[
          This is a demo, tags like <p> will
          appear literally.
          ]]></pre>


          I don’t see how you could combine xmp and CDATA section to achieve so-called polyglot markup






          share|improve this answer




















          • 3





            +1 excellent! Would you know if this <xmp> is also supported in polyglot etc? Also, is there a (x)(ht)ml version where <![CDATA[<tag>bla & bla</tag>]]> could be used?

            – GitaarLAB
            May 28 '13 at 9:25











          • @GitaarLAB, good questions, I’ll extend my answer.

            – Jukka K. Korpela
            May 28 '13 at 10:00






          • 3





            xmp is an obsolete tag.

            – jlguenego
            Oct 17 '14 at 14:31






          • 2





            @jlguenego, Surely with such a statement you know you need [citation needed]?

            – Pacerier
            Apr 30 '15 at 6:30












          • @JukkaK.Korpela, What does <xmp> stand for?

            – Pacerier
            Apr 30 '15 at 6:34













          89












          89








          89







          You can use the xmp element, see What was the <XMP> tag used for?. It has been in HTML since the beginning and is supported by all browsers. Specifications frown upon it, but HTML5 CR still describes it and requires browsers to support it (though it also tells authors not to use it, but it cannot really prevent you).



          Everything inside xmp is taken as such, no markup (tags or character references) is recognized there, except, for apparent reason, the end tag of the element itself, </xmp>.



          Otherwise xmp is rendered like pre.



          When using “real XHTML”, i.e. XHTML served with an XML media type (which is rare), the special parsing rules do not apply, so xmp is treated like pre. But in “real XHTML”, you can use a CDATA section, which implies similar parsing rules. It has no special formatting, so you would probably want to wrap it inside a pre element:



          <pre><![CDATA[
          This is a demo, tags like <p> will
          appear literally.
          ]]></pre>


          I don’t see how you could combine xmp and CDATA section to achieve so-called polyglot markup






          share|improve this answer















          You can use the xmp element, see What was the <XMP> tag used for?. It has been in HTML since the beginning and is supported by all browsers. Specifications frown upon it, but HTML5 CR still describes it and requires browsers to support it (though it also tells authors not to use it, but it cannot really prevent you).



          Everything inside xmp is taken as such, no markup (tags or character references) is recognized there, except, for apparent reason, the end tag of the element itself, </xmp>.



          Otherwise xmp is rendered like pre.



          When using “real XHTML”, i.e. XHTML served with an XML media type (which is rare), the special parsing rules do not apply, so xmp is treated like pre. But in “real XHTML”, you can use a CDATA section, which implies similar parsing rules. It has no special formatting, so you would probably want to wrap it inside a pre element:



          <pre><![CDATA[
          This is a demo, tags like <p> will
          appear literally.
          ]]></pre>


          I don’t see how you could combine xmp and CDATA section to achieve so-called polyglot markup







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited May 23 '17 at 12:10









          Community

          11




          11










          answered May 28 '13 at 7:17









          Jukka K. KorpelaJukka K. Korpela

          155k25194299




          155k25194299







          • 3





            +1 excellent! Would you know if this <xmp> is also supported in polyglot etc? Also, is there a (x)(ht)ml version where <![CDATA[<tag>bla & bla</tag>]]> could be used?

            – GitaarLAB
            May 28 '13 at 9:25











          • @GitaarLAB, good questions, I’ll extend my answer.

            – Jukka K. Korpela
            May 28 '13 at 10:00






          • 3





            xmp is an obsolete tag.

            – jlguenego
            Oct 17 '14 at 14:31






          • 2





            @jlguenego, Surely with such a statement you know you need [citation needed]?

            – Pacerier
            Apr 30 '15 at 6:30












          • @JukkaK.Korpela, What does <xmp> stand for?

            – Pacerier
            Apr 30 '15 at 6:34












          • 3





            +1 excellent! Would you know if this <xmp> is also supported in polyglot etc? Also, is there a (x)(ht)ml version where <![CDATA[<tag>bla & bla</tag>]]> could be used?

            – GitaarLAB
            May 28 '13 at 9:25











          • @GitaarLAB, good questions, I’ll extend my answer.

            – Jukka K. Korpela
            May 28 '13 at 10:00






          • 3





            xmp is an obsolete tag.

            – jlguenego
            Oct 17 '14 at 14:31






          • 2





            @jlguenego, Surely with such a statement you know you need [citation needed]?

            – Pacerier
            Apr 30 '15 at 6:30












          • @JukkaK.Korpela, What does <xmp> stand for?

            – Pacerier
            Apr 30 '15 at 6:34







          3




          3





          +1 excellent! Would you know if this <xmp> is also supported in polyglot etc? Also, is there a (x)(ht)ml version where <![CDATA[<tag>bla & bla</tag>]]> could be used?

          – GitaarLAB
          May 28 '13 at 9:25





          +1 excellent! Would you know if this <xmp> is also supported in polyglot etc? Also, is there a (x)(ht)ml version where <![CDATA[<tag>bla & bla</tag>]]> could be used?

          – GitaarLAB
          May 28 '13 at 9:25













          @GitaarLAB, good questions, I’ll extend my answer.

          – Jukka K. Korpela
          May 28 '13 at 10:00





          @GitaarLAB, good questions, I’ll extend my answer.

          – Jukka K. Korpela
          May 28 '13 at 10:00




          3




          3





          xmp is an obsolete tag.

          – jlguenego
          Oct 17 '14 at 14:31





          xmp is an obsolete tag.

          – jlguenego
          Oct 17 '14 at 14:31




          2




          2





          @jlguenego, Surely with such a statement you know you need [citation needed]?

          – Pacerier
          Apr 30 '15 at 6:30






          @jlguenego, Surely with such a statement you know you need [citation needed]?

          – Pacerier
          Apr 30 '15 at 6:30














          @JukkaK.Korpela, What does <xmp> stand for?

          – Pacerier
          Apr 30 '15 at 6:34





          @JukkaK.Korpela, What does <xmp> stand for?

          – Pacerier
          Apr 30 '15 at 6:34













          22
















          Essentially the original question can be broken down in 2 parts:



          • Main objective/challenge: embedding(/transporting) a raw formatted code-snippet
            (any kind of code) in a web-page's markup (for simple copy/paste/edit due to no
            encoding/escaping)

          • correctly displaying/rendering that code-snippet (possibly edit it) in the
            browser

          The short (but) ambiguous answer is: you can't, ...but you can (get very close).

          (I know, that are 3 contradicting answers, so read on...)



          (polyglot)(x)(ht)ml Markup-languages rely on wrapping (almost) everything between begin/opening and end/closing tags/character(sequences).

          So, to embed any kind of raw code/snippet inside your markup-language, one will always have to escape/encode every instance (inside that snippet) that resembles the character(-sequence) that would close the wrapping 'container' element in the markup. (During this post I'll refer to this as rule no 1.)

          Think of "some "data" here" or <i>..close italics with '</i>'-tag</i>, where it is obvious one should escape/encode (something in) </i and " (or change container's quote-character from " to ').



          So, because of rule no 1, you can't 'just' embed 'any' unknown raw code-snippet inside markup.

          Because, if one has to escape/encode even one character inside the raw snippet, then that snippet would no longer be the same original 'pure raw code' that anyone can copy/paste/edit in the document's markup without further thought. It would lead to malformed/illegal markup and Mojibake (mainly) because of entities.

          Also, should that snippet contain such characters, you'd still need some javascript to 'translate' that character(sequence) from (and to) it's escaped/encoded representation to display the snippet correctly in the 'webpage' (for copy/paste/edit).



          That brings us to (some of) the datatypes that markup-languages specify. These datatypes essentially define what are considered 'valid characters' and their meaning (per tag, property, etc.):



          • PCDATA (Parsed Character DATA): will expand entities and one must
            escape <, & (and > depending on markup language/version).

            Most tags like body, div, pre, etc, but also textarea (until
            HTML5) fall under this type.

            So not only do you need to encode all the container's closing character-sequences
            inside the snippet, you also have to encode all <, & (,>) characters
            (at minimum).

            Needless to say, encoding/escaping this many characters falls outside this
            objective's scope of embedding a raw snippet in the markup.

            '..But a textarea seems to work...', yes, either because of the browsers
            error-engine trying to make something out of it, or because HTML5:



          • RCDATA (Replaceable Character DATA): will not not treat tags inside the
            text as markup (but are still governed by rule 1), so one doesn't need to
            encode < (>). BUT entities are still expanded, so they and 'ambiguous
            ampersands' (&) need special care.

            The current HTML5 spec says the textarea is now a RCDATA field and (quote):




            The text in raw text and RCDATA elements must not contain any
            occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F SOLIDUS)
            followed by characters that case-insensitively match the tag name of
            the element followed by one of U+0009 CHARACTER TABULATION (tab),
            U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
            (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F SOLIDUS (/).




            Thus no matter what, textarea needs a hefty entity translation handler or
            it will eventually Mojibake on entities!



          • CDATA (Character Data) will not treat tags inside the text as
            markup and will not expand entities
            .

            So as long as the raw snippet code does not violate rule 1 (that one can't
            have the containers closing character(sequence) inside the snippet), this
            requires no other escaping/encoding.


          Clearly this boils down to: how can we minimize the number of characters/character-sequences that still need to be encoded in the snippet's raw source and the number of times that character(sequence) might appear in an average snippet; something that is also of importance for the javascript that handles the translation of these characters (if they occur).



          So what 'containers' have this CDATA context?



          Most value properties of tags are CDATA, so one could (ab)use a hidden input's value property (proof of concept jsfiddle here).

          However (conform rule 1) this creates an encoding/escape problem with nested quotes (" and ') in the raw snippet and one needs some javascript to get/translate and set the snippet in another (visible) element (or simply setting it as a text-area's value). Somehow this gave me problems with entities in FF (just like in a textarea). But it doesn't really matter, since the 'price' of having to escape/encode nested quotes is higher then a (HTML5) textarea (quotes are quite common in source code..).



          What about trying to (ab)use <![CDATA[<tag>bla & bla</tag>]]>?

          As Jukka points out in his extended answer, this would only work in (rare) 'real xhtml'.

          I thought of using a script-tag (with or without such a CDATA wrapper inside the script-tag) together with a multi-line comment /* */ that wraps the raw snippet (script-tags can have an id and you can access them by count). But since this obviously introduces a escaping problem with */, ]]> and </script in the raw snippet, this doesn't seem like a solution either.



          Please post other viable 'containers' in the comments to this answer.



          By the way, encoding or counting the number of - characters and balancing them out inside a comment tag <!-- --> is just insane for this purpose (apart from rule 1).




          That leaves us with Jukka K. Korpela's excellent answer: the <xmp> tag seems the best option!



          The 'forgotten' <xmp> holds CDATA, is intended for this purpose AND is indeed still in the current HTML 5 spec (and has been at least since HTML3.2); exactly what we need! It's also widely supported, even in IE6 (that is.. until it suffers from the same regression as the scrolling table-body).

          Note: as Jukka pointed out, this will not work in true xhtml or polyglot (that will treat it as a pre) and the xmp tag must still adhere to rule no 1. But that's the 'only' rule.



          Consider the following markup:



          <!-- ATTENTION: replace any occurrence of &lt;/xmp with </xmp -->
          <xmp id="snippet-container">
          <div>
          <div>this is an example div &amp; holds an xmp tag:<br />
          <xmp>
          <html><head> <!-- indentation col 0!! -->
          <title>My Title</title>
          </head><body>
          <p>hello world !!</p>
          </body></html>
          &lt;/xmp> <!-- note this encoded/escaped tag -->
          </div>
          This line is also part of the snippet
          </div>
          </xmp>


          The above codeblok illustrates a raw piece of markup where <xmp id="snippet-container"> contains an (almost raw) code-snippet (containing div>div>xmp>html-document).

          Notice the encoded closing tag in this markup? To comply with rule no 1, this was encoded/escaped).



          So embedding/transporting the (sometimes almost) raw code is/seems solved.



          What about displaying/rendering the snippet (and that encoded &lt;/xmp>)?



          The browser will (or it should) render the snippet (the contents inside snippet-container) exactly the way you see it in the codeblock above (with some discrepancy amongst browsers whether or not the snippet starts with a blank line).

          That includes the formatting/indentation, entities (like the string &amp;), full tags, comments AND the encoded closing tag &lt;/xmp> (just like it was encoded in the markup). And depending on browser(version) one could even try use the property contenteditable="true" to edit this snippet (all that without javascript enabled). Doing something like textarea.value=xmp.innerHTML is also a breeze.



          So you can... if the snippet doesn't contain the containers closing character-sequence.



          However, should a raw snippet contain the closing character-sequence </xmp (because it is an example of xmp itself or it contains some regex, etc), you must accept that you have to encode/escape that sequence in the raw snippet AND need a javascript handler to translate that encoding to display/render the encoded &lt;/xmp> like </xmp> inside a textarea (for editing/posting) or (for example) a pre just to correctly render the snippet's code (or so it seems).



          A very rudimentary jsfiddle example of this here. Note that getting/embedding/displaying/retrieving-to-textarea worked perfect even in IE6. But setting the xmp's innerHTML revealed some interesting 'would-be-intelligent' behavior on IE's part. There is a more extensive note and workaround on that in the fiddle.



          But now comes the important kicker (another reason why you only get very close):
          Just as an over-simplified example, imagine this rabbit-hole:



          Intended raw code-snippet:



          <!-- remember to translate between </xmp> and &lt;/xmp> -->
          <xmp>
          <p>a paragraph</p>
          </xmp>


          Well, to comply with rule 1, we 'only' need to encode those </xmp[> nrtf/] sequences, right?



          So that gives us the following markup (using just a possible encoding):



          <xmp id="container">
          <!-- remember to translate between &lt;/xmp> and &lt;/xmp> -->
          <xmp>
          <p>a paragraph</p>
          &lt;/xmp>
          </xmp>


          Hmm.. shalt I get my crystal ball or flip a coin? No, let the computer look at its system-clock and state that a derived number is 'random'. Yes, that should do it..



          Using a regex like: xmp.innerHTML.replace(/&lt;(?=/xmp[> nrtf/])/gi, '<');, would translate 'back' to this:



          <!-- remember to translate between </xmp> and </xmp> -->
          <xmp>
          <p>a paragraph</p>
          </xmp>


          Hmm.. seems this random generator is broken... Houston..?

          Should you have missed the joke/problem, read again starting at the 'intended raw code-snippet'.



          Wait, I know, we (also) need to encode .... to ....

          Ok, rewind to 'intended raw code-snippet' and read again.

          Somehow this all begins to smell like the famous hilarious-but-true rexgex-answer on SO, a good read for people fluent in mojibake.



          Maybe someone knows a clever algorithm or solution to fix this problem, but I assume that the embedded raw code will get more and more obscure to the point where you'd be better of properly escaping/encoding just your <, & (and >), just like the rest of the world.



          Conclusion: (using the xmp tag)



          • it can be done with known snippets that do not contain the container's closing character-sequence,

          • we can get very close to the original objective with known snippets that only use 'basic first-level' escaping/encoding so we don't fall in the rabbithole,

          • but ultimately it seems that one can't do this reliably in a 'production-environment' where people can/should copy/paste/edit 'any unknown' raw snippets while not knowing/understanding the implications/rules/rabbithole (depending on your implementation of handling/translating for rule 1 and the rabbit-hole).

          Hope this helps!



          PS:
          Whilst I would appreciate an upvote if you find this explanation useful, I kind of think Jukka's answer should be the accepted answer (should no better option/answer come along), since he was the one who remembered the xmp tag (that I forgot about over the years and got 'distracted' by the commonly advocated PCDATA elements like pre, textarea, etc.).

          This answer originated in explaining why you can't do it (with any unknown raw snippet) and explain some obvious pitfalls that some other (now deleted) answers overlooked when advising a textarea for embedding/transport. I've expanded my existing explanation to also support and further explain Jukka's answer (since all that entity and *CDATA stuff is almost harder than code-pages).






          share|improve this answer

























          • What you write is very true and according to the spec, but at the end of the day OP is after a solution which will allow him to copy the text out of an element and use it again. I have tested on Chrome, Firefox and IE, putting all the special characters you mention into the HTML source inside the textarea, and it doesn't want to break. When I copy the value out of the textarea it is always exactly what was in the HTML source originally.

            – Mathijs Flietstra
            May 28 '13 at 6:33






          • 1





            I interpreted the original question as: 'how to have a formatted raw code-snippet inside an element inside a valid html-source' (you also start your answer with: <textarea readonly> <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM> </textarea>). Even without that restriction (so it doesn't matter how the (correct) raw source gets into an element) one still needs an escaping routine, if only to safeguard against </textarea[ >/] (which is rather obvious when you think about that) for example.

            – GitaarLAB
            May 28 '13 at 6:54












          • PS: I am looking into <![CDATA[<tag>bla & bla</tag>]]>, but I'm currently unsure at the moment about the exact rules across markup-languages (html,xhtml,xml,polyglot,etc) and serving-methods.

            – GitaarLAB
            May 28 '13 at 7:08











          • A truly shakespearian presentation! When does the movie come out?

            – Kebman
            Aug 20 '17 at 23:19















          22
















          Essentially the original question can be broken down in 2 parts:



          • Main objective/challenge: embedding(/transporting) a raw formatted code-snippet
            (any kind of code) in a web-page's markup (for simple copy/paste/edit due to no
            encoding/escaping)

          • correctly displaying/rendering that code-snippet (possibly edit it) in the
            browser

          The short (but) ambiguous answer is: you can't, ...but you can (get very close).

          (I know, that are 3 contradicting answers, so read on...)



          (polyglot)(x)(ht)ml Markup-languages rely on wrapping (almost) everything between begin/opening and end/closing tags/character(sequences).

          So, to embed any kind of raw code/snippet inside your markup-language, one will always have to escape/encode every instance (inside that snippet) that resembles the character(-sequence) that would close the wrapping 'container' element in the markup. (During this post I'll refer to this as rule no 1.)

          Think of "some "data" here" or <i>..close italics with '</i>'-tag</i>, where it is obvious one should escape/encode (something in) </i and " (or change container's quote-character from " to ').



          So, because of rule no 1, you can't 'just' embed 'any' unknown raw code-snippet inside markup.

          Because, if one has to escape/encode even one character inside the raw snippet, then that snippet would no longer be the same original 'pure raw code' that anyone can copy/paste/edit in the document's markup without further thought. It would lead to malformed/illegal markup and Mojibake (mainly) because of entities.

          Also, should that snippet contain such characters, you'd still need some javascript to 'translate' that character(sequence) from (and to) it's escaped/encoded representation to display the snippet correctly in the 'webpage' (for copy/paste/edit).



          That brings us to (some of) the datatypes that markup-languages specify. These datatypes essentially define what are considered 'valid characters' and their meaning (per tag, property, etc.):



          • PCDATA (Parsed Character DATA): will expand entities and one must
            escape <, & (and > depending on markup language/version).

            Most tags like body, div, pre, etc, but also textarea (until
            HTML5) fall under this type.

            So not only do you need to encode all the container's closing character-sequences
            inside the snippet, you also have to encode all <, & (,>) characters
            (at minimum).

            Needless to say, encoding/escaping this many characters falls outside this
            objective's scope of embedding a raw snippet in the markup.

            '..But a textarea seems to work...', yes, either because of the browsers
            error-engine trying to make something out of it, or because HTML5:



          • RCDATA (Replaceable Character DATA): will not not treat tags inside the
            text as markup (but are still governed by rule 1), so one doesn't need to
            encode < (>). BUT entities are still expanded, so they and 'ambiguous
            ampersands' (&) need special care.

            The current HTML5 spec says the textarea is now a RCDATA field and (quote):




            The text in raw text and RCDATA elements must not contain any
            occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F SOLIDUS)
            followed by characters that case-insensitively match the tag name of
            the element followed by one of U+0009 CHARACTER TABULATION (tab),
            U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
            (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F SOLIDUS (/).




            Thus no matter what, textarea needs a hefty entity translation handler or
            it will eventually Mojibake on entities!



          • CDATA (Character Data) will not treat tags inside the text as
            markup and will not expand entities
            .

            So as long as the raw snippet code does not violate rule 1 (that one can't
            have the containers closing character(sequence) inside the snippet), this
            requires no other escaping/encoding.


          Clearly this boils down to: how can we minimize the number of characters/character-sequences that still need to be encoded in the snippet's raw source and the number of times that character(sequence) might appear in an average snippet; something that is also of importance for the javascript that handles the translation of these characters (if they occur).



          So what 'containers' have this CDATA context?



          Most value properties of tags are CDATA, so one could (ab)use a hidden input's value property (proof of concept jsfiddle here).

          However (conform rule 1) this creates an encoding/escape problem with nested quotes (" and ') in the raw snippet and one needs some javascript to get/translate and set the snippet in another (visible) element (or simply setting it as a text-area's value). Somehow this gave me problems with entities in FF (just like in a textarea). But it doesn't really matter, since the 'price' of having to escape/encode nested quotes is higher then a (HTML5) textarea (quotes are quite common in source code..).



          What about trying to (ab)use <![CDATA[<tag>bla & bla</tag>]]>?

          As Jukka points out in his extended answer, this would only work in (rare) 'real xhtml'.

          I thought of using a script-tag (with or without such a CDATA wrapper inside the script-tag) together with a multi-line comment /* */ that wraps the raw snippet (script-tags can have an id and you can access them by count). But since this obviously introduces a escaping problem with */, ]]> and </script in the raw snippet, this doesn't seem like a solution either.



          Please post other viable 'containers' in the comments to this answer.



          By the way, encoding or counting the number of - characters and balancing them out inside a comment tag <!-- --> is just insane for this purpose (apart from rule 1).




          That leaves us with Jukka K. Korpela's excellent answer: the <xmp> tag seems the best option!



          The 'forgotten' <xmp> holds CDATA, is intended for this purpose AND is indeed still in the current HTML 5 spec (and has been at least since HTML3.2); exactly what we need! It's also widely supported, even in IE6 (that is.. until it suffers from the same regression as the scrolling table-body).

          Note: as Jukka pointed out, this will not work in true xhtml or polyglot (that will treat it as a pre) and the xmp tag must still adhere to rule no 1. But that's the 'only' rule.



          Consider the following markup:



          <!-- ATTENTION: replace any occurrence of &lt;/xmp with </xmp -->
          <xmp id="snippet-container">
          <div>
          <div>this is an example div &amp; holds an xmp tag:<br />
          <xmp>
          <html><head> <!-- indentation col 0!! -->
          <title>My Title</title>
          </head><body>
          <p>hello world !!</p>
          </body></html>
          &lt;/xmp> <!-- note this encoded/escaped tag -->
          </div>
          This line is also part of the snippet
          </div>
          </xmp>


          The above codeblok illustrates a raw piece of markup where <xmp id="snippet-container"> contains an (almost raw) code-snippet (containing div>div>xmp>html-document).

          Notice the encoded closing tag in this markup? To comply with rule no 1, this was encoded/escaped).



          So embedding/transporting the (sometimes almost) raw code is/seems solved.



          What about displaying/rendering the snippet (and that encoded &lt;/xmp>)?



          The browser will (or it should) render the snippet (the contents inside snippet-container) exactly the way you see it in the codeblock above (with some discrepancy amongst browsers whether or not the snippet starts with a blank line).

          That includes the formatting/indentation, entities (like the string &amp;), full tags, comments AND the encoded closing tag &lt;/xmp> (just like it was encoded in the markup). And depending on browser(version) one could even try use the property contenteditable="true" to edit this snippet (all that without javascript enabled). Doing something like textarea.value=xmp.innerHTML is also a breeze.



          So you can... if the snippet doesn't contain the containers closing character-sequence.



          However, should a raw snippet contain the closing character-sequence </xmp (because it is an example of xmp itself or it contains some regex, etc), you must accept that you have to encode/escape that sequence in the raw snippet AND need a javascript handler to translate that encoding to display/render the encoded &lt;/xmp> like </xmp> inside a textarea (for editing/posting) or (for example) a pre just to correctly render the snippet's code (or so it seems).



          A very rudimentary jsfiddle example of this here. Note that getting/embedding/displaying/retrieving-to-textarea worked perfect even in IE6. But setting the xmp's innerHTML revealed some interesting 'would-be-intelligent' behavior on IE's part. There is a more extensive note and workaround on that in the fiddle.



          But now comes the important kicker (another reason why you only get very close):
          Just as an over-simplified example, imagine this rabbit-hole:



          Intended raw code-snippet:



          <!-- remember to translate between </xmp> and &lt;/xmp> -->
          <xmp>
          <p>a paragraph</p>
          </xmp>


          Well, to comply with rule 1, we 'only' need to encode those </xmp[> nrtf/] sequences, right?



          So that gives us the following markup (using just a possible encoding):



          <xmp id="container">
          <!-- remember to translate between &lt;/xmp> and &lt;/xmp> -->
          <xmp>
          <p>a paragraph</p>
          &lt;/xmp>
          </xmp>


          Hmm.. shalt I get my crystal ball or flip a coin? No, let the computer look at its system-clock and state that a derived number is 'random'. Yes, that should do it..



          Using a regex like: xmp.innerHTML.replace(/&lt;(?=/xmp[> nrtf/])/gi, '<');, would translate 'back' to this:



          <!-- remember to translate between </xmp> and </xmp> -->
          <xmp>
          <p>a paragraph</p>
          </xmp>


          Hmm.. seems this random generator is broken... Houston..?

          Should you have missed the joke/problem, read again starting at the 'intended raw code-snippet'.



          Wait, I know, we (also) need to encode .... to ....

          Ok, rewind to 'intended raw code-snippet' and read again.

          Somehow this all begins to smell like the famous hilarious-but-true rexgex-answer on SO, a good read for people fluent in mojibake.



          Maybe someone knows a clever algorithm or solution to fix this problem, but I assume that the embedded raw code will get more and more obscure to the point where you'd be better of properly escaping/encoding just your <, & (and >), just like the rest of the world.



          Conclusion: (using the xmp tag)



          • it can be done with known snippets that do not contain the container's closing character-sequence,

          • we can get very close to the original objective with known snippets that only use 'basic first-level' escaping/encoding so we don't fall in the rabbithole,

          • but ultimately it seems that one can't do this reliably in a 'production-environment' where people can/should copy/paste/edit 'any unknown' raw snippets while not knowing/understanding the implications/rules/rabbithole (depending on your implementation of handling/translating for rule 1 and the rabbit-hole).

          Hope this helps!



          PS:
          Whilst I would appreciate an upvote if you find this explanation useful, I kind of think Jukka's answer should be the accepted answer (should no better option/answer come along), since he was the one who remembered the xmp tag (that I forgot about over the years and got 'distracted' by the commonly advocated PCDATA elements like pre, textarea, etc.).

          This answer originated in explaining why you can't do it (with any unknown raw snippet) and explain some obvious pitfalls that some other (now deleted) answers overlooked when advising a textarea for embedding/transport. I've expanded my existing explanation to also support and further explain Jukka's answer (since all that entity and *CDATA stuff is almost harder than code-pages).






          share|improve this answer

























          • What you write is very true and according to the spec, but at the end of the day OP is after a solution which will allow him to copy the text out of an element and use it again. I have tested on Chrome, Firefox and IE, putting all the special characters you mention into the HTML source inside the textarea, and it doesn't want to break. When I copy the value out of the textarea it is always exactly what was in the HTML source originally.

            – Mathijs Flietstra
            May 28 '13 at 6:33






          • 1





            I interpreted the original question as: 'how to have a formatted raw code-snippet inside an element inside a valid html-source' (you also start your answer with: <textarea readonly> <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM> </textarea>). Even without that restriction (so it doesn't matter how the (correct) raw source gets into an element) one still needs an escaping routine, if only to safeguard against </textarea[ >/] (which is rather obvious when you think about that) for example.

            – GitaarLAB
            May 28 '13 at 6:54












          • PS: I am looking into <![CDATA[<tag>bla & bla</tag>]]>, but I'm currently unsure at the moment about the exact rules across markup-languages (html,xhtml,xml,polyglot,etc) and serving-methods.

            – GitaarLAB
            May 28 '13 at 7:08











          • A truly shakespearian presentation! When does the movie come out?

            – Kebman
            Aug 20 '17 at 23:19













          22












          22








          22









          Essentially the original question can be broken down in 2 parts:



          • Main objective/challenge: embedding(/transporting) a raw formatted code-snippet
            (any kind of code) in a web-page's markup (for simple copy/paste/edit due to no
            encoding/escaping)

          • correctly displaying/rendering that code-snippet (possibly edit it) in the
            browser

          The short (but) ambiguous answer is: you can't, ...but you can (get very close).

          (I know, that are 3 contradicting answers, so read on...)



          (polyglot)(x)(ht)ml Markup-languages rely on wrapping (almost) everything between begin/opening and end/closing tags/character(sequences).

          So, to embed any kind of raw code/snippet inside your markup-language, one will always have to escape/encode every instance (inside that snippet) that resembles the character(-sequence) that would close the wrapping 'container' element in the markup. (During this post I'll refer to this as rule no 1.)

          Think of "some "data" here" or <i>..close italics with '</i>'-tag</i>, where it is obvious one should escape/encode (something in) </i and " (or change container's quote-character from " to ').



          So, because of rule no 1, you can't 'just' embed 'any' unknown raw code-snippet inside markup.

          Because, if one has to escape/encode even one character inside the raw snippet, then that snippet would no longer be the same original 'pure raw code' that anyone can copy/paste/edit in the document's markup without further thought. It would lead to malformed/illegal markup and Mojibake (mainly) because of entities.

          Also, should that snippet contain such characters, you'd still need some javascript to 'translate' that character(sequence) from (and to) it's escaped/encoded representation to display the snippet correctly in the 'webpage' (for copy/paste/edit).



          That brings us to (some of) the datatypes that markup-languages specify. These datatypes essentially define what are considered 'valid characters' and their meaning (per tag, property, etc.):



          • PCDATA (Parsed Character DATA): will expand entities and one must
            escape <, & (and > depending on markup language/version).

            Most tags like body, div, pre, etc, but also textarea (until
            HTML5) fall under this type.

            So not only do you need to encode all the container's closing character-sequences
            inside the snippet, you also have to encode all <, & (,>) characters
            (at minimum).

            Needless to say, encoding/escaping this many characters falls outside this
            objective's scope of embedding a raw snippet in the markup.

            '..But a textarea seems to work...', yes, either because of the browsers
            error-engine trying to make something out of it, or because HTML5:



          • RCDATA (Replaceable Character DATA): will not not treat tags inside the
            text as markup (but are still governed by rule 1), so one doesn't need to
            encode < (>). BUT entities are still expanded, so they and 'ambiguous
            ampersands' (&) need special care.

            The current HTML5 spec says the textarea is now a RCDATA field and (quote):




            The text in raw text and RCDATA elements must not contain any
            occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F SOLIDUS)
            followed by characters that case-insensitively match the tag name of
            the element followed by one of U+0009 CHARACTER TABULATION (tab),
            U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
            (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F SOLIDUS (/).




            Thus no matter what, textarea needs a hefty entity translation handler or
            it will eventually Mojibake on entities!



          • CDATA (Character Data) will not treat tags inside the text as
            markup and will not expand entities
            .

            So as long as the raw snippet code does not violate rule 1 (that one can't
            have the containers closing character(sequence) inside the snippet), this
            requires no other escaping/encoding.


          Clearly this boils down to: how can we minimize the number of characters/character-sequences that still need to be encoded in the snippet's raw source and the number of times that character(sequence) might appear in an average snippet; something that is also of importance for the javascript that handles the translation of these characters (if they occur).



          So what 'containers' have this CDATA context?



          Most value properties of tags are CDATA, so one could (ab)use a hidden input's value property (proof of concept jsfiddle here).

          However (conform rule 1) this creates an encoding/escape problem with nested quotes (" and ') in the raw snippet and one needs some javascript to get/translate and set the snippet in another (visible) element (or simply setting it as a text-area's value). Somehow this gave me problems with entities in FF (just like in a textarea). But it doesn't really matter, since the 'price' of having to escape/encode nested quotes is higher then a (HTML5) textarea (quotes are quite common in source code..).



          What about trying to (ab)use <![CDATA[<tag>bla & bla</tag>]]>?

          As Jukka points out in his extended answer, this would only work in (rare) 'real xhtml'.

          I thought of using a script-tag (with or without such a CDATA wrapper inside the script-tag) together with a multi-line comment /* */ that wraps the raw snippet (script-tags can have an id and you can access them by count). But since this obviously introduces a escaping problem with */, ]]> and </script in the raw snippet, this doesn't seem like a solution either.



          Please post other viable 'containers' in the comments to this answer.



          By the way, encoding or counting the number of - characters and balancing them out inside a comment tag <!-- --> is just insane for this purpose (apart from rule 1).




          That leaves us with Jukka K. Korpela's excellent answer: the <xmp> tag seems the best option!



          The 'forgotten' <xmp> holds CDATA, is intended for this purpose AND is indeed still in the current HTML 5 spec (and has been at least since HTML3.2); exactly what we need! It's also widely supported, even in IE6 (that is.. until it suffers from the same regression as the scrolling table-body).

          Note: as Jukka pointed out, this will not work in true xhtml or polyglot (that will treat it as a pre) and the xmp tag must still adhere to rule no 1. But that's the 'only' rule.



          Consider the following markup:



          <!-- ATTENTION: replace any occurrence of &lt;/xmp with </xmp -->
          <xmp id="snippet-container">
          <div>
          <div>this is an example div &amp; holds an xmp tag:<br />
          <xmp>
          <html><head> <!-- indentation col 0!! -->
          <title>My Title</title>
          </head><body>
          <p>hello world !!</p>
          </body></html>
          &lt;/xmp> <!-- note this encoded/escaped tag -->
          </div>
          This line is also part of the snippet
          </div>
          </xmp>


          The above codeblok illustrates a raw piece of markup where <xmp id="snippet-container"> contains an (almost raw) code-snippet (containing div>div>xmp>html-document).

          Notice the encoded closing tag in this markup? To comply with rule no 1, this was encoded/escaped).



          So embedding/transporting the (sometimes almost) raw code is/seems solved.



          What about displaying/rendering the snippet (and that encoded &lt;/xmp>)?



          The browser will (or it should) render the snippet (the contents inside snippet-container) exactly the way you see it in the codeblock above (with some discrepancy amongst browsers whether or not the snippet starts with a blank line).

          That includes the formatting/indentation, entities (like the string &amp;), full tags, comments AND the encoded closing tag &lt;/xmp> (just like it was encoded in the markup). And depending on browser(version) one could even try use the property contenteditable="true" to edit this snippet (all that without javascript enabled). Doing something like textarea.value=xmp.innerHTML is also a breeze.



          So you can... if the snippet doesn't contain the containers closing character-sequence.



          However, should a raw snippet contain the closing character-sequence </xmp (because it is an example of xmp itself or it contains some regex, etc), you must accept that you have to encode/escape that sequence in the raw snippet AND need a javascript handler to translate that encoding to display/render the encoded &lt;/xmp> like </xmp> inside a textarea (for editing/posting) or (for example) a pre just to correctly render the snippet's code (or so it seems).



          A very rudimentary jsfiddle example of this here. Note that getting/embedding/displaying/retrieving-to-textarea worked perfect even in IE6. But setting the xmp's innerHTML revealed some interesting 'would-be-intelligent' behavior on IE's part. There is a more extensive note and workaround on that in the fiddle.



          But now comes the important kicker (another reason why you only get very close):
          Just as an over-simplified example, imagine this rabbit-hole:



          Intended raw code-snippet:



          <!-- remember to translate between </xmp> and &lt;/xmp> -->
          <xmp>
          <p>a paragraph</p>
          </xmp>


          Well, to comply with rule 1, we 'only' need to encode those </xmp[> nrtf/] sequences, right?



          So that gives us the following markup (using just a possible encoding):



          <xmp id="container">
          <!-- remember to translate between &lt;/xmp> and &lt;/xmp> -->
          <xmp>
          <p>a paragraph</p>
          &lt;/xmp>
          </xmp>


          Hmm.. shalt I get my crystal ball or flip a coin? No, let the computer look at its system-clock and state that a derived number is 'random'. Yes, that should do it..



          Using a regex like: xmp.innerHTML.replace(/&lt;(?=/xmp[> nrtf/])/gi, '<');, would translate 'back' to this:



          <!-- remember to translate between </xmp> and </xmp> -->
          <xmp>
          <p>a paragraph</p>
          </xmp>


          Hmm.. seems this random generator is broken... Houston..?

          Should you have missed the joke/problem, read again starting at the 'intended raw code-snippet'.



          Wait, I know, we (also) need to encode .... to ....

          Ok, rewind to 'intended raw code-snippet' and read again.

          Somehow this all begins to smell like the famous hilarious-but-true rexgex-answer on SO, a good read for people fluent in mojibake.



          Maybe someone knows a clever algorithm or solution to fix this problem, but I assume that the embedded raw code will get more and more obscure to the point where you'd be better of properly escaping/encoding just your <, & (and >), just like the rest of the world.



          Conclusion: (using the xmp tag)



          • it can be done with known snippets that do not contain the container's closing character-sequence,

          • we can get very close to the original objective with known snippets that only use 'basic first-level' escaping/encoding so we don't fall in the rabbithole,

          • but ultimately it seems that one can't do this reliably in a 'production-environment' where people can/should copy/paste/edit 'any unknown' raw snippets while not knowing/understanding the implications/rules/rabbithole (depending on your implementation of handling/translating for rule 1 and the rabbit-hole).

          Hope this helps!



          PS:
          Whilst I would appreciate an upvote if you find this explanation useful, I kind of think Jukka's answer should be the accepted answer (should no better option/answer come along), since he was the one who remembered the xmp tag (that I forgot about over the years and got 'distracted' by the commonly advocated PCDATA elements like pre, textarea, etc.).

          This answer originated in explaining why you can't do it (with any unknown raw snippet) and explain some obvious pitfalls that some other (now deleted) answers overlooked when advising a textarea for embedding/transport. I've expanded my existing explanation to also support and further explain Jukka's answer (since all that entity and *CDATA stuff is almost harder than code-pages).






          share|improve this answer

















          Essentially the original question can be broken down in 2 parts:



          • Main objective/challenge: embedding(/transporting) a raw formatted code-snippet
            (any kind of code) in a web-page's markup (for simple copy/paste/edit due to no
            encoding/escaping)

          • correctly displaying/rendering that code-snippet (possibly edit it) in the
            browser

          The short (but) ambiguous answer is: you can't, ...but you can (get very close).

          (I know, that are 3 contradicting answers, so read on...)



          (polyglot)(x)(ht)ml Markup-languages rely on wrapping (almost) everything between begin/opening and end/closing tags/character(sequences).

          So, to embed any kind of raw code/snippet inside your markup-language, one will always have to escape/encode every instance (inside that snippet) that resembles the character(-sequence) that would close the wrapping 'container' element in the markup. (During this post I'll refer to this as rule no 1.)

          Think of "some "data" here" or <i>..close italics with '</i>'-tag</i>, where it is obvious one should escape/encode (something in) </i and " (or change container's quote-character from " to ').



          So, because of rule no 1, you can't 'just' embed 'any' unknown raw code-snippet inside markup.

          Because, if one has to escape/encode even one character inside the raw snippet, then that snippet would no longer be the same original 'pure raw code' that anyone can copy/paste/edit in the document's markup without further thought. It would lead to malformed/illegal markup and Mojibake (mainly) because of entities.

          Also, should that snippet contain such characters, you'd still need some javascript to 'translate' that character(sequence) from (and to) it's escaped/encoded representation to display the snippet correctly in the 'webpage' (for copy/paste/edit).



          That brings us to (some of) the datatypes that markup-languages specify. These datatypes essentially define what are considered 'valid characters' and their meaning (per tag, property, etc.):



          • PCDATA (Parsed Character DATA): will expand entities and one must
            escape <, & (and > depending on markup language/version).

            Most tags like body, div, pre, etc, but also textarea (until
            HTML5) fall under this type.

            So not only do you need to encode all the container's closing character-sequences
            inside the snippet, you also have to encode all <, & (,>) characters
            (at minimum).

            Needless to say, encoding/escaping this many characters falls outside this
            objective's scope of embedding a raw snippet in the markup.

            '..But a textarea seems to work...', yes, either because of the browsers
            error-engine trying to make something out of it, or because HTML5:



          • RCDATA (Replaceable Character DATA): will not not treat tags inside the
            text as markup (but are still governed by rule 1), so one doesn't need to
            encode < (>). BUT entities are still expanded, so they and 'ambiguous
            ampersands' (&) need special care.

            The current HTML5 spec says the textarea is now a RCDATA field and (quote):




            The text in raw text and RCDATA elements must not contain any
            occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F SOLIDUS)
            followed by characters that case-insensitively match the tag name of
            the element followed by one of U+0009 CHARACTER TABULATION (tab),
            U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
            (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F SOLIDUS (/).




            Thus no matter what, textarea needs a hefty entity translation handler or
            it will eventually Mojibake on entities!



          • CDATA (Character Data) will not treat tags inside the text as
            markup and will not expand entities
            .

            So as long as the raw snippet code does not violate rule 1 (that one can't
            have the containers closing character(sequence) inside the snippet), this
            requires no other escaping/encoding.


          Clearly this boils down to: how can we minimize the number of characters/character-sequences that still need to be encoded in the snippet's raw source and the number of times that character(sequence) might appear in an average snippet; something that is also of importance for the javascript that handles the translation of these characters (if they occur).



          So what 'containers' have this CDATA context?



          Most value properties of tags are CDATA, so one could (ab)use a hidden input's value property (proof of concept jsfiddle here).

          However (conform rule 1) this creates an encoding/escape problem with nested quotes (" and ') in the raw snippet and one needs some javascript to get/translate and set the snippet in another (visible) element (or simply setting it as a text-area's value). Somehow this gave me problems with entities in FF (just like in a textarea). But it doesn't really matter, since the 'price' of having to escape/encode nested quotes is higher then a (HTML5) textarea (quotes are quite common in source code..).



          What about trying to (ab)use <![CDATA[<tag>bla & bla</tag>]]>?

          As Jukka points out in his extended answer, this would only work in (rare) 'real xhtml'.

          I thought of using a script-tag (with or without such a CDATA wrapper inside the script-tag) together with a multi-line comment /* */ that wraps the raw snippet (script-tags can have an id and you can access them by count). But since this obviously introduces a escaping problem with */, ]]> and </script in the raw snippet, this doesn't seem like a solution either.



          Please post other viable 'containers' in the comments to this answer.



          By the way, encoding or counting the number of - characters and balancing them out inside a comment tag <!-- --> is just insane for this purpose (apart from rule 1).




          That leaves us with Jukka K. Korpela's excellent answer: the <xmp> tag seems the best option!



          The 'forgotten' <xmp> holds CDATA, is intended for this purpose AND is indeed still in the current HTML 5 spec (and has been at least since HTML3.2); exactly what we need! It's also widely supported, even in IE6 (that is.. until it suffers from the same regression as the scrolling table-body).

          Note: as Jukka pointed out, this will not work in true xhtml or polyglot (that will treat it as a pre) and the xmp tag must still adhere to rule no 1. But that's the 'only' rule.



          Consider the following markup:



          <!-- ATTENTION: replace any occurrence of &lt;/xmp with </xmp -->
          <xmp id="snippet-container">
          <div>
          <div>this is an example div &amp; holds an xmp tag:<br />
          <xmp>
          <html><head> <!-- indentation col 0!! -->
          <title>My Title</title>
          </head><body>
          <p>hello world !!</p>
          </body></html>
          &lt;/xmp> <!-- note this encoded/escaped tag -->
          </div>
          This line is also part of the snippet
          </div>
          </xmp>


          The above codeblok illustrates a raw piece of markup where <xmp id="snippet-container"> contains an (almost raw) code-snippet (containing div>div>xmp>html-document).

          Notice the encoded closing tag in this markup? To comply with rule no 1, this was encoded/escaped).



          So embedding/transporting the (sometimes almost) raw code is/seems solved.



          What about displaying/rendering the snippet (and that encoded &lt;/xmp>)?



          The browser will (or it should) render the snippet (the contents inside snippet-container) exactly the way you see it in the codeblock above (with some discrepancy amongst browsers whether or not the snippet starts with a blank line).

          That includes the formatting/indentation, entities (like the string &amp;), full tags, comments AND the encoded closing tag &lt;/xmp> (just like it was encoded in the markup). And depending on browser(version) one could even try use the property contenteditable="true" to edit this snippet (all that without javascript enabled). Doing something like textarea.value=xmp.innerHTML is also a breeze.



          So you can... if the snippet doesn't contain the containers closing character-sequence.



          However, should a raw snippet contain the closing character-sequence </xmp (because it is an example of xmp itself or it contains some regex, etc), you must accept that you have to encode/escape that sequence in the raw snippet AND need a javascript handler to translate that encoding to display/render the encoded &lt;/xmp> like </xmp> inside a textarea (for editing/posting) or (for example) a pre just to correctly render the snippet's code (or so it seems).



          A very rudimentary jsfiddle example of this here. Note that getting/embedding/displaying/retrieving-to-textarea worked perfect even in IE6. But setting the xmp's innerHTML revealed some interesting 'would-be-intelligent' behavior on IE's part. There is a more extensive note and workaround on that in the fiddle.



          But now comes the important kicker (another reason why you only get very close):
          Just as an over-simplified example, imagine this rabbit-hole:



          Intended raw code-snippet:



          <!-- remember to translate between </xmp> and &lt;/xmp> -->
          <xmp>
          <p>a paragraph</p>
          </xmp>


          Well, to comply with rule 1, we 'only' need to encode those </xmp[> nrtf/] sequences, right?



          So that gives us the following markup (using just a possible encoding):



          <xmp id="container">
          <!-- remember to translate between &lt;/xmp> and &lt;/xmp> -->
          <xmp>
          <p>a paragraph</p>
          &lt;/xmp>
          </xmp>


          Hmm.. shalt I get my crystal ball or flip a coin? No, let the computer look at its system-clock and state that a derived number is 'random'. Yes, that should do it..



          Using a regex like: xmp.innerHTML.replace(/&lt;(?=/xmp[> nrtf/])/gi, '<');, would translate 'back' to this:



          <!-- remember to translate between </xmp> and </xmp> -->
          <xmp>
          <p>a paragraph</p>
          </xmp>


          Hmm.. seems this random generator is broken... Houston..?

          Should you have missed the joke/problem, read again starting at the 'intended raw code-snippet'.



          Wait, I know, we (also) need to encode .... to ....

          Ok, rewind to 'intended raw code-snippet' and read again.

          Somehow this all begins to smell like the famous hilarious-but-true rexgex-answer on SO, a good read for people fluent in mojibake.



          Maybe someone knows a clever algorithm or solution to fix this problem, but I assume that the embedded raw code will get more and more obscure to the point where you'd be better of properly escaping/encoding just your <, & (and >), just like the rest of the world.



          Conclusion: (using the xmp tag)



          • it can be done with known snippets that do not contain the container's closing character-sequence,

          • we can get very close to the original objective with known snippets that only use 'basic first-level' escaping/encoding so we don't fall in the rabbithole,

          • but ultimately it seems that one can't do this reliably in a 'production-environment' where people can/should copy/paste/edit 'any unknown' raw snippets while not knowing/understanding the implications/rules/rabbithole (depending on your implementation of handling/translating for rule 1 and the rabbit-hole).

          Hope this helps!



          PS:
          Whilst I would appreciate an upvote if you find this explanation useful, I kind of think Jukka's answer should be the accepted answer (should no better option/answer come along), since he was the one who remembered the xmp tag (that I forgot about over the years and got 'distracted' by the commonly advocated PCDATA elements like pre, textarea, etc.).

          This answer originated in explaining why you can't do it (with any unknown raw snippet) and explain some obvious pitfalls that some other (now deleted) answers overlooked when advising a textarea for embedding/transport. I've expanded my existing explanation to also support and further explain Jukka's answer (since all that entity and *CDATA stuff is almost harder than code-pages).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited May 23 '17 at 12:34









          Community

          11




          11










          answered May 28 '13 at 6:04









          GitaarLABGitaarLAB

          11.9k84470




          11.9k84470












          • What you write is very true and according to the spec, but at the end of the day OP is after a solution which will allow him to copy the text out of an element and use it again. I have tested on Chrome, Firefox and IE, putting all the special characters you mention into the HTML source inside the textarea, and it doesn't want to break. When I copy the value out of the textarea it is always exactly what was in the HTML source originally.

            – Mathijs Flietstra
            May 28 '13 at 6:33






          • 1





            I interpreted the original question as: 'how to have a formatted raw code-snippet inside an element inside a valid html-source' (you also start your answer with: <textarea readonly> <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM> </textarea>). Even without that restriction (so it doesn't matter how the (correct) raw source gets into an element) one still needs an escaping routine, if only to safeguard against </textarea[ >/] (which is rather obvious when you think about that) for example.

            – GitaarLAB
            May 28 '13 at 6:54












          • PS: I am looking into <![CDATA[<tag>bla & bla</tag>]]>, but I'm currently unsure at the moment about the exact rules across markup-languages (html,xhtml,xml,polyglot,etc) and serving-methods.

            – GitaarLAB
            May 28 '13 at 7:08











          • A truly shakespearian presentation! When does the movie come out?

            – Kebman
            Aug 20 '17 at 23:19

















          • What you write is very true and according to the spec, but at the end of the day OP is after a solution which will allow him to copy the text out of an element and use it again. I have tested on Chrome, Firefox and IE, putting all the special characters you mention into the HTML source inside the textarea, and it doesn't want to break. When I copy the value out of the textarea it is always exactly what was in the HTML source originally.

            – Mathijs Flietstra
            May 28 '13 at 6:33






          • 1





            I interpreted the original question as: 'how to have a formatted raw code-snippet inside an element inside a valid html-source' (you also start your answer with: <textarea readonly> <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM> </textarea>). Even without that restriction (so it doesn't matter how the (correct) raw source gets into an element) one still needs an escaping routine, if only to safeguard against </textarea[ >/] (which is rather obvious when you think about that) for example.

            – GitaarLAB
            May 28 '13 at 6:54












          • PS: I am looking into <![CDATA[<tag>bla & bla</tag>]]>, but I'm currently unsure at the moment about the exact rules across markup-languages (html,xhtml,xml,polyglot,etc) and serving-methods.

            – GitaarLAB
            May 28 '13 at 7:08











          • A truly shakespearian presentation! When does the movie come out?

            – Kebman
            Aug 20 '17 at 23:19
















          What you write is very true and according to the spec, but at the end of the day OP is after a solution which will allow him to copy the text out of an element and use it again. I have tested on Chrome, Firefox and IE, putting all the special characters you mention into the HTML source inside the textarea, and it doesn't want to break. When I copy the value out of the textarea it is always exactly what was in the HTML source originally.

          – Mathijs Flietstra
          May 28 '13 at 6:33





          What you write is very true and according to the spec, but at the end of the day OP is after a solution which will allow him to copy the text out of an element and use it again. I have tested on Chrome, Firefox and IE, putting all the special characters you mention into the HTML source inside the textarea, and it doesn't want to break. When I copy the value out of the textarea it is always exactly what was in the HTML source originally.

          – Mathijs Flietstra
          May 28 '13 at 6:33




          1




          1





          I interpreted the original question as: 'how to have a formatted raw code-snippet inside an element inside a valid html-source' (you also start your answer with: <textarea readonly> <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM> </textarea>). Even without that restriction (so it doesn't matter how the (correct) raw source gets into an element) one still needs an escaping routine, if only to safeguard against </textarea[ >/] (which is rather obvious when you think about that) for example.

          – GitaarLAB
          May 28 '13 at 6:54






          I interpreted the original question as: 'how to have a formatted raw code-snippet inside an element inside a valid html-source' (you also start your answer with: <textarea readonly> <REALLY_REALLY_VERBATIM> ...... </<REALLY_REALLY_VERBATIM> </textarea>). Even without that restriction (so it doesn't matter how the (correct) raw source gets into an element) one still needs an escaping routine, if only to safeguard against </textarea[ >/] (which is rather obvious when you think about that) for example.

          – GitaarLAB
          May 28 '13 at 6:54














          PS: I am looking into <![CDATA[<tag>bla & bla</tag>]]>, but I'm currently unsure at the moment about the exact rules across markup-languages (html,xhtml,xml,polyglot,etc) and serving-methods.

          – GitaarLAB
          May 28 '13 at 7:08





          PS: I am looking into <![CDATA[<tag>bla & bla</tag>]]>, but I'm currently unsure at the moment about the exact rules across markup-languages (html,xhtml,xml,polyglot,etc) and serving-methods.

          – GitaarLAB
          May 28 '13 at 7:08













          A truly shakespearian presentation! When does the movie come out?

          – Kebman
          Aug 20 '17 at 23:19





          A truly shakespearian presentation! When does the movie come out?

          – Kebman
          Aug 20 '17 at 23:19











          6














          Cheap and cheerful answer:



          <textarea>Some raw content</textarea>


          The textarea will handle tabs, multiple spaces, newlines, line wrapping all verbatim.
          It copies and pastes nicely and its valid HTML all the way. It also allows the user to resize the code box.
          You don't need any CSS, JS, escaping, encoding.



          You can alter the appearance and behaviour as well.
          Here's a monospace font, editing disabled, smaller font, no border:



          <textarea
          style="width:100%; font-family: Monospace; font-size:10px; border:0;"
          rows="30" disabled
          >Some raw content</textarea>


          This solution is probably not semantically correct. So if you need that, it might be best to choose a more sophisticated answer.






          share|improve this answer























          • Simplier solution & it does the job!

            – RousseauAlexandre
            Oct 27 '17 at 11:08















          6














          Cheap and cheerful answer:



          <textarea>Some raw content</textarea>


          The textarea will handle tabs, multiple spaces, newlines, line wrapping all verbatim.
          It copies and pastes nicely and its valid HTML all the way. It also allows the user to resize the code box.
          You don't need any CSS, JS, escaping, encoding.



          You can alter the appearance and behaviour as well.
          Here's a monospace font, editing disabled, smaller font, no border:



          <textarea
          style="width:100%; font-family: Monospace; font-size:10px; border:0;"
          rows="30" disabled
          >Some raw content</textarea>


          This solution is probably not semantically correct. So if you need that, it might be best to choose a more sophisticated answer.






          share|improve this answer























          • Simplier solution & it does the job!

            – RousseauAlexandre
            Oct 27 '17 at 11:08













          6












          6








          6







          Cheap and cheerful answer:



          <textarea>Some raw content</textarea>


          The textarea will handle tabs, multiple spaces, newlines, line wrapping all verbatim.
          It copies and pastes nicely and its valid HTML all the way. It also allows the user to resize the code box.
          You don't need any CSS, JS, escaping, encoding.



          You can alter the appearance and behaviour as well.
          Here's a monospace font, editing disabled, smaller font, no border:



          <textarea
          style="width:100%; font-family: Monospace; font-size:10px; border:0;"
          rows="30" disabled
          >Some raw content</textarea>


          This solution is probably not semantically correct. So if you need that, it might be best to choose a more sophisticated answer.






          share|improve this answer













          Cheap and cheerful answer:



          <textarea>Some raw content</textarea>


          The textarea will handle tabs, multiple spaces, newlines, line wrapping all verbatim.
          It copies and pastes nicely and its valid HTML all the way. It also allows the user to resize the code box.
          You don't need any CSS, JS, escaping, encoding.



          You can alter the appearance and behaviour as well.
          Here's a monospace font, editing disabled, smaller font, no border:



          <textarea
          style="width:100%; font-family: Monospace; font-size:10px; border:0;"
          rows="30" disabled
          >Some raw content</textarea>


          This solution is probably not semantically correct. So if you need that, it might be best to choose a more sophisticated answer.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Feb 8 '17 at 3:41









          HenryHenry

          5,05412330




          5,05412330












          • Simplier solution & it does the job!

            – RousseauAlexandre
            Oct 27 '17 at 11:08

















          • Simplier solution & it does the job!

            – RousseauAlexandre
            Oct 27 '17 at 11:08
















          Simplier solution & it does the job!

          – RousseauAlexandre
          Oct 27 '17 at 11:08





          Simplier solution & it does the job!

          – RousseauAlexandre
          Oct 27 '17 at 11:08











          3














          echo '<pre>' . htmlspecialchars("<div><b>raw HTML</b></div>") . '</pre>';


          I think that's what you're looking for?



          In other words, use htmlspecialchars() in PHP






          share|improve this answer



























            3














            echo '<pre>' . htmlspecialchars("<div><b>raw HTML</b></div>") . '</pre>';


            I think that's what you're looking for?



            In other words, use htmlspecialchars() in PHP






            share|improve this answer

























              3












              3








              3







              echo '<pre>' . htmlspecialchars("<div><b>raw HTML</b></div>") . '</pre>';


              I think that's what you're looking for?



              In other words, use htmlspecialchars() in PHP






              share|improve this answer













              echo '<pre>' . htmlspecialchars("<div><b>raw HTML</b></div>") . '</pre>';


              I think that's what you're looking for?



              In other words, use htmlspecialchars() in PHP







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Apr 7 '15 at 21:18









              tribulanttribulant

              603516




              603516





















                  3














                  @GitaarLAB and @Jukka elaborate that <xmp> tag is obsolete, but still the best. When I use it like this



                  <xmp>
                  <div>Lorem ipsum</div>
                  <p>Hello</p>
                  </xmp>


                  then the first EOL is inserted in the code, and it looks awful.



                  It can be solved by removing that EOL



                  <xmp><div>Lorem ipsum</div>
                  <p>Hello</p>
                  </xmp>


                  but then it looks bad in the source. I used to solve it with wrapping <div>, but recently I figured out a nice CSS3 rule, I hope it also helps somebody:



                  xmp margin: 5px 0; padding: 0 5px 5px 5px; background: #CCC; 
                  xmp:before content: ""; display: block; height: 1em; margin: 0 -5px -2em -5px;


                  This looks better.






                  share|improve this answer



























                    3














                    @GitaarLAB and @Jukka elaborate that <xmp> tag is obsolete, but still the best. When I use it like this



                    <xmp>
                    <div>Lorem ipsum</div>
                    <p>Hello</p>
                    </xmp>


                    then the first EOL is inserted in the code, and it looks awful.



                    It can be solved by removing that EOL



                    <xmp><div>Lorem ipsum</div>
                    <p>Hello</p>
                    </xmp>


                    but then it looks bad in the source. I used to solve it with wrapping <div>, but recently I figured out a nice CSS3 rule, I hope it also helps somebody:



                    xmp margin: 5px 0; padding: 0 5px 5px 5px; background: #CCC; 
                    xmp:before content: ""; display: block; height: 1em; margin: 0 -5px -2em -5px;


                    This looks better.






                    share|improve this answer

























                      3












                      3








                      3







                      @GitaarLAB and @Jukka elaborate that <xmp> tag is obsolete, but still the best. When I use it like this



                      <xmp>
                      <div>Lorem ipsum</div>
                      <p>Hello</p>
                      </xmp>


                      then the first EOL is inserted in the code, and it looks awful.



                      It can be solved by removing that EOL



                      <xmp><div>Lorem ipsum</div>
                      <p>Hello</p>
                      </xmp>


                      but then it looks bad in the source. I used to solve it with wrapping <div>, but recently I figured out a nice CSS3 rule, I hope it also helps somebody:



                      xmp margin: 5px 0; padding: 0 5px 5px 5px; background: #CCC; 
                      xmp:before content: ""; display: block; height: 1em; margin: 0 -5px -2em -5px;


                      This looks better.






                      share|improve this answer













                      @GitaarLAB and @Jukka elaborate that <xmp> tag is obsolete, but still the best. When I use it like this



                      <xmp>
                      <div>Lorem ipsum</div>
                      <p>Hello</p>
                      </xmp>


                      then the first EOL is inserted in the code, and it looks awful.



                      It can be solved by removing that EOL



                      <xmp><div>Lorem ipsum</div>
                      <p>Hello</p>
                      </xmp>


                      but then it looks bad in the source. I used to solve it with wrapping <div>, but recently I figured out a nice CSS3 rule, I hope it also helps somebody:



                      xmp margin: 5px 0; padding: 0 5px 5px 5px; background: #CCC; 
                      xmp:before content: ""; display: block; height: 1em; margin: 0 -5px -2em -5px;


                      This looks better.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Oct 4 '15 at 16:18









                      Jan TuroňJan Turoň

                      19.1k1685131




                      19.1k1685131





















                          2














                          xmp is the way to go, i.e.:



                          <xmp>
                          # your code...
                          </xmp>





                          share|improve this answer



























                            2














                            xmp is the way to go, i.e.:



                            <xmp>
                            # your code...
                            </xmp>





                            share|improve this answer

























                              2












                              2








                              2







                              xmp is the way to go, i.e.:



                              <xmp>
                              # your code...
                              </xmp>





                              share|improve this answer













                              xmp is the way to go, i.e.:



                              <xmp>
                              # your code...
                              </xmp>






                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered May 28 '17 at 10:18









                              Pedro LobitoPedro Lobito

                              51.5k16141172




                              51.5k16141172





















                                  1














                                  If you have jQuery enabled you can use an escapeXml function and not have to worry about escaping arrows or special characters.



                                  <pre>
                                  $fn:escapeXml('
                                  <!-- all your code -->
                                  ');
                                  </pre>





                                  share|improve this answer



























                                    1














                                    If you have jQuery enabled you can use an escapeXml function and not have to worry about escaping arrows or special characters.



                                    <pre>
                                    $fn:escapeXml('
                                    <!-- all your code -->
                                    ');
                                    </pre>





                                    share|improve this answer

























                                      1












                                      1








                                      1







                                      If you have jQuery enabled you can use an escapeXml function and not have to worry about escaping arrows or special characters.



                                      <pre>
                                      $fn:escapeXml('
                                      <!-- all your code -->
                                      ');
                                      </pre>





                                      share|improve this answer













                                      If you have jQuery enabled you can use an escapeXml function and not have to worry about escaping arrows or special characters.



                                      <pre>
                                      $fn:escapeXml('
                                      <!-- all your code -->
                                      ');
                                      </pre>






                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Dec 17 '14 at 23:22









                                      PanicBusPanicBus

                                      4241515




                                      4241515















                                          protected by Community Sep 16 '15 at 14:35



                                          Thank you for your interest in this question.
                                          Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                          Would you like to answer one of these unanswered questions instead?



                                          Popular posts from this blog

                                          Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                                          Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                                          Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript