Forum:Meta: Markdown bug when text content has HTML tags
1
1
Entering edit mode
7.9 years ago
Ram 43k

Hello all,

I just noticed this: Markdown style hyperlinks don't work when the text content in the principal text area has HTML. For example, when you moderate->close a post and then try editing the moderation comment to add links, the edit brings up a page with the content already populated with html tags (such as the <p> tag).

I got the links working once the tags were removed. I think the Markdown parser gets into a conflict with a different HTML parser and that's what causes this. Any thoughts on how we can address this?

Thank you!

-- Ram

meta bug • 1.9k views
ADD COMMENT
1
Entering edit mode

Also these type of issue should also opened in the Github issue page and cross linked. While it is fine to talk about them here I might occasionally miss some of these topics here.

ADD REPLY
0
Entering edit mode

I did think GitHub, but I'm going to use my usual "run-it-by-community" excuse for the last time now. The next time I am in a similar situation, GitHub will be my go-to spot.

ADD REPLY
0
Entering edit mode

Missing content in old posts?. This is one such case I have faced recently.

ADD REPLY
1
Entering edit mode
7.9 years ago

We have posts from three different versions of Biostar, an old Markdown, a CKEditor based HTML and now a new style "CommonMark" type of markdown.

Markdown does allow mixing HTML content into the markdown but will not apply markdown specific instructions for content placed inside HTML tags.

For a few years posts were created directly in HTML rather than Markdown. These were not automatically converted down to Markdown since it was not clear if that an automatic conversion would handle all cases correctly and the goal was to protect the existing content as is.

When one edits an older post that was started out as HTML they would also need to remove the simple formatting tags such as <p> (especially the root tag that makes the whole post HTML) etc to make use markdown instructions.

ADD COMMENT
0
Entering edit mode

This makes total sense. The content in posts and comments are stored in a DB correct? Would it be possible to look for a HTML-to-MD converter that would replace formatting and anchor tags perhaps? I am sure you would have considered this - what are the major challenges?

ADD REPLY
0
Entering edit mode

there is a pretty handy library for this I tested that out and it seems to work

https://github.com/aaronsw/html2text

But that alone would not fix a good number of cases, especially those with the iframes, that would require a different parser to be written. Hence this that was a not a full solution we've never applied that.

ADD REPLY
0
Entering edit mode

There are iframes in posts? Wow!

ADD REPLY
0
Entering edit mode

only when content is embedded gist, youtube, twitter

ADD REPLY
0
Entering edit mode

Ah. I thought embedding used <embed> or <object> tags - just noticed that W3schools now recommends <iframe>s. This is challenging stuff!

ADD REPLY

Login before adding your answer.

Traffic: 2025 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6