Revert unnecessary "fix handling of unicode when counting codeblock lines" + document (#30368)

Michael Sloan created 9 months ago

After merging #30364 I realized why it was unnecessary to fix the code,
and was more efficient before. UTF-8 does not use the standard 0-127
ASCII range for multi-byte chars. So this reverts that change and
documents why the code is valid.

Release Notes:

- N/A

Change summary

crates/markdown/src/markdown.rs | 1 +
crates/markdown/src/parser.rs   | 5 +++--
2 files changed, 4 insertions(+), 2 deletions(-)

Detailed changes

crates/markdown/src/markdown.rs 🔗

@@ -223,6 +223,7 @@ impl Markdown {
     }
 
     pub fn escape(s: &str) -> Cow<str> {
+        // Valid to use bytes since multi-byte UTF-8 doesn't use ASCII chars.
         let count = s
             .bytes()
             .filter(|c| *c == b'\n' || c.is_ascii_punctuation())

crates/markdown/src/parser.rs 🔗

@@ -79,9 +79,10 @@ pub fn parse_markdown(
                         let content_range =
                             content_range.start + range.start..content_range.end + range.start;
 
+                        // Valid to use bytes since multi-byte UTF-8 doesn't use ASCII chars.
                         let line_count = text[content_range.clone()]
-                            .chars()
-                            .filter(|c| *c == '\n')
+                            .bytes()
+                            .filter(|c| *c == b'\n')
                             .count();
                         let metadata = CodeBlockMetadata {
                             content_range,