Skip to content

Org parser fails with "/italics in quotes/" #2513

Closed
conklech opened this Issue · 3 comments

2 participants

@conklech

The following text is valid Org markup: "/text/". It represents a quotation mark in roman, the word "text" in italics, and a quotation mark in roman. Pandoc correctly parses that literal string, but fails to parse it in any context, e.g. the text X "/text/" X , or even just the quoted word surrounded in spaces, produces output with slashes rather than italics.

Let me know if this report isn't sufficiently clear. I may take some time to poke at the parser to see if there's a simple fix. Org-mode's own pretty-printing parser is kind of touchy with markup and special characters; for example the text /"text"/, i.e. with the quotation marks italicized, is apparently not valid Org markup.

@tarleb

Pandoc tries to mostly follow Emacs' Org-Mode parser in what it recognizes as markup. As you noted, it does has its downsides, Org-Mode is a little weird in that regard.

As to the italics in quotes issue: I'll need a few more details, I wasn't able to reproduce this yet. Do you have some example code I could use? Also, just to be on the safe side: Did you make sure that pandoc knows that the input file is in Org format (e.g. by specifying --from org on the command line)?

@conklech

Ah, I missed a necessary condition. This only happens with the --smart flag. I don't have a library build of pandoc convenient, so here are some command-line test cases:

$ pandoc --version
pandoc 1.15.1

$ echo "\"/test/\"" | pandoc --from org --to native --smart
[Para [Quoted DoubleQuote [Emph [Str "test"]]]]

$ echo "X\"/test/\"" | pandoc --from org --to native --smart
[Para [Str "X",Quoted DoubleQuote [Emph [Str "test"]]]]

$ echo " \"/test/\"" | pandoc --from org --to native --smart
[Para [Quoted DoubleQuote [Str "/test/"]]]

$ echo "\"/test/\" X" | pandoc --from org --to native --smart
[Para [Quoted DoubleQuote [Emph [Str "test"]],Space,Str "X"]]

So the necessary condition seems to be having whitespace before the opening quotation mark. (Example 3.) As a result, in parsing whole files the bug is usually triggered.

As I mentioned, the org parser (or at least whatever powers the pretty-printing in emacs) accepts this format. It's occasionally necessary, and there's apparently no workaround.

(For the record: [/test/] is not accepted by either org-mode or pandoc. You need spaces on both sides. So an org footnote just containing Id., which is terribly common in legal prose, must be typed as [fn:: /Id./ ], with a space at the end. Pandoc will disregard the extra space at the end.)

@tarleb

Now I see it. There is an error in the way the parser state is updated, closely related to #2504. Should be easy to fix.

Thanks for the report!

@tarleb tarleb added a commit to tarleb/pandoc that referenced this issue
@tarleb tarleb Org reader: Fix emphasis rules for smart parsing
Smart quotes, ellipses, and dashes should behave like normal quotes,
single dashes, and dots with respect to text markup parsing.  The parser
state was not updated properly in all cases, which has been fixed.

Thanks to @conklech for reporting this issue.

This fixes #2513.
220f3d1
@jgm jgm closed this in #2525
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.