Skip to content

disable conversion of straight quotes to curly quotes in latex to wiki conversion #2541

Closed
jlrn7 opened this Issue · 4 comments

2 participants

@jlrn7

Hello!

I am using LaTeX with the wiki.sty package. I'd like to prevent Pandoc from converting the wiki markup ''italics'' or '''boldface''' into ”utf8” ”’quotes”’ when converting my latex file to wiki markup, for further processing. In fact, according to the manual, the directives "--smart-quote" "-S" work for converting to LaTeX, not from it; and if you require UTF-8 as input, it should be fairly easy to input the curly quotes directly, and I see no reason to encourage the use of those TeX ligatures anymore.

I think this issue fits in the category of feature request.

@jlrn7 jlrn7 changed the title from disable conversion of quotes to smart quotes from latex to wiki to disable conversion of straight quotes to curly quotes in latex to wiki conversion
@jgm
Owner
@jlrn7

I fear I didn't explain myself: I normally type ''emphasis'' with two single quotation marks with the wiki.sty package on LaTeX; now I want to convert the LaTeX file to ''true'' wiki, but the processor converts my double single quotes into a unicode closing curly quotes; I want to avoid that. The --no-tex-ligatures switch also works only in conversion to latex, not from it.

@jgm
Owner

OK, I understand:

% pandoc -t mediawiki -f latex 
''hi''
^D
”hi”

and you want ''hi'' instead. I think this is really a bug. the LaTex reader shouldn't be parsing these as a Quoted inline when --no-tex-ligatures is set. Fix is fairly easy.

@jgm
Owner

On second look: the current situation with --no-tex-ligatures is confusing, since in addition to affecting the LaTeX writer it actually causes smart quote parsing to be turned off in the reader, but only when the output format is LaTeX or ConTeXt.

It would make more sense for --no-tex-ligatures to affect the LaTeX reader, no matter what the output format. (We'd still need to have it affect other readers when the output format is LaTeX or ConTeXt, because otherwise we get Quoted elements in the parse tree and the LaTeX writer would have no idea how to reconstruct the original delimiter characters. This is a bit ugly but I don't see a good way around it.)

@jgm jgm added a commit that closed this issue
@jgm Rationalized behavior of --no-tex-ligatures and --smart.
This change makes `--no-tex-ligatures` affect the LaTeX reader
as well as the LaTeX and ConTeXt writers.  If it is used,
the LaTeX reader will parse characters `` ` ``, `'`, and `-`
literally, rather than parsing ligatures for quotation marks
and dashes.  And the LaTeX writer will print unicode quotation
mark and dash characters literally, rather than converting
them to the standard ASCII ligatures.

Note that `--smart` has no affect on the LaTeX reader.

`--smart` is still the default for all input formats when
LaTeX or ConTeXt is the output format, *unless* `--no-tex-ligatures`
is used.

Some examples to illustrate the logic:

```
% echo "'hi'" | pandoc -t latex
`hi'
% echo "'hi'" | pandoc -t latex --no-tex-ligatures
'hi'
% echo "'hi'" | pandoc -t latex --no-tex-ligatures --smart
‘hi’
% echo "'hi'" | pandoc -f latex --no-tex-ligatures
<p>'hi'</p>
% echo "'hi'" | pandoc -f latex
<p>’hi’</p>
```

Closes #2541.
ed1173a
@jgm jgm closed this in ed1173a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.