Skip to content

Greater/lesser-than signs cause problems for DOI resolver #1640

Closed
adunning opened this Issue · 4 comments

4 participants

@adunning

When citing an article via pandoc-citeproc, I found that Pandoc's method of encoding the URL in the HTML output causes the DOI resolver to fail:

pandoc -t html << EOT
Whitmore, Ian. 1999. ‘Terminologia Anatomica: New Terminology for the
New Anatomist’. *The Anatomical Record* 257 (2): 50–53.
doi:[10.1002/(sici)1097-0185(19990415)257:2\<50::aid-ar4\>3.3.co;2-n](http://
dx.doi.org/10.1002/(sici)1097-0185(19990415)257:2<50::aid-ar4>3.3.co;2-n).
EOT

Following Pandoc's output results in an error:

http://dx.doi.org/10.1002/(sici)1097-0185(19990415)257:2&lt;50::aid-ar4&gt;3.3.co;2-n

Following the URL as it appears in the Markdown does not result in any problems; alternatively, Zotero does it like this:

http://dx.doi.org/10.1002%2F(sici)1097-0185(19990415)257%3A2%3C50%3A%3Aaid-ar4%3E3.3.co%3B2-n

@mpickering
Collaborator

Details about the correct encoding are here

@mpickering mpickering added the bug label
@katrinleinweber

Thanks for the hint about encoding! This problem can also be mitigated by shortening the DOI. However, should this be reported to Zotero or other reference managers, so they do the correct encoding during im- or export?

@jgm
Owner

Pandoc should probably be URL-encoding its links.
But it would also be good if Zotero encoded correctly during export.

@katrinleinweber

2 opinions in favour of doing the encoding only, when DOIs are actually converted into URL/Is, over there.

@jgm jgm added a commit that closed this issue
@jgm Percent-encode more special characters in URLs.
HTML, LaTeX writers adjusted.
The special characters are '<','>','|','"','{','}','[',']','^', '`'.

Closes #1640, #2377.
1e8a25a
@jgm jgm closed this in 1e8a25a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.