Jump to content

RS API escaping XML reserved characters


majstang

Recommended Posts

Posted
This is a suggestion. Query the RS API returns reserved XML characters in data and some escaping is needed. If no escaping it creates problems when opening the query in third party XML parsers, which operates according to the correct XML specifications. The most common character creating problems is the Ampersand "&".
Example from api/recordings.html:

<recording id="5708" charset="255" start="20141101200000" duration="003000">
<channel>TV6</channel>
<file>
d:\tv\recording service\20141101_19-59-03_tv6_Rizzoli & Isles - S05E12.ts
</file>
<title>Rizzoli & Isles</title>
<info>S05E12</info>
<image>735346743_SM.jpg</image>
</recording>

 

& should be escaped to &

 

d:\tv\recording service\20141101_19-59-03_tv6_Rizzoli & Isles - S05E12.ts

 

Heres a list of all reserved characters:

http://technet.microsoft.com/en-US/library/ms145315%28v=SQL.90%29.aspx

Posted
& should be escaped to &

 

Of course. I wonder why it doesn't happen, because the content between <file> and </file> is processed by a function named SimpleXMLEncodeUTF8, and besides " = " ' = ' < = < and > = > it does handle the ampersand.

 

Are you sure that there is no XML decoding resp. un-escaping going on after the xml has been received from the RS?

Posted

Ok, that's what I did:

 

(1) Created a short recording in the web interface via Timer -> New Timer and forced the filename to be "Das Erste & Morgenmagazin.ts"

(2) Used Firefox to download /api/recordings.html from the RS directly to my hard disk

(3) Opened the file with Notepad++ and got (line breaks added):

 

<?xml version="1.0" encoding="utf-8" ?>

<recordings Ver="1">

<rev>79</rev>

<serverURL>http://127.0.0.1:8090/upnp/recordings/</serverURL>

<recording id="1" charset="255" content="32" start="20141104053000" duration="033000">

<channel>Das Erste</channel>

<file>d:\videos\DVBViewer pro\das erste & morgenmagazin.ts</file>

etc.

 

So where is the problem?

 

The full processing path is missing in your report. See my enumeration above. Probably you didn't consider everything happening along this path. Did you countercheck and verify your assumption "the RS API returns reserved XML characters" by receiving the recordings.html in a different way?

Posted

I cant confirm this, for me its okay like Griga said.

 

The XML parser i am using in the Android App wouldnt work if the special chars arent escaped

Posted

Probably you didn't consider everything happening along this path. Did you countercheck and verify your assumption "the RS API returns reserved XML characters" by receiving the recordings.html in a different way?

 

Well, no! The ways i have tried so far are by chrome and asp.net WebRequest class script and both seem to return non-escaped ampersand. Right now Im forced to do the escaping myself in my scripts before loading it with MSXMLDOM. Yes, I will have to do some more testing before getting back on the matter.

Posted (edited)

 

Of course. I wonder why it doesn't happen, because the content between <file> and </file> is processed by a function named SimpleXMLEncodeUTF8, and besides " = " ' = ' < = < and > = > it does handle the ampersand.

 

Are you sure that there is no XML decoding resp. un-escaping going on after the xml has been received from the RS?

 

AHA, mystery solved! As long as downloading the for example recordings.html and save it as an xml the SimpleXMLEncodeUTF8 function kicks in and escapes the ampersand. I had hopes that could be done on the requested html page directly when being queryed via the API. More precisely the ampersand escape being visible in the browser return right away (no download to file/save on harddrive). You see im skipping the download to file part and are downloading the entire string into a variable. The variable content are then loaded directly into the XML parser. Prior to parser load im now forced to do the escaping myself. Im not sure if the escaping is done in the html directly would have bad effects on other stuff in RS? After all the html contains mostly XML. If not possible, I will happily continue with my own solution :original:

Edited by majstang
Posted

I am sure the problem is in your script again (un-escaping in your script). ;)

Posted (edited)

Hi nuts!

 

No this time i dont think so, cuz the WebRequest returns exactly what all browsers (IE, Firefox and Chrome) i have tried returns. Or do you get in data ampersand escape if bringing up the EPG.html or Recordings.html in a browser? I only get the escape if saving the browser content to XML file, like Griga described in step (2) and (3). I do think that is how the SimpleXMLEncodeUTF8 function works. If not that is the only intended use for the function perhaps the browsers un-escapes too?

Edited by majstang
Posted

perhaps the browsers un-escapes too?

Of course they do!

In firefox you can change the view to sourcecode and suddenly "&" changed to "&" ;)

You have to be careful with highlevel stuff like "WebRequest".

Posted

Huh, yes you are definitely correct, once again :original: Chrome has sourcecode view as well and ampersand is escaped. Hmm, looks like i have to do some reading up getting WebRequest returning the sourcecode string instead. Thanks for the enlightments :original:

Posted

haha :D Yes, AHK is the "little brother", but has gone through huge developments lately making it a full-fledged programming language. Not just automation and hotkeys. Really intresting and im trying to learn all the new stuff. Dont know which direction AutoIt has taken though. Yes, autohotkey has several similar URL download to var/file functions, but its a constant struggle cuz nobody has assembled all the variations of the function for you. I have to ask around and read up on msdn to figure out how to tackle the issues im encountering. The issue before you gave me pointers on was the URLdownloadToVar function called the server synchronously, causing a timeout, whereas doing it asynchronously was the correct way to go.

Regarding this issue im pretty sure it should somehow be possible to GetWebHtmlSourceCode using the HTTPWebRequest. I have found something like it but its in c#.

×
×
  • Create New...