Re: String character range

From: Doug Tody <dtody-at-nrao.edu>
Date: Fri, 1 Aug 2008 07:58:53 -0600 (MDT)


On Fri, 1 Aug 2008, Mark Taylor wrote:
> What this means is that there are legal SAMP strings (ones containing
> any character in the ranges 0x01-0x08, 0x0B, 0x0C, 0x0E-0x1F) which
> cannot be transmitted as an XML-RPC <string> element. This means
> that either the definition of a SAMP string, or the prescription for
> transmitting SAMP strings in XML-RPC messages in the Standard Profile,
> must be modified to avoid inconsistency.
>
> I think the possibilities are as follows:
>
> 1. Encode all SAMP strings as <base64> elements when transmitting
> over XML-RPC.
>
> 2. Allow SAMP strings to be transmitted as either <string> or
> <base64> elements when transmitting over XML-RPC (the latter
> case being required only if the string contains un-XML
> characters).
>
> 3. Define some escaping convention for un-XML characters, e.g.
> \u001f for character 31.
>
> 4. Change the SAMP string definition so that only XML-friendly
> characters are allowed.

Why not just enclose character data in CDATA sections when they pass through XML? This is the mechanism XML provides for pass through of arbitrary data (also encoding individual chars such as &lt;).

If message content is separated from transport encoding, this sort of thing would happen automatically at the encode/decode level. If the transport is XML and one uses the XML mechanisms and an XML library, this might happen automatically in any case.

Received on 2008-08-01Z16:00:30