Re: String character range

From: Mark Taylor <m.b.taylor-at-bristol.ac.uk>
Date: Fri, 1 Aug 2008 16:15:21 +0100 (BST)


On Fri, 1 Aug 2008, Carlos Rodrigo Blanco wrote:

> Hi
>
> I'm sorry that I don't know much about unicode encoding and I feel quite
> ashamed of showing this ignorance, but I wonder what happens with latin
> characters and so.
>
> If I have to write, for instance, some author name in a xml document that
> includes some latin character (like ñ), is that allowed?

Writing it in an XML document - no problem. XML, and Unicode on which it is based, is very capable at representing almost any character from almost any language you can think of (and a lot more).

As far as SAMP goes: that character looks to me like code point 0xf1, from the Latin-1 Supplement code block. So you could not send it using either the existing definition for a SAMP string or the proposal (4) that I am suggesting. If we used a variant of my suggestion (3):

   3. Define some escaping convention for un-XML characters, e.g. \u001f

      for character 31.

with the intention that this escaping mechanism could be used for any 8-bit character it would be possible to transmit this kind of non-7-bit Latin character. However, characters with the 8th bit set might cause problems for certain other transports and language environments. I must admit apart from RFC-822 mail-type contexts I can't think of what these might be, but I'd be inclined to steer clear of non-7-bit characters just in case. However, if others (e.g. with less Anglo-Saxon prejudices) think that it's an important requirement to permit transmission of characters like this within SAMP we could take that on board. We could even in principle say that this escaping mechanism could be used to specify any Unicode character - but I think that would definitely be a bad idea as it would effectively restrict use of the protocol to languages with Unicode support, which excludes quite a lot.

Mark

-- 
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor@bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/
Received on 2008-08-01Z17:15:30