Jump to content

Teletext with Cyrillic


QBox User

Recommended Posts

New channel Sjónvarpið (RUV) on 0.8W, 11403V uses Icelandic language, though most programmes seem to be in English. Programme details can be found in the Teletext pages, but they all display in Cyrillic, when (I expect) they should be using the Latin-based characters of the Icelandic alphabet. Is this a bug or a config error? (4.9) I haven't changed Options for Teletext from default.

 

DIsplays OK in other apps including GE.

post-107761-0-92437800-1325894653_thumb.png

Link to comment

I hope the attached sample is sufficient. It opens in Pro on my Win7 PC with Cyryllic. I also tried it on a Demo version (4.8.1) on my Vista machine and it shows in Latin alphabet (with same Teletext Options), as it did in GE on Win7.

RUV Teletext sample.zip

As you can see, there's a "language" indicator in the PMT

<?xml version="1.0" encoding="utf-8"?>
 <Thor_3_5___Intelsat_10-02_0.8_W_11403_V>
<ES PID="581" Name="Teletext">
 	<StreamType HValue="0x06" Name="Private Data PES (ITU-T Rec. H.222.0 | ISO/IEC 13818-1)"/>
 	<Descriptor HValue="0x56" Name="Teletext">
   	<DescriptorLength Value="5"/>
   	<PageType Value="1" Name="Initial Page">
     	<Language String="ice"/>
     	<Magazin Value="1"/>
     	<Tens HValue="0x00"/>
     	<Units HValue="0x00"/>
     	<Number Value="100"/>
   	</PageType>
 	</Descriptor>
</ES>
 </Thor_3_5___Intelsat_10-02_0.8_W_11403_V>

Edited by QBox User
Link to comment

Well, I could find something out by trial and error.

 

Here DVBViewer Pro doesn't shows cyrillic characters with the default settings. Obviously the (ambiguous) language information in the teletext page headers (national option) indicates the following language group (according to the ETSI specifications): Spanish/Portuguese, Serbian/Croation (Latin), Ukrainian, Hebrew. Icelandic is not covered by the specifications.

 

I'm getting cyrillic letters in DVBViewer Pro and GE when Ukrainian is ticked on Options -> EPG -> Teletext -> Character Set Preferences, which is not the default, so I guess it has been changed. Spanisch/Portuguese looks most appropriate to me, but we would need someone from Iceland to confirm it.

Link to comment

P.S. Derrick's sample contains PAT and PMT, but I noticed it too late ;)

 

It confirms my findings above. The page headers indicate national option 5 (Spanish/Portuguese etc.). Additionally the teletext pages contain location enhancement packets 26/0 with special character assignments, which means, on certain page positions characters from the Latin G0 Primary Set are replaced by characters from the Latin G2 Supplementary Set, provided the language preference in the DVBViewer options is set correctly.

 

For more information refer to ETSI EN 300 706.

Link to comment

Well, I could find something out by trial and error.

 

Here DVBViewer Pro doesn't shows cyrillic characters with the default settings. Obviously the (ambiguous) language information in the teletext page headers (national option) indicates the following language group (according to the ETSI specifications): Spanish/Portuguese, Serbian/Croation (Latin), Ukrainian, Hebrew. Icelandic is not covered by the specifications.

 

I'm getting cyrillic letters in DVBViewer Pro and GE when Ukrainian is ticked on Options -> EPG -> Teletext -> Character Set Preferences, which is not the default, so I guess it has been changed. Spanisch/Portuguese looks most appropriate to me, but we would need someone from Iceland to confirm it.

 

Yes, that seems to have fixed it. I changed to Spanish/Portuguese and re-started. It's now showing in Latin. I can't find any Cyrillic teletext, but it's possible I could have changed it while searching for subtitles.

 

A config issue then, not a bug. Maybe you can move this thread to General.

Link to comment

I found a Ukrainian station with teletext. The pages display in Latin with Spanish/Portuguese or Ukrainian selected in option 5. Only selection of Russian/Bulgarian instead of French for option 4 makes it show Cyrillic, which I presume is intended, but I'm not able to read packet headers. French teletext still shows Latin with this option.

Inter Plus teletext.zip

Link to comment
Only selection of Russian/Bulgarian instead of French for option 4 makes it show Cyrillic

The page headers indicate National Option 4, which is French or Russian/Bulgarian (Cyrillic).

 

French teletext still shows Latin with this option.

Which one? There may be Teletext Level 2.5 packets that specify the language more precisely.

Link to comment
  • 3 months later...

Hi all,

 

Yes the language code is for sure received as 5 (spanish/portuguese). But when I compare the teletext processed on my TV (VBI) with the one processed via my digital decoder (OSD), I find some missing special

characters(which I think eminent because spanish alphabet and ICelandic one are not completely the same).

 

The decoder treats the DVB and identifies it as an independent stream, which means that the teletext you are watching is missing some characters too.

 

On the other hand, I can not find the Icelandic normalized characters in the ETSI EN 300 706.

 

What would help me fix this?

 

PS: I can modify the software of the decoder.

Link to comment
On the other hand, I can not find the Icelandic normalized characters in the ETSI EN 300 706.

As far as I can see on Wikipedia all Icelandic characters are included.

 

Ð, ð, Þ, þ, Æ, æ and the diacritical marks are part of the Latin G2 Supplementary Set (see ETSI EN 300 706, 15.6.3). In teletext level 1.5 they are selected by packets X/26 that specify a position on the teletext page and a character from the G2 set that is supposed to replace the level 1.0 character at this position, or a diacritical mark from the G2 set supposed to be added to a character from the Latin G0 set (see 9.4.1, 12.3 and here).

 

Basically you will have to handle X/26 packets in your software, specifically the row address triplets "Set Active Position", "Address Display Row 0", "Termination Marker", and the column address triplets "Character from the G2 Supplementary Set" and "Characters Including Diacritical Marks".

 

Since characters with diacritical marks defined by packets X/26 (e.g. Ý) are not directly included in the G2 character set, your decoder will have to compose them, which makes things a bit difficult. DVBViewer internally translates the teletext characters to Unicode by using UTF-16 look-up tables. Creating them was pretty lengthy, and I had to write a special tool for it ;)

Link to comment

P.S. The following screenshot from Derrick's sample (see above) shows clearly which characters originate from which source:

 

- White characters are from the Latin G0 Character Set (identical for all countries with a latin alphabet)

 

- Red characters are from the Spanisch/Portuguese National Option Subset.

 

- Green characters added by packets X/26 are from the Latin G2 Supplementary Set.

Zwischenablage01.png

Link to comment

Thank you for your interest Griga :)

 

I can see the National options (in red without any problem), otherwise I can not find the special characters in green (packet26) you have mentioned on the Latin G2 Supplementary Set (15.6.3), in the ETS 300706. Do I need an extension for this set?

Link to comment
Do I need an extension for this set?

No. They are all included here in ETSI EN 300 706 V1.2.1 2003-04, clause 15.6.3. Please take note of my remarks about characters with diacritical marks. They are not directly present in the G2 set, but must be composed by the decoder. The X/26 packet tells the decoder that e.g. Y from the G0 set shall be combined with the diacritical mark ´ from the G2 set in order to create Ý.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...