Jump to content
bogdan1980

diacritics problem on epg guide

Recommended Posts

bogdan1980

I see a problem with wrong diacritics that appear on epg menu in DVBViewer.

win 10 , locale set to romanian.

Untitled.jpg

Share this post


Link to post
Griga

It depends on the language and character table indicated in the EPG data. Both are displayed by the TransEdit Analyzer:

 

Zwischenablage01.png

 

Please check it for the channels in question. Use "Find" for searching the tree data if you want to find an EPG entry. Does TransEdit display the EPG correctly?

Share this post


Link to post
bogdan1980

the same.

On cinemax channel we have also hungarian audio and subtitles, years ago sometimes we haved hungarian instead of romanian epg guide( also audio in hungarian).

But this problem is fixed now. Its diacritics problem.

 

Maybe its because we have rum and ron in epg settings . Need to test them both.

 

 

EDIT: the same in progdvb.

 

 

Capture.JPG

Capture2.JPG

Capture3.JPG

Edited by bogdan1980

Share this post


Link to post
Griga

There is no character table indication, which means, the ISO 6937 diacritics character set applies as default according to the DVB specifications. However, DVBViewer can't handle it this way throughout because several Western European broadcasters don't comply with this rule (shame on them), in contrast to most Eastern European broadcasters. That's why DVBViewer checks if it's an Eastern European language.

 

The following languages are assumed to use ISO 6397 if no other character table is specified: hun, hrv, cze, slo, ces, slv, plk, rom, ron

 

rum is missing here. Do you think it should be added?

Share this post


Link to post
Griga

P.S. DVBViewer provides a "Convert EPG data to the ISO 6937 character set" tweak (-> launch Tweaker.exe) that you may want to try. It lets DVBViewer convert all EPG data without character set information to the ISO 6937 character set. However, it will only take effect on new EPG data, so it may be necessary to close DVBViewer and delete the already received EPG data in the file epg.dat (see configuration folder).

Share this post


Link to post
bogdan1980

i enable convert epg data to the iso 6937. Need more time to see if its working.

 

Share this post


Link to post
bogdan1980

its not working. The same. Doesnt matter to much.

Share this post


Link to post
Griga

In your screenshot above you have marked an item "Pe campuri de cap?uni". The question mark replaces a character that can't be handled. Which character should appear there?

 

Unfortunately the screenshot doesn't show the corresponding language and character table indication because you didn't expand the item. What was it? That's what I need to know.

Share this post


Link to post
bogdan1980
Posted (edited)

ş instead of ?

 

And here instead of ?coala de rock is : şcoala de rock.

Our language is old latin with some slavic in it. We have ş ţ ă î .

 

What is curios is that on our channels some report rum subtitles, other report ron subtitles, or rom.

Or even md(moldavian) who is also romanian language, same people etc.

Maybe because rum ron rom md have this problems/?!

 

I see RON in language.

Capture.JPG

Edited by bogdan1980

Share this post


Link to post
Griga

Hmm, "ron" lets DVBViewer apply the ISO 6937 character table. Maybe the issue is caused by

 

Quote

Also, some diacritics used with the Latin alphabet like the Romanian comma are not included, using cedilla instead as no distinction between cedilla and comma below was made at the time.

 

https://en.wikipedia.org/wiki/ISO/IEC_6937

 

Is it always ş that appears as ? or are other characters also affected?

 

DVBViewer is able to handle Ş and ş (S with cedilla), which means it translates the ISO 6937 two byte character codes 0xCB 0x53 and 0xCB 0x73 to the corresponding Unicode character codes 0x015E and 0x015F (see here).

 

It would be interesting to know which hex codes the broadcaster is using. This requires right-clicking the EIT stream on the right side of the Analyzer Window -> Hex View -> Set Packets to a high value like 5000 -> Restart -> wait a bit -> use the Find function -> Text String to find the string (an ASCII part of it or a sequence of ASCII characters nearby) in order to examine the corresponding bytes as hex codes.

Share this post


Link to post
bogdan1980

ţ and ş from what i can see.

 

ş is BA.

ţ is FE.

 

 

Capture.JPG

Share this post


Link to post
Griga
4 hours ago, bogdan1980 said:

ş is BA.

ţ is FE.

 

No, it doesnt't work this way. The Hex Viewer knows nothing about DVB character coding. On the right side it just uses the local Windows character set for displaying bytes as characters, so you can see if there is some text in the TS packets. What you see there as ş or ţ is meaningless.

 

That's why I have written:

 

On 4.1.2018 at 3:58 PM, Griga said:

an ASCII part of it or a sequence of ASCII characters nearby

 

because that's all you can find with the Hex Viewer. What you need to do is

  1.  Find a word or sentence with a ? in the EPG, e.g. "Pe campuri de cap?uni"
  2. Search in the Hex Viewer for a part nearby that only contains ASCII characters (no special characters!), e.g. "campuri de cap"
  3. Check how the part where the ? came in has been coded by the broadcaster as hexadecimal bytes. The two (!) byte hex code "CB 73" for ş would be correct ISO 6937.

Share this post


Link to post
bogdan1980

its k?

Capture.JPG

Share this post


Link to post
Griga

Well, secret disclosed. :D The ? (ASCII code 3F or decimal  63) is already in the broadcasted EPG data. The broadcaster is in trouble with character translation, not DVBViewer.

 

It should be "Mărturii şi evocări", but ş is replaced by ?. This usually happens if Unicode characters are translated to another character table. If the Unicode character is not present in this table,  a question mark is inserted. At least Windows does it this way.

 

So all you can do is to write a nice letter to the broadcaster and ask him to fix his Unicode -> ISO 6937 character translation.

Share this post


Link to post
bogdan1980

so thank you for your trouble.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×