Jump to content

Chinese EPG not correctly displayed


allanlee

Recommended Posts

Using DVBViewer Pro 5.3.1.0 and TBS5881 DVB-C Tuner, to watch Singapore DVB-C FTA channels.

 

Some of the channels provides both English & Chinese EPG, while others only have English EPG.

 

The problem is all English EPG works perfect, however Chinese displayed as unreadable characters.

 

I've tried using the Tweaker to (1) on/off Convert EPG data to local character set; (2) on/off Convert EPG data to ISO6937; (3) on/off Use UTF16 instead of Big5. Unfortunately none combination works.

 

In the STB provided by carrier, EPG can display in Simplified Chinese.

 

I attached here the Exported EPG html files and support.zip.

 

P.S. TransEdit has the same problem.

 

post-152470-0-66453600-1456588686_thumb.png

 

Would very much appreciate if someone can help. Let me know if there's more info required to fix this problem.

 

Thanks a lot!

 

 

 

 

support.zip

 

 

Link to comment

If you have any problems you should tray the current DVBViewer version.

(update are available in the Members Section)

 

The Tweaker only effects newly revised EPG. So you should delete the configuration folder\epg.dat after each change.

 

If there is a problem TransEdit is the best tool to investigate this. But you should use Version 4.x Beta from the Members Section.

Try "Assume Unicode character coding in case of Big5" in the settings there dose it help in TransEdit?

http://www.DVBViewer.tv/forum/topic/2745-transedit/page-4#entry396436

Link to comment

Thanks!

 

Updated to 5.5.2.0, same problem.

 

Done the Tweaking again with epg.dat deleted every time, no luck.

 

Tried TE 4.1.0 beta, on/off Unicode & on/off ISO6937, all showed unreadable characters. (Same as in DVBViewer)

 

 

 

 

 

 

 

Is there anything else I can do to solve the issue, or help to fix the problem?

 

Many thanks again!

Link to comment

In TransEdit in the Analyzer window, on a transponder with channels which have EPG problems.

Make a right kick in the PID part and select "Select Main SI PIDs" this will select some PID (including EIT).

 

Then "Start Recording" those for 30 sec.and post the .ts file. Maybe that helps to determinate what is going on.

  • Like 1
Link to comment

Is it possible to manually set decoding character set in DVBViewer / TransEdit? Then I can try & error to see which is the right one.

 

It's not likely to be Big5 because Big5 is for Traditional Chinese while the STB is displaying Simplified Chinese.

Link to comment

I've examined the sample. The broadcaster specifies no character set whatsoever in the EPG data. So according to the DVB specifications the default latin character coding has to be assumed (ISO 6937). That's causing the issue.

 

The broadcaster must flag his content correctly, there is no real good way around it. Every character string that is coded as Simplified Chinese must begin with a #19 control character (hex 0x13), see DVB specifications, ETSI EN 300 468, Annex A.2.

 

DVBViewer/TransEdit could try some guessing, e.g. assume Simplified Chinese if the ISO language code is specified as "chi". But that may have unwanted side effects in other situations.

 

Is it possible to manually set decoding character set in DVBViewer / TransEdit?

 

Sorry, no. It can be considered for future DVBViewer versions, but not as a short-term solution.

Link to comment

Thanks for your reply, Griga.

 

Some Linux-based receivers are able to decode Chinese EPG correctly (e.g. DM800Se SR4). As well as VLC - when I try to stream the channel via LAN the embedded EPG info could be decoded.

 

Some further study & check showed that the character seems to be UTF-8. Could you (or Tjod) help to confirm? If this is the case, by any chance there could be a temperary solution to correctly display the EPG info before future versions enabling manually specifying the character set? (i.e. Add another option in the Tweaker?)

Edited by allanlee
Link to comment
Some further study & check showed that the character seems to be UTF-8. Could you (or Tjod) help to confirm?

 

No UTF-8, as far as I can see. Does this look correct? It's from Channel 8 HD:

Zwischenablage01.png

Link to comment

You are right, it is UTF-8.

 

I tried an automatic UTF-8 detection, but wrongly on first attempt. What I've added now is

 

IF no_character_set_specified AND (language = chi) AND utf-8_auto_detected THEN assume_utf-8

 

However, the automatic UTF-8 detection is a kind of good guessing, which means, it's not 100% fail-safe. Other character coding may be mistaken for UTF-8, causing trouble in other cases, though it is not very likely. We can try to use it in the next DVBViewer release (which will come soon), but if half of China complains about garbled characters we have done the wrong thing ;)

Link to comment

Thank you soooo much! That seem to be good solution so far.

 

Will this guessing also apply to recording service? (which I'm also using)

 

As far as I know, most broadcaster in mainland China use GB2312, tag as "zho" and current version of DVBViewer works fine.

 

If necessary I can get you some samples from my friends in various Chinese cities.

Edited by allanlee
Link to comment
Will this guessing also apply to recording service?

 

Yes, in the next RS release.

 

As far as I know, most broadcaster in mainland China use GB2312, tag as "zho"

 

In my samples they use "chi" plus character set designation (e.g. Shen Zhen Cable, UTF-16). Same applies to Taiwan DVB-T.

Link to comment
  • 2 weeks later...

Tested with v5.6.0

 

The detection algorithm partially works. Text in red box are garbled - seems not detected as UTF8.

 

Will it be a good idea to assume UTF8 for all EPG data in the same channel (or even the whole broadcaster) if, let say >80% of EPG data fields, are detected as UTF8?

 

Sample ts here and here in case you need the raw EIT stream.

 

Thanks for the efforts again!

 

 

iJUusZq.png

 

mBjjo81.png

Link to comment

There are strings that are not valid UTF-8, e.g. the following picked from the debugger and handled as UTF-8 by Notepad++:

 

《中国新闻》是以海外华人、港澳台同胞、留学生、驻外使领馆及中资机构人员为目标的新闻节目。节目由国内外要闻、内地经济和社会新闻、对国内外重要新闻xE4xBA

 

with invalid codes displayed as hexadecimal numbers. This lets the UTF-8 detection fail because it checks for valid UTF-8.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...