Jump to content

Wrong characters


Recommended Posts

Hi,

I have a small problem, but I do not think that this is a bug DVBViewer...

 

Some Czech alphabet characters with accents do not appear correctly in the EPG.
Instead letter with a hook displays O + standard character
Instead of letters with accent displays B + standard character
Examples:
The correct characters: ě, š, č, ř, ž, ý, á, í, é, ú. Incorrectly displayed characters: Oe, Os, Oc, Or, Oz, By, Ba, Bi, Be, Bu
Příliš žluťoučký kůň úpěl... POiliOs kůň BupOel..OzluťouOckBy BupOel. Wildcards are O and B only!

 

Wrong characters appear randomly, a substantial part of the EPG text is displayed correctly. Previous and subsequent program of the same station is typically displayed correctly. Only exceptionally error appear in the title of the program.

 

The problem with incorrect display of characters is reflected in DVBViewer Pro 5.0 (I think in earlier versions too), as well as the Recording Service 1.25. Like erroneously (in the same places) the characters displayed in the EPG exported to EPGhtml.
Problem observed on a PC with W7 Pro 64bit SP1 and on a PC with W7 Home Prem. 32bit SP1.

Display characters on TV (Samsung) is no problem, also DVBViewer GE (v3.3) EPG text displays correctly.

Provider DVB-C is UPC Czech Republic. The used code page is UTF-8, details of.
UPC cable broadcasts programs in the DVB-T as well. In DVB-T EPG display problem also occurs.

EPG information in the stream is flawless, all characters are displayed correctly

(PID 18> SID> EID> Extended Event> Text).

 

I know that the encoding of the Czech language is problematic, I also know that historically there are at least five code pages of the Czech language. DVBViewer I use since 2005, the problém with the encoding was already practically solved.

Even so, I ask, does anyone have any ideas?

 

 

Link to comment
  • 4 months later...

The best way to check EPG data and the embedded character set information is the TransEdit Analyzer.

 

Download TransEdit from the members area, read in the ReadMe file how to install it, launch it, select a suitable transponder list for DVB-C / UPC scanning on the left side, select a frequency where the problem occurs on the right side and click Analyze.

 

In the treeview on the left side of the Analyzer Window expand the "EIT - Actual TS" node (Event Information Table for the actual transport stream, that's where EPG data is located) and the sub-nodes until you can look inside a Short Event Descriptor or Extended Event Descriptor. These data structures contain character set information and EPG text. Read more about the Service Information Tables displayed by the Analyzer here, if you like.

 

Does TransEdit display the text correctly? You may post a screenshot of what you see, or right-click one of the descriptor nodes containing text, select "Copy as XML" from the context menu and paste the Analyzer output here. Maybe I can see from it what's wrong.

Link to comment

Griga, I tried all of this, the problem is somewhere else...

Anyway thanks for the detailed walkthrough

 

Quote (me): Problem observed on a PC with W7 Pro 64bit SP1 and on a PC with W7 Home Prem. 32bit SP1.

Display characters on TV (Samsung) is no problem, also DVBViewer GE (v3.3) EPG text displays correctly.

...
UPC cable broadcasts programs in the DVB-T as well. In DVB-T EPG display problem also occurs.

EPG information in the stream is flawless, all characters are displayed correctly

(PID 18> SID> EID> Extended Event> Text).

 

 

Link to comment
The problem with incorrect display of characters is reflected in DVBViewer Pro 5.0 (I think in earlier versions too), as well as the Recording Service 1.25.

 

What about the current DVBViewer Pro 5.2.7 and RS 1.26? You may need to delete the file epg.dat in the configuration folder (containing previously received EPG data) before checking a different version.

 

Without insight in the broadcasted data I'm not able to find out what's wrong. It would be good if you could record a sample with TransEdit:

 

- Start the Analyzer, wait some seconds

 

- Right-click the PID list on the right side and click "Select main SI pids" in the context menu.

 

- Add the EIT to the selection by holding down the Ctrl key and clicking the EIT entry in the PID list

 

- Click "Start Recording" and let TransEdit record for a minute or so. The file is written to the directory that is specified on Settings -> Analyzer -> Output Directory.

 

- Upload the resulting TS file to some hoster.

Link to comment

Hi Griga,

this topic I started by April 11th, then I had problems with the card TBS6680. Since then I have many times reinstalled and upgraded any DVBViewer (including unpublished version 5.2.5).
I use DVBViever Pro 5.2.7 from its publication, including HbbTV + MPEG5. Recording Service with card TBS6680 is not working (that is, it works, but not with two tuners, see here).

I use TransEdit long time, but still it wanders … I am not sure whether the attached file also contains the EIT text descriptors, therefore, I attach also the XML file (the file is large, so find for the word "Derren Brown"). For comparison, I enclose both EPG text copied from DVBV/Windows/EPG Details and also the same text in DOC format (MS Word 2003), where a slightly edited for better overview. I enclose a copy of the same text from another computer, Derren_Asrock.txt.
For completeness, I enclose also htmlEPG. Here you find the station CT 2 (cze), 7.9.2013, 15:05, Derren Brown:… .

A query on another forum I learned that for encoding in DVB (in Czech Republic) applies (or would apply) standard ISO6937. Yet I thought that use different code tables to ASCII type ...

The http://ulozto.cz/xXb4cFeW/wrongchar-zip are in a package WrongChar. zip these files:

derren.txt
dereen.doc
Derren_Asrock.txt
Kabel UPC CR 658 09-07 14-22-20.PIDs.txt
Kabel UPC CR 658 09-07 14-22-20.ts
Kabel UPC CR 658.xml
Folder htmlEPG
EPG.html

Regards,
Retiree

Edited by Tony
Link to comment
Kabel UPC CR 658 09-07 14-22-20.ts

 

For reproducing the problem I only need a TS sample, no other files. Unfortunately your recording does not contain the EIT, so it is unusable for this purpose. Please try again and follow the instructions, particularly

 

- Add the EIT to the selection by holding down the Ctrl key and clicking the EIT entry in the PID list
Link to comment

This file is ok. It enables scanning and receiving the EPG in DVBViewer Pro as if broadcasted live.

 

The EPG contains no character set information, so in consideration of the language information "cze" it's most likely ISO 6937 like most Eastern European EPGs. As far as I can see DVBViewer Pro handles it correctly. I've tried two different versions (5.1.0 and 5.2.7). The EPG window output (see screenshot) equals the TransEdit Analyzer output.

 

So I guess something is wrong with your settings. Please close DVBViewer and launch Tweaker.exe in the DVBViewer installation directory (where DVBViewer.exe is), scroll down until you find the following checkboxes:

 

Convert EPG data to the local character set

Convert EPG data to the ISO 6937 character set

 

If they are ticked, untick them, click Save, delete the file epg.dat in the DVBViewer configuration folder, relaunch DVBViewer and check the EPG.

Zwischenablage01.png

Link to comment

P.S. I've just found a text where DVBViewer Pro obviously goes wrong from a certain point on in the long description of "Malá Velká Británie" on CT2, Sunday 00:45, e.g.

 

"pOriOcinOenBi"

 

should be

 

"přičinění"

 

It's not so easy to see if one can't read Czech, because the bug only seems to show up on certain occasions. I found it by searching for capitals within words, however. Looks like some event kicks the DVBViewer ISO 6937 handling "out of sync" when reading the text. TransEdit is not affected. Now I'll try to find out why...

Link to comment

HI,
I'm using RS mostly only to monitor the Data Rate and therefore I had the items sorted by Data Rate. That's why I mistakenly called the first scan of the EIT…

On the your attached picture is next season's programme Derren Brown, which EPG contains only the first few rows without errors. Errors occurred to other lines…

Although far from losing RS Program, but almost all of the options discussed so far I checked, not so gracefully, but authenticate. I also Convert the EPG to … authenticate data, no effect. ………………….

I can't quickly respond to your posts, I couldn't speak well English. Also for me it is difficult to find a garbled text, I find such a text is the easiest way in htmlEPG, I think that it will be easier for you.
The EPG shows the same text incorrectly DVBVPro and also DVBV RS, by contrast, the same text appears correctly on the Samsung tv but also DVBV GE…
Some quick tips:
HBO, Mon 9.9.13, 4:20 Nezvratný osud
FilmBox, Sun 8.9.13 9:55 BugOFF!
CT2 + CT2 HD (different channels) Sun 8.8.13 18:50 Pravěk útočí

Link to comment

ZDF, Mon 9.9.13, 20:15 Spuren des Bösen Mit den Schmiergeldern kHonnte die SANDAG sich auch.... ???

 

At htmlEPG saved 8.9.13 8:11, not actual EPG ???

Edited by Tony
Link to comment

When I "discovered" the existence of ISO/IEC 6937, and when I very superficially, he became acquainted with its contents (it's the classic ASCII table with leading codes for accented characters), a good (or stupid) idea occurred to me.

If it is caused by wrong control character handling and it will be addressed, consider the possibility of optionally turn off the display of characters with diacritics. In some cases, this option could be useful (if I understand correctly ISO6937).

Link to comment

Hi,

 

The biggest problem is finding a corrupted text. After the update to version 4.2.8 I'm such a text could not find:D

 

Thanks Griga and Christian

 

 

BTW How can be obtained EPG overview which is on your attached Picture from the attached ts file?

 

Retiree

Link to comment

I Do Not Understand. I have thirty seconds record the main PID & EIT from Recording Service (specimen.ts).

In the Recording Service I use Scan > Scan TS File…

In DVBViewer I use DVBViewer > Open Video… Although the open file specimen.ts, always displays the current EPG. ???

I can live without it, but I will sleep better if I'm going to be able to

Link to comment
In DVBViewer I use DVBViewer > Open Video… Although the open file specimen.ts, always displays the current EPG. ???

 

Ok, now I understand what you mean. You want to know how a file containing EPG data can be loaded so the data gets displayed in DVBViewer.

 

I did it by using a file device:

 

- Click the "+" button on Options -> Hardware.

 

- Set the number of virtual file devices to 1, click OK. DVBViewer will add a file device to the device list.

 

- Set the status of the file device to "Normal" and the tuner type to "Cable". Make sure that all other DVB-C devices are set to "Don't use" so DVBViewer must use the file device.

 

- Select the file device and click the Settings button. It opens a "Stream Setup" window.

 

- In this window click Add -> File. In the File Dialog Window select the TS file that shall be associated with the file device.

 

- Pick a suitable frequency from the list by clicking the frequency column (here I've selected 658000, which is the original frequency of the recorded transponder).

 

- Click OK in the Stream Setup and in the Options window.

 

Now the file device can be used like a DVB hardware. Scanning the 658000 frequency will let DVBViewer read PAT, PMT and SDT from the file and store the channels in the channellist. Tuning one of the channels will let DVBViewer read the EPG data in the file. The file device simulates cable reception. The only difference is that the data source is a TS file, not a real DVB hardware. If a TV channel was recorded in the file, you could watch it, record it... as if broadcasted live.

 

That's how I made the screenshot. However, you can't reproduce it anymore in this way, because DVBViewer will discard EPG data that is older than one day (except if you put your system clock back to the past).

Link to comment

What about the ș Ș ț Ț romanian characters? DVBViewer doesn't display them correctly in the EPG data (Transedit also doesn't) and it also affects how Tuesday (marți) is displayed in the Auto Popup EPG.

 

DVBViewer 5.2.8.1, Windows 7 x64

post-122439-0-73421600-1379354573_thumb.jpg

Edited by highyield
Link to comment
What about the ș Ș ț Ț romanian characters?

 

What about a sample file? :) See above.

 

You may also try TransEdit 4.0.3 Beta from the beta section of the members area. If it displays the characters correctly I know what's going on.

Link to comment

Thanks.

 

As far as I can see the provider uses the ISO/IEC 8859-2 character set (or something similar) for Central/East European languages, but without flagging the EPG data accordingly. So, since TransEdit can't know what it actually is, it uses your local Windows character set, which obviously is Windows Western, thus converting Cireaşa to Cireaºa and marți to marþi.

 

I guess most of it would be displayed correctly in TransEdit if your PC (or more precisely the language for non-unicode programs) was configured for the Windows-1250 character set.

Link to comment

I use US system locale for non-unicode programs, because some programs use it to determine the language for their menus/options and I hate romanian translations of menus. One example, installing the Realtek Gigabit NIC driver, while the character set is set to Romanian, you will have to read the advanced options, for that network controller, in Romanian. For me, it's unbearable. Also, I think most non-unicode programs are created to use/look better when using the US character set.

 

Maybe there's room for improvement. An option in Tweaker, to force the character set used by DVBViewer?

Thank you for clearing this issue.

Link to comment

There is no clean solution for this problem. However, a dirty work-around may do...

 

According to the DVB specifications an EPG (particularly of Eastern European provenance) without character set information has to be regarded as ISO 6937, which is wrong in your case, because it's ISO 8859-2. Since your provider is not the only one violating the specifications in this way, we have implemented a heuristic method for distinguishing these two character sets. It works mostly, but not always, which means, with limited reliability. Whether it is applied or not depends on the language identifier, which is "ron" in your case. So we will put it in the list of languages that trigger the heuristic detection and then we'll see...

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...