Jump to content

ffdshow H.264 decoder with DVBViewer


CiNcH

Recommended Posts

We finally succeeded in convincing ffdshow developers that the incompatibility between DVBSource and ffdshow, when it comes to H.264 decoding, is actually due to a limitation within ffdshow. ffdshow uses ffmpeg/libavcodec as the actual H.264 decoder and has to hand a complete H.264 Access Unit to the library. ffdshow, acting as a DirectShow wrapper for the libavcodec library, takes this responsibility a level higher and also expects a whole H.264 Access Unit within one DirectShow sample provided by the splitter/demuxer (e.g. DVBSource). That makes ffdshow not only incompatible with DVBSource, but also with a lot of other demuxers (e.g. Elecard, MS MPEG-2 Demultiplexer, Sonic HD Demux,...).

Delivering a whole H.264 Access Unit means that the demuxer already has to know a lot about the H.264 specs and parse the bitstream which is, in our opinion, the business of the decoder. So parsing the H.264 bitstream within the demuxer already more or less is a flaw in design.

 

What we did was working together with ffdshow developer Haruhiko Yamagata and finally succeeded in implementing the H.264 Access Unit parser within the decoder (or actually the ffdshow DirectShow wrapper):

 

ffdshow Revision 2146

 

Changelog:

Improved compatibility with certain MPEG2 Transport Stream demultiplexers (e.g. DVBSource, Elecard) for files that contain H.264 video. Patch by Haruhiko Yamagata and CiNcH.

 

orf_1_hd.jpg

 

Known limitations:

  • libavcodec is extremely CPU intensive, especially with content that is not encoded using the slice concept, for such content the decoder is not threaded (so better have a single core CPU with high clock speed)
  • for better performance you should disable H.264 deblocking within ffdshow

Next step:

Next step is to take this idea to MPC Video Decoder which is also based on ffdshow code but uses the DXVA/DXVA 2.0 capabilities of the graphics unit for decoding H.264 which will lead to a CPU usage going towards zero. My goal is to finally have a replacement for the CyberLink decoder when it comes to DXVA.

Edited by CiNcH
Link to comment

Please wait until one of the guys provide a build with installer. Think that your CPU won't do 1080i with libavcodec anyway. Better disable H.264 deblocking...

 

Builds will be available here!

 

Reclock

I will jump on that more thoroughly as soon as it has its own forum section at S**Soft. This 100-page-thread they currently have is not very friendly to work with...

Edited by CiNcH
Link to comment

Here is some more experience with ffdshow H.264 and DVB:

  • no multithreading with non-sliced material, 100% CPU load for Anixe HD for example on core 2 with a C2D 2.13 GHz -> async audio, freezes a.s.o. (not much performance is missing though, on a C2D with 2.67 GHz it already worked flawlessly, however without deinterlacing...)
  • 1080i works perfectly with a CPU load of 60-80% when H.264 deblocking is disabled, again without deinterlacing being applied (so if you experience performance problems do that as well)
  • 720p50 ORF 1 HD works perfectly, load is equally distributed on both cores (40-50% on the C2D 2.13 GHz), their encoder seems to take advantage of the slice concept
  • switching from 720p50 to 1080i on the fly without graph rebuild is no good idea, ffdshow may crash, to work around it, enable video format detection within DVBViewer

And a quote by Dark Shikari (x264 developer) from yesterday:

libavcodec is now actually significantly faster decoding in singlethreaded mode (and the mt branch provides multithreading, which will be merged soon)

So the future may not look too bad.

Edited by CiNcH
Link to comment
  • 3 weeks later...

It is getting better and better :lol:

 

Now with ffmpeg-mt, 1080i Anixe HD with H.264 deblocking enabled:

 

ffmpeg_mt.jpg

 

This is live with DVBViewer on Anixe HD. It somehow crashes on 720p channels like arte HD and ORF 1 HD. Recorded files from the same channels work fine.

Edited by CiNcH
Link to comment

OK, it works now with 720p live streams. You'll have to enable video format detection. ffdshow seems to use the format information propagated on the video pin.

Link to comment
  • 2 weeks later...

This is really nice- I have just tested 2227 build and I get a really good picture with BBC HD and Luxe TV with deinterlace selected in ffdshow options- I get about 74% usage (using 64x2 5000+) on BBC HD 1080i h.264 about 16Mbps and about 68% on Luxe TV HD 1080i h.264 about 10Mbps. I do get slight jaggedness on diagonal edges but I presume this is because I need to set ffdshow to resize (from 1080i to 720i for my 1280x768 16:10 monitor)?

 

It would be brilliant if you could get similar changes made to MPC-HC as this seems to play all my .ts DVBViewer recordings (and just about anything else) with DXVA no problem as long as VMR9 (windowed) or VMR9 (renderless) with Direct 3D Fullscreen and YUV mixing is selected.

 

Screenshots (click for fullsize)

 

BBC HD

th_bbchd1080iusingffdshow2227.jpg

 

BBC HD

th_bbchd11080iusingffdshow2227.jpg

 

Luxe TV HD

th_luxetvhd1080iusingffdshow2227.jpg

 

Luxe TV HD

th_luxetvhd11080iusingffdshow2227.jpg

Edited by dvbrewer
Link to comment
[*] for better performance you should disable H.264 deblocking within ffdshow

OK, I followed your tip and selected "skip deblocking when safe" in ffdshow h.264 options and processor use dropped by about 10%. I now get about 56% for Luxe TV HD and 64% for BBC HD.

Edited by dvbrewer
Link to comment

Just to add, using the ffdshow resize option sorted out the jaggies problem- using the settings "Resize to screen resolution", "No aspect ratio correction", "Luma method:Fast bilnear", checking "Interlaced" and unchecking "Accurate rounding" this only adds about 2-3% on CPU usage.

Link to comment

I managed to get a better picture-by changing deinterlace method to linear blending (from linear interpolation) and also chroma method of resizing to fast bilinear. It does not seem to be necessary to select "Interlaced" in resize settings, presumably because the deinterlacing is applied before the resizing.

 

Some more screencaps:

 

BBC 2 576i upscaled to 720

th_BBC2EnglandengGardenersWorld10-3120.jpg

 

BBC HD 1080i downscaled to 720

th_BBCHDNARFrancescosMediterraneanV-2.jpg

 

BBC HD 1080i downscaled to 720

th_BBCHDNARFrancescosMediterraneanV-1.jpg

 

BBC HD 1080i downscaled to 720

th_BBCHDNARFrancescosMediterraneanVoya.jpg

 

Luxe TV HD 1080i

http://i535.photobucket.com/albums/ee357/j...-3015-05-40.jpg

 

Luxe TV HD 1080i

http://i535.photobucket.com/albums/ee357/j...-3015-05-22.jpg

 

Luxe TV HD 1080i

http://i535.photobucket.com/albums/ee357/j...-3015-03-15.jpg

 

Luxe TV HD 1080i

http://i535.photobucket.com/albums/ee357/j...-3015-03-03.jpg

Link to comment

I have just tried the latest standalone filters from MPC (svn 866) and it looks like it is partly working as it decodes some of the picture correctly and uses DXVA -

 

th_BBCHDNARMutualFriends11-0622-43-43.jpg

 

At first it shows a blank screen then after a few seconds the above kind of image is shown- then it usually crashes after about a minute. If any more information would be useful i.e. from the renderer or graphstudio I can post that.

Edited by dvbrewer
Link to comment
I have just tried the latest standalone filters from MPC (svn 866) and it looks like it is partly working as it decodes some of the picture correctly and uses DXVA

The H.264 Access Unit parser has not been committed yet. So what MPC Video Decoder expects on its input is a whole Access Unit. If this is not the case (like with almost every demuxer except Haali and MPC) the decoder fails and acts unpredictably, like in your case by showing weird colors.

Edited by CiNcH
Link to comment
I have just tried the latest standalone filters from MPC (svn 866) and it looks like it is partly working as it decodes some of the picture correctly and uses DXVA -

 

th_BBCHDNARMutualFriends11-0622-43-43.jpg

 

At first it shows a blank screen then after a few seconds the above kind of image is shown- then it usually crashes after about a minute. If any more information would be useful i.e. from the renderer or graphstudio I can post that.

 

Now my DVBViewer always start with those colors after testing out this ffdshow version. A regraph and things return to normal.

 

Regarding this ffdshow version. I tested this on 2 computers and both could barely show the HD channels due to maxed out CPU's. Core2 Duo 6320 and T5500. But beside that it worked.

Link to comment
The H.264 Access Unit parser has not been committed yet. So what MPC Video Decoder expects on its input is a whole Access Unit. If this is not the case (like with almost every demuxer except Haali and MPC) the decoder fails and acts unpredictably, like in your case by showing weird colors.

Thanks for the feedback. It looks like they broke something on MPC svn 852/853 as all revisions since then no longer deinterlace some BBC HD recordings correctly. MPC svn 845 decoder does seem to work OK with DXVA in DVBViewer for BBC HD h.264 1080i .ts recordings if I change them to .mp4 or .mkv, progressive h.264 and VC-1 also seem to work.

Edited by dvbrewer
Link to comment

EDIT: OOOPS! I didn't realize this Thread was in English. No time to translate right now, maybe I will later. I suspect a lot of you will understand it anyway.

 

Wow, da gibt's ja echte Fortschritte bei ffdshow. Hab grad den ffdshow_rev2227_20081017_mt.exe aus diesem Thread probiert, hier meine Erkenntnisse.

 

-Wie bereits festgestellt crashed er DVBV wenn man von einem 1080i Kanal auf einen 720p Kanal umschaltet. Pre-Format detection verringert zwar die Crash Wahrscheinlichkeit aber beseitigt sie nicht. Kommt für mich außerdem sowieso nicht in Frage, da das Umschalten damit viel länger dauert.

 

-Er braucht im Vergleich zu CoreAVC fast doppelt so viel CPU power. Ich hab's gerade bei der Champions League Wiederholung auf Premiere HD ausprobiert und ffdshow kommt öfters mal auf 100% (Ruckeln) bei meinem AMD X2 @ 2.7 GHz.

 

-Wenn ich die CPU Belastung verringere, durch Ausschalten von Deblocking, ist das Fussball Spiel jedoch 100% flüssig! Damit kann ffdshow jetzt was CoreAVC immer noch nicht kann. CoreAVC ruckelt bei Premiere HD Fussball obwohl etliche Kunden sich bereits seit Jahren bei doom9 darüber beklagen. Leider kommt das Ausschalten von Deblocking nicht in Frage, da dies in der AVC Norm gar nicht zulässig ist und hässliche Artefakte zur Folge hat. Beispiel Rasen auf folgendem Screenshot:

alwayscm4.th.jpgthpix.gif

 

-Field Order Probleme gibt es keine, im Gegensatz zu CoreAVC, wieder trotz vieler bug reports.

 

-Bei Tests mit vielen HDTV Samples von meiner Festplatte schneidet ffdshow etwas schlechter ab als Cyberlink und CoreAVC. Bei einigen Dateien kommt es zu kleinen Rucklern, die nicht durch CPU Überlastung zu Begünden sind.

 

-Ein Paar Fragen: was genau ist eine "mt" Version von ffdshow? Wo gibt's da die neuesten Versionen? Ist zu erwarten, daß die CPU Last in Zukunft besser wird?

 

-Und noch ein wichtiger Tip: es hat noch keiner in diesem Thread erwähnt wie man das hochqualitative (und CPU entlastende) Hardware Deinterlacing der Grafikkarte mit ffdshow benutzen kann. Dazu muss man be ffdshow unter "Output" alle colorspaces bis auf NV12 aus machen, "set interlace flag" an machen und "Method" auf "bob" setzen. Sicherheitshalber kann man dann noch im Treiber der Grafikkarte die beste Deinterlacing Methode auswählen z.B. bei ATI "Vector Adaptive". Damit gibt es auch keine Probleme mehr mit eckigen Kanten, von denen in diesem Thread die Rede war.

 

 

 

-Fazit: wenn man eine sehr schnelle Dual Core CPU hat (bei AMD ca. 3 GHz, bei Intel ca. 2.7 GHz) dann gibt es jetzt einen CPU Decoder, mit dem man HDTV mit DVBV schauen kann, vorrausgesetzt man nimmt ein Paar crashes in Kauf. Nur, wer einen solchen Rechner hat, hat höchstwahrscheinlich auch eine Grafikkarte mit DXVA2, kann also genauso gut Cyberlink benutzen, der alles perfekt kann und auch umsonst ist (einfach PDVD8 Demo installieren). Wenn ffdshow jedoch noch ein bisschen schneller wird dann könnten auch Leute mit älteren Rechnern (wie ich: AMDX2@2.7GHz/X1950PRO) endlich problemlos HDTV mit DVBV schauen, ohne neue Hardware kaufen zu müssen. (Und bevor jemand fragt: nein, die X1950PRO und alle anderen DXVA1 Grafikkarten können, trotz der Lügen der Hersteller, kein AVC, da sie das oben erwähnte Deblocking nicht ausführen).

Edited by J.B.
Link to comment
EDIT: OOOPS! I didn't realize this Thread was in English. No time to translate right now, maybe I will later. I suspect a lot of you will understand it anyway.

OK, babelfish helped out...it is weird you are getting 100%- the highest I get is in the 70s (with 64X2 5000+ 2.66 2GB XP SP3) and that is with ffdshow linear blending deinterlace and bilinear resize filters selected. I don't see any image deteriation selecting "skip deblocking when safe" and it reduces usage by about 10%. The latest builds of ffdshow mt (multithread) are here.

Edited by dvbrewer
Link to comment
-Ein Paar Fragen: was genau ist eine "mt" Version von ffdshow? Wo gibt's da die neuesten Versionen? Ist zu erwarten, daß die CPU Last in Zukunft besser wird?

 

mt means "multithreaded". This release is able to use more than one CPU (or core) for decoding. That isn't possible with the normal versions of ffdshow, which makes it more or less impossible to decode HD H.264 content with them. So the mt version is a big step forward. ;)

Link to comment
OK, babelfish helped out...it is weird you are getting 100%- the highest I get is in the 70s (with 64X2 5000+ 2.66 2GB XP SP3) and that is with ffdshow linear blending deinterlace and bilinear resize filters selected. I don't see any image deteriation selecting "skip deblocking when safe" and it reduces usage by about 10%. The latest builds of ffdshow mt (multithread) are here.

 

Well if I skip deblocking my CPU usage is similar to yours. It also depends strongly on the type of content. High bitrate, high motion, interlaced HD sports is MUCH harder to decode than a typical Premiere 6 Mbits/s "HD" movie. Whether disabling deblocking deteriorates the image quality depends on whether the stream you are watching uses deblocking or not. It's an encoder side setting and from my experience it's often used in sports broadcasts and less often in movies. Since you're using blend deinterlacing I assume you don't watch sports broadcasts. BTW I also tested the "skip deblocking when safe" setting and also incorrectly decodes the video.

 

Thanks for the infos on the ffdshow mt project, I'll be watching it's development.

Edited by J.B.
Link to comment

Here's a short translation of my post from yesterday.

 

-ffdshow-mt needs roughly double the amount of CPU power as CoreAVC, for smooth live HD football my AMDX2@2.7GHz is not quite enough.

 

-If the CPU is fast enough it's 100% smooth in DVBV which is not the case for CoreAVC.

 

-Disabling deblocking will save CPU power but this shouldn't be done as it's not AVC spec compliant and causes artifacts as seen in the screenshot in the above post (blocks on the grass).

 

-No field order problems, unlike CoreAVC.

 

-Important hint: to enable high quality GPU hardware deinterlacing, instead of using the CPU hungry ffdshow deinterlacers, you need to make the following settings in the ffdshow "output" tab: disable all colorspaces except NV12. Enable "set interlace flag" and set the "Method" to "bob" or "auto". This will also take care of the jaggies some people were mentioning.

 

-Conclusion. If your CPU is fast enough and you don't have a GPU that supports full AVC decode, ffdshow-mt enables you to finally watch HDTV on DVBV without any problems (except a few crashes). But the chances are that if you do have such a CPU you will also have a capable GPU so you can just as well use the Cyberlink decoder, which is free with the PDVD8 demo, and deals with HDTV just fine. So to be really useful I think ffdshow needs to improve CPU performance a bit more.

Edited by J.B.
Link to comment
-Important hint: to enable high quality GPU hardware deinterlacing, instead of using the CPU hungry ffdshow deinterlacers, you need to make the following settings in the ffdshow "output" tab: disable all colorspaces except NV12. Enable "set interlace flag" and set the "Method" to "bob" or "auto". This will also take care of the jaggies some people were mentioning.

 

-Conclusion. If your CPU is fast enough and you don't have a GPU that supports full AVC decode, ffdshow-mt enables you to finally watch HDTV on DVBV without any problems (except a few crashes). But the chances are that if you do have such a CPU you will also have a capable GPU so you can just as well use the Cyberlink decoder, which is free with the PDVD8 demo, and deals with HDTV just fine. So to be really useful I think ffdshow needs to improve CPU performance a bit more.

OK, thanks for the superior translation! I tried your hint re: disabling all colorspaces except NV12, but it actually increased my CPU usage - it gave about 76% over 67% average using the ffdshow linear blend deinterlacer on BBC HD (16mbps 1080i h.264). Regarding sports, I can't see any problem with some recordings of the Olympics that I have from BBC HD which is a high bitrate. There is some football tomorrow on ITV HD so I will try and take some screenshots for you to see if I can see the same effect you encountered. If you do try any other builds of ffdshow-mt avoid svn 2303 as it caused instability on my system- 2307 seems to work OK.

Link to comment

OK, did a few CPU tests with this sample I recorded earlier: http://rapidshare.com/files/161961525/bundesliga.ts

 

First with all ffdshow settings on default --> CPUmax 98%

1defdl9.png

 

Then with deblocking disabled --> CPUmax 80%

2nodeblockmb8.png

 

Then with Linear Blend deinterlace enabled --> CPUmax 88%

3linearblendct5.png

 

Then with hardware deinterlacing --> CPUmax 80%

4hwdeintao8.png

 

And then with NV12 and hardware deinterlacing --> CPUmax 77%

http://img519.imageshack.us/img519/3875/5nv12rv9.png

 

So basically it looks like GPU interlacing does save some CPU time and NV12 is in fact faster than leaving the default colorspace settings.

 

I also tested yadif, which is obviously way better than linear blend, but still not as good as hardware deinterlacing, and the clip ran at about 25 fps instead of 50 so I guess there's no hope for decent 1080i software deinterlacing any time soon.

 

And for laughs, another example of why disabling deblocking is a bad idea:

http://img146.imageshack.us/img146/5969/mats00000vp9.jpg

 

Sorry for not linking the last two images properly but the Forum software is saying I'm not allowed to so many images.

Edited by J.B.
Link to comment
So basically it looks like GPU interlacing does save some CPU time and NV12 is in fact faster than leaving the default colorspace settings.

 

I also tested yadif, which is obviously way better than linear blend, but still not as good as hardware deinterlacing, and the clip ran at about 25 fps instead of 50 so I guess there's no hope for decent 1080i software deinterlacing any time soon.

 

And for laughs, another example of why disabling deblocking is a bad idea:

http://img146.imageshack.us/img146/5969/mats00000vp9.jpg

 

Sorry for not linking the last two images properly but the Forum software is saying I'm not allowed to so many images.

Unfortunately I cannot open your .ts file correctly in DVBViewer using ffdshow mt 2307 or mpc svn 845- if I change it to .mp4 or .mkv it plays for a short while then freezes? It plays in MPC-HC svn 849 OK with DXVA but I can't run the CPU tests for comparison.

Edited by dvbrewer
Link to comment
I have just tried the latest standalone filters from MPC (svn 866) and it looks like it is partly working as it decodes some of the picture correctly and uses DXVA -

 

th_BBCHDNARMutualFriends11-0622-43-43.jpg

 

At first it shows a blank screen then after a few seconds the above kind of image is shown- then it usually crashes after about a minute. If any more information would be useful i.e. from the renderer or graphstudio I can post that.

 

Is EVR the only right renderer to get DXVA under Vista? Skin tones are almost orange on HD videos.

Link to comment
Is EVR the only right renderer to get DXVA under Vista? Skin tones are almost orange on HD videos.

I think EVR Custom is the only other renderer that will work in Vista with DXVA- a discussion on Doom9 suggests your problem might be something to do with color conversion.

Link to comment
Oh, sorry. I don't use DVBV for file playback so I didn't know it can't handle TS files that have been cut using Tsremux. I'll upload the original uncut TS file, which can be played by DVBV. The perfmon graphs I posted are just the first 30 seconds of that file.

 

http://rapidshare.com/files/162003325/bund..._full.part1.rar

http://rapidshare.com/files/162003327/bund..._full.part2.rar

There is a bit of a jinx going on here :blush: when I try to unzip these I am getting a file is broken error using 7-zip?

Link to comment
so I didn't know it can't handle TS files that have been cut using Tsremux.

Had a look at the file. Seems TSRemux messes the PCR (Program Clock Reference) values in the TS up. They start with 00:00, but the PTS (original Presentation Time Stamps) with 20:35, as I can see on the DVBSource property page. Should be approximately the same value. Such an output is not DVB compliant. In this way TSRemux indicates that video/audio should be presented 20:35 hours after the data has arrived. Well, no problem, if you have enough time to wait...:blush: I could provide a work-around in the DVBViewer Filter, but to be true, I don't feel like supporting such a nonsense.

Link to comment
There is a bit of a jinx going on here :blush: when I try to unzip these I am getting a file is broken error using 7-zip?

 

I just downloaded and tested the files. They work fine with Winrar. I don't have 7zip so I don't know how it deals with split archives but you definitely need to download both files before you begin unzipping.

Link to comment
Had a look at the file. Seems TSRemux messes the PCR (Program Clock Reference) values in the TS up. They start with 00:00, but the PTS (original Presentation Time Stamps) with 20:35, as I can see on the DVBSource property page. Should be approximately the same value. Such an output is not DVB compliant. In this way TSRemux indicates that video/audio should be presented 20:35 hours after the data has arrived. Well, no problem, if you have enough time to wait...:blush: I could provide a work-around in the DVBViewer Filter, but to be true, I don't feel like supporting such a nonsense.

 

I see. Any advice on other cutting tools that do a better job?

Link to comment
I just downloaded and tested the files. They work fine with Winrar. I don't have 7zip so I don't know how it deals with split archives but you definitely need to download both files before you begin unzipping.

OK, WinRAR worked fine, thanks. The first part of the file seems to be a high bitrate - 19Mbps+ and yes it uses 90+% on my 64x2 5000+. From 1:10 the bitrate drops and I get 60s-70s%. So basically the usage is bitrate dependent- the highest bitrate I encounter for FTA UK HD is only 16Mbps (BBC HD) and I usually get around 70% average for that.

Edited by dvbrewer
Link to comment
Had a look at the file. Seems TSRemux messes the PCR (Program Clock Reference) values in the TS up. They start with 00:00, but the PTS (original Presentation Time Stamps) with 20:35, as I can see on the DVBSource property page. Should be approximately the same value. Such an output is not DVB compliant. In this way TSRemux indicates that video/audio should be presented 20:35 hours after the data has arrived. Well, no problem, if you have enough time to wait... I could provide a work-around in the DVBViewer Filter, but to be true, I don't feel like supporting such a nonsense.

I see. Any advice on other cutting tools that do a better job?

TSRemux does no cutting, only trimming, as far as I can see. Already tried TSPlayer?

 

Anyway - I've attached a tool that makes the TSRemux output playable in DVBViewer by searching the first PMT and patching the PCR PID to 0x1FFF (= not present), thus preventing DVBViewer from using the faulty PCR. Applying the patch a second time restores the previous state (provided TSRemux still uses 0x1001 as PCR PID). Just drag & drop the TS file on the program icon or the program window.

TSRemux_Fixer.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...