|
|
Up |
|
|
  |
Author: geogeo Date: Oct 16, 2007 10:32
Hello everyone,
I've been working on some code to extract the audio from a multimedia
file and save the audio as a PCM WAV file. This is a pretty
straightforward problem and my solution was to render a filtergraph
for the multimedia file in question (mpg, mpa, mp2, mp3, aif, wav,
wma, wmv, whatever), then modify the graph to ditch the video and
hook-in the wavedest filter and a filewriter.
This worked like a charm, sort of.
Before I explain the problem, I think it's important to point out the
media files in question are usually close to 2 hours in duration, and
the encoding of these files is outside my control. For this reason, I
see quite an interesting spectrum of goofy bitrates, codecs, dropped
frames, etc. in the files I am trying to process.
Because of the specifics of my application, it is very important the
extracted PCM audio matches (as closely as possible) the original
audio stream within the source multimedia file. Or to put it in
simpler terms, the duration of the extracted PCM audio should very
closely match the original multimedia duration.
|
| Show full article (2.15Kb) |
|
| | 11 Comments |
|
  |
Author: Alessandro AngeliAlessandro Angeli Date: Oct 16, 2007 12:18
> My ultimate goal is to identify (1) Why the durations are
> so dramatically different and (2) adapt the WavDest to
> 'insert' or 'delete' samples when appropriate to keep the
> timebase between my source and destination the same.
If your WAV files contain PCM data and they are not broken
(that is, you finished writing them without errors and they
are <= 2 GiB), then their duration can not be wrong (it is
just calculated from the number of samples in the data chunk
so I can't imagine anything that can go wrong there). It is
more likely that, if some of your source files use VBR
encoding, the duration of those files is reported
incorrectly, because there is no generic way to calculate it
without additional metadata, which depends on the file
format and is usually optional and sometimes unreliable.
|
| Show full article (1.00Kb) |
|
| | no comments |
|
  |
Author: geogeo Date: Oct 16, 2007 14:31
Thanks for the reply Alessandro...
>If your WAV files contain PCM data and they are not broken
>(that is, you finished writing them without errors and they
>are <= 2 GiB), then their duration can not be wrong (it is
>just calculated from the number of samples in the data chunk
>so I can't imagine anything that can go wrong there).
Yes, that is true. I realize the duration of my output PCM WAV can be
calculated quite easily from the number of channels, bits per sample,
and sample rate, and that there is no way to dispute that calculation.
It simply is what it is...
But the relationship between the duration of the output PCM and the
original multimedia file (as reported by DirectShow) does not seem to
bear any resemblance...
For instance, after extracting the audio from a 2 hour MPEG-1 stream,
the original stream may report a duration of 2 hours, 0 minutes, 0
seconds, and 0 milliseconds but the PCM might contain only 1 hour, 58
minutes, 13 seconds, and 200 milliseconds of actual samples. What
happened to the difference?
|
| Show full article (4.12Kb) |
| no comments |
|
  |
Date: Oct 16, 2007 15:17
How are you getting the duration of the original file?
Unless it was clocked off the audio, then rarely (if ever) are they going to
match.
30 seconds in 2 hours sounds like normal clock drift. If the source stream
is reporting duration based on wall clock time (or any other timer besides
the audio sample count) then there won't be a match. Even on the same
machine, same clock, there can be differences.
The only real duration comparison is comparing audio sample count. Even
then, any compression on the audio stream(s) needs to be completely lossless
as far as sample count is concerned.
Mark
--
Mark Salsbery
Microsoft MVP - Visual C++
indatacorp.com> wrote in message
news:96aah3djcp3jbiqosif1trctoklse9j992@4ax.com...
> Thanks for the reply Alessandro...
>
>>If your WAV files contain PCM data and they are not broken
>>(that is...
|
| Show full article (5.00Kb) |
| no comments |
|
  |
Author: Alessandro AngeliAlessandro Angeli Date: Oct 16, 2007 16:07
> Anyway, if I took graphedit, and rendered my MPEG1
> stream, and disconnected only the video portion of the
> graph, and clicked play would the resulting playback time
> equal the reported duration? (I have to try that
> experiment) When the filtergraph is actually rendering
> the audio in real time (not just processing it as a
> collection of bytes) will the audio be sped up or slowed
> down to keep time with the reference clock? If so,
> couldn't my wavedest be smart enough to do the same by
> introducing or subtracting 'spacer' samples into my
> output stream?
|
| Show full article (2.34Kb) |
| no comments |
|
  |
Author: Michel Roujansky - DirectShow Consultant and TrainerMichel Roujansky - DirectShow Consultant and Trainer Date: Oct 17, 2007 03:24
On Oct 17, 12:17 am, "Mark Salsbery [MVP]"
wrote:
> How are you getting the duration of the original file?
>
> Unless it was clocked off the audio, then rarely (if ever) are they going to
> match.
>
> 30 seconds in 2 hours sounds like normal clock drift. If the source stream
> is reporting duration based on wall clock time (or any other timer besides
> the audio sample count) then there won't be a match. Even on the same
> machine, same clock, there can be differences.
>
To reinforce this point, we found on one project a 8/1000 (nearly 1%%)
difference between source and sink audio clocks when streaming between
different machines.
|
| |
| no comments |
|
  |
Author: geogeo Date: Oct 17, 2007 10:33
Hi Mark,
Thanks for the help. In response to your question:
>How are you getting the duration of the original file?
Nothing fancy, just rendering the file and querying the Duration
property.
In an attempt to better wrap my brain around all this I've been
looking at two example files...
The first is an MPEG with a reported duration of 6000.901. After I
extract the audio I'm left with a WAV file reporting a duration of
5997.597.
The interesting thing here is I wrote a quick VB hack to render both
files and then seek within both to just a few minutes before the end.
If I play them back the audio is suprisingly close, so apparently my
'missing time' within the wav file consists mostly of samples that
'pad' the end of the audio stream. It's like the last 3 seconds of
video have no corresponding audio.
|
| Show full article (2.52Kb) |
| no comments |
|
  |
Author: Chris P.Chris P. Date: Oct 17, 2007 11:12
> The second file is an MP3 with a duration of 3775.044. After
> extraction the Wav file reports a duration of 3792.822, 17 seconds
> longer! When I try my little 'sync playback hack' these files aren't
> even close. Hoever, this is where it gets really neat. I know the
> sample rate of my output wav is 11KHz, and I can therefore calculate
> my ouput wav is 195,580 'samples longer' than my original mp3 (I know
> I'm mixing up apples and oranges here, but stay with me a sec). If I
> do some math, I discover that had I dropped every 214th sample when
> creating my wave file the durations would have been a spot on match,
> and more importantly seeking within the MP3 would exactly match
> seeking within the WAV.
Are you doing any sample rate conversion when converting to a WAV file?
The sample rate converters that come with Windows are not sample time
accurate and will cause huge problems.
|
| |
| no comments |
|
  |
Author: geogeo Date: Oct 17, 2007 11:37
>>Are you doing any sample rate conversion when converting to a WAV file?
Nope. The filtergraph is simply the rendered graph with the video
removed and the wavdest and filewriter inserted into the end of the
audio chain.
Since the output is a PCM WAV, I'm going to try writing a quick
post-process method to copy the data but insert/drop samples as
appropriate. I will then do some spot check QC against problem files
to see where I am.
I am curious however about the IMediaSample GetTime and GetMediaTime
methods. The only documentation I can find seems pretty sketchy about
what they actally mean, and what types of values to expect. I thought
I might be able to compare the StartTime against the number of samples
I've previously processed to see if the numbers jive up...
:-) Geo...
On Wed, 17 Oct 2007 14:12:06 -0400, "Chris P." chrisnet.net>
wrote:
>> The second file is an MP3 with a duration of 3775.044. After
>> extraction the Wav file reports a duration of 3792.822, 17 seconds
...
|
| Show full article (1.86Kb) |
| no comments |
|
  |
|
|
  |
Author: Alessandro AngeliAlessandro Angeli Date: Oct 17, 2007 12:11
> I am curious however about the IMediaSample GetTime and
> GetMediaTime methods. The only documentation I can find
> seems pretty sketchy about what they actally mean, and
> what types of values to expect. I thought I might be able
> to compare the StartTime against the number of samples
> I've previously processed to see if the numbers jive
> up...
|
| |
| no comments |
|
|
|
|