Does digital audio work like digital images...ie more bits for highend?

Discussion in 'Audio Hardware' started by Kustom 250, Oct 16, 2008.

Thread Status:
Not open for further replies.
  1. Key

    Key New Member

    Location:
    , USA
    I worked in audio first and then used my knowledge of digital audio to help me with photoshop and digital images.

    While I was figureing everything out I did come up with a bunch of analogies between the two differen fields.

    I basically equate over contrast or white details being washed out with digital clipping. Details being washed out in the black I associate with undersaturation of a digital signal - those dynamics at the softest part of the file.

    I equated resolution or the size of the picture with the sample rate of a music file.

    And well bit-depth is bit-depth. The biggest problem that audio engineers say they hear with 16-bit audio can actually be seen in a 16-bit image - gradient stepping. That is fades or gradient changes are stepped instead of smooth. Go into photosop and make a gradient from absolute white to absolute black - you will see jumps in the color instead of smooth transitions.

    There were other analogies I came up with. Like contrast and color correction are similar to EQ and dynamic range compression.
     
  2. Natt

    Natt Forum Resident

    Location:
    Acton, Canada
    Clipping is clipping in both audio and images.

    Resolution is equivalent to sample rate.

    Bit depth in both are the degree of precision in storing the dynamic range. The problem being with photoshop is that you're viewing the image through a variety of conversions, colour space calibrations, and then probably on an 8bit display, which means you'll see banding in gradients that just isn't there in the data, but is an artifact of how the data is displayed.

    The biggest difference between the two is that audio data is usually stored reference to linear voltage, whereas the linear voltage from the image sensor in the camera is gamma encoded. A CRT display has an inherent inverse gamma curve, which cancels out the gamma in the record path, allowing you to view a nice looking image that has been perceptually compressed in the gamma curve to allow the same tonal range / dynamic range to be stored using less code values in the digital signal. LCDs are inherently linear light devices, hence they simulate the inverse gamma curve digitally.

    Audio is recorded, amplified and played back linearly, but often the dynamic range is compressed so as to make the music "louder" or allow it to be heard easier in noisy environments. Images go through pretty strong compression curves to make them nicely viewable too.
     
  3. Key

    Key New Member

    Location:
    , USA
    Yes, you must always consider the signal chain.

    I am still trying to figure out how to practically use ICC calibrations in certain workflows.
     
  4. Natt

    Natt Forum Resident

    Location:
    Acton, Canada
    There are certainly a lot more potential problems with video and images due to the gamma curves (ie non-linearities) involved. Also, display technology is still lacking the full dynamic range to display images properly. Many display technologies lack good black levels.
     
  5. Key

    Key New Member

    Location:
    , USA
    Yeah, I found in the lab when printing, that icc referencing helped me save a little bit of paper here and there, but it still came down to trial and error in a lot of respects.
     
  6. Sorry if you don't like it, but those are K.G.'s words. On this forum I am compelled to give them more weight than yours or mine for that matter.
     
  7. bdiament

    bdiament Producer, Engineer, Soundkeeper

    Location:
    New York
    Hi Lee,

    As an opinion, it is no sillier than any other, no sillier than anything you or I would say. It is an opinion. One can agree or disagree.

    In fact, if you listen to a mic feed and compare it to a DSD version and a 16/44 version, listening in particular to the top end, one could make the argument that it is the deficiencies of DSD that are pretty obvious.

    There appear to be some audio phenomena that some folks hear and others do not. Some folks are not bothered by the top end of DSD. I find it fatiguing. With other phenomena, there are folks that don't like the sound of things I do. I don't know how to explain this but it is something I've observed. We all have different sensitivities to different aspects of sound.

    Best regards,
    Barry
    www.soundkeeperrecordings.com
    www.barrydiamentaudio.com
     
  8. LeeS

    LeeS Music Fan

    Location:
    Atlanta
    I should have been more diplomatic here. I respect Kevin quite a bit but I find the top end is one of the improvements of using the faster sampling of DSD. The noise in the upper band is really pushed out. Paul Stubblebine, John Atkinson and others also don't hear it so there is not a consensus on this issue. I think playback gear may play a role perhaps.

    If you listen to Channel Classics La Stravaganza SACD, you will hear a very natural violin sound. I've yet to hear a CD that comes close to the sweetness of simply recorded violin in DSD.

    I did a violin and piano recording today in 24/176. Listening back it really captured the sweetness of tone in the instrument. Morinix doesn't believe in the value of higher rez recordings but I find it is the only way to really get good sound from digital. 16/44 has come a long way but it misses quite a bit.
     
  9. Metoo

    Metoo Forum Hall Of Fame

    Location:
    Spain (EU)
    LeeS, are you sure that when you are referring to the top end you are not really talking about the midrange. Violins only go up to about 4kHz (see chart here: http://www.psbspeakers.com/audio-topics/The-Frequencies-of-Music) Perhaps you are mistaking midrange clarity/resolution with the high end.

    I remember that when I started listening to SACD there was this 'noise' in the high frequencies that I didn't like. It was softened by a change of cables, but I wouldn't doubt that I also learned to tune it out. I remember this specifically from my first listen to Dark Side of the Moon on SACD.

    IMHO, the definition and clarity that makes SACD sound very 'natural' is in the mid range.

    There is something that I also notice in SACDs: they do not handle overcompressed (not to mention clipped) sound too well. There is some kind of distortion I hear in the loud parts of the Elton John SACDs (where you get more of a block waveform) that is not present in the parts where the peaks seldomly - if at all - reach near the top. I have tried 'softening' this with the use of a declipper and the most that I have achieved is a little better sound.

    There was a paper published on the Internet a long time ago that said that SACD has more resolution than CD in the mid to low frequencies, but at around 10kHz its sound is worse than CD. The reason that was mentioned for this being this way is, IIRC, because they chose a lower sampling frequency than would have been optimum. Perhaps someone who has read this information can point us to it.

    There are, obviously, beautiful recordings of violin on SACD. Julia Fischer's "J.S. Bach Sonatas and Partitas" comes to mind.

    I agree with this.
     
  10. LeeS

    LeeS Music Fan

    Location:
    Atlanta
    I used the violin example because I find that violin has a "sweet" tone that is not captured by 44.1khz. When we record the violin we hear most of the tone and string sound very clearly and naturally at 24/176. We then downrez the file to 16/44.1 for our musicians. When we do that several thing stick out:

    **We don't hear the spatial cues of the hall so the ambiance goes MIA.
    **The sweetness goes away. I'm not sure why this happens but it does.

    The top end air and energy I believe is only captured by higher resolution formats.
     
  11. Publius

    Publius Forum Resident

    Location:
    Austin, TX
    I'd argue that resolution analogies between digital audio and digital images are mostly bogus.

    Visual images in perfect focus have sharp discontinuities between objects. That is, ignoring things like the Rayleigh criterion, the bandwidth of a visual image is naturally infinite. So increasing the imaging resolution can directly and obviously lead to improved visual quality. And moreover, spatially lowpassing the visual image leads to quite obvious artifacts (like what happens with overcompressed JPEGs).

    Audio samples, in comparison, have relatively "soft" discontinuities between audio "objects" - the attack time of most instruments is far longer than the sample rates in question. Many instruments don't even have harmonics at or over 20khz (unlike in the visual field where virtually every sharp outline will cause high-order harmonics). And it's still quite controversial (to put it mildly) as to whether or not higher bitdepth or sampling rate causes audible improvements, unlike the relatively universal agreement in the video world that even 1080p is well below the limits of visual acuity.
     
  12. bdiament

    bdiament Producer, Engineer, Soundkeeper

    Location:
    New York
    Hi Lee,


    I agree regarding his res over CD "standard". I've run 24/96 for a long time and the differences are significant and very positive. Lately, after completing a long period of experimentation, I've switched to 24/192 and again, the sonic leap upward is quite appreciable.
    (I understand there are some folks who debate this. Then again, this is audio. There are folks who will debate anything. If the Sun was an audio component, they'd debate whether it really rises or whether that is just some scam invented by cable companies. ;-} Let them debate. I'm too busy having fun and enjoying the sonic benefits of high res.)

    However, when you "downrez" you are using an SRC algorithm. In my experience, the symptoms you mark with asterisks are easily attributable to most of the SRC I've heard. The results are not the same as a well made original 16/44 recording would be.

    Best regards,
    Barry
    www.soundkeeperrecordings.com
    www.barrydiamentaudio.com
     
  13. Natt

    Natt Forum Resident

    Location:
    Acton, Canada
    1080p is indeed below the limits of visual acuity, but only if you're sitting close to the television. Move further away, and you reach the point where it meets the limits, and move further away and you go beyond the limits of what the eye can see. The eye itself if a sampled system, and of course, the lens in the eye is not perfect and will act as an optical low pass filter, along with the diffraction effects of the iris.
     
  14. Stefan

    Stefan Senior Member

    Location:
    Montreal, Canada
    So Barry are you saying these problems don't occur (at least to the same extent) if one records directly to 44.1kHz instead of recording in hirez and then downsample? In other words, if the final target is 16/44.1 only, are you advocating recording to 44.1 directly instead of hires>downsample?
     
  15. LeeS

    LeeS Music Fan

    Location:
    Atlanta
    Yet we also daisy chain two Sound Devices 722s and when we record on one 722 in 16/44.1 the same thing happens so I don't believe it's isolated to just a downrez algorithm artifacts. I believe it has to do with the value of the extra information in a hirez file.
     
  16. Natt

    Natt Forum Resident

    Location:
    Acton, Canada
    Downsampling is a two part process: a low pass filter, then decimation. In that respect downsampling is a sampling process. The key stage is the low pass filter, and it's the design of that filter, that I think is causing the problems with we hear with digital audio. The big advantage with recording at a high sample rate for later downsampling is that you could, in theory, try different downsampling filters with the music and figure which sounds the best. You could even try an adaptive approach where different filters are used for different sections of the music.

    But an initial sampling at a high sample rate allows you to use milder filters, still keep out the aliases, and protect the frequency response of the audio band.

    Now what I've not studied, but it certainly could be interesting to look into, is the possible interactions between different low pass filters used in a chain that downsamples to 44.1khz from a high sample rate source, and how that interacts with the reconstruction filter (assuming the engineer remembered to put one in) in the DAC.
     
  17. bdiament

    bdiament Producer, Engineer, Soundkeeper

    Location:
    New York
    Hi Stefan,

    I'm only pointing out that most of the SRC I've heard adds artifacts of its own. This must be considered when evaluating hi res vs. 16/44, if the 16/44 in such a comparison is the product of one of these.

    As to recording to 44.1 directly if that is the target, it depends on the tools used to make the recording. If I was absolutely sure the recording was not going to require any changes at all and if I was limited to using an SRC other than my choice, I would consider recording directly to 44.1.

    However, since I leave lots of headroom when I record, making final level adjustments in the mastering room and since I can use the SRC of my choice, my approach is to record at the highest resolution I can (currently 24/192) and use my chosen SRC to create the CD version. I prefer to make the original recording as good as it can possibly be and not compromise it for the sake of a less than optimal format. With the best tools, this ends up making the best CD anyway.

    Best regards,
    Barry
    www.soundkeeperrecordings.com
    www.barrydiamentaudio.com
     
  18. bdiament

    bdiament Producer, Engineer, Soundkeeper

    Location:
    New York
    Hi Lee,

    I agree. I did not mean to imply that 16/44 is not missing anything. I was only saying that once you add (most, not all) SRC, it leaves a clearly audible mark on the sound.

    Best regards,
    Barry
    www.soundkeeperrecordings.com
    www.barrydiamentaudio.com
     
  19. LeeS

    LeeS Music Fan

    Location:
    Atlanta
    I agree; that's an excellent point.
     
  20. Key

    Key New Member

    Location:
    , USA
    edit brb
     
  21. Key

    Key New Member

    Location:
    , USA
    10kHz sinewave @ 192kHz
    [​IMG][​IMG]

    10kHz sinewave @ 96kHz
    [​IMG][​IMG]

    10kHz sinewave @ 44.1kHz
    [​IMG][​IMG]
     
  22. Publius

    Publius Forum Resident

    Location:
    Austin, TX
    That's not actually what happens in the analog domain, Key. Those sharp edges don't exist.
     
  23. Key

    Key New Member

    Location:
    , USA
    you mean after or before playback Publius? I just tend to think that the nyquist theorem is only for satisfactory results not for perfection.
     
  24. Publius

    Publius Forum Resident

    Location:
    Austin, TX
    Neither. Linear interpolation is never, ever involved in the signal chain. Bandlimited interpolation is always involved (either by upsampling or oversampling or with analog filtering or all three or whatever).
     
  25. LeeS

    LeeS Music Fan

    Location:
    Atlanta
    Merging Technology describes the value of hirez this way:

    [​IMG]
     
Thread Status:
Not open for further replies.

Share This Page

molar-endocrine