comment by firethief

a thoughtful web.

Good ideas and conversation. No ads, no tracking. Login or Take a Tour!

firethief · 3567 days ago · link · · parent · post: MIT can now eavesdrop through soundproof glass by watching the vibrations of a bag of chips | ExtremeTech

(if you want to reconstruct human speech at 300Hz, you preferably want to capture at 300 fps or higher)

Interesting, if true. A naive application of the described approach (assuming no rolling shutter trickery) would sample one point on the edge of the visual reactor, and interpret the deviation of its position in each frame as a (scalar) amplitude. Clearly under such circumstances Nyquist's Theorem would apply, and the highest frequency that could be captured faithfully would be half the framerate.

Doubling that would require getting more data out of each frame, which seems like it would be easy under just the right circumstances but nigh impossible otherwise.

One approach would be to sample two visual reactors, yielding two samples per frame with their effective times differing by the amount of time it takes sound to get from one to the other. This would be easy to do, but you would need sample sources at the right relative distances. 54 cm would turn a 300 fps framerate into 600 samples/s. A higher or lower difference in distance between visual reactors and sound source modulo 108 cm (assuming 300 fps) would yield lower-quality results, with times between samples alternating between two different values. You'd want to normalize the two sample sets to the same volume to avoid artifacts at the frequency of their offset.

markup tips · 0

kleinbl00 · 3567 days ago · link ·

You'd run into serious frame lock problems, too - lossy codecs such as h.264 and (pretty much everything in consumer gear) don't much give a crap about temporal frame length. This doesn't matter when you're recording video with audio as the frame captures both. When you're syncing two systems the footage tends to drift after about five minutes. If you're looking for interframe CMOS roll and comparing two different samples in order to get an interpolated waveform, that shit would have to be locked tight to provide anything useful as the harmonic effects would be cmopletely swamped by the inaccuracies of the frame start and stop times.