As I mentioned on a poll on the LJ side of my blog (maybe I should add poll support to pound?), I don't mind editing out parts of songs that I don't like, from surrounding bits to entire interludes. If I don't like a drum solo, and can manage to remove it while having the music still flow, then I will, although that's tricky. I understand the argument for artistic integrity, having had that discussion many years ago when I was dating an artist, but understanding the position doesn't mean agreeing with it. This morning I woke up and realised, after putting on music that I've been listening to a lot in my head recently, how much I want to be able to add in the "missing voices" and other elaborations that happen to songs that I keep in my head. Apart from simple additions (that I could theoretically do by mixing in my playing of the "missing bits" on accordion and making a new ogg), what would it take to do more sophisticated editing of music? What would amazingly sophisticated software need to be able to do smoothly to allow this? A few tricky bits:
- Auditory scene analysis - Software would need to be able to split a mixed audio stream perfectly and cleanly into a stream for each instrument type. More ideally, it could handle multiples of each instrument (e.g. human voice) and split them as well, interpolating when interference destroys information. There are all sorts of nice little clues that could be used to do this better. I wonder if in the end it could be done better if instruments were first recognised with a classifier and then used premade custom classifiers for common classifiers - distinguishing two people singing in a band would use different cues than two trombones (if sound quality is fantastic, imperfections in the tuning and shapes of instruments could be used on the latter... and the former would be a bad approach for when artists like Dokaka mix themselves in repeatedly..) I wonder if it would be doable to try to have the machine understand what's normally done with separate parts in a composition to aid in this process - I'm sure humans do it partly that way (although the "handing off" of the lead voice in songs like Bach's Double Violin Concerto might confuse this)
- Understanding when voices have the lead voice, and when not how close they are in a kind of "accompaniment space" to it - the further they are the more flexibility there is in reworking them to accomodate the addition of other voices for embellishment. I have vague, handwavy ideas on how this might be done, although I suspect to really do a good job, one would either need to see this as a quantity that changes fluidly over time for each instrument or would vary as per the structure of the music (which would thus need to be recognised).
- Understanding how accompaniment works and automatically generating suitable accompaniment. I've heard there's a Japanese piece of software that manages this task and produces a "band" to "play around" a single player. Has anyone tried this? How well does it work?
Yesterday's rain was glorious.