Here we are, at the final post of this blog. I remember how young I was four months ago, and now I'm looking at two more credits for the year. How time flies. But do not dismay, for I have quite the post for you today. For my final project, I chose to create an original composition using all the skills I've acquired over this course. However, the idea actually predates this project. When I exposed Deep Purple for blatantly plagiarizing Carlos Lyra, I became interested in copyright—not just concerning things that sounded similar, but things that sounded exactly the same. This brings us to the discussion in copyright about AI voice replication. I'm referring to the kind of replication you'd find on a Taylor Swift cover of Frank Sinatra's "My Way"—a song she has never sung, but a video that exists nonetheless. A phenomenon made possible by AI voice training.
I began looking into the legality of works using AI voices like this, and the problems I found largely extend to the greater AI sphere as well. The primary issues are that the material used to train the AI comes from other people's copyrighted works, and that since the AI voices are created by a third party, the actual voice owner doesn't give consent to have their voice replicated, nor are they compensated for it. While it is not expressly illegal to publish works using AI voices, it is not exactly legal either, existing in a sort of legal gray area that I'd think is best avoided, especially when there are much more ethical alternatives.
Vocal synthesis is nothing new. In fact, it's been around for about as long as computers have, even with those made for the express purpose of singing. One of the most notable modern synthetic voice programs is YAMAHA's 2004 software, Vocaloid, which I’ve briefly discussed on this blog before. It uses the standard piano roll layout familiar to producers, so it's easy to navigate. You can program in the notes and then type in each word or syllable (depending on the language) for the desired note. This is accomplished by pulling from an extensive library of pre-recorded sounds. For reference, in Japanese, there are 100 distinct syllables that form words. Each syllable must be recorded once for every note and octave within the voicebank’s intended range. It’s a lot. But as a result, it creates a very convincing string of words and sentences, each of which can be further enhanced by editing other aspects of the voice, such as vibrato.
To get to the point, the reason why programs like Vocaloid are created ethically as opposed to AI voices is that they are made in collaboration with the voice provider. Any recording is done specifically for the Vocaloid software, and the provider licenses the recordings to YAMAHA. Whenever a voicebank is purchased, both YAMAHA and the voice provider take a cut from that sale. As a result, people who purchase the software also gain full authorization to use the program for anything, including commercial purposes, without having to split royalties with the creators.
With that out of the way, I was naturally very intrigued by this program and how it worked with composition. So when proposals for the final project came around, I jumped at the chance to incorporate what I had found into my work. I decided to create a Vocaloid song, specifically based on the style common in the traditional Japanese side of the program (since the English Vocaloid music is much different from what I normally make). I first did some research into what makes a typical Vocaloid song, which was hard to pin down since it isn't a genre. But after listening to more songs than I care to admit, I determined that the most common characteristics of Vocaloid music are emotional and symbolic lyrics contrasted by a somewhat upbeat instrumental, with the lyrics, of course, being in Japanese. Specific tracks that stood out to me were on the more Dreamcore side of things, which is the style I specifically wanted to emulate in my song. Below is a notable example of the sound I was going for:
As you can see, I was very intent on matching songs in this style. The transition into the chorus and the actual energy present in that section are almost a mirror image of both the examples I've shown. The chorus itself looks more complex than it is, with most of the additional tracks being layers/variations of the initial chords or the arp. The latter of which is introduced in the second half of the chorus section to maintain interest before moving into the verse. The drums used are actually the same drum sample from the first example I showed, the "Amen Break," which is a very popular sample in Breakcore and Drum and Bass music. I only made minor alterations to the rhythm before layering the sample with my own drums for more punch.
The second chorus, which is really the first chorus because it's the first section with the actual chorus vocals, is almost the same as the first. The only minor difference is that the initial arp variation is present throughout the whole section, and the arp introduced in the second half is actually a new melody.
The bridge vocals serve to transition directly into the chorus and actually lead into a sort of third part of the bridge section, which has similar instrumentation to that of the chorus. It's a bit hard to explain, but you will hear what I'm talking about when you listen to the piece. This part then transitions into the chorus through the use of a quick drum fill. Since the instrumentation of these sections was so similar, there would be less impact when switching between them. So, for the final chorus section, I layered yet another set of chords on top to fill in even more space.
That does it for the basic composition side of the project, but there is still the whole question of vocals. I have a sort of odd process for writing lyrics that I don't think most people have. I write the lyrics first, and then write the melody exactly in line with the syllables in each line. Basically, I wrote the lyrics, and then wrote the melody with an instrument on top of the backing track while audiating the lyrics in my head. As a result, I do treat the vocals as an instrument, which bleeds into how I mix them as well. But I think it works for the dreamscape feel of the tracks I'm replicating. On top of that, since the vocals are literally synthesized, it makes more sense to treat them that way.
I used two different voice banks in this track: one being more mellow, which I used for the verse sections, and a louder, more outspoken one for the choruses. Each portion of the vocals has its own set of effects applied, mostly just reverb and chorus, but depending on where they are present, they may have distortion or compression as well. The chorus vocals, I believe, have the most effects applied to any one sound in the project, with all the effects mentioned above, as well as delay and bit crush. Later, during the bridge section—in the third portion of that—I added heavy distortion to the vocals to make them sound like they were yelling. I also tried altering the delivery settings in the Vocaloid software itself to produce this effect, and the results were fine. I would have produced a better result if I had used a different voice bank, but I think it fits well enough.
The lyrics themselves are in Japanese by the way, and now is probably also a good time to mention that I'm a Japanese language minor. So to clarify, I'm not just pulling the lyrics out of thin air—I do have some grasp on the language. I am actually pretty proud of these lyrics, though they are a little cringey. Here are the lyrics, as well as a rough translation, below:
Japanese:
あなたはあらゆる努力をしました
寝ても覚めても頑張った
ですが うまくいきません
誰かに認められたかったのですか
あなたが本当にしたかったことは何ですか
自分の価値は他人が決めるものだと思ってた
自分が自分を大切にするより
周りの目が大事だと思ってた
でも そうじゃなかった
光を見たくないですか
でも 本当は見たくないんでしょ
だってもうとっくに気づいてる
私はあなただってことに
そのままでいいんです
そのままでいいんです
そのままでいいんです
そのままでいいんです
(2x)
光を見たくないですか
でも 本当は見たくないんでしょ
だってもうとっくに気づいてる
私はあなただってことに
English:
You made every effort.
You worked hard, whether you were asleep or awake.
But it didn’t go well.
Did you want to be recognized by someone?
What is it that you really wanted to do?
You thought that your worth was something determined by others.
You thought it was more important to care about what others think than to care for yourself.
You thought the opinions of others were important.
But that wasn’t the case.
Don’t you want to see the light?
But really, you don’t want to see it, do you?
Because you’ve already realized it a long time ago.
That I am you.
It's fine the way it is
It's fine the way it is
It's fine the way it is
It's fine the way it is
(2x)
Don’t you want to see the light?
But really, you don’t want to see it, do you?
Because you’ve already realized it a long time ago.
That I am you.
Without further ado, here is my composition:
No comments:
Post a Comment