Sing With Me ~The Truth About Triphones~

October 11, 2012 at 6:00 am Leave a comment

Welcome to the third installment in the Sing With Me series!  In my first two posts, I went over the basics of UTAU and Vocaloid.  Today I will be musing over that most mysterious of technologies, triphones!

So what are triphones?

The Basics

Triphones, also known as VCV (vowel-consonant-vowel), are used to make synthesized singing more realistic, compared to the older diphones/CV (consonant-vowel) technology. (For clarity, I’m going to only use the terms triphone and diphone.)  The way it looks is like this:

Example 1: Diphones vs Triphones

Diphones: na, kyo

Triphones: a na, i kyo

Example 2: Diphone Song Line vs Triphone Song Line*

Diphone song line: [ka] [ze] [ni] [u] [ra] [i] [de] [hi] [ra] [ri] [ma] [i] [chi] [ru]

Triphone song line: [ka] [a ze]  [e ni] [i u] [u ra] [a i] [i de] [e hi] [i ra] [a ri] [ma] [a i] [i chi] [i ru]

*This line is from Akahitoha, sung by Hatsune Miku

The extra vowels at the beginning of each sample helps to smooth out the transition between each phoneme.  So less choppiness, yay!  Even better, both UTAU and Vocaloid 3 have the ability to use triphones.

In the Example 1 above, I only posted two examples of a single diphone and a corresponding triphone.  However, UTAU triphones are usually recorded in a string of five to seven triphone phonemes to save space.   The voicebank is then configured so that UTAU knows that it’s five or seven different notes. (But don’t ask me how to do it!  I’ve only just started out learning about it myself!)  If you don’t think that you’re up to recording a triphone voicebank yet, but you still want to give it a try, there are many UTAUloids with either a lite or a full triphone voicebank (like Momo Momone and Ritsu Namine, respectively).

The Pro

  • Triphones sound smoother than diphones.

The Con

  • It’s harder to record a triphone voicebank than a diphone voicebank for an UTAUloid.  (No comment on Vocaloid recording :P.)

Learn More

The UTAU wiki has an explanation of triphones (though not how to record them) and a list of triphone voicebanks here:

This article is where I first heard of triphones.  It sort of goes over the process to record a triphone voicebank, but a video tutorial would probably be clearer.  However, it has some comparisons of diphone/triphone songs, so you can really hear the difference between technologies.


Entry filed under: Meetings. Tags: , , , , , , , , .

Ten Under Ten: A Crowd of Vocaloids, Part 2 Ten Under Ten: A Crowd of Vocaloids, Part 3

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed

October 2012
« Sep   Nov »

Want to hear about our adventures as soon as we have them? Enter your email address here!

Join 12 other followers

%d bloggers like this: