JCPalmer Posted September 6, 2017 Share Posted September 6, 2017 It has long been the potential of the QueueInterpolation animation system to be capable of speech using shape keys. The difference between potential & actual ended up being a couple of years, though. The "Talk" button on the Automaton QA scene, now says a half dozen sentences. Wingnut, MarianG, adam and 4 others 5 2 Quote Link to comment Share on other sites More sharing options...
GameMonetize Posted September 6, 2017 Share Posted September 6, 2017 Really cool! Quote Link to comment Share on other sites More sharing options...
Jim U Posted September 6, 2017 Share Posted September 6, 2017 Mr. Bitey rates it 10 of 10 Quote Link to comment Share on other sites More sharing options...
jerome Posted September 6, 2017 Share Posted September 6, 2017 so funny ! Quote Link to comment Share on other sites More sharing options...
hunts Posted September 6, 2017 Share Posted September 6, 2017 A little bit grumpy phaselock 1 Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted September 6, 2017 Author Share Posted September 6, 2017 On 9/5/2017 at 11:44 PM, jerome said: so funny ! Ah, I think I am going to need to credit Robin Williams for the "I'm melting. I'm melting. / Clean up, Aisle 4." sequence. I was trying to come up with a way to exercise different expressions. Completely random stuff is not as good as some kind of a theme across sentences. Those 2 popped into my head. I cannot prove he said it, but it fits. Thanks, @hunts. I like negative cheeks high. I have actually updated LAUGH & HAPPY (not in published yet) to use this. They really look great now. I could add grumpy as a stock expression after a clean up (no tongue), or may just review my ANGRY to see if might be improved, influenced by your settings. @Jim U, glad this got you register. I have spent another day on this, and got some real improvements (will adjust topic title when pushed up). A couple of visemes were fine tuned. The thing really improved is being able to talk fast, and not have it be just some violent chopping. I found a way to not always deform for every viseme. Having less deforms in itself makes it smother. The thing is to know what can be discarded. The Arpabet database I converted to Javascript has vowel stresses of 1- primary, 2 - secondary, and none for all of its 10k of words. I now discard vowels with no stress. Am now able to say "get me the hell out of here" without it being slow, overly enunciated, & wooden. Guess I am trapped in an "continuous improvement cycle". Just one more day. Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted September 7, 2017 Author Share Posted September 7, 2017 Ok, the talking is now set. I could run before & now side by side, and it looks much more life like. In addition to the speech itself, I improved some of the expressions & added GRUMPY. Also, there needs to be a talking version of each expression, see dropdown. These are computer built from the normal one. Before I just removed all the MOUTH deforms. Now I am retaining more. While MOUTH_OPEN is fully removed, other mouth deforms are just reduced by 50%. Gives much better expressional talking. About half the sentences had a vowel or 2 discarded from before, and they are all less tortured for it. Some of the beginning sentence were re-recorded going much faster. There are some problems with the characters, but I am going to call the talking done. Quote Link to comment Share on other sites More sharing options...
adam Posted September 8, 2017 Share Posted September 8, 2017 "Mmmm hmmm" phaselock 1 Quote Link to comment Share on other sites More sharing options...
meteoritool Posted September 12, 2017 Share Posted September 12, 2017 OMG this is totally AWESOME ! I've been looking for this feature around It sells for 100$ on the unity store ! https://www.assetstore.unity3d.com/en/#!/content/3021 I'm so gonna look into it ! Hope it's not too complicated ... I'd be glad if you had the time/intention to write a guide for noobs Also very curious on how you tweaked the audio frequencies to interpret facial movement ! This is great ! Wingnut 1 Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted September 12, 2017 Author Share Posted September 12, 2017 That unity addon you reference is based on bones. That kind of a skeleton in WebGL is not advised. Not sure if you were going to buy or were just referencing it, but exporting the output to BJS could have issues, especially on iOS. I achieve this without bones, using morphing. The early workflow from MakeHuman to Blender to export does have a readme.md linked in the References dropdown of the Automaton Test Scene. FYI, there are many additional parts to put into both MakeHuman & Blender currently required, but the extra stuff for MakeHuman will be included in the normal install with the next release (soon I think). I actually do not directly take into account the audio track to determine either the deforms to perform, nor when to do them. The tool I use to write the animation sequence is not going to be public at this time. I am not sure where it is going, but once out I cannot get it back. FYI, all the expressions and vismes are public, so a causal dev could maybe do something by hand. meteoritool 1 Quote Link to comment Share on other sites More sharing options...
meteoritool Posted September 13, 2017 Share Posted September 13, 2017 Thx for your answer ! I've seen your documentation after posting here so I see you did share your method ! Thank you very much for that ! So you mean you animated the lipsync by 'hand' ??? That's a lot of work tweaking and fine-tuning ! My first intention was to create a kind of "next gen music video" with BABYLON.js, where a 3D character would sing the song. One solution would be to use a videoTexture of a face/mouth singing but that's really not ideal because file size would be huge and the result aesthetically doubtful. Then the morph target feature was released and I thought : that + audio analyser would be the way to go for automatic lipsync ! But that's just the theory and it would take a scientist/engineer to achieve what I want :-/ I'm just a noob using Babylon end-user methods. Your scene offers a glimpse of endless possibilities ! Love it ! (In the meantime, "next gen music video" already exists now : see a video rendition of an Oculus clip : ) Quote Link to comment Share on other sites More sharing options...
JCPalmer Posted September 18, 2017 Author Share Posted September 18, 2017 BTW, for any demo that involves MakeHuman, I usually make a cross post for them. Rarely does this lead to discussion, but in this case, someone pointed me to a C-based repo that does speech generation. I ended up coding a 28 line html file, to test how consistent speech rate was cross-browser for the Web Speech API. Not very unfortunately. In case anyone cares. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.