[UniMRCP] TTS and ASR channels within one MRCPv2 session

528 views
Skip to first unread message

Kamil Shakirov

unread,
May 21, 2010, 7:40:39 AM5/21/10
to uni...@googlegroups.com
Hello Arsen,

I am back to my VAD application development. :-)

Before I used two separate sessions for each active call with one single
channel: TTS and ASR. That worked fine except it was slower to
setup/tear down than having one single session per call for both TTS and
ASR channels.

I decided to change this having TTS and ASR channels in one active
session per call and started with UniMRCP from the current HEAD as I
wanted to "transparently set header fields as generic name-value pairs"
feature. :-)

It works when creating and adding (first) TTS and (second) ASR channels.
But when sending a request to ASR channel it appears that the channel's
name is not valid and I get 405 error. TTS channel works fine.

I checked the UniMRCP logs and found that ASR channel's name gets
changed (2 times) during setup (when created and added) but the old name
is still used when a request is sent to ASR channel.

I attached my session logs (from my application with custom logging
handler (all UniMRCP related logs are prefixed with 'mrcp_unimrcp:'
mark), SIP and MRCP logs from Wireshark) that reproduce this case.

Arsen, thanks a lot for UniMRCP (the best one available) and all your
effort.

--
--wbr.

--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To post to this group, send email to uni...@googlegroups.com.
To unsubscribe from this group, send email to unimrcp+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/unimrcp?hl=en.

session-app.log
session-sip.log
session-mrcp.log

Arsen Chaloyan

unread,
May 21, 2010, 12:44:34 PM5/21/10
to uni...@googlegroups.com
Hello Kamil,

On Fri, May 21, 2010 at 4:40 PM, Kamil Shakirov <kami...@gmail.com> wrote:
Hello Arsen,

I am back to my VAD application development. :-)
 

Just a few days ago, discussing the "bypass" application in a parallel thread, I noticed there had been nothing from you for a while. Glad you are back to the UniMRCP development now.


Before I used two separate sessions for each active call with one single
channel: TTS and ASR. That worked fine except it was slower to
setup/tear down than having one single session per call for both TTS and
ASR channels.
 

Clear.


I decided to change this having TTS and ASR channels in one active
session per call and started with UniMRCP from the current HEAD as I
wanted to "transparently set header fields as generic name-value pairs"
feature. :-)
 

Both sound reasonable.


It works when creating and adding (first) TTS and (second) ASR channels.
But when sending a request to ASR channel it appears that the channel's
name is not valid and I get 405 error. TTS channel works fine.

I checked the UniMRCP logs and found that ASR channel's name gets
changed (2 times) during setup (when created and added) but the old name
is still used when a request is sent to ASR channel.
 

There is a question here which still remains wide open.

See the channel identifiers Nuance responds with
a=channel:1@speechsynth
a=channel:2@speechrecog

I'm still not sure if this even according to spec or not. At least, the behavior suggested by spec and supported by UniMRCP is.

a=channel:1@speechsynth
a=channel:1@speechrecog

See the typical flow in http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-20#section-4.2.

Now, lets analyze a bit. What is the channel identifier?
http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-20#section-6.2.1
The first part is an unambiguous string identifying the MRCPv2 session

Now the question is what the MRCPv2 session is. The only more or less clear definition one can found is

A separate MRCPv2 session is needed to control each of the media processing resources
associated with the SIP dialog between the client and server

Another question is whether there can be multiple MRCPv2 sessions in a SIP dialog or not.
According to Nuance, yes it's possible. Why not? Actually, it was not possible in MRCP v1, but v2 conceptually allows it. Anyway, UniMRCP doesn't support this method at the moment. I don't exclude that it'll be supported in the future, though.


I attached my session logs (from my application with custom logging
handler (all UniMRCP related logs are prefixed with 'mrcp_unimrcp:'
mark), SIP and MRCP logs from Wireshark) that reproduce this case.
 

Well, everything seems clear there.


Arsen, thanks a lot for UniMRCP (the best one available) and all your
effort.
 

You're welcome.

--
--wbr.

--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To post to this group, send email to uni...@googlegroups.com.
To unsubscribe from this group, send email to unimrcp+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/unimrcp?hl=en.




--
Arsen Chaloyan
The author of UniMRCP
http://www.unimrcp.org

Kamil Shakirov

unread,
May 21, 2010, 1:14:55 PM5/21/10
to uni...@googlegroups.com
Arsen,

Thanks a lot for your detailed explanation. I will take a look closely
and try to provide more feedback on Monday.

Good night.
> +unsub...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/unimrcp?hl=en.
>
>
>
>
> --
> Arsen Chaloyan
> The author of UniMRCP
> http://www.unimrcp.org
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "UniMRCP" group.
> To post to this group, send email to uni...@googlegroups.com.
> To unsubscribe from this group, send email to unimrcp
> +unsub...@googlegroups.com.

Arsen Chaloyan

unread,
May 22, 2010, 2:04:08 PM5/22/10
to uni...@googlegroups.com
I'd suggest to follow up the discussion below to those who might be interested in the concepts.

http://www.ietf.org/mail-archive/web/speechsc/current/msg01855.html

I wish I found this thread earlier as it gives almost all the answers personally I have looked for a while.
Please read and let's discuss it later.

Kamil Shakirov

unread,
May 24, 2010, 8:57:00 AM5/24/10
to uni...@googlegroups.com
Hi Arsen,

At Sat, 22 May 2010 23:04:08 +0500,
Arsen Chaloyan wrote:

> I'd suggest to follow up the discussion below to those who might be interested
> in the concepts.
>
> http://www.ietf.org/mail-archive/web/speechsc/current/msg01855.html
>
> I wish I found this thread earlier as it gives almost all the answers
> personally I have looked for a while.
> Please read and let's discuss it later.
>

Thanks a lot.

I would like to comment on some definitions of that discussion:

"1. SIP is used to establish and maintain a SIP dialog with a SIP UA acting
on behalf of an MRCP server."

Clear.

"2. Within that dialog, SDP offer-answer has been used to negotiate one *or
more* MRCP Sessions. Each of these is described with a separate m= line
in the SDP. In SDP terminology, each constitutes a media session."

From my use-case when openning two (control channels) resources ASR and TTS
within one MRCPv2 session we get one m-line for each resource (control
channel). Following the MRCPv2 draft many media resources (control channels)
can be added to one MRCPv2 session each has its own m-line (created by MRCPv2
server). However here we have m-line created for each MRCPv2 session within
one SIP dialog.

"3. Within an MRCP session, there may be one or more *channels*
representing the the way the client refers to a particular resource at
the server."

Ok (according to MRCPv2 draft).

>>> I checked the UniMRCP logs and found that ASR channel's name gets
>>> changed (2 times) during setup (when created and added) but the old
>>> name is still used when a request is sent to ASR channel.
>>
>>
>> There is a question here which still remains wide open.
>>
>> See the channel identifiers Nuance responds with
>> a=channel:1@speechsynth
>> a=channel:2@speechrecog
>>
>> I'm still not sure if this even according to spec or not. At least,
>> the behavior suggested by spec and supported by UniMRCP is.
>>
>> a=channel:1@speechsynth
>> a=channel:1@speechrecog
>>
>>
>> See the typical flow in
>> http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-20#section-4.2.
>>
>> Now, lets analyze a bit. What is the channel identifier?
>> http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-20#section-6.2.1
>> The first part is an unambiguous string identifying the MRCPv2 session

Agreed. I have read sections 4.2 and 6.2.1. The first part (before '@') of
different channel's identifiers within one MRCPv2 session should be the same.

In my use-case it seems either Nuance MRCP server doesn't follow MRCPv2 draft
or it somehow thinks there are two MRCPv2 sessions within one SIP dialog. I
may be wrong. I could contact Nuance support on this issue (if there is one).

>> Now the question is what the MRCPv2 session is. The only more or less
>> clear definition one can found is
>>
>> A separate MRCPv2 session is needed to control each of the media processing
>> resources associated with the SIP dialog between the client and server

The last definition confuses me. Now we can't have more than one resource
within the same MRCPv2 session. :-)

>> Another question is whether there can be multiple MRCPv2 sessions in a SIP
>> dialog or not.

By the way, how to distinguish multiple MRCPv2 sessions from only one session
withing one SIP dialog in this case? Using m-lines (as mentioned in the
discussion)?

>> According to Nuance, yes it's possible. Why not? Actually, it was not
>> possible in MRCP v1, but v2 conceptually allows it. Anyway, UniMRCP doesn't
>> support this method at the moment. I don't exclude that it'll be supported
>> in the future, though.

I guess in most use-cases one MRCPv2 session per SIP dialog would be
sufficient but it would be nice if it was supported in the future.

Arsen Chaloyan

unread,
May 24, 2010, 12:09:50 PM5/24/10
to uni...@googlegroups.com
Hi Kamil,

Thanks for the comments and see below.


Actually, the statement below should be enough to make a conclusion the current implementation of UniMRCP complies with the specs.

http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-20#section-6.2.1
  The unambiguous string (first part) MUST BE unique among the resource
instances managed by the server and is common to all resource
channels with that server established through a single SIP dialog.

 

In my use-case it seems either Nuance MRCP server doesn't follow MRCPv2 draft
or it somehow thinks there are two MRCPv2 sessions within one SIP dialog. I
may be wrong. I could contact Nuance support on this issue (if there is one).

I'm pretty sure they are aware of this behavior. Moreover, I wouldn't be surprised if this was an intentional behavior for them. If I'm not mistaken, someone from Nuance submitted such a suggestion to the issue tracker of the speechcs group. Unfortunately, that issue tracker doesn't appear to be publicly available now.

Nonetheless, the suggestion they made seems quite reasonable to me. However, it definitely doesn't comply with either the current  or any former drafts. And still this doesn't surprise me, remember that RTCP BYE message they require.

>> Now the question is what the MRCPv2 session is. The only more or less
>> clear definition one can found is
>>
>> A separate MRCPv2 session is needed to control each of the media processing
>> resources associated with the SIP dialog between the client and server

The last definition confuses me. Now we can't have more than one resource
within the same MRCPv2 session. :-)

>> Another question is whether there can be multiple MRCPv2 sessions in a SIP
>> dialog or not.

By the way, how to distinguish multiple MRCPv2 sessions from only one session
withing one SIP dialog in this case? Using m-lines (as mentioned in the
discussion)?

This question is to the RFC editors. I wonder why they haven't explicitly defined what the MRCPv2 session is.

>> According to Nuance, yes it's possible. Why not? Actually, it was not
>> possible in MRCP v1, but v2 conceptually allows it. Anyway, UniMRCP doesn't
>> support this method at the moment. I don't exclude that it'll be supported
>> in the future, though.

I guess in most use-cases one MRCPv2 session per SIP dialog would be
sufficient but it would be nice if it was supported in the future.

It's not in my immediate plans, but I don't exclude such a possibility in the future. We'll see...

--wbr.

Best regards

--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To post to this group, send email to uni...@googlegroups.com.
To unsubscribe from this group, send email to unimrcp+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/unimrcp?hl=en.


--
Arsen Chaloyan
The author of UniMRCP
http://www.unimrcp.org

Kamil Shakirov

unread,
May 24, 2010, 1:43:02 PM5/24/10
to uni...@googlegroups.com
Hi Arsen,

At Mon, 24 May 2010 21:09:50 +0500,
Then I can have a convincing argument for reporting this issue to Nuance
support. ;-)

> http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-20#section-6.2.1
>
> The unambiguous string (first part) MUST BE unique among the resource
>
> instances managed by the server and is common to all resource
> channels with that server established through a single SIP dialog.
>
>  
>
> In my use-case it seems either Nuance MRCP server doesn't follow MRCPv2
> draft
> or it somehow thinks there are two MRCPv2 sessions within one SIP dialog. I
> may be wrong. I could contact Nuance support on this issue (if there is
> one).
>
> I'm pretty sure they are aware of this behavior. Moreover, I wouldn't be
> surprised if this was an intentional behavior for them. If I'm not mistaken,
> someone from Nuance submitted such a suggestion to the issue tracker of the
> speechcs group. Unfortunately, that issue tracker doesn't appear to be publicly
> available now.

Anyway I am gonna ask them as we have a paid support.

> Nonetheless, the suggestion they made seems quite reasonable to me. However, it
> definitely doesn't comply with either the current  or any former drafts. And
> still this doesn't surprise me, remember that RTCP BYE message they require.

Well, I started using OpenMRCP and the first implemntation of UniMRCP without
RTCP BYE message and it worked fine. But I did experienced some small delays when
recognizing but didn't take a closer look to that issue though. I will try it
as soon as I get my sessions working.

> >> Now the question is what the MRCPv2 session is. The only more or less
> >> clear definition one can found is
> >>
> >> A separate MRCPv2 session is needed to control each of the media
> processing
> >> resources associated with the SIP dialog between the client and server
>
> The last definition confuses me. Now we can't have more than one resource
> within the same MRCPv2 session. :-)
>
> >> Another question is whether there can be multiple MRCPv2 sessions in a
> SIP
> >> dialog or not.
>
> By the way, how to distinguish multiple MRCPv2 sessions from only one
> session
> withing one SIP dialog in this case? Using m-lines (as mentioned in the
> discussion)?
>
> This question is to the RFC editors. I wonder why they haven't explicitly
> defined what the MRCPv2 session is.

It wonders me too. That discussion was almost ten revisions of MRCPv2 draft
ago. I guess as MRCPv2 is still in the draft state it doesn't get much
attention and fixes from many different other implementations. I may be wrong.

> >> According to Nuance, yes it's possible. Why not? Actually, it was not
> >> possible in MRCP v1, but v2 conceptually allows it. Anyway, UniMRCP
> doesn't
> >> support this method at the moment. I don't exclude that it'll be
> supported
> >> in the future, though.
>
> I guess in most use-cases one MRCPv2 session per SIP dialog would be
> sufficient but it would be nice if it was supported in the future.
>
> It's not in my immediate plans, but I don't exclude such a possibility in the
> future. We'll see...

UniMRCP works as expected in most use cases and it makes it the best in Open
Source world. I am not aware of any other OSS implementations though. ;-)

--wbr.

Arsen Chaloyan

unread,
May 24, 2010, 2:35:24 PM5/24/10
to uni...@googlegroups.com
Hi Kamil,


Well, give it a try :)

> http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-20#section-6.2.1
>
>   The unambiguous string (first part) MUST BE unique among the resource
>
>    instances managed by the server and is common to all resource
>    channels with that server established through a single SIP dialog.
>
>  
>
>     In my use-case it seems either Nuance MRCP server doesn't follow MRCPv2
>     draft
>     or it somehow thinks there are two MRCPv2 sessions within one SIP dialog. I
>     may be wrong. I could contact Nuance support on this issue (if there is
>     one).
>
> I'm pretty sure they are aware of this behavior. Moreover, I wouldn't be
> surprised if this was an intentional behavior for them. If I'm not mistaken,
> someone from Nuance submitted such a suggestion to the issue tracker of the
> speechcs group. Unfortunately, that issue tracker doesn't appear to be publicly
> available now.

Anyway I am gonna ask them as we have a paid support.

Yes, the paid support gives you that ability to ask them. I suspect what would be the answer, though.
 

> Nonetheless, the suggestion they made seems quite reasonable to me. However, it
> definitely doesn't comply with either the current  or any former drafts. And
> still this doesn't surprise me, remember that RTCP BYE message they require.

Well, I started using OpenMRCP and the first implemntation of UniMRCP without
RTCP BYE message and it worked fine. But I did experienced some small delays when
recognizing but didn't take a closer look to that issue though. I will try it
as soon as I get my sessions working.

The ability to indicate the end of utterance from the client to the server is quite helpful. However, it shouldn't be a mandatory option. Also, if I was on their place, I'd do it through named events (RFC4733), instead of RTCP BYE message, which primarily indicates the end of RTP session...

>     >> Now the question is what the MRCPv2 session is. The only more or less
>     >> clear definition one can found is
>     >>
>     >> A separate MRCPv2 session is needed to control each of the media
>     processing
>     >> resources associated with the SIP dialog between the client and server
>
>     The last definition confuses me. Now we can't have more than one resource
>     within the same MRCPv2 session. :-)
>
>     >> Another question is whether there can be multiple MRCPv2 sessions in a
>     SIP
>     >> dialog or not.
>
>     By the way, how to distinguish multiple MRCPv2 sessions from only one
>     session
>     withing one SIP dialog in this case? Using m-lines (as mentioned in the
>     discussion)?
>
> This question is to the RFC editors. I wonder why they haven't explicitly
> defined what the MRCPv2 session is.

It wonders me too. That discussion was almost ten revisions of MRCPv2 draft
ago. I guess as MRCPv2 is still in the draft state it doesn't get much
attention and fixes from many different other implementations. I may be wrong.

Yes, it looks a bit strange as we all have been waiting for this draft to become a working standard for a while. Anyway, even now MRCPv2 has been widely adopted. BTW, if you go through at least last 10 revisions of this draft and compare the differences (the first page would be enough), you may get some answers, I believe.

>     >> According to Nuance, yes it's possible. Why not? Actually, it was not
>     >> possible in MRCP v1, but v2 conceptually allows it. Anyway, UniMRCP
>     doesn't
>     >> support this method at the moment. I don't exclude that it'll be
>     supported
>     >> in the future, though.
>
>     I guess in most use-cases one MRCPv2 session per SIP dialog would be
>     sufficient but it would be nice if it was supported in the future.
>
> It's not in my immediate plans, but I don't exclude such a possibility in the
> future. We'll see...

UniMRCP works as expected in most use cases and it makes it the best in Open
Source world. I am not aware of any other OSS implementations though. ;-)

Yes, it should be working as expected, as far as there is someone who cares about it much.

--wbr.

--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To post to this group, send email to uni...@googlegroups.com.
To unsubscribe from this group, send email to unimrcp+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/unimrcp?hl=en.


--
Arsen Chaloyan
The author of UniMRCP
http://www.unimrcp.org

Kamil Shakirov

unread,
May 26, 2010, 7:51:41 AM5/26/10
to uni...@googlegroups.com

Hi Arsen,

I am still waiting for response from Nuance support on the reported problem.

In the meantime I made a quick fix that fixes the problem for now. Now I can
successfully utilize ASR and TTS resources in one MRCPv2 session.

The fix:

diff --git a/libs/mrcp-client/src/mrcp_client_session.c b/libs/mrcp-client/src/mrcp_client_session.c
index 6b6a4e7..89e15ff 100644
--- a/libs/mrcp-client/src/mrcp_client_session.c
+++ b/libs/mrcp-client/src/mrcp_client_session.c
@@ -624,6 +624,8 @@ static mrcp_channel_t* mrcp_client_channel_find_by_name(mrcp_client_session_t *s

static apt_bool_t mrcp_client_message_send(mrcp_client_session_t *session, mrcp_channel_t *channel, mrcp_message_t *message)
{
+ apt_str_t session_id, resource_name;
+
if(!session->base.id.length) {
mrcp_message_t *response = mrcp_response_create(message,message->pool);
response->start_line.status_code = MRCP_STATUS_CODE_METHOD_FAILED;
@@ -632,7 +634,8 @@ static apt_bool_t mrcp_client_message_send(mrcp_client_session_t *session, mrcp_
return TRUE;
}

- message->channel_id.session_id = session->base.id;
+ apt_id_resource_parse(&channel->control_channel->identifier,'@',&session_id,&resource_name,message->pool);
+ message->channel_id.session_id = session_id; /* session->base.id; */
message->start_line.request_id = ++session->base.last_request_id;
apt_log(APT_LOG_MARK,APT_PRIO_INFO,"Send MRCP Request "APT_NAMESIDRES_FMT" [%"MRCP_REQUEST_ID_FMT"]",
MRCP_SESSION_NAMESID(session),

--
--wbr.

Arsen Chaloyan

unread,
May 26, 2010, 9:08:41 AM5/26/10
to uni...@googlegroups.com
Hi Kamil,

On Wed, May 26, 2010 at 4:51 PM, Kamil Shakirov <kami...@gmail.com> wrote:

Hi Arsen,

I am still waiting for response from Nuance support on the reported problem.

Feel the difference.


In the meantime I made a quick fix that fixes the problem for now. Now I can
successfully utilize ASR and TTS resources in one MRCPv2 session.

Thanks for the patch. I haven't tried it yet, but I believe it does the trick.
Nevertheless, it's clearly just a workaround. The problem is more conceptual here.

Of course, I could commit this patch probably #ifdef-ed and disabled by default, if you and others find this reasonable.

The fix:

diff --git a/libs/mrcp-client/src/mrcp_client_session.c b/libs/mrcp-client/src/mrcp_client_session.c
index 6b6a4e7..89e15ff 100644
--- a/libs/mrcp-client/src/mrcp_client_session.c
+++ b/libs/mrcp-client/src/mrcp_client_session.c
@@ -624,6 +624,8 @@ static mrcp_channel_t* mrcp_client_channel_find_by_name(mrcp_client_session_t *s

 static apt_bool_t mrcp_client_message_send(mrcp_client_session_t *session, mrcp_channel_t *channel, mrcp_message_t *message)
 {
+       apt_str_t session_id, resource_name;
+
       if(!session->base.id.length) {
               mrcp_message_t *response = mrcp_response_create(message,message->pool);
               response->start_line.status_code = MRCP_STATUS_CODE_METHOD_FAILED;
@@ -632,7 +634,8 @@ static apt_bool_t mrcp_client_message_send(mrcp_client_session_t *session, mrcp_
               return TRUE;
       }

-       message->channel_id.session_id = session->base.id;
+       apt_id_resource_parse(&channel->control_channel->identifier,'@',&session_id,&resource_name,message->pool);
+       message->channel_id.session_id = session_id; /* session->base.id; */
       message->start_line.request_id = ++session->base.last_request_id;
       apt_log(APT_LOG_MARK,APT_PRIO_INFO,"Send MRCP Request "APT_NAMESIDRES_FMT" [%"MRCP_REQUEST_ID_FMT"]",
                                       MRCP_SESSION_NAMESID(session),

--
--wbr.

--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To post to this group, send email to uni...@googlegroups.com.
To unsubscribe from this group, send email to unimrcp+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/unimrcp?hl=en.

Kamil Shakirov

unread,
May 26, 2010, 11:17:57 AM5/26/10
to uni...@googlegroups.com
At Wed, 26 May 2010 18:08:41 +0500,
Arsen Chaloyan wrote:

> In the meantime I made a quick fix that fixes the problem for now. Now I
> can successfully utilize ASR and TTS resources in one MRCPv2 session.
>
> Thanks for the patch. I haven't tried it yet, but I believe it does the
> trick. Nevertheless, it's clearly just a workaround. The problem is more
> conceptual here.
>
> Of course, I could commit this patch probably #ifdef-ed and disabled by
> default, if you and others find this reasonable.

No problem (I keep my branch anyway), if anyone finds it useful but I hope
Nuance will resolve this issue and fix it.

--
--wbr.

Reply all
Reply to author
Forward
0 new messages