•   11 months ago

WIT as a transcription service

Hey fellas,
Can we use wit as a pure transcription service? I am looking for a response something like as shown below (I require the timestamps of individual words). This is like level-1 basic transcription without training. Is this possible with current wit speech API? If I understand correctly right now I get timestamps only with the entities that are extracted from the speech after training.

Quite new to the world of natural language interaction. Please let me know in case I am missing something here.

Cheers
Prateek

[
"items": [
{
"start_time": "2.23",
"end_time": "2.78",
"alternatives": [
{
"confidence": "0.9582",
"content": "morning"
}
],
"type": "pronunciation"
},
{
"alternatives": [
{
"confidence": "0.0",
"content": "."
}
],
"type": "punctuation"
},
{
"start_time": "2.79",
"end_time": "2.91",
"alternatives": [
{
"confidence": "0.861",
"content": "Who"
}
],
"type": "pronunciation"
},
{
"start_time": "2.91",
"end_time": "3.04",
"alternatives": [
{
"confidence": "0.8081",
"content": "would"
}
],
"type": "pronunciation"
},
{
"alternatives": [
{
"confidence": "0.0",
"content": "?"
}
],
"type": "punctuation"
},
]

  • 2 comments

  • Manager   •   11 months ago

    Hey! Here's the answer I got for you:

    You can send an audio file to get the text wit.ai transcribes it to, but we do not break down the individual words. You can see https://wit.ai/docs/http/20200513 > "Retrieve the meaning of an audio wave" for more detail.

    Hope that helps!
    Stefanie

  •   •   11 months ago

    Thanks for your time and efforts, Stefanie.
    This is helpful but unfortunately will solve only half of my problem :) I need the individual words and their timestamp. But if it's not there, we can't help it. I will try figuring something out.

    Happy Weekend.

    Cheers
    Prateek

Comments are closed.