Is it time to subscribe to a printer service from HP?

Ever since my dad brought home an...

What’s the best way of buying a phone today?

How did you buy your latest phone?...

MWC: What device highlights did you miss?

So, early last week I predicted that...

CORRECTION: I’m a victim of SpinVox Spin – Sorry!

As some eagle eyed readers of Mobile Industry Review have spotted, my previous post entitled, ‘SpinVox claims only 2% of messages are referred to humans” was incorrect.

I read their press release (included in the previous post) and centred on this paragraph:

“This process has helped us improve our accuracy massively. Since its inception in 2007, the technology has improved to the extent that the system requires only two per cent of the input it required just two years ago and can even now predict more than 99 per cent of what most people speaking in English or Spanish will say next. Or to put it another way, in just two years, we have reduced the requirement for human intervention to just a few hundred agents per market compared to the thousands per market when we started. Our world-class speech scientists in the Advanced Speech Group have helped make this system unchallenged in terms of accuracy, speed and reliability.”

My immediate takeaway? Only 2% of their messages are referred to humans.

‘Finally,’ I thought, ‘We can get on with our lives and stop having to wonder about the SpinVox technology.’

2% is good enough, right?

Well, yes.

But I think I was willing this to be the case. I wonder how you read that paragraph? A lot better than I did in the cold light of day I’m sure.

It turns out I got it wholly wrong as this anonymous poster rightly asserts in the initial article comments:

“The system requires only two per cent of the input it required just two years ago”

That’s not the same as “only 2% of messages are referred to humans”

Sorry to be a pedant, but there are lies, damned lies and statistics!!

Ed Lea joined the discussion thus:

“Since its inception in 2007, the technology has improved to the extent that the system requires only two per cent of the input it required just two years ago.”

That’s certainly 2% of something, but I’m not sure what. There are a lot of different numbers and stats in that their statement but it feels like there’s a bit of smoke and a few mirrors left.

PR expert Patrick Smith contributed:

Actually, after reading the release again, they still don’t actually state how many messages are done by computers or humans, or a bit of both. What they state is they need 98% less human interaction – which isn’t quite the same as 2% of messages done by humans.

So, even after all this, they’re trying to hide behind ‘clever’ PR.

Why can’t they just be up front and honest?

Kieran wasn’t impressed:

SPINvox, I honestly don’t think people would mind if they were more honest and less Spin

Ewan Spence then demonstrates my plonker status thus:

Hold on a mo, still think you;re being spun here

” the system requires only two per cent of the input it required just two years ago ”

Let’s be evil. Let’s try and work out how many MESSAGES this is. ie how BIG can the number be. First up, lets say at the start ALL messages had to go to humans. 100%. So that means now tha tthe input required is two percent of the workload. Okay, imagining a sample of 1000 SMS messages,each filling 160 characters. That’s 160,000 letters, so taking an average of 5 letters per word (good rule of thumb in code breaking) that’s 32,000 words

So in year X, 32,000 of 32,000 words are checked by humans.

Now, in year X+2, 32000/50 only 640 words need checked. But how many messages does that represent? Well if the 160 SMS long messages contain, on average 32 words, then our batch of 1000 messages with an equal spread has one wrong word in two thirds of the messages, and that means we could be lookign at an upper limit of 66% of messages requiring human intervention.

That’s a far cry from the PR spin attempt at 2%

…and I’m not even a statistician!

Thank you chaps for your contributions and for helping clarify that my original post wasn’t accurate. I’ve updated the headline accordingly and drafted this one to go out across the social media wires quickly to replace the old one.

As for SpinVox, goodness me, I’ll need to read their statements a lot more carefully next time.

Can I get a non-slippy, non-hidden, non-verbal-reasoning-test answer to the following question, please, SpinVox?

How many messages are referred — in any way — to human operators during the transcription process?

It’s great that you need 98% less human interaction. But I don’t know what that means. Maybe humans needed to listen to the entire message and transcribe everything when they launched? Now they only need to listen to parts of the message or phrases? I don’t know.

12 COMMENTS

  1. Have they not out-spun themselves?

    The way they've expressed this 2% suggests that more than 2% of messages require human interaction – otherwise why wouldn't they just say 2% (or less).

    If when they started they required 100% intervention then they are now at 2%. Presuming that they never required 100% they must be (after 98% reduction) at less than 2%. Let's call it 1%.

    Well done Porter Novelli. We're all thinking 'more than 2%' but the actuality must be less than 2%

  2. SpinVox Voice Message Conversion System (VMCSâ„¢) is now so advanced + efficient it has reduced use of QC agents by 98% in just two years.

    This does not mean 98% automation OR 2% manual. It means they have reduced agents by 98%. I am sure they have a level of automation. However ASR technology even with AI attached will not produce 98% automation. These numbers are only achievable with constrained vocabularies. They also have the added challenge that mobile audio is notoriously noisy so even knowing what 99% of people might say does not mean your engine will understand what they have said.

    This whole debate comes down to less spin and more vox. How many X% of voicemails, not hangups or empty calls require human interaction.

  3. If the BBC is allowed to visit their datacentre, this is what they should be looking for:

    If they have 30m customers, each one receives on average 1 message per day resulting in about 4,500 servers being necessary, with 8 cores each assuming each 30 second voicemail processes at 3x realtime or 1:30 (which is considered state of the art in modern speech recognition). This calculation is based on 10% of transactions happening during the peak busyhour, which is standard when sizing telcom platforms.

    I bet you will find they do not have the servers necessary to claim automation for these 30m customers.

  4. Have they not out-spun themselves?

    The way they've expressed this 2% suggests that more than 2% of messages require human interaction – otherwise why wouldn't they just say 2% (or less).

    If when they started they required 100% intervention then they are now at 2%. Presuming that they never required 100% they must be (after 98% reduction) at less than 2%. Let's call it 1%.

    Well done Porter Novelli. We're all thinking 'more than 2%' but the actuality must be less than 2%

  5. NOTE – if they are counting hangups or empty calls in these stats, which can be automated as there is no message then this alone would account for almost 50% of all voicemails. the PR does not answer the question asked by BBC it actually stokes the fire more as it just seems there is more spin than vox.

  6. SpinVox Voice Message Conversion System (VMCSâ„¢) is now so advanced + efficient it has reduced use of QC agents by 98% in just two years.

    This does not mean 98% automation OR 2% manual. It means they have reduced agents by 98%. I am sure they have a level of automation. However ASR technology even with AI attached will not produce 98% automation. These numbers are only achievable with constrained vocabularies. They also have the added challenge that mobile audio is notoriously noisy so even knowing what 99% of people might say does not mean your engine will understand what they have said.

    This whole debate comes down to less spin and more vox. How many X% of voicemails, not hangups or empty calls require human interaction.

  7. If the BBC is allowed to visit their datacentre, this is what they should be looking for:

    If they have 30m customers, each one receives on average 1 message per day resulting in about 4,500 servers being necessary, with 8 cores each assuming each 30 second voicemail processes at 3x realtime or 1:30 (which is considered state of the art in modern speech recognition). This calculation is based on 10% of transactions happening during the peak busyhour, which is standard when sizing telcom platforms.

    I bet you will find they do not have the servers necessary to claim automation for these 30m customers.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Recently Published

Is it time to subscribe to a printer service from HP?

Ever since my dad brought home an HP LaserJet printer (version 3, if memory serves), I have been printing with an HP. Over the...

What’s the best way of buying a phone today?

How did you buy your latest phone? I'm asking because I'm thinking about what I should be doing. When I was living in Oman, I...

MWC: What device highlights did you miss?

So, early last week I predicted that next to nothing from Mobile World Congress would break through into the mainstream media. I was right,...

How Wireless Will Pave the Path to Neobank Profitability

I'm delighted to bring you an opinion piece from Rafa Plantier at Gigs.com. I think it's particularly relevant given the recent eSIM news from...

An end of an era: Vodafone UK turns off 3G services

I thought it was worthwhile highlighting this one from the Vodafone UK team. For so long - for what feels like years, seeing the...

Mobile World Congress: Did the mainstream media notice?

I resolved this year to make sure I wrote something - anything - about Mobile World Congress, the huge mobile industry trade show taking...