Pharma Strategy Blog

Commentary on Pharma & Biotech Oncology / Hematology New Product Development

Image representing YouTube as depicted in Crun...Image via CrunchBase

One of the big challenges with the Web2.0 world is as more services provide audio through YouTube and other tools, the ability to transcribe and translate into the written word has yet to catch up with reality.  We've all experienced the vagaries of Google Translate on websites, but what about audio files? 

Picture 29 A while back, I enthusiastically tried out some apps available on my iPhone with the idea of testing them out for authenticity and accuracy for business purposes.  These included ReQall and Jott.  Now, even bearing in mind I have a British, not American accent, it was a bit of a surprise to be reading a translated shopping list and finding I needed an item hilariously called "break your knees" and other such gibberish.  Sadly, I was at the checkout before realising that it actually meant 'frozen garden peas'!

It was therefore with great amusement that I read an interesting article in the New York Times this morning about Google Voice.  Here is one such snippet:

Picture 28

"Sunday schnitzels cripples"?  Yikes, I'm glad my accent isn't the only one that gets mangled by the computer algorithms!!  

A few years ago, we used to send CD's of physician interviews off for medical translation and of course, they would come back with some absolute gems and pearls, necessitating having to listen to the whole thing again to tune up the quirks and strangulated medical terminology.  After a while, we began to think maybe it was easier to do it ourselves, at least it would be accurate, if time consuming.

Having just signed up for Google Voice, it will be interesting to see how good the service really is, but raising ones hopes too much is probably not a good idea after reading the hilarity of the examples in the Times article.

How many of us have started to get those small Flip video cameras for posting content to the audio web via YouTube, Qik, UStream, Viddler, Vimeo etc?  At $200 a pop for a point and shoot, the costs are now so low that a major barrier has been removed for the masses to take advantage of new technology without needing a Ph.D to figure out the weighty and complicated manuals.

Recently, Fred Wilson of A VC described how the API could be used to get a written translation of a half hour long interview from a YouTube video.  Most of us can scan text and mine it for key points relevant to us much more quickly than listening to an audio presentation without written cues.

Many of us listen to Pharma and Biotechnology analyst presentations regularly,
but again, these are time consuming and there is many a time when I
would rather download the presentation and a transcript for easy
offline review while on a train or plane than having to sit through a
live or recorded webcast for an hour at an inconvenient time, and of course, you don't where or when the relevant
items of interest will pop up, forcing you to sit through the whole thing to get a small nugget of intelligence that might actually be valuable.

Still, in the Pharma world, imagine if you could accurately translate audio from the web, which included patient sentiments about brands, diseases or even companies?  That would be very powerful indeed, especially with
significant growth in online communication expected to come from video over the next few years.  Being able to analyse the ideas expressed in the aggregate would be a really useful tool for social media monitoring.

Watch this space for further developments in the near future.

Reblog this post [with Zemanta]

4 Responses to “The evils of audio translation and it's relation to Pharma”

  1. James Whatley

    Drop me a line if you want to try SpinVox… 🙂
    via @mikeashworth on Twitter

  2. Diane Saarinen

    Drop me a line if you want to try good old human audio transcription (and I speak medical-ese!) 😉

  3. Ian R McAllister

    Interesting debate Sally! My dissertation for my first degree was on the subject of artificial voice creation/storage. I was then part of the Directory Enquiries (DQ) team at BT, using it to double staff productivity. No system at present is ever going to be perfect, defined by: mathematics understand human language; speed of applying that algorithm; and storage cost. To create the DQ “your number is” part, they employed an actress for four months to record all of the various sound iterations of each letter/combination: that still sounds false. I use Odiogo plugin on my wordpress blog, and accepting the American accent, its instructions warn that perfect written English/punctuation will often not result in great audio. I think the best we can hope for at present is a possible time saving/greater access device. The number of blind/partially sited blog readers I have had comments from has very much struck me since installing Odiogo. One last point, and its still true today: gamblers perceive that talking devices have more intelligence and hence lesser odds over old “dumb” one armed bandits – strange but true!

  4. Clare D

    Yes, that’s the trouble with podcasts I find – when I listen to them I find them fascinating – it’s just grabbing the time that’s the problem. I guess I just don’t do enough ironing these days.

Comments are closed.

error: Content is protected !!