This is an interesting article on the Authors Guild's attempt to get
authors paid audio rights for Kindle's ability to read aloud. Is it
really read aloud if it isn't read by a human?
Alicia
--
eric scoles (esc...@antikoan.com)
An interesting question. My initial impression is that having your device read your book aloud is no different than having a friend read it to you. That seems like a fair use of the product that you have purchased.
On the other hand, it does seem to me that it's a harbinger of things to come. For a few years now I've been expecting the advent of a sort of automated podcast: Blog posts or articles that had been marked-up for automated readers. You could do it with xhtml+css, even -- no need for a new markup language. You'd be able to subscribe to feeds of marked-up text (thus very compact and trivially fast to transfer to your device) that would be interpreted on the fly. It would take some work by patient people to craft ways of describing speech, but the basic work was all done by linguists and phonologists years ago -- it "just" needs to be ported.
[snip]
...
Now to color my initial impressions with Blount's article. First, wow! That was one of the best written newspaper pices I have read in a long time. It's almost an essay, though it's been so long since I've read or written one, my specs on essays are a but rusty.
The automated podcast has been attempted at least once, and years ago. And of all types - it was an intellectual property law podcast using one of the OS-genre text-to-speech systems. It was awful, and it even used text specifically prepared for it. I couldn't listen to it for more than a couple of minutes. Of course, using a better system would improve that quite a bit, and many improvements have been made in even the OS powered systems since the podcast was done (I think it was 2003 or 2004). It's very likely that there another, more listenable, automated podcast out there now.
[snip]
If I undestand correctly what you're describing, this would not be that. Sounds like what you describe is a matter of piping text through MacInTalk, or something like that, and saving it as MP3. Plus, they wouldn't have any inflection, as we both noted, so it would be hard to listen to -- especially for something like that, where you need to understand all of it.
[snip]
All my speculations are purely divorced from the IP aspects of it, of course. On the IP level, I still am not sure what to think of it. I think I probably favor a broader reading of Fair Use than is currently accepted. It's my mis-spent Libertarian youth coming back to haunt me.
[snip]
That said, I think it's reasonable to have a separate license for a
performance by a particular reader: human or machine. If Kindle's
reader is just reading the plain text, then it falls under the plain
text license I mentioned above. If it encodes additional information
beyond plain text, then it's a performance that deserves a different
license.
[snip]
Keep in mind that fair use is a defense to copyright infringement, not a right. This is something that is generally not understood by the non-copyright folks (read 98% or more of the world) You still infringe the copyright and can be sued. You just say that what you did does not warrant any sort of compensation to the copyright holder.
[snip]
On Wed, Feb 25, 2009 at 8:59 AM, Eric Scoles <erics...@gmail.com> wrote:
[snip]
If I undestand correctly what you're describing, this would not be that. Sounds like what you describe is a matter of piping text through MacInTalk, or something like that, and saving it as MP3. Plus, they wouldn't have any inflection, as we both noted, so it would be hard to listen to -- especially for something like that, where you need to understand all of it.
[snip][snip]
All my speculations are purely divorced from the IP aspects of it, of course. On the IP level, I still am not sure what to think of it. I think I probably favor a broader reading of Fair Use than is currently accepted. It's my mis-spent Libertarian youth coming back to haunt me.
Keep in mind that fair use is a defense to copyright infringement, not a right. This is something that is generally not understood by the non-copyright folks (read 98% or more of the world) You still infringe the copyright and can be sued. You just say that what you did does not warrant any sort of compensation to the copyright holder.
I would love to see specific grants of rights for people to read to each other in non-commercial contexts, such as in the car, at bed time, in book clubs, and the like. Specific grants of rights have to be codified into statute or granted by the copyright holders, such as in a notice in the work (hint, hint).
On 2009-02-25, Dave Henn <dave...@gmail.com> wrote:On Wed, Feb 25, 2009 at 8:59 AM, Eric Scoles <erics...@gmail.com> wrote:
[snip]
If I undestand correctly what you're describing, this would not be that. Sounds like what you describe is a matter of piping text through MacInTalk, or something like that, and saving it as MP3. Plus, they wouldn't have any inflection, as we both noted, so it would be hard to listen to -- especially for something like that, where you need to understand all of it.
I don't think we're talking about the same "cues."
Yes, there are periods and commas and paragraphs and quotation marks, and you can code a text-to-speech system to account for that. (MacInTalk does.) But that's a long way from Roy Blount, Jr. Or Tom Bodet. Or Peter Riegert. Imagine Sarah Vowell read by a text to speech system. OK, bad example: some people would prefer that, I know. How about David Sedaris?
Consider Blount's point about the accent: IBM has coded that into their voice tree systems, possibly using his own southern accent as one model. I've listened to accented text to speech voices, and they're not terrible. But you'd have to know to use them, and there's no cue in plaintext for that. There's also no cue for gender, pitch, timbre, tone, or, really, cadence.
[snip]
To come back to my first question, what's the difference between a
computer reading a text and a human actor reading a text? One's a
separate work and the other isn't? And to Bount's point that computer
speech is becoming more human-like. The line will only become more
blurred as technology improves.
To take if further and farther, if an avatar on SL dramatizes a work
of mine, is it a separate work? If someone erects a virtual mountain
out of a particularly vivid chunk of my text, sets it on their own SL
island and sells tickets to climb it, can I argue that I own the
rights? Someone else put their effort and artistry, their inflection,
into it. With a computer.
The world is changing and we as spec fic writers should be ahead of
the crowd in exploring what new technologies and media do the the old
lines between print, audio, and everything else.
Alicia
Does it? What right is being taken away? The right to ensure that
> That creates a precedent that would allow everyone to take
> another licensing right from me just because they could.
people don't apply some kinds of algorithms to your work? Which kinds
of algorithms?
And what if I buy my own software to read text and connect it somehow
to my Kindle? Should I send you a check if I feed your book via
Kindle into it?
And flip the precedent argument over: should we pursue every possible
licensing right just because we could? (I strongly believe that if
American public libraries were not started when they were, they'd
never have occurred -- corporate interests and many writers alike
would today decry them as theft if the precedent were not already
established and someone tried to start a public library now.)
Because a machine is reading it. It's not a performance. There's an
> Why shouldn't I and every writers group fight that?
algorithm between the purchased text and the reader/listener.
What's the idea here? When people read the book without any
intervening machinery that's OK, but if they apply some machinery to
it then the author should be paid extra? This is a digital reader --
it's code all the way down, it's all machine. What if I plug a Kindle
into my computer so I can read a book on a bigger screen (whether the
Kindle allows this is irrelevant -- suppose a future version did)?
Should the Guild demand the reader should have to pay again, this time
for bigger font rights? What if Kindle came out with an add-on screen
that was more colorful? Should readers be paying more-colorful-
reading rights? And if not, what's the difference? It's code doing
the major work in all these cases -- whether putting it to a screen or
"reading" it.
I'm only going to reply to this part because it's quick. There are LOTS of other cues before a reader, far more than the individual words and the punctuation. There is the context in which each word is employed, which is dependent in part on the words surrounding it., partly on the words in larger portions of the text being processed. There is the flow of the text in a sentence, it's rhythm, or, as you say, cadence that should be parsable using phonemes and syllabic databases (I'm sure I'm mangling the terminology, but you get the idea). How often have you seen someone or yourself read a passage aloud a second time because the first time didn't sound right? Something cued, or didn't, your change in how you read the passage. All of these things are goals for text-to-speech, and context is already being used in many. I don't know about rhythm, but that shouldn't be long if it's not already there. As far as gender, if a system has a sufficient database of names, it should be able to take a good guess at that, and pitch and timbre would at least partially follow from gender. Tone, I don't know, but context would certainly help there.
Maybe we should do away with copyright altogether and simply rely on contracts alone.
... What if it were a box with a Furby
and a talking Barbie and you could feed any script into it and it
would read the play, alternating between the two robots to simulate
dialog? ...
I stand by my bitter conviction, and proclaim from the rooftops that
THE AUTHOR GUILD LEADERS ARE IDIOTS if they pursue this.
Yeah, maybe I shouldn't say "idiots". "The Craven Depraved and their
Depraved Craven Acts"? "Foot Shooters and the Feet they Shoot"?
"Reactionary Fools and their Foolish Reactions"?
On Wed, Feb 25, 2009 at 7:14 PM, Dave Henn <dave...@gmail.com> wrote:
I'm only going to reply to this part because it's quick. There are LOTS of other cues before a reader, far more than the individual words and the punctuation. There is the context in which each word is employed, which is dependent in part on the words surrounding it., partly on the words in larger portions of the text being processed. There is the flow of the text in a sentence, it's rhythm, or, as you say, cadence that should be parsable using phonemes and syllabic databases (I'm sure I'm mangling the terminology, but you get the idea). How often have you seen someone or yourself read a passage aloud a second time because the first time didn't sound right? Something cued, or didn't, your change in how you read the passage. All of these things are goals for text-to-speech, and context is already being used in many. I don't know about rhythm, but that shouldn't be long if it's not already there. As far as gender, if a system has a sufficient database of names, it should be able to take a good guess at that, and pitch and timbre would at least partially follow from gender. Tone, I don't know, but context would certainly help there.
....
First: Those are all things that can be inferred, but many of them are choices. Other choices are possible. Those choices are what make a reading a performance.
Second: The value added for a good reading seldom has much to do with those things that are already in the text. It has to do with, yes, the choices that the reader makes in how to interpret or render the stuff that's in the text; but it also has to do with how to render the stuff that's either not in the text (what it makes him/her feel or think, etc.), or that is in the text in ways that no reader will be able to deal with for quite a while. ('Biff's tone oozed. "Oh, but you look wonderful, dear." There was a glint in his eye that I knew well, but Jane did not.') Or consider Rob Sawyer's Kennedy impression in the reading many of us heard him give some months back: Kennedy's not even named in the text, except by implications that only a human would get. Not naming Kennedy has a positive impact on the story-reading experience, because you're allowed to discover who it is that the alien is talking about as you read the words. You could mark that up, though, without having a negative impact on the experience: You discover it by hearing the impression.
Good readings are performances. Performance involves choice. What I'm suggesting is to take the automated reading to a new level -- one analogous to that enabled by MIDI on a good keyboard set, which is absolutely not possible right now or in the truly foreseeable future (i.e., the future that comes before the readers have AI that allows them to interpret the meaning of texts). (To say that it's likely to happen because we've done these things before is not what I mean by 'foreseeable' in this context, because we don't yet have an idea of how it would be done.)
I feel we're speaking at cross-purposes. My broader argument is that there are applications for a speech markup language. The automated reading of blog posts is simply what I see as an early application. This is not something that would take huge research funding to work out -- the markup spec could be mapped out in a weekend by someone familiar with XML specification, critiqued over a period of time. The particulars could be figured out with low-tech by grad students. Hell, I'd be surprised if there hadn't already been Media Lab projects to do exactly this kind of stuff.
OK, I wish I'd written "infuriatingly counterproductive" in place of
each instance of "idiotic" and other such adjectives.
On Feb 26, 11:57 am, Dana Paxson <dwpax...@acm.org> wrote:
> What a wild thread this is! But it hits at a huge problem any artist or
> creator has to face: How do I get an appropriate reward for my creative
> efforts?
Yep.
But I'm going to throw in another huge topic. The future. Ever since
the 19th century spawned a whole literature of future utopias people
have argued, sometimes convincingly, sometimes not, for running the
world in a fundamentally different way than the present. (They got
this, just as writers always have, from the morning headlines,
specifically discussions of Communism and Socialism replacing
Capitalism, but I'm going to ignore that.)
The one thing that Utopians and their successors always left out was
the messy transition. How did a society, with society's massive
inertia and legacy problems, change over from the present day to the
future that presumed universal change down to its very roots?
You see this in current arguments from those who want to get rid of
our suburban sprawl and replace it with tight-knit cities dependent on
mass transit. How do you get there from here? How do you compensate
the 60% of America whose current choices would be devalued by removing
cars from the equation?
The world in which the argument technology exists and everybody has to
let it roll over them is like those. In a future world content will be
produced in a different payment model than today. It can't be expected
that those who developed their business plans for the current business
model will abruptly drop the any more than Microsoft has dropped all
its DOS from the OS I'm using, no matter how much theoretical sense
such a change might make.
Nor are other industries' transitions proof that technology triumphs
over all.
Imusic is a better business model than Napster. Hulu makes
more money than YouTube. People will pay for content, if the price is
right and the delivery mechanism is right. Taking and putting it out
for free has changed industries but hasn't broken them, because
content has value. Content owners will fight for that value and find
acceptable ways to extract payments.
It's not that fight that is self-defeating, it's the taking for free
fight that is. People will strike back when their possessions are
stolen. In the long run, even on the Internet, the thieves have been
the ones to lose.
The future of books will be different. The transition will be messy.
The result will be a compromise in ways that probably no one is
predicting. What I guarantee it won't be is what would happen if you
wiped the slate clean (what a hoary obsolete cliché that is - and yet
we still understand it!) and created the new system from scratch.
On Feb 26, 2:04 pm, Jonathan Sherwood <jonathan.r.sherw...@gmail.com>
wrote:
> Stross had an interesting take on this in Accelerondo. He laid out a world
> where doing something for money became unfashionable; it was preferable to
> do something for free and then reap the rewards of the kudos or reputation
> you received because of it. Essentially, reputation became the new
> barter-esque currency.
Cory Doctorow did the same two years earlier in Down and Out in the
Magic Kingdom. That's how Cory made his name. There was a lot of talk
on the Net at that time or earlier about how important reputations
were becoming and noting that message boards were allowing people to
vote on other people's status and positing all sorts of changes
thereof.
In fact, nothing really changed. Sure, a few authors have become well-
known and even bestselling for marketing themselves on the web with
free content, but a few authors have always stood out because of their
marketing. A few groups have become known for their free music. A few
have become known for their movies. A few have become known for their
blogs. The tools changed but the underlying economic reality still
means that to make money their products get sold in the old-fashioned
way. Google "reputation economy" for discussion of this. Five years
old and already obsolete.
Craig, the Guild is most definitely defending an existing business.
The audio book business is the one part of publishing that's making
money.