Scientists find secret to writing a best-selling novel
Computer scientists have developed an algorithm which can
predict with 84 per cent accuracy whether a book will be
a commercial success - and the secret is to avoid cliches
and excessive use of verbs
By downloading classic books from the Project Gutenberg
archive they were able to analyse texts and then compare
their predictions to historical information on the
success of the work�Photo: Alamy
By Matthew Sparkes
The Telegraph
January 9, 2014
Scientists have developed an algorithm which can analyse
a book and predict with 84 per cent accuracy whether or
not it will be a commercial success.
A technique called statistical stylometry, which
mathematically examines the use of words and grammar, was
found to be �surprisingly effective� in determining how
popular a book would be.
The group of computer scientists from Stony Brook
University in New York said that a range of factors
determine whether or not a book will enjoy success,
including �interestingness�, novelty, style of writing,
and how engaging the storyline is, but admit that
external factors such as luck can also play a role.
By downloading classic books from the Project Gutenberg
archive they were able to analyse texts with their
algorithm and compare its predictions to historical
information on the success of the work. Everything from
science fiction to classic literature and poetry was
included.
It was found that the predictions matched the actual
popularity of the book 84 per cent of the time.
They found several trends that were often found in
successful books, including heavy use of conjunctions
such as �and� and �but� and large numbers of nouns and
adjectives.
Less successful work tended to include more verbs and
adverbs and relied on words that explicitly describe
actions and emotions such as �wanted�, �took� or
�promised�, while more successful books favoured verbs
that describe thought processes such as �recognised� or
�remembered�.
To find �less successful� books for their tests, the
researchers scoured Amazon for low-ranking books in terms
of sales. They also included Dan Brown�s The Lost Symbol,
despite its commercial success, because of �negative
critiques if had attracted from media�.
�Predicting the success of literary works poses a massive
dilemma for publishers and aspiring writers alike,� said
Assistant Professor Yejin Choi, one of the authors of the
paper published by the Association of Computational
Linguistics.
�To the best of our knowledge, our work is the first that
provides quantitative insights into the connection
between the writing style and the success of literary
works.
�Previous work has attempted to gain insights into the
�secret recipe� of successful books. But most of these
studies were qualitative, based on a dozen books, and
focused primarily on high-level content - the
personalities of protagonists and antagonists and the
plots. Our work examines a considerably larger collection
- 800 books - over multiple genres, providing insights
into lexical, syntactic, and discourse patterns that
characterise the writing styles commonly shared among the
successful literature.�
Continues at:
http://www.telegraph.co.uk/technology/news/10560533/Scientists-find-secret-to-writing-a-best-selling-novel.html
Jai Maharaj, Jyotishi
Om Shanti
http://groups.google.com/group/alt.fan.jai-maharaj
o o o
o Not for commercial use. Solely to be fairly used
for the educational purposes of research and open
discussion. The contents of this post may not have been
authored by, and do not necessarily represent the opinion
of the poster. The contents are protected by copyright
law and the exemption for fair use of copyrighted works.
o If you send private e-mail to me, it will likely
not be read, considered or answered if it does not
contain your full legal name, current e-mail and postal
addresses, and live-voice telephone number.
o Posted for information and discussion. Views
expressed by others are not necessarily those of the
poster who may or may not have read the article.
FAIR USE NOTICE: This article may contain copyrighted
material the use of which may or may not have been
specifically authorized by the copyright owner. This
material is being made available in efforts to advance
the understanding of environmental, political, human
rights, economic, democratic, scientific, social, and
cultural, etc., issues. It is believed that this
constitutes a 'fair use' of any such copyrighted material
as provided for in section 107 of the US Copyright Law.
In accordance with Title 17 U.S.C. Section 107, the
material on this site is distributed without profit to
those who have expressed a prior interest in receiving
the included information for research, comment,
discussion and educational purposes by subscribing to
USENET newsgroups or visiting web sites. For more
information go to:
http://www.law.cornell.edu/uscode/17/107.shtml
If you wish to use copyrighted material from this article
for purposes of your own that go beyond 'fair use', you
must obtain permission from the copyright owner.
Since newsgroup posts are being removed by forgery by one
or more net terrorists, this post may be reposted several
times.