[0002] Often, the first step in patenting an invention is performing a
search of earlier documents (i.e., prior art) to determine if the invention
is new and non-obvious over what was available publicly prior to the time of
the invention. Similarly, the first step in determining the validity of an
issued patent is usually a prior art search. Typically, a prior art search
is performed in one of two ways. These two methods are often referred to as
classification searching and keyword searching.
[0003] Under the classification method of searching, each document in a
database of documents is associated with one or more classes and/or
subclasses by a person familiar with the art. For example, an invention
related to a web server for hosting thumbnail images generated from uploaded
digital photographs may be associated with class 707/104.1 (as well as
others) in the U.S. patent classification system. The searcher then selects
one or more of the classes and/or subclasses related to the invention he is
searching for, and reviews each of the documents in the chosen
classes/subclasses. The searcher's review of the documents may include
viewing the figures associated with the documents and/or reading some or all
of the text associated with each of the documents. This review process may
be performed with hard copies of the documents and/or on a computer screen.
[0004] The classification searching method has certain drawbacks. In the
classification system, a number of classes/subclasses must be created and
maintained. For example, the U.S. patent classification system has over 400
classes, and most of these classes have several subclasses. For a large
number of documents (e.g., millions of U.S. patents), if the number of
classes/subclasses is too small, there are too many documents in each
class/subclass to review in a timely manner. If the number of
classes/subclasses is too large, determining which classes/subclasses a
particular document belongs too becomes complex, and needing to review
multiple classes/subclasses can also produce an unmanageable number of
documents. Human error potentially plays a role each time a document is
classified and each time that document is sought. The classifier may
misclassify the document and/or the searcher may not search in the correct
class(es). Even if no errors occur, there may be hundreds of legitimate
documents that are highly relevant to the search. Manually reviewing
hundreds of documents is time consuming.
[0005] Under the keyword method of searching, a searcher enters one or more
keywords and Boolean operators into a computer which transmits a query to a
database. For example, if the invention is related to a web server for
hosting thumbnail images generated from uploaded digital photographs, the
searcher may enter:
[0006] SPEC/(server OR host) AND (thumbnail OR "low resolution image") AND
(upload OR transmit) AND ("digital photograph" OR "digital image")
[0007] The database will then return some or all of the documents it holds
that contain at least one occurrence of "server" or "host" and at least one
occurrence of "thumbnail" or "low resolution image" and at least one
occurrence of "upload" or "transmit" and at least one occurrence of "digital
photograph" or "digital image".
[0008] The keyword search method also has certain drawbacks. First, the
search iteration cycle is so time consuming, it effectively prohibits
extensive "element scoping." In the example above, the first "element" of
the Boolean search is directed to the web server portion of the invention.
The searcher may prefer to find "web server" over "server," because "web
server" is narrower (i.e., more on point). However, the searcher probably
realizes that "web server" may be harder to find in combination with the
other elements than "server." Similarly, the searcher may prefer "server"
over "host" for essentially the same reasons. "Host" seems more likely to be
found out of context for this search. In other words, the searcher is
typically able to come up with terms that have varying scope from narrow
(more desirable/less likely to find) to broad (less desirable/more likely to
find), but not knowing what is available in the prior art, the searcher does
not know how "greedy" to get with his search terms.
[0009] Performing multiple searches with varying scope may be too
time-consuming. For example, if each of five elements is varied over three
levels of scope, the searcher may have to enter and review 243 separate
searches. If the searcher is going to iterate his search at all, he must
evaluate the results of each search in order to determine if that iteration
is better or worse than iterations that have come before it. Typically,
existing searching systems allow the searcher to review various aspects of
each document (e.g., title, abstract, specification, and drawings) between
search iterations in order to make this determination. However, it is
typically up to the searcher to "skim" the document to determine if it is a
good one. Skimming an unannotated document can be time consuming and error
prone.
[0010] In order to review search results in a timely fashion, the searcher
typically only reviews the "top" X search results (e.g., the "best" ten).
However, this leads to a second problem with keyword searching for prior
art; what is "better" than something else? Typically, search results are
ranked in some manner before they are displayed to the user. Some systems do
not help the searcher determine which results are "better." For example,
some prior art searching systems will simply rank the search results by
patent number or filing date.
[0011] Other systems will attempt to rank the results based on the number of
occurrences of the search terms. While this approach may work for some
searching applications, it is fundamentally flawed for prior art searching
applications. For example, if a searcher is looking for five different
elements (e.g., A, B, C, D, and E) and one prior art reference has one
hundred occurrences of A, but only one occurrence of B, C, D, and E, (for a
total of 104 occurrences), and a second prior art reference has 20
occurrences of each element A, B, C, D, and E (for a total of 100
occurrences), most patent professionals would rather see the second
reference even though it has fewer total occurrences.
[0012] If instead, the searching system determined the "better" result by
giving each search term a "vote" (e.g., compare occurrences on a term by
term basis), the ranking result would be "correct" for the example above
because the second prior art reference in the example above "wins" on 4 out
of the 5 search terms. However, under a voting system, the ranking result
would be "incorrect" for a search result set where the first prior art
reference had occurrences of (A=10, B=10, C=10, D=1, E=0) and the second
prior art reference had occurrences of (A=9, B=9, C=9, D=100, E=100) because
this result would fail to take into account the difference in patent law
between "102 art" and "103 art" wherein "103 art" is inferior because it
completely lacks an element.
[0013] Even if the "103 art" aspect is taken into account by not considering
references that do not have at least one occurrences of each search term,
there is still a problem with the voting algorithm described above. For
example, the voting algorithm would rank a search result set where the first
prior art reference had occurrences of (A=10, B=10, C=10, D=1, E=1) higher
than a second prior art reference which had occurrences of (A=9, B=9, C=9,
D=100, E=100), because the first prior art reference in this example "wins"
on 3 out of the 5 search terms. However, most patent professionals would
prefer to see the second prior art reference in this example over the first
prior art reference; because it appears to be essentially the same as the
first prior art reference on the first three elements, but far superior on
the last two elements.
[0014] A third problem with existing prior art searching systems is that
regardless of what ranking method is used, additional synonyms for the same
claim element (e.g., A or A', B or B' or B", etc.), are not grouped together
by the ranking algorithm. This omission prevents prior art searching systems
from employing the logarithmic based ranking approach described in detail
below.
[0015] A fourth problem with existing prior art searching systems is the
time it takes the patent professional to thoroughly analyze the content of
each document (e.g., read through the "top ten" documents from the search
and determine which one or two of the documents he will use and what
sections he will cite). As a result, some systems may highlight each
occurrence of the search terms in order to aid the searcher in locating the
relevant portions of the document.
[0016] However, the highlighting performed by existing systems suffers from
two drawbacks. First, existing systems use the same color for all search
terms or a different color for every search term. Existing systems do not
group synonyms associated with the same claim element under one color while
using a different color for other groups of synonyms associated with other
claim elements (e.g., A and A'=red, B and B' and B"=blue). Second, existing
systems highlight text versions of the documents. Existing systems do not
highlight hypertext versions of the documents or graphical versions of the
documents with a text layer "underneath" (e.g., a searchable PDF file).
I should remind everyone of the concerns from maybe a decade ago
when IBM had a patent web site and was possibly retaining search
strings and corresponding IP addresses, with the potential of
utilizing this information against any competitors who utilized
their service.
I would think though that a private inventor might be in better
position to utilize the service as long as he was cautious as to
the detail of information he discloses, since he doesn't have the
professional liability here that patent attorneys and agents do.
Greg O wrote:
> Does this seem feasible?
>
> http://www.freshpatents.com/Methods-and-apparatus-to-search-and-analyze-prior-art-dt20050922ptan20050210042.php?type=description
--
--------------------------------------------------------------------
The preceding was not a legal opinion, and is not my employer's.
Original portions Copyright 2005 Bruce E. Hayden,all rights reserved
My work may be copied in whole or part, with proper attribution,
as long as the copying is not for commercial gain.
--------------------------------------------------------------------
Bruce E. Hayden bha...@ieee.org
Dillon, Colorado bha...@highdown.com
Phoenix, Arizona www.softpates.com
> Does this seem feasible?
sure, it does (seem feasible) - it also seems to add relatively little to
already existing searching algorithms
for example, the use of "keyword search" is actually used to determine the
basis for a "classification search", so they are not quite mutually
independent
moreover, ranking is provided by various search providers, such as Dialog
I am not quite sure what is really new in the "freshpatents" approach
Fran Lorin
www.patent.0catch.com
x-- 100 Proof News - http://www.100ProofNews.com
x-- 30+ Days Binary Retention with High Completion
x-- Access to over 1.9 Terabytes per Day - $8.95/Month
x-- UNLIMITED DOWNLOAD
I couldn't understand how it worked. It's kind of ironic, a patent knocking
what attorneys do now, written by an attorney, in language only attorneys
can understand, to get more business from attorneys.
But can't any search agent potentially disclose information about searches?
Is there any evidence IBM misused this data?
Have there been cases where someone in the US Patent Office has divulged
details of an invention to a competitor? I would have thought there were
lots of workers scanning documents for example who could do this. They might
have a friend who is told about an invention and then tries to improve on
it.
Nothing definite, but extremely suggestive. When in cross
licensing negotiations with IBM, it appeared that they picked
patents to assert partially by what we had previously searched.
In other words, of their thousands of patents, the ones that
they chose to assert against us were ones that had turned up
in searches our attorneys had made using their search tool.
The good thing was that this probably cost IBM money, as
the patents asserted this way were really not relevant to
our product line, and, thus, they ended up asserting irrelevant
patents when they could have asserted some more relevant ones.
I am sure they have, but it is infrequent enough that I don't
know of any cases. The differences here though is that a
PTO employee to disclose this sort of information is likely
to go to jail, whereas they most likely wouldn't in this case.
The difference, of course, being that the PTO employees are
constrained by PTO regulations preventing disclosure plus
federal criminal statutes. None of this applies here.
There is some evidence of that IBM did make use of some
of this information. In the situation I am aware of, it
didn't seem to have benefited them, and, maybe even harmed
them.
There are really two different, related, issues here. First,
you have actual disclosure. And secondly, you have usage of
the information. I would suggest that most search engines
could be abused and misused in either respect. Note though
that this wouldn't apply to the PTO search engines (see my
next post) due to legal constraints placed on the agency
and its employees.
> Greg O wrote:
> > Have there been cases where someone in the US Patent Office has divulged
> >
> > details of an invention to a competitor? I would have thought there were
> >
> > lots of workers scanning documents for example who could do this. They
> > might
> > have a friend who is told about an invention and then tries to improve
> > on
> > it.
>
> I am sure they have, but it is infrequent enough that I don't
> know of any cases
uh, Mr. Hayden, you state that you are "sure they have", yet you don't know
of any cases? - isn't that inconsistent?
as an ex-examiner myself (nearly 11 years there), I can say that warnings
were frequent and often repeated to examiners in various forums, such as
Patent Academy (the three-phase training program for new recruits),
required periodic ethics classes, etc. to avoid this type of improper
disclosure
however, in the recent past, examiners had file wrappers in their offices
that could be opened and read by unscrupulous attorneys, inventors and other
visitors to the examiners's offices, but all that has changed with the
hightened security at the new facility at Alexandria, VA, and the electronic
file wrappers that can only be accessed by authorized examiners
at the time of Thomas Edison, about 100 years ago, it is quite expected that
he himself would have learned about his competitors in that unscrupulous
manner, considering the type of person he was and the more lax environment
at the Patent Office at that time
Not quite sure of your point there. I really don't know of cases,
but you suggest there are. If so, fine. I really don't know.
Is this something we should be worried about - or shouldn't?
But now that you do bring it up, one place that the USPTO does
differ from the average search site (like the ones we were talking
about) is that there is an expectation of confidentiality. What
this translates to is, at a minimum, sufficiency to legally protect
trade secrets. Thus, if you can show that someone acquired information
in this way, it would be a violation of trade secret law, and open
those involved to liability thereof (including at a minimum the
visitors delving into the file wrappers). You don't have this with
the private search sites - or at least the two discussed.
I remember hearing warnings in the EPO about internet based prior art
searches: we should not give enough information to allow a third party
to guess the subject-matter of the invention from our keywords...
However, the internet searches are aimed to find non patent
litterature, less likely to be "surveyed", and we have everything
available in house for patents.
May I remind you that the EPO provides its own patent searching tool
for free at http://ep.espacenet.com
This includes the US patents, and there is no way the EPO is going to
use your search against you :-)
well, Mr. Hayden, you stated in your previous posting that you were "sure"
that such improper/illegal activity occurs, i.e., USPTO personnel improperly
sharing confidential information with others who are unauthorized to receive
that information, yet you stated that you knew of no "cases" - that is what
seems to be an inconsistency - how can you be "sure" it happens?, yet be
unable to cite proof?
I did not suggest that such cases exist at all (at least recently), except
perhaps in the more lax environment of the late 1800s, i.e., at the time of
Edison, about 100 years ago, when security was nowhere near as tight as it
is now
I also stated that warnings to examiners and other employees of the USPTO
of such improper/illegal activity are given quite regularly in such forums
as "Patent Academy" and the regular required ethics training - so, my point
is quite clearly that USPTO personnel are very aware of the seriousness of
such behavior, making it so rare that it probably never happens - if it
ever does surface (assuming it ever even happens), you can be sure it will
become a major issue that everyone will hear about, especially since the
party negatively affected by such disclosure will immediately go public
with the accusations - no such accusations have ever surfaced as you
admitted
so, I really don't think confidentiality within the USPTO should be a
concern at all
Greg O: I have three general observations about this patent application
(2005/0210042 to James Francis Goedken of Palatine, IL)
the first observation I have is that this application is part of an increase
in the number of recent filings of provisional patent applications related
to patent searching methodology - as each of these provisional applications
issue into patents (most seem to), the chances increase that they may
undermine the patent examining process, i.e., the USPTO may infringe a
patent it has issued during its own examination procedures! - alternatively,
these patents show that there is a gaping hole that remains to be filled
regarding patent search methodology in general - the USPTO clearly has no
"lock" on this methodology - actually, the methodology used among examiners
to locate "evidence" (i.e., "prior art") to support their Official positions
regarding patentability, the most critical aspect of the patent examiners'
functions, is not uniform, nor comprehensive - it is also important to note
how some of these applications are filed by past patent examiners
the second observation is that this application contains 44 pages, a
relatively large amount of information - my experience is that such large
disclosures generally include a very wide range of complex interactions
among the key features - I have not reviewed the entire document, so I don't
know whether it contains much "fluff", i.e., repetive or unrelated
information
the third observation is that the first claim includes numerical variables
that indicate the number of "hits" in a query - this means that the
inventors are probably trying to simplify the linguistic aspects of a patent
search into a numerical system that can probably be manipulated in a more
manageable manner using known numerical manipulations, such as addition,
subtraction, ranking, etc.
> well, Mr. Hayden, you stated in your previous posting that you were "sure"
> that such improper/illegal activity occurs, i.e., USPTO personnel improperly
> sharing confidential information with others who are unauthorized to receive
> that information, yet you stated that you knew of no "cases" - that is what
> seems to be an inconsistency - how can you be "sure" it happens?, yet be
> unable to cite proof?
Please quote where you believe that I said that such improper/illegal
activity occurred - unless it is the post that I discuss below.
If it came across that way, that was not the way I meant it. Rather, I
was trying to suggest the opposite - that misuse of confidential
information was less by the USPTO than by private search sites
(and used IBM as an example). See the end of this post for more.
>
> I did not suggest that such cases exist at all (at least recently), except
> perhaps in the more lax environment of the late 1800s, i.e., at the time of
> Edison, about 100 years ago, when security was nowhere near as tight as it
> is now
>
> I also stated that warnings to examiners and other employees of the USPTO
> of such improper/illegal activity are given quite regularly in such forums
> as "Patent Academy" and the regular required ethics training - so, my point
> is quite clearly that USPTO personnel are very aware of the seriousness of
> such behavior, making it so rare that it probably never happens - if it
> ever does surface (assuming it ever even happens), you can be sure it will
> become a major issue that everyone will hear about, especially since the
> party negatively affected by such disclosure will immediately go public
> with the accusations - no such accusations have ever surfaced as you
> admitted
> so, I really don't think confidentiality within the USPTO should be a
> concern at all
Good. We agree on the basics. Let's parse what I did say:
Bruce Hayden wrote:
> Greg O wrote:
>> Have there been cases where someone in the US Patent Office has
>> divulged details of an invention to a competitor? I would have thought
>> there were lots of workers scanning documents for example who could do
>> this. They might have a friend who is told about an invention and then
>> tries to improve on it.
>
> I am sure they have, but it is infrequent enough that I don't
> know of any cases. The differences here though is that a
> PTO employee to disclose this sort of information is likely
> to go to jail, whereas they most likely wouldn't in this case.
> The difference, of course, being that the PTO employees are
> constrained by PTO regulations preventing disclosure plus
> federal criminal statutes. None of this applies here.
My first point is that the USPTO is an organization staffed
by humans. And, thus, it is inevitable that some minor amt.
of disclosure is going to happen. It happens in the CIA,
the FBI, etc. And, indeed, you admitted that some minor amt.
has probably happened in your earlier post through slack
adherence to proper procedures, etc. We seem to be in agreement here.
Note though that I am not saying that it has. I have no
evidence that it has. I am just suggesting that if confidential
information can leak out of the FBI, CIA, etc., then there is
a chance that it could (illegally) leak out of the USPTO.
But then I go on to point out that it is infrequent, and,
indeed, so infrequent, that there aren't any cases. You seem
to agree that it is quite infrequent (or, maybe even nonexistent).
Finally, I point out that disclosure by USPTO employees is
criminal in nature. They are prevented from disclosure by
both regulations and criminal statutes. I was attempting
to imply that such would be a big incentive for PTO employees
not to disclose. I think this fits in fairly well with your
last point about the precautions taken by the USPTO.
Also, I note your agreement with my point that cases on
this are rare to nonexistent, - presumably, as you point out,
because they would be made quite public.
snip
> .....disclosure by USPTO employees is
> criminal in nature. They are prevented from disclosure by
> both regulations and criminal statutes. ...
now, Mr. Hayden, I can only agree entirely will all you wrote
so, assuming we are all in agreement that the USPTO is not a very likely
contributor to the problem of improper disclosure AT THIS TIME, and that IBM
is an example of at least suspect activity in using search terms entered in
their Delpion or other databases for possible misuse, the potential problem
of misuse of such database searches is actually going to INCREASE with
OUTSOURCING
getting back to the OP, I note that in the most recently awarded PCT search
contract to two outside companies (i.e., Landon IP of Alexandria, VA and IP
Data Miner of Lakewood, OH, see
http://www.uspto.gov/web/offices/ac/comp/proc/pctsearch/pctsearchhom.html),
the USPTO is publicly announcing that searches on patent applications (which
are "confidential"), that were previously performed by patent examiners,
will be performed outside the "protective sphere" of the USPTO
these contractors may use IBM's databases, or other databases, that may or
may not be encrypted, etc. to protect the applicants' disclosure
> Greg O: I have three general observations about this patent application
> (2005/0210042 to James Francis Goedken of Palatine, IL)
Greg O: here are two more general points:
1 a search in the USPTO patent application database for the search phrase:
abst/("prior art" and search) used in this application, gives 27 references
additional references - this means that there are already 27 references that
may be directly on point with the subject matter of this application (i.e.,
2005/0210042)
2 this application discusses and claims use of color in highlighting terms
in the target documents - however, rather than filing colored photographs,
the applicant has only b/w drawings - therefore, the impact of the color
"highlighting" is lost - this may impact on "feasibility"