RFC: Self-Hosted PHP Documentation

0 views
Skip to first unread message

Gabor Hojtsy

unread,
May 9, 2003, 4:45:02 PM5/9/03
to php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net
Hi!

[I hope this will be a nice reading for those getting back from the
Intl. conference :)]

Problems, preliminary experinece:

- The new my php.net feature showed, that users would like to see
more personalisation, while this would probably be very hard to
code without one central session server. Things like favorites,
or mostly used functions, etc.

- The current search features on the php.net site are far from
satisfactory, it is not easy to focus the search on something
the user is really interested in.

- The CHMs are quite good in searching, but they are very limited
to what Microsoft provides for customization, and many ugly JS
hacks were needed to allow searching in the notes separately for
example. Also proper update of the documentation can only be done
by hand. Otheriwse the CHM format contains many good stuff we
can build on in future formats.

- We continually receive requests to provide the 'phpweb' format for
download with a stipped down version of the site's manual handling
code.

- We discussed at the 2002 doc meeting that some CHM-like format would
be nice for non-Windows systems too.

Considering all the above, it would be nice to see a format, with which
the CHM features are available [Tree TOC, Full Text Search, Index,
Favorites], and which provides extreme customizability options.

What I have come up with so far in my mind considering the above
problems is the 'Self-Hosted PHP Documentation' or 'PHP Documentation
Central'. I hope at the end it will also prove to deserve the latter name ;)

The idea is based on the assumption, that most of those demanding
advanced help have a web server and a PHP installed at least with
default options. Those who have not own such a server can use the online
documentation anyway.

So the requirements to set up a self-hosting documentation for a user
are: web server, php. This would work without an internet connection,
but it would provide much more feature with connection.

The basic goal for the first version is to replicate what CHM already
has (mentioned above) with content files probably in HTML or XML, and
the TOC in XML. Full text search shuold be done without a DB backend, so
file based native PHP search engines should be taken into account. There
are some out there with quite great speed, see some script sites [I am
not going to name names yet]. The index can be done using the Full Text
Search database or using page titles as we do it now. Implemented in a
frameset, this would provide what is already provided by the CHM
version, but with platform and browser independence.

BUT we can go much furthen than this. Some ideas:

- Automatic install and update. This would be easy to install using
some PEAR install tools, if we require PEAR to be on the user's
own server. Automatic updates can also be done using PEARs update
tool [probably, I have not checked...].

- Subset selection. Users would be able to select a subset of the
manual, which they use. Searches will be performed in this subset
(or results outside this subsite will appear below those from the
selected nodes). Automatic update would only need to update the
nodes the user actually reads in his everyday work, when new
documentation is ready.

- Multiple book support. This system woul be able to support multiple
books. Imagine to have PHP, PEAR and PHP-GTK docs all at one place.
PHP-GTK developers would find this extremely useful. Searches would
be made in all books, if asked for. This is the point where my
idea become 'PHP Documentation Central ;)'.

This also opens up the possibilty of third party projects providing
books for this system, eg. Smarty, ADODB, etc.

- PHP version dependant behaviour. The manual would be able to warn
users with a given PHP version number that some functions are not
available in that version. It would be nice to search in only those
function pages that are available in the version the user has.

- PDF generation. In case we provide the documentation pages in some
XML format (presented in HTML on the fly), then we can also provide
the ability for users to create PDF files on the fly (in case of
some PDF functions are installed or using a bundled native PHP PDF
generator lib). This would at it's extreme provide the ability
to the user to arrange a whole big PDF for himself/herself with
those pages he is interested in (eg. with only the ODBC docs, if
[s]he is not going to use anything else).

- Tight website integration. We can also warn users that an updated
version of PHP is available, as we know the used version. Or we
can deliver news to his browser as part of this app.

- Skinning. I am perfectly sure that the CHM layout will be annoying
for many developers, as it gives too much space for the navigation,
and not enough for the content. So this system needs to be designed
to be skinnable via some templates.

- Intranet user handling. Let's imagine some company develops using
PHP. It is obvious that there should be no need for every guy to
install this doc system. The system should be at one place on an
intranet server. However preferred skins of users, favorite pages,
etc. will probably differ, so it would be nice to throw in some
authentication and session handling for this case.

Who would use such a documentation?

- I think this would simply obsolote CHMs, and we would generate
this format, instead of the CHMs because it's advantages.

- Those who now have local mirror sites for browsing the docs,
and getting updates ASAP. If we can convince these users that
this solution is better, then they won't rsync our server, as
there will be no point in it.

This means less load on rsync.php.net and more load on mirrors,
as every mirror will be a possible 'autoupdate server' for this
format - with the obvious exception of www.php.net itself.

- Those tired with the online search feature of our site. They will
definitely swicth to this format ASAP.

I have done some research on the needed components and whether there are
already written libs for this. There are some questions in my mind about
this stuff:

- Should this be fully implemented in PEAR, so it would be available
for installation from there, and it would be able to update itself
and/or the docs through PEAR update mechanism. Is PEAR capable of
such thing?

- Is it ok to use third party tools here (for file based full text
search indexing or templating for example).

It is obvious that all the above stuff cannot be done at once. I think
simulating the CHM interface and features would be the first goal,
writing the app as extensible as possible, not with the CHM system in
mind when building the internal structure of course. Then adding the
above convinience features step by step would be nice.

If 'PHP Documentation Central' becomes reality, we would have an open
version of something like Microsoft's MSDN, but with much more
extensibilty and customizability features.

The question is if you think that guys would be interested in this, and
that can anyone from here volunteer to start the project. I won't have
much time for this until after my exam time. But I thought it would be
nice to start a discussion now, so the picture will be cleaner then.

Waiting for comments,
Goba

Davey

unread,
May 9, 2003, 6:47:11 PM5/9/03
to pear...@lists.php.net, php...@lists.php.net, php-g...@lists.php.net, php-m...@lists.php.net
I have one thing to say on this subject, and that is portability, not
between platforms, but between devices. Specifically PDAs and such. The
CHM format works on all the new winCE machines (ipaqs for example), I
believe. For that reason alone I don't think you should obsolete the CHM
manual... perhaps make a function reference only CHM, so theres less to
generate?

As for placing it in PEAR, there is already a CLI version of the manual
being written as a PEAR package, and so I think there is definately a
place for this...

I like the idea, I just think that you shouldn't concentrate on
obsoleteing the CHM so much as offering a viable alternative for other
platforms and for people who don't want the CHM. Or... perhaps you could
obsolete the CHM that is generated for php.net and simply link to the
extended CHM created by a third party (sorry, I can't remember who you are!)

Also, you might think about using XML as the docs format because
transformation from XML -> (X)HTML is possible using client-side XSLT in
Mozilla (though theres a JS bug before 1.2.x) and IE6 (and for the most
part IE5.5, that support the working draft at its time of publication,
little changed from that to the final recomendation).

could CVS be used to keep the manual itself up-to-date? Rather than
releasing a PEAR package for every major (or minor?) change to the docs
themselves... rather just release packages for changes in the scripts,
and not the docs.

These are just IMO, make of them what you will...

- Davey

On a side note, this might make the adoption of PEAR go up too, which is
a good thing!

Gabor Hojtsy

unread,
May 10, 2003, 6:09:42 AM5/10/03
to da...@php.net, pear...@lists.php.net, php...@lists.php.net, php-g...@lists.php.net, php-m...@lists.php.net
> I have one thing to say on this subject, and that is portability, not
> between platforms, but between devices. Specifically PDAs and such. The
> CHM format works on all the new winCE machines (ipaqs for example), I
> believe. For that reason alone I don't think you should obsolete the CHM
> manual... perhaps make a function reference only CHM, so theres less to
> generate?

Really? Do anybody used the CHM on WinCE? I have not heard about it.

> I like the idea, I just think that you shouldn't concentrate on
> obsoleteing the CHM so much as offering a viable alternative for other
> platforms and for people who don't want the CHM. Or... perhaps you could
> obsolete the CHM that is generated for php.net and simply link to the
> extended CHM created by a third party (sorry, I can't remember who you
> are!)

I am the coordinator of that extended CHM ;) I quite lost the focus on
that format, after I have switched to Linux :) So if that needs regular
updates, some new maintainer is needed for it. I am not sure whether I
will produce any updates for that.

> Also, you might think about using XML as the docs format because
> transformation from XML -> (X)HTML is possible using client-side XSLT in
> Mozilla (though theres a JS bug before 1.2.x) and IE6 (and for the most
> part IE5.5, that support the working draft at its time of publication,
> little changed from that to the final recomendation).

Brrr. I thought about XML distribution becuase it's smaller then the
HTML source and it would enbale us to generate PDF on the long run. But
I don't think we should depend on Mozilla/MSIE rendering the XML. We
would better render it with PHP (which would be needed for searching
anyway).

> could CVS be used to keep the manual itself up-to-date? Rather than
> releasing a PEAR package for every major (or minor?) change to the docs
> themselves... rather just release packages for changes in the scripts,
> and not the docs.

Well, I don't think that the Docbook XML format is OK for client side
parsing. It would simply mean a real huge XML document to parse. Also it
would not be easy to implement conditional updates (ie. only update the
pages the user often visits). So IMHO a custom update solution will be
needed. Especially if we count in that this way third party manuals
would also be supported well, who don't have CVS modules.

> On a side note, this might make the adoption of PEAR go up too, which is
> a good thing!

This is one reason I thought about using PEAR. And because it has some
packages we can use (ie. HTML_TreeMenu for the CHM like TOC treemenu).

Goba

Alexander Merz

unread,
May 11, 2003, 8:25:07 AM5/11/03
to Gabor Hojtsy, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net
In general: a good draft! :-)

But: It doesn't really solve the doc problem.

Q: "Why I need a full-text search?"
A: "To find the information you need."

Q: "In the most case, i need the parameters of a specific function, or
finding a function with expects ie. a string with a file name and
returns the file content as array. Can i do this comfortable with the
full-text search?"
A: "No, a full-text search doesn't know what "function name" mean, or if
the term "file name" refers to a parameter description or a general
function description."

Q: "Isn't this documented in the Manual?"
A: "In the docbook source, function names and parameters are marked with
specific tags, it is documented"

Q: "So whats the problem? The information exists to answer my request."
A: "This information is lost after transforming the docbook to HTML, PDF
etc."

Q: "Why must the docbook be transformed?"
A: "There are no native Docbook readers, so we must do the
transformations to a viewable format like HTML or PDF."

Q: "Why is there no reader?"
A: "Well, ähm, yes..."

Thats the point! IMO instead of writing tons of code and wasting time
with project specific solutions for the "rendering docbook?"-problem, it
is time to start writing a native Docbook-Writer/Reader.

With a DB-W/R:
1.) we doesn't need to build the manual
2.) such an application knows the meaning of tags, rendering a file
containing a <refsect> as root tag is possible; a user havn't to fetch
the whole manual, only the required files for a specific Package/Extension.
3.) tons of client-side possibilties for search functions and navigation
based on the logical structure of the document
4.) writing docs is like writing in OpenOffice or Word

For PHP-Manual specific stuff like showing functions only if there are
in the installed PHP-build or highlighted sources such an application
could have book-specific Filters or Hooks.

Gabor Hojtsy

unread,
May 11, 2003, 11:00:52 AM5/11/03
to Alexander Merz, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net
> In general: a good draft! :-)

Thanks.

> Thats the point! IMO instead of writing tons of code and wasting time
> with project specific solutions for the "rendering docbook?"-problem, it
> is time to start writing a native Docbook-Writer/Reader.

Well, as I said, I thought about using some XML format. I am not yet
convinced that DocBook XML is the real good choice for this. I thought
about some preparsed XML structure with much less tags to deal with.
DocBook is so huge that coding a generic viewer would take much more
time then I am willing to invest in this. It would be interesting indeed
;) What I thought about is to have a subset of DocBook or a simpler XML
format to use for this task and implement a viewer for that.

> For PHP-Manual specific stuff like showing functions only if there are
> in the installed PHP-build or highlighted sources such an application
> could have book-specific Filters or Hooks.

What you propose does not seem to be impossible, and it's quite
impressive. But are there any guys here who are willing to join in such
a development?

Goba

Alexander Merz

unread,
May 11, 2003, 11:11:48 AM5/11/03
to Gabor Hojtsy, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net
Gabor Hojtsy wrote:

> ;) What I thought about is to have a subset of DocBook or a simpler XML
> format to use for this task and implement a viewer for that.

Agree, i wait for DB v10 with such stuff like
<sentence><nomen>I</nomen> <verb>am</verb>
<adjectiv>stupid</adjective></sentence>. ;-)

> What you propose does not seem to be impossible, and it's quite
> impressive. But are there any guys here who are willing to join in such
> a development?

Good question, may be should send a note to other doc-teams around the
world, to get a critical mass of developers for such a project.

Gabor Hojtsy

unread,
May 12, 2003, 5:55:01 AM5/12/03
to Alexander Merz, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net, da...@php.net
>> What you propose does not seem to be impossible, and it's quite
>> impressive. But are there any guys here who are willing to join in
>> such a development?
>
> Good question, may be should send a note to other doc-teams around the
> world, to get a critical mass of developers for such a project.

Well, what came into my mind is that Damien Seguy already has similar
code to render the DocBook tags we use into HTML. Though it is not
designed to work on the fly, we may be able to use the code here.

Read Online: http://dev.nexen.net/docs/php/annotee/
Download: http://dev.nexen.net/docs/php/index.php

I don't remember what he uses for PDF generartion, but those PDFs are
not that bad either ;)

Goba

Jirka Kosek

unread,
May 12, 2003, 9:18:11 AM5/12/03
to pear...@lists.php.net, php...@lists.php.net, php-g...@lists.php.net, php-m...@lists.php.net
Gabor Hojtsy wrote:

> Considering all the above, it would be nice to see a format, with which
> the CHM features are available [Tree TOC, Full Text Search, Index,
> Favorites], and which provides extreme customizability options.

You can try JavaHelp. It has features similar to HTML Help but it is
Java based and thus runs everywhere. DocBook XSL stylesheets support
JavaHelp output format out-of-the box.

Jirka

--
-----------------------------------------------------------------
Jirka Kosek
e-mail: ji...@kosek.cz
http://www.kosek.cz

Hans Zaunere

unread,
May 13, 2003, 9:41:32 AM5/13/03
to Gabor Hojtsy, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net

Hi Gabor et al,

--- Gabor Hojtsy <ga...@hojtsy.hu> wrote:
> Hi!
>
> [I hope this will be a nice reading for those getting back from the
> Intl. conference :)]
>
> Problems, preliminary experinece:
>
> - The new my php.net feature showed, that users would like to see
> more personalisation, while this would probably be very hard to
> code without one central session server. Things like favorites,
> or mostly used functions, etc.
>
> - The current search features on the php.net site are far from
> satisfactory, it is not easy to focus the search on something
> the user is really interested in.

I agree with these first two points :)

Although noted below as a db-free site, why would this be the case? I
realize that with mirrors, things can get complicated, but I think an
architecture with some sort of DB usage, at least at it's core, is vital.
You just won't get the speed and flexibility otherwise, IMHO. I have the
resources to provide some production sites for such an endeavour, or for
testing/development.

Also as I've mentioned, I'd be happy to do some of the backend/logic coding;
just throw some function/class specs my way and I'll code it out, SQL or
otherwise.

Happy to help,

H

Gabor Hojtsy

unread,
May 14, 2003, 4:44:25 AM5/14/03
to ha...@nyphp.org, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net
> Although noted below as a db-free site, why would this be the case? I
> realize that with mirrors, things can get complicated, but I think an
> architecture with some sort of DB usage, at least at it's core, is vital.
> You just won't get the speed and flexibility otherwise, IMHO. I have the
> resources to provide some production sites for such an endeavour, or for
> testing/development.

What speed problems do you think we have currently? I don't know of any
:) Well, we have flexibility problems. But what if we would require some
sort of DB to be installed on mirror sites. Let's say we go with MySQL.
I am sure not all sites would be happy with it. It also sounds very bad
to require one engine. So let's say we use some generic DB, and an
abstraction on it (PEAR DB for example). Then we would need to have PEAR
on all mirrors (a working copy ;). So then would we gain speed with
this? No.

Then again, we have 100 mirror sites. What if someone set's his
bookmarks at hu.php.net and it goes down the next week. The bookmarks
will be gone (well, hu.php.net is a real example for a problematic
mirror). OK, so we need to share the data across all the 100 mirrors.
Wooh. All mirrors would need a significantly complex setup (higher
joining requirement to be a mirror, and higher probabilty of problems).

Yet we would not have a great useable offline version for those without
broadband (which is still a very huge amount of people, beleive me).

So I think we should go on with the offline version, but spice it up
with enough online functions for those who are always online to get the
latest info and still get load off the mirrors, and still have much more
customization options because of the local hosting for users.

Goba

Hans Zaunere

unread,
May 16, 2003, 11:34:30 AM5/16/03
to Gabor Hojtsy, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net

--- Gabor Hojtsy <ga...@hojtsy.hu> wrote:
> > Although noted below as a db-free site, why would this be the case? I
> > realize that with mirrors, things can get complicated, but I think an
> > architecture with some sort of DB usage, at least at it's core, is vital.
>
> > You just won't get the speed and flexibility otherwise, IMHO. I have the
> > resources to provide some production sites for such an endeavour, or for
> > testing/development.
>
> What speed problems do you think we have currently? I don't know of any
> :) Well, we have flexibility problems. But what if we would require some
> sort of DB to be installed on mirror sites. Let's say we go with MySQL.
> I am sure not all sites would be happy with it. It also sounds very bad
> to require one engine. So let's say we use some generic DB, and an
> abstraction on it (PEAR DB for example). Then we would need to have PEAR
> on all mirrors (a working copy ;). So then would we gain speed with
> this? No.

Well it's impossible to please everyone all the time; you'll be lucky to
please some of them, some of the time. If we were to go to some DB based
functionality, we should use one DB - no abstractions, no complexities, with
queries tailored to that engine.

> Then again, we have 100 mirror sites. What if someone set's his
> bookmarks at hu.php.net and it goes down the next week. The bookmarks
> will be gone (well, hu.php.net is a real example for a problematic
> mirror). OK, so we need to share the data across all the 100 mirrors.
> Wooh. All mirrors would need a significantly complex setup (higher
> joining requirement to be a mirror, and higher probabilty of problems).

I think we'd need to consider the general architecture a bit more. Mirrors
don't have to be simply duplicates of the master site; there can be download
mirrors, image mirrors, primary and secondary. In our case, there's room for
"manual" mirrors and "DB" mirrors.

Using a round-robin DNS scheme to access a replicated set of databases would
be straightforward and inline with James' suggestion to do the same for the
website itself. While some mirrors could mirror everything, that is the
website and running a database, others could do only one or the other.

> Yet we would not have a great useable offline version for those without
> broadband (which is still a very huge amount of people, beleive me).
>
> So I think we should go on with the offline version, but spice it up
> with enough online functions for those who are always online to get the
> latest info and still get load off the mirrors, and still have much more
> customization options because of the local hosting for users.

Personally, I've never used the offline version, but do see it's value. That
said, most professional PHP developers are always online from what I've seen
- developing a website without being online is a bit of a paradox - and to
this end I think providing the richest resource online is most important. If
DB usage is valuable at all, I think it would be in a searching, filtering
capacity; implementing user personalizations would require more thought, and
probably more DB resources (in regards to server power).

H

Gabor Hojtsy

unread,
May 16, 2003, 12:28:22 PM5/16/03
to ha...@nyphp.org, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net
> Well it's impossible to please everyone all the time; you'll be lucky to
> please some of them, some of the time. If we were to go to some DB based
> functionality, we should use one DB - no abstractions, no complexities, with
> queries tailored to that engine.

I am still not convinced that having a DB on mirrors would solve any
problems. What do you think this can solve? [update: we have sqlite db
on php.net and an option for all mirrors to set it up to reduce stat()
calls]. What else would be speeded up and or spiced up with features
with a db?

> I think we'd need to consider the general architecture a bit more. Mirrors
> don't have to be simply duplicates of the master site; there can be download
> mirrors, image mirrors, primary and secondary. In our case, there's room for
> "manual" mirrors and "DB" mirrors.

Brrr. DB mirrors? Consider that if we reduce the number of current
public all-content mirror sites, people will probably be disappointed IMHO.

> Using a round-robin DNS scheme to access a replicated set of databases would
> be straightforward and inline with James' suggestion to do the same for the
> website itself. While some mirrors could mirror everything, that is the
> website and running a database, others could do only one or the other.

It is not a problem if a site is not updated for a day. It is a problem,
if user settings are not syncronized on all mirrors. If we would have 10
mirrors with user settings synced, then the other mirrors won't be used
at all for manual browsing, as they would be much less useful then those
10 mirrors. So then we would reduce our mirror base to ten 'useable'
mirrors. We certainly cannot coordinate all the current 100 mirrors to
have a DB installed and be always in sync.

> Personally, I've never used the offline version, but do see it's value. That
> said, most professional PHP developers are always online from what I've seen
> - developing a website without being online is a bit of a paradox - and to
> this end I think providing the richest resource online is most important. If
> DB usage is valuable at all, I think it would be in a searching, filtering
> capacity; implementing user personalizations would require more thought, and
> probably more DB resources (in regards to server power).

Searching is already supported. What filtering are you talking about?
Isn't that part of the "personalization which would require more thought"?

You probably cannot imagine the real value of offline manuals. You have
not followed the PDF requests probably. There are many developers out
there who are not using any digital version, but they print out the
whole manual. There are also many developers who simply cannot afford to
be online (plenty of them in Hungary). I used to be an 56k modem user
till the end this march, not because I was unable to pay for broadband,
but because there was no other option. My connection bills were very
high. So don't only count with the US, we are serving the whole world!

With my proposed solution we would have a
- distributed manual which would
- work offline,
- without any requirement to change the mirrors locally
- with automatic update to have the latest content offile
- with any user preference one can imagine based on offile data
- and it would also reduce the hits on mirrors because of
the offline manual

And it would also offer a standard integration interface for other [PHP]
projects to plug in their documentation. Many guys need PEAR or ADODB or
Smarty docs by the side of PHP docs to write their apps, don't you
agree? The php site won't integrate with ADODB or any other third party
project I can assure you.

So why to reinvent the whole mirroring structure, add more work to
mirror maintaners, put a higher level for mirror acceptions, when we can
create a better solution without disturbing the mirrors and with adding
more options to users.

The only negative point in my proposal is that readers need to have a
web server and a PHP installed (maybe with a database too, but I am sure
everything can be done without it, there are quite cool binary file
based native search solutions). But you said your target is professional
PHP programmers. So do they have a webserver and a PHP installed? YES ;)

Goba

Hans Zaunere

unread,
May 19, 2003, 9:32:02 PM5/19/03
to Gabor Hojtsy, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net

--- Gabor Hojtsy <ga...@hojtsy.hu> wrote:
> > Well it's impossible to please everyone all the time; you'll be lucky to
> > please some of them, some of the time. If we were to go to some DB based
> > functionality, we should use one DB - no abstractions, no complexities,
> with
> > queries tailored to that engine.
>
> I am still not convinced that having a DB on mirrors would solve any
> problems. What do you think this can solve? [update: we have sqlite db
> on php.net and an option for all mirrors to set it up to reduce stat()
> calls]. What else would be speeded up and or spiced up with features
> with a db?
>
> > I think we'd need to consider the general architecture a bit more.
> Mirrors
> > don't have to be simply duplicates of the master site; there can be
> download
> > mirrors, image mirrors, primary and secondary. In our case, there's room
> for
> > "manual" mirrors and "DB" mirrors.
>
> Brrr. DB mirrors? Consider that if we reduce the number of current
> public all-content mirror sites, people will probably be disappointed IMHO.

I can certainly see the hesitation, but again running a DB wouldn't be
mandatory for all mirrors. Since being on the mirror list and such, I see a
lot of donations for what appear to be well-connected, hefty offers for
additional mirrors that are turned away because of country restrictions.
These, or other mirrors, I'm sure would be happy canidates to provide such a
DB mirroring scheme.

> > Using a round-robin DNS scheme to access a replicated set of databases
> would
> > be straightforward and inline with James' suggestion to do the same for
> the
> > website itself. While some mirrors could mirror everything, that is the
> > website and running a database, others could do only one or the other.
>
> It is not a problem if a site is not updated for a day. It is a problem,
> if user settings are not syncronized on all mirrors. If we would have 10
> mirrors with user settings synced, then the other mirrors won't be used
> at all for manual browsing, as they would be much less useful then those
> 10 mirrors. So then we would reduce our mirror base to ten 'useable'
> mirrors. We certainly cannot coordinate all the current 100 mirrors to
> have a DB installed and be always in sync.

I agree, thus my reasoning for not implementing user settings at this time,
but rather only to use DBs for searching functionality. After this initial
phase is operational, we could always look at extending it with little cost,
since there would be a network of DBs operational.

I agree, but perhaps we are talking about two different issues. Like I say,
th offline manual is certainly valuable, but I don't think it has to be
instead of a DB setup.

> So why to reinvent the whole mirroring structure, add more work to
> mirror maintaners, put a higher level for mirror acceptions, when we can
> create a better solution without disturbing the mirrors and with adding
> more options to users.

Well, some of these are factors. Adding a DB would require a slight shift in
mirror setup and organization, but since not all mirros would be required to
support a DB, many of these aren't issues.

> The only negative point in my proposal is that readers need to have a
> web server and a PHP installed (maybe with a database too, but I am sure
> everything can be done without it, there are quite cool binary file
> based native search solutions). But you said your target is professional
> PHP programmers. So do they have a webserver and a PHP installed? YES ;)

Perhaps I'm confused. If we go to a distributed offline manual, wouldn't the
whole purpose be to not require a web server, DB or anything similar? I see
two distinct audiences; those who need the offline manual for reasons you've
stated above, and those professional PHP users (or at least those who have
constant Internet access). I think an offline manual (a sort of "compiled"
manual) and the online version, with fulltext searching, etc. would address
each of these audiences.

I just wanted to throw out a couple ideas, mainly because I find that
searching for things that I know exist in php.net's manual simply return no
results. Maybe simply another look at using binary/static web site searching
would suffice. Just something to kick around.

H


>
> Goba
>

Gabor Hojtsy

unread,
May 20, 2003, 5:57:52 AM5/20/03
to ha...@nyphp.org, php...@lists.php.net, PHP Webmasters, pear...@lists.php.net, php-g...@lists.php.net
> I can certainly see the hesitation, but again running a DB wouldn't be
> mandatory for all mirrors. Since being on the mirror list and such, I see a
> lot of donations for what appear to be well-connected, hefty offers for
> additional mirrors that are turned away because of country restrictions.
> These, or other mirrors, I'm sure would be happy canidates to provide such a
> DB mirroring scheme.

Yes, you are probably right. The 'problem with me' is probably that I am
unable to see the advantages of having syncronized DB mirrors, given the
extra care we need to take on those mirrors and the maintainers. We
already have 100 mirrors, and have quite some problems with them. Synced
DB mirrors would need more care. Does it worth the time? I am not
convinced yet. You have yet to show me the very big thing this would
bring to us.

>>It is not a problem if a site is not updated for a day. It is a problem,
>>if user settings are not syncronized on all mirrors. If we would have 10
>>mirrors with user settings synced, then the other mirrors won't be used
>>at all for manual browsing, as they would be much less useful then those
>>10 mirrors. So then we would reduce our mirror base to ten 'useable'
>>mirrors. We certainly cannot coordinate all the current 100 mirrors to
>>have a DB installed and be always in sync.
>
> I agree, thus my reasoning for not implementing user settings at this time,
> but rather only to use DBs for searching functionality. After this initial
> phase is operational, we could always look at extending it with little cost,
> since there would be a network of DBs operational.

We already have local search support on mirror sites. This is not a
question of the database. Actually it is much-much better to have a
local search support without a syncronized DB. If we would have search
indexes syncronized, we would need to sync the data of mirror sites at
least twice. We already have the whole site locally on mirrors, so why
not index the pages locally instead of syncing the index sparing network
bandwidth? We do local indexing currently. Yes, htdig is not the best
for local search, mnogosearch is even better supported as a PHP
extension. But nobody had the time to experiment with mnogo, and provide
a setup howto and an integration kit for mirror sites, which would
probably boost the search performance and user experience. But there is
no need to have synced DBs for that, acutally synced DBs would be
wasting human and network resources in this case.

>>And it would also offer a standard integration interface for other [PHP]
>>projects to plug in their documentation. Many guys need PEAR or ADODB or
>>Smarty docs by the side of PHP docs to write their apps, don't you
>>agree? The php site won't integrate with ADODB or any other third party
>>project I can assure you.
>
> I agree, but perhaps we are talking about two different issues. Like I say,
> th offline manual is certainly valuable, but I don't think it has to be
> instead of a DB setup.

Seems like you are using the 'DB setup' as a magic word, something which
would solve all of our problems :) I would rather like to come to the
problem from the user's view. He would like to get fast manual access
with the most features he can get, with actual content. If we would have
two similar solutions for the problem, we would waste time to take care
of both I think. Having an online version with such features, and having
a semi-offline version with such features would be two times work. While
in my eyes, the semi-offline one would be able to provide all the
features we can get with the online one, plus more. It would only need a
onetime setup, and then it would update itself as needed to have
actual content, and it would be much better customizable then an online one.

Please note that the current manual pages are designed to be cacheable
and we greatly depend on caches storing them, so we don't get that much
hits. If we would have some level of customization, then the pages won't
be cacheable anymore, and our server would be down on their knees in no
time, especially www.php.net. Note that we only have a few pages with
user specific content (my.php the download pages and the manual language
selection page). All the others have the same contents for all users.

>>The only negative point in my proposal is that readers need to have a
>>web server and a PHP installed (maybe with a database too, but I am sure
>>everything can be done without it, there are quite cool binary file
>>based native search solutions). But you said your target is professional
>>PHP programmers. So do they have a webserver and a PHP installed? YES ;)
>
> Perhaps I'm confused. If we go to a distributed offline manual, wouldn't the
> whole purpose be to not require a web server, DB or anything similar? I see
> two distinct audiences; those who need the offline manual for reasons you've
> stated above, and those professional PHP users (or at least those who have
> constant Internet access). I think an offline manual (a sort of "compiled"
> manual) and the online version, with fulltext searching, etc. would address
> each of these audiences.

I see one audience. The PHP programmers who want quick access, actual
information, great customizability, good search features, etc. As far as
I can see, we can provide the greatest flexibility with a semi-offline
solution. Having a manual which updates itself if needed, but works even
if there is no connection to the internet. Do you think that "those
professional PHP users" you mention wouldn't love to have all these
after initially setting up the app? Don't you think that the load of
these features won't drive them to set up this environment locally?

As for the beginners, we would still have the online manual to start
with, and to get to a stage, where they can set up this stuff locally.
Or we can put together a simple manual Engine with BadBlue (a very small
Windows server) and a stripped down PHP to run this manual offline.

> I just wanted to throw out a couple ideas, mainly because I find that
> searching for things that I know exist in php.net's manual simply return no
> results. Maybe simply another look at using binary/static web site searching
> would suffice. Just something to kick around.

You are absolutely right that the online search is not satisfactory at
all. Even PHP developers use alltheweb or google to search in the PHP
manual (using URL restrictions there). So the online search definitely
needs improvement, and probably not in a htdig based future. Having a
synced DB is not a solution for this problem though, see the notes above.

To summarize the stuff, I see that we would like to provide the best
service to the PHP users around the world. This is the holy goal ;) In
my eyes, it is wortwhile to implement one system which fits [nearly] all
users because we only have a limited time to work on this. This is why
I have started this discussion to gather the ideas and those guys who
can / would like to help. Maybe I am not looking too flexible regarding
your idea, but I really don't see the benefit of it. I think every user
want the best and we should provide every user the best stuff if we can.

Goba

Reply all
Reply to author
Forward
0 new messages