Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to limit the number of web pages downloaded from a site?

3 views
Skip to first unread message

Nad

unread,
Aug 8, 2008, 6:42:51 PM8/8/08
to
I have a very large site with valuable information.
Is there any way to prevent downloading a large number
of articles. Some people want to download the entire site.

Any hints or pointers would be appreciated.

dorayme

unread,
Aug 8, 2008, 7:51:45 PM8/8/08
to

Password protect folders or pages, make users register to get the
passwords, that would slow them down a bit. But really, if you make
stuff available publicly...

--
dorayme

Adrienne Boswell

unread,
Aug 8, 2008, 11:36:16 PM8/8/08
to
Gazing into my crystal ball I observed n...@invalid.com (Nad) writing in
news:g7ii5c$3b5$1...@aioe.org:

You could store their IP address in a session, and check to see the length
of time between requests.

--
Adrienne Boswell at Home
Arbpen Web Site Design Services
http://www.cavalcade-of-coding.info
Please respond to the group so others can share

Neredbojias

unread,
Aug 9, 2008, 12:07:31 AM8/9/08
to

Change the articles' text to Olde Englishe.

--
Neredbojias
http://www.neredbojias.net/
Public Website

Nad

unread,
Aug 9, 2008, 1:38:24 AM8/9/08
to
In article <Xns9AF4D1999BBA...@69.16.185.247>, Adrienne Boswell
<arb...@yahoo.com> wrote:
>Gazing into my crystal ball I observed n...@invalid.com (Nad) writing in
>news:g7ii5c$3b5$1...@aioe.org:
>
>> I have a very large site with valuable information.
>> Is there any way to prevent downloading a large number
>> of articles. Some people want to download the entire site.
>>
>> Any hints or pointers would be appreciated.
>>
>>
>
>You could store their IP address in a session, and check to see the length
>of time between requests.

Well, something along those lines.
The problem is the server side support.
Some servers do not allow cgi, php, javascript, or even ssi
executable commands, and I'd like it to work on ANY server.

Nad

unread,
Aug 9, 2008, 1:38:26 AM8/9/08
to
In article <Xns9AF4D6E58210...@194.177.96.78>, Neredbojias
<Scrot...@gmail.com> wrote:
>On 08 Aug 2008, n...@invalid.com (Nad) wrote:
>
>> I have a very large site with valuable information.
>> Is there any way to prevent downloading a large number
>> of articles. Some people want to download the entire site.
>>
>> Any hints or pointers would be appreciated.
>
>Change the articles' text to Olde Englishe.

:--}

I like that!!!


Lars Eighner

unread,
Aug 9, 2008, 2:15:51 AM8/9/08
to
In our last episode, <g7ii5c$3b5$1...@aioe.org>, the lovely and talented Nad
broadcast on alt.html:

> I have a very large site with valuable information. Is there any way to
> prevent downloading a large number of articles. Some people want to
> download the entire site.

It depends upon what you mean by 'articles.' If put you html documents on a
web server. you are pretty much inviting the public to view/download as much
of it as they want. If it is 'valuable', why are you giving it away? And
if you are giving it away valuable stuff, what did you expect? What is your
real concern here?

If you only worried about server load, why not zip or tar and gzip it up
and put it on an FTP server? This is most practical for related documents,
such as parts of a tutorial or parts of a spec. If you are a philanthropist
who is giving away valuable stuff, you can give it away in big chunks so
the nickel and dime requests don't bug you.

Well-behaved download-the-whole-site spiders will obey robots.txt, but that
is pretty much a courtesy thing, and it won't stop anyone who is manually
downloading a page at a time, and it won't stop rogue or altered spiders.
Likewise, you can block nice spiders which send a true user-agent ID, but
not so nice spiders can spoof their ID. That's kind of pointless, because
most of the nice spiders will obey robots.txt anyway.

You can make pages available through php or cgi which keeps track of the
number of documents with hidden controls. This is easily defeated by
anyone determined to do so, and like a cheap lock, will only keep the honest
people out. Beyond that, you can go to various user account schemes up to
putting your documents on a secure server.

But I think what you are asking is 'Can I keep my documents public and still
limit public access?' And the answer to that is, of course not because
there is a fundamental contradiction in what you want.

> Any hints or pointers would be appreciated.

--
Lars Eighner <http://larseighner.com/> use...@larseighner.com
War hath no fury like a noncombatant.
- Charles Edward Montague

Neredbojias

unread,
Aug 9, 2008, 6:10:58 AM8/9/08
to
On 08 Aug 2008, n...@invalid.com (Nad) wrote:

> In article <Xns9AF4D6E58210...@194.177.96.78>, Neredbojias
> <Scrot...@gmail.com> wrote:
>>On 08 Aug 2008, n...@invalid.com (Nad) wrote:
>>
>>> I have a very large site with valuable information.
>>> Is there any way to prevent downloading a large number
>>> of articles. Some people want to download the entire site.
>>>
>>> Any hints or pointers would be appreciated.
>>
>>Change the articles' text to Olde Englishe.
>
>:--}
>
> I like that!!!

<grin>

Seriously, I don't think there's much you can do that is practical. With
server-side support, you could impliment some kind of time limit and/or p/w
but you indicated you didn't want to rely on that. An off-the-wall "non-
solution" would be to use reasonably long meta page redirects, but the user
could always come back with a new time limit.

Nad

unread,
Aug 9, 2008, 6:31:26 AM8/9/08
to
In article <doraymeRidThis-F5E...@news-vip.optusnet.com.au>,
dorayme <dorayme...@optusnet.com.au> wrote:
>In article <g7ii5c$3b5$1...@aioe.org>, n...@invalid.com (Nad) wrote:
>
>> I have a very large site with valuable information.
>> Is there any way to prevent downloading a large number
>> of articles. Some people want to download the entire site.
>>
>> Any hints or pointers would be appreciated.
>
>Password protect folders or pages, make users register to get the
>passwords, that would slow them down a bit.

It doesn't work. For example, Teleport Pro (a program to download
the entire sites) allows you to specify login/passwd.
So, once they register, they can enter this info and boom...

>But really, if you make
>stuff available publicly...

Well, the site is 150 megs, over 20k articles.
And there are plenty of people who would LOVE to have
the entire site on their own box.
Then you have a problem. Providers usually charge for
the amount of traffic. In one month, you'd have to shell
out some bux, just to give the information to the
"gimme free Coce" zombies.
That does not make sense.

Nad

unread,
Aug 9, 2008, 6:31:28 AM8/9/08
to
In article <Xns9AF4D1999BBA...@69.16.185.247>, Adrienne Boswell
<arb...@yahoo.com> wrote:
>Gazing into my crystal ball I observed n...@invalid.com (Nad) writing in
>news:g7ii5c$3b5$1...@aioe.org:
>
>> I have a very large site with valuable information.
>> Is there any way to prevent downloading a large number
>> of articles. Some people want to download the entire site.
>>
>> Any hints or pointers would be appreciated.
>>
>>
>
>You could store their IP address in a session, and check to see the length
>of time between requests.

Something along those lines.
I was thinking of detecting automated downloads.
When people simply look at the information manually,
it is all fine and dandy.
But when they start running a program to download hundreds
if not thousands of articles, that is another issue.

But there are issues with this.
Some web hosting vendors do not allow the executables
to run as it is a security risk. They may not allow
cgi, php, even executable commands of ssi and javascripts.

Now, in order to detect a page access by the client,
you either add an ssi include statement to either cgi
or javascript, or make your pages dynamically assemble
with php. But what if you can not even run those because
the provider does not allow it?

Sure, you can shell out some serious bux to get yourself
a premiere hosting facility where you can have your own
virtual domain. But if you spent years developing tools
to automatically build a 20k+ article site, and are
willing to give the information for free, then financing
the top notch provider on the top of it, is not something
that excites my imagination.

There is an excellent provider - by.ru. They are free.
They are huge, several hundred thousand sites and free
email users. But they do not allow ANY executables to
run. They even disable the ssi executable statements.

So, what do you do in that case?

Nad

unread,
Aug 9, 2008, 6:31:30 AM8/9/08
to
In article <slrng9qd4o...@debranded.larseighner.com>, Lars Eighner
<use...@larseighner.com> wrote:
>In our last episode, <g7ii5c$3b5$1...@aioe.org>, the lovely and talented Nad
>broadcast on alt.html:
>
>> I have a very large site with valuable information. Is there any way to
>> prevent downloading a large number of articles. Some people want to
>> download the entire site.
>
>It depends upon what you mean by 'articles.' If put you html documents on a
>web server. you are pretty much inviting the public to view/download as much
>of it as they want. If it is 'valuable', why are you giving it away? And
>if you are giving it away valuable stuff, what did you expect? What is your
>real concern here?

Downloading the entire 150+ meg site, which translates into
all sorts of things.

>If you only worried about server load, why not zip or tar and gzip it up
>and put it on an FTP server? This is most practical for related documents,
>such as parts of a tutorial or parts of a spec. If you are a philanthropist
>who is giving away valuable stuff, you can give it away in big chunks so
>the nickel and dime requests don't bug you.
>
>Well-behaved download-the-whole-site spiders will obey robots.txt,

That doesn't work. Some random user may come and download the
entire site. By the time you put him into robots.txt, it is too late.

>but that
>is pretty much a courtesy thing, and it won't stop anyone who is manually
>downloading a page at a time,

That is not a problem. They can manually download as much as they want.
But no automated downloads.

>and it won't stop rogue or altered spiders.
>Likewise, you can block nice spiders which send a true user-agent ID, but
>not so nice spiders can spoof their ID. That's kind of pointless, because
>most of the nice spiders will obey robots.txt anyway.

>You can make pages available through php or cgi which keeps track of the
>number of documents with hidden controls. This is easily defeated by
>anyone determined to do so,

How?

>and like a cheap lock, will only keep the honest
>people out. Beyond that, you can go to various user account schemes up to
>putting your documents on a secure server.

Well, no account schemes, no user verification, no limits beyond
trying to automatically download the entire site pretty much.

>But I think what you are asking is 'Can I keep my documents public and still
>limit public access?'

Not really. AUTOMATED download.

>And the answer to that is, of course not because
>there is a fundamental contradiction in what you want.

I do not see it at the moment.

dorayme

unread,
Aug 9, 2008, 7:09:56 AM8/9/08
to
In article <g7js9i$gr2$1...@aioe.org>, n...@invalid.com (Nad) wrote:

> In article <doraymeRidThis-F5E...@news-vip.optusnet.com.au>,
> dorayme <dorayme...@optusnet.com.au> wrote:
> >In article <g7ii5c$3b5$1...@aioe.org>, n...@invalid.com (Nad) wrote:
> >
> >> I have a very large site with valuable information.
> >> Is there any way to prevent downloading a large number
> >> of articles. Some people want to download the entire site.
> >>
> >> Any hints or pointers would be appreciated.
> >
> >Password protect folders or pages, make users register to get the
> >passwords, that would slow them down a bit.
>
> It doesn't work. For example, Teleport Pro (a program to download
> the entire sites) allows you to specify login/passwd.
> So, once they register, they can enter this info and boom...
>

I understand your concerns and it is natural to worry a bit. But
consider again.

That there is Teleport Pro does not actually show that my suggestion
would not work. Perhaps you are looking at every stage at worst case
possibilities. It would limit it to people who knew about this program
or be prepared to get it. That is one thing. The other thing is that
granting passwords might be conditional on them agreeing not to do what
you fear. Is your site a serious site liable to attract serious people?
You might be surprised how decent most people are if you make things
clear.


> >But really, if you make
> >stuff available publicly...
>
> Well, the site is 150 megs, over 20k articles.
> And there are plenty of people who would LOVE to have
> the entire site on their own box.
> Then you have a problem. Providers usually charge for
> the amount of traffic. In one month, you'd have to shell
> out some bux, just to give the information to the
> "gimme free Coce" zombies.
> That does not make sense.

How sure are you of the likelihood of a whole bunch of people wanting to
download the whole lot? Most people are wary of over exposing themselves
to information and will get what they are interested in. So I guess, you
need to do some guessing and some analysis. Perhaps you are worrying
excessively?

Presumably you would be hoping your site is used and is useful. If a
bunch of folk download a small bunch of articles each, this might well
be the biggest factor rather than a few who download the lot. You would
have to make some projections concerning this, you would be in the best
position to crunch some numbers as it is your field. If you are more
successful than you imagine via people doing reasoanable things rather
than unreasonable things, you perhaps ought to be preparing yourself for
the possibility of serious server charges. I understand your concern to
limit things, but a huge site carves out a certain territory and you may
need to consider charging for access?

The other suggestion I might make is that you provide for the odd
possibility of some people wanting the lot by employing compressed
archives and utilising other than your own server, there might be some
free servers or cheap servers for this express purpose.

--
dorayme

Nad

unread,
Aug 9, 2008, 7:10:38 AM8/9/08
to
>In article <g7ii5c$3b5$1...@aioe.org>, n...@invalid.com (Nad) wrote:
>
>> I have a very large site with valuable information.
>> Is there any way to prevent downloading a large number
>> of articles. Some people want to download the entire site.
>>
>> Any hints or pointers would be appreciated.
>
>Password protect folders or pages, make users register to get the
>passwords, that would slow them down a bit.

It doesn't work. For example, Teleport Pro (a program to download


the entire sites) allows you to specify login/passwd.
So, once they register, they can enter this info and boom...

>But really, if you make
>stuff available publicly...

Well, the site is 150 megs, over 20k articles.


And there are plenty of people who would LOVE to have
the entire site on their own box.
Then you have a problem. Providers usually charge for
the amount of traffic. In one month, you'd have to shell
out some bux, just to give the information to the
"gimme free Coce" zombies.
That does not make sense.


--
The most powerful Usenet tool you have ever heard of.

NewsMaestro v. 4.0.8 has been released.

* Several nice improvements and bug fixes.

Note: In some previous releases some class files were missing.
As a result, the program would not run.
Sorry for the inconvenience.

Web page:
http://newsmaestro.sourceforge.net/

Download page:
http://newsmaestro.sourceforge.net/Download_Information.htm

Send any feedback, ideas, suggestions, test results to
newsmaestroinfo \at/ mail.ru.

Your personal info will not be released and your privacy
will be honored.

Nad

unread,
Aug 9, 2008, 7:11:19 AM8/9/08
to
In article <Xns9AF4D1999BBA...@69.16.185.247>, Adrienne Boswell
<arb...@yahoo.com> wrote:
>Gazing into my crystal ball I observed n...@invalid.com (Nad) writing in
>news:g7ii5c$3b5$1...@aioe.org:
>
>> I have a very large site with valuable information.
>> Is there any way to prevent downloading a large number
>> of articles. Some people want to download the entire site.
>>
>> Any hints or pointers would be appreciated.
>>
>>
>
>You could store their IP address in a session, and check to see the length
>of time between requests.

Something along those lines.

Nad

unread,
Aug 9, 2008, 7:11:52 AM8/9/08
to
>In our last episode, <g7ii5c$3b5$1...@aioe.org>, the lovely and talented Nad
>broadcast on alt.html:
>
>> I have a very large site with valuable information. Is there any way to
>> prevent downloading a large number of articles. Some people want to
>> download the entire site.
>
>It depends upon what you mean by 'articles.' If put you html documents on a
>web server. you are pretty much inviting the public to view/download as much
>of it as they want. If it is 'valuable', why are you giving it away? And
>if you are giving it away valuable stuff, what did you expect? What is your
>real concern here?

Downloading the entire 150+ meg site, which translates into
all sorts of things.

>If you only worried about server load, why not zip or tar and gzip it up


>and put it on an FTP server? This is most practical for related documents,
>such as parts of a tutorial or parts of a spec. If you are a philanthropist
>who is giving away valuable stuff, you can give it away in big chunks so
>the nickel and dime requests don't bug you.
>
>Well-behaved download-the-whole-site spiders will obey robots.txt,

That doesn't work. Some random user may come and download the


entire site. By the time you put him into robots.txt, it is too late.

>but that


>is pretty much a courtesy thing, and it won't stop anyone who is manually
>downloading a page at a time,

That is not a problem. They can manually download as much as they want.
But no automated downloads.

>and it won't stop rogue or altered spiders.


>Likewise, you can block nice spiders which send a true user-agent ID, but
>not so nice spiders can spoof their ID. That's kind of pointless, because
>most of the nice spiders will obey robots.txt anyway.

>You can make pages available through php or cgi which keeps track of the
>number of documents with hidden controls. This is easily defeated by
>anyone determined to do so,

How?

>and like a cheap lock, will only keep the honest
>people out. Beyond that, you can go to various user account schemes up to
>putting your documents on a secure server.

Well, no account schemes, no user verification, no limits beyond


trying to automatically download the entire site pretty much.

>But I think what you are asking is 'Can I keep my documents public and still
>limit public access?'

Not really. AUTOMATED download.

>And the answer to that is, of course not because
>there is a fundamental contradiction in what you want.

I do not see it at the moment. Can you expand on that?

Nad

unread,
Aug 9, 2008, 7:32:03 AM8/9/08
to
In article <Xns9AF52060C1A3...@194.177.96.78>, Neredbojias
<Scrot...@gmail.com> wrote:
>On 08 Aug 2008, n...@invalid.com (Nad) wrote:
>
>> In article <Xns9AF4D6E58210...@194.177.96.78>, Neredbojias
>> <Scrot...@gmail.com> wrote:
>>>On 08 Aug 2008, n...@invalid.com (Nad) wrote:
>>>
>>>> I have a very large site with valuable information.
>>>> Is there any way to prevent downloading a large number
>>>> of articles. Some people want to download the entire site.
>>>>
>>>> Any hints or pointers would be appreciated.
>>>
>>>Change the articles' text to Olde Englishe.
>>
>>:--}
>>
>> I like that!!!
>
><grin>
>
>Seriously, I don't think there's much you can do that is practical.

Well, Google does it. Sure, it is slightly a different setup,
but they limit the number of queries to 100.

> With
>server-side support, you could impliment some kind of time limit

Time limit on high bandwidth does not work.

>and/or p/w
>but you indicated you didn't want to rely on that. An off-the-wall "non-
>solution" would be to use reasonably long meta page redirects, but the user
>could always come back with a new time limit.

Could you expand on that idea?

Nad

unread,
Aug 9, 2008, 7:32:04 AM8/9/08
to
In article <doraymeRidThis-672...@news-vip.optusnet.com.au>,
dorayme <dorayme...@optusnet.com.au> wrote:
>In article <g7js9i$gr2$1...@aioe.org>, n...@invalid.com (Nad) wrote:
>
>> In article <doraymeRidThis-F5E...@news-vip.optusnet.com.au>,
>> dorayme <dorayme...@optusnet.com.au> wrote:
>> >In article <g7ii5c$3b5$1...@aioe.org>, n...@invalid.com (Nad) wrote:
>> >
>> >> I have a very large site with valuable information.
>> >> Is there any way to prevent downloading a large number
>> >> of articles. Some people want to download the entire site.
>> >>
>> >> Any hints or pointers would be appreciated.
>> >
>> >Password protect folders or pages, make users register to get the
>> >passwords, that would slow them down a bit.
>>
>> It doesn't work. For example, Teleport Pro (a program to download
>> the entire sites) allows you to specify login/passwd.
>> So, once they register, they can enter this info and boom...
>>
>
>I understand your concerns and it is natural to worry a bit. But
>consider again.
>
>That there is Teleport Pro does not actually show that my suggestion
>would not work. Perhaps you are looking at every stage at worst case
>possibilities. It would limit it to people who knew about this program
>or be prepared to get it.

Not really. There are quite a few programs out there.
Easiest thing in the world to find.
Do a search on Teleport Pro. A VERY nice program.

>That is one thing. The other thing is that
>granting passwords might be conditional on them agreeing not to do what
>you fear. Is your site a serious site liable to attract serious people?

Well, if you use that site, your salary could go up quite a bit
just in a few months. Is that "serious"?

>You might be surprised how decent most people are if you make things
>clear.

:--}

I wish I had your optimism. But I have seen plenty of evidence
otherwise.

>> >But really, if you make
>> >stuff available publicly...
>>
>> Well, the site is 150 megs, over 20k articles.
>> And there are plenty of people who would LOVE to have
>> the entire site on their own box.
>> Then you have a problem. Providers usually charge for
>> the amount of traffic. In one month, you'd have to shell
>> out some bux, just to give the information to the
>> "gimme free Coce" zombies.
>> That does not make sense.
>
>How sure are you of the likelihood of a whole bunch of people wanting to
>download the whole lot?

It's been done already.

>Most people are wary of over exposing themselves
>to information and will get what they are interested in. So I guess, you
>need to do some guessing and some analysis. Perhaps you are worrying
>excessively?

Who knows. But there is an issue here. I have no doubts about that much.

>Presumably you would be hoping your site is used and is useful. If a
>bunch of folk download a small bunch of articles each,

That is not a problem.

>this might well

>be the biggest factor rather than a few who download the lot.

But those few count hundred times more that fair users.

>You would
>have to make some projections concerning this, you would be in the best
>position to crunch some numbers as it is your field.

That is what we are doing here.

>If you are more
>successful than you imagine via people doing reasoanable things rather
>than unreasonable things, you perhaps ought to be preparing yourself for
>the possibility of serious server charges.

Yep. For about $100/mo. I can have my own virtual domain
on a tier 1 network. But that bites. There is no income from
this enterprise. Nobody is going to give you a dime for getting
something. That much I have seen.

>I understand your concern to
>limit things, but a huge site carves out a certain territory and you may
>need to consider charging for access?

Nope. That is not reasonable. It should be totally free.
No registration, no charges of any kind.
Just come and look at anything you want. But not be a bastard.

>The other suggestion I might make is that you provide for the odd
>possibility of some people wanting the lot by employing compressed
>archives and utilising other than your own server, there might be some
>free servers or cheap servers for this express purpose.

Thanks for feedback.

Adrienne Boswell

unread,
Aug 9, 2008, 10:41:47 AM8/9/08
to
Gazing into my crystal ball I observed n...@invalid.com (Nad) writing in
news:g7ju3i$qfi$6...@aioe.org:

> Now, in order to detect a page access by the client,
> you either add an ssi include statement to either cgi
> or javascript, or make your pages dynamically assemble
> with php. But what if you can not even run those because
> the provider does not allow it?
>

That is transparent to the client, and you find a provider that allows
scripting on their servers. There are plenty of free hosts that do so.
Look through this group for some recently mentioned hosts.

Neredbojias

unread,
Aug 9, 2008, 3:33:25 PM8/9/08
to
On 09 Aug 2008, n...@invalid.com (Nad) wrote:

>>> I like that!!!
>>
>><grin>
>>
>>Seriously, I don't think there's much you can do that is practical.
>
> Well, Google does it. Sure, it is slightly a different setup,
> but they limit the number of queries to 100.

Sure, but tell me they do it without server-side techniques which you so
explicitly eschewed...

>>but you indicated you didn't want to rely on that. An off-the-wall
>>"non- solution" would be to use reasonably long meta page redirects, but
>>the user could always come back with a new time limit.
>
> Could you expand on that idea?

Don't think it would work but just 10-15 minute meta refresh in page head.

Nad

unread,
Aug 9, 2008, 5:00:31 PM8/9/08
to
In article <Xns9AF54E4B62C8...@69.16.185.250>, Adrienne Boswell
<arb...@yahoo.com> wrote:
>Gazing into my crystal ball I observed n...@invalid.com (Nad) writing in
>news:g7ju3i$qfi$6...@aioe.org:
>
>> Now, in order to detect a page access by the client,
>> you either add an ssi include statement to either cgi
>> or javascript, or make your pages dynamically assemble
>> with php. But what if you can not even run those because
>> the provider does not allow it?
>>
>
>That is transparent to the client, and you find a provider that allows
>scripting on their servers. There are plenty of free hosts that do so.
>Look through this group for some recently mentioned hosts.

Well, I'd like to know about those servers. I'll review the
posts. Do you know of any off hand?

But I suspect there is a limit on the site size, isn't there?
The sites I'll be creating are in the range of 150-250 megs.
The one I am using now has no limit on size.

Btw, I was thinking if it is worth creating a site on HTML,
using this group's archive, going back a couple of years.

Unfortunately, I am not an HTML expert and the site has to
be organized on two categories of information

1) Code examples
2) Expert opinions

Now, to generate the site automatically, you have to process
the archives with fancy, multi-stage filters to make sure
you get only articles that ARE exactly on topic for some
issue. In other words, find a needle in a haystack.

The information is categorized by different issues
and I have no idea what those are for HTML. You have to spend
some time on it and create a category list of interesting or
important issues.

Secondly, you need to know who are the "experts" around here.
I have never participated in this group. So, I'd have to review
tons of posts and decide who are those people that really know
what they are talking about.

Things like that. But who knows, you may see a very nice site
on HTML in the near future what would contain thousands of
code examples. If you have any requests, ideas
or suggestions, that would help. If you can recommend a list
of "experts", that'd help. They don't have to be the HTML gods.
They just have to know what they are talking about and be
helpful in the followups.

Also, if you can give me a list of HTML issues that are worth
creating a chapter from, that would help.

Just type one entry per line.


Chaddy2222

unread,
Aug 10, 2008, 12:15:42 PM8/10/08
to

Nad wrote:

> In article <Xns9AF54E4B62C8...@69.16.185.250>, Adrienne Boswell
> <arb...@yahoo.com> wrote:
> >Gazing into my crystal ball I observed n...@invalid.com (Nad) writing in
> >news:g7ju3i$qfi$6...@aioe.org:
> >
> >> Now, in order to detect a page access by the client,
> >> you either add an ssi include statement to either cgi
> >> or javascript, or make your pages dynamically assemble
> >> with php. But what if you can not even run those because
> >> the provider does not allow it?
> >>
> >
> >That is transparent to the client, and you find a provider that allows
> >scripting on their servers. There are plenty of free hosts that do so.
> >Look through this group for some recently mentioned hosts.
>
> Well, I'd like to know about those servers. I'll review the
> posts. Do you know of any off hand?
>
> But I suspect there is a limit on the site size, isn't there?
> The sites I'll be creating are in the range of 150-250 megs.
> The one I am using now has no limit on size.

Their is NO SUCH THING as Unlimited, especially when it come to web
servers. The best you can get is a VPS or dedicated server.
Check out http://www.servergrade.com.au
They are not free but they do have very good deals on web hosting and
domain names. Especially if you buy your Domain and web hosting as the
one package, then the domain only costs $1 per year!


>
> Btw, I was thinking if it is worth creating a site on HTML,
> using this group's archive, going back a couple of years.

No it is not at all. Also if you create sites useing content from
other sites, you can be sued for being in breach of copyright laws in
many / most countries of the world, also Google will ban you from it's
listings for having duplicated content.
--
Regards Chad. http://freewebdesignonline.org

Dr J R Stockton

unread,
Aug 10, 2008, 11:15:33 AM8/10/08
to
In comp.lang.javascript message <g7ii5c$3b5$1...@aioe.org>, Fri, 8 Aug 2008
22:42:51, Nad <n...@invalid.com> posted:

If you have a well-crafted index.htm page, and a robots.txt file that
allows robot access only to that page, then it seems likely that the
proportion of access from those whose searches have found something that
might have been of interest but was not will be significantly reduced.
Certainly using such a robots.txt works for me, to reduce total
download.

Keep page sizes down, so that a page access which turned out to be
uninteresting or only partly interesting does not cost you so many
bytes.

Omit inessential figures from the text pages, link to them instead, so
that a click is needed and will open a new tab or window. Maybe do
similar with tables.

Check how the access is counted. If a page in plain HTML requires 50 kB
but can be compressed to 25 kB, is it delivered compressed and is it
counted as 25 or 50 kB?

Consider zipping material, as a possible means of deterring mere
passers-by. Consider compressing material in a manner less easy of
access - zip with password or a rarer compressing tool. Consider
encoding material by writing not in English but, say, in German. You
can always if necessary rephrase your German so that translate tools
make reasonable sense of it.

Don't expect any of these to prevent all downloading of the whole site;
they are merely ways likely to reduce downloading by those who don't
need the material.

--
(c) John Stockton, nr London UK. ?@merlyn.demon.co.uk IE7 FF2 Op9 Sf3
news:comp.lang.javascript FAQ <URL:http://www.jibbering.com/faq/index.html>.
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.

0 new messages