Preventing multiple invocations of script from accessing the same file

29 views
Skip to first unread message

Mark Hobley

unread,
Nov 27, 2005, 4:08:03 PM11/27/05
to
I am writing a CGI script using the Bourne shell. I want to update a file
based on input obtained.

If another person is accessing the same web page, the script may be invoked
a second time, whilst the first occurance is still running.

How can I configure some sort of lock, so that the second copy of the script
will not continue, until the first copy of the script has done the update.

I thought about some sort of lock file, but I thought of a problem when both
scripts are invoked one after the other where the second script may not see
the lock file by the first, because it didn't exist at the time of check.

Please advise.

Mark.

--
Mark Hobley
393 Quinton Road West
QUINTON
Birmingham
B32 1QE

Telephone: (0121) 247 1596
International: 0044 121 247 1596

Email: markhobley at hotpop dot donottypethisbit com

http://markhobley.yi.org/

Geoffrey Clements

unread,
Nov 27, 2005, 4:28:49 PM11/27/05
to
On Sun, 27 Nov 2005 21:08:03 +0000, Mark Hobley wrote:

> I am writing a CGI script using the Bourne shell. I want to update a file
> based on input obtained.
>
> If another person is accessing the same web page, the script may be invoked
> a second time, whilst the first occurance is still running.
>
> How can I configure some sort of lock, so that the second copy of the script
> will not continue, until the first copy of the script has done the update.
>
> I thought about some sort of lock file, but I thought of a problem when both
> scripts are invoked one after the other where the second script may not see
> the lock file by the first, because it didn't exist at the time of check.
>
> Please advise.
>

How about:
1. Check if lockfile exists, if it does wait until it doesn't.
2. When there is no lockfile create a lockfile containing the current
process id.
3. Read the lockfile, if it contains the current process id carry on, if
the contained process id is different then wait until lockfile is removed.

--
Geoff
Replace bitbucket with geoff to mail me

Unruh

unread,
Nov 27, 2005, 5:07:22 PM11/27/05
to
markh...@hotpop.deletethisbit.com (Mark Hobley) writes:

>I am writing a CGI script using the Bourne shell. I want to update a file
>based on input obtained.

>If another person is accessing the same web page, the script may be invoked
>a second time, whilst the first occurance is still running.

>How can I configure some sort of lock, so that the second copy of the script
>will not continue, until the first copy of the script has done the update.

>I thought about some sort of lock file, but I thought of a problem when both
>scripts are invoked one after the other where the second script may not see
>the lock file by the first, because it didn't exist at the time of check.

So have the script activate the lock file the y first thing it
does.
Yes, race conditions can happen.
But they can be improbable.

Martin Gregorie

unread,
Nov 27, 2005, 5:18:58 PM11/27/05
to
Mark Hobley wrote:
> I am writing a CGI script using the Bourne shell. I want to update a file
> based on input obtained.
>
> If another person is accessing the same web page, the script may
> be invoked a second time, whilst the first ocurrence is still running.

>
> How can I configure some sort of lock, so that the second copy of the
> script will not continue, until the first copy of the script has done
> the update.
>
I'd suggest that you don't do it that way.

The easy way to avoid conflicts without using locks is to move all the
critical stuff into a single threaded server which only ever runs as a
single instance and listens on a named pipe or a port.

Your script needs to call a simple program that opens a connection to
the server, passes the request to the server, waits for the response,
closes the connection and quits.

The server should be written to receive a request, process it using
synchronous i/o and return a response. Because the server is single
threaded it will serialize requests: while it is handling one request it
won't accept another connection, let alone overlap it with another
request. The server needs to use poll() to handle the listener port and
the accepted connections but should not use it for anything else apart
from trapping signals. This assumes a single copy of a permanent server
is started at boot time.

The other way to run the server is to let xinetd start the server on
demand and configure xinetd to allow only one copy of the server to be
started. This makes the server as easy to write as a normal filter. It
gets the request via stdin and writes the response to stdout. It doesn't
use poll() at all.

HTH

--
martin@ | Martin Gregorie
gregorie. |
org | Zappa fan & glider pilot

Julian Bradfield

unread,
Nov 27, 2005, 5:45:28 PM11/27/05
to
In article <fagp53-...@neptune.markhobley.yi.org>,

Mark Hobley <markh...@hotpop.deletethisbit.com> wrote:
>I am writing a CGI script using the Bourne shell. I want to update a file
>based on input obtained.

Why the Bourne shell?
If you were using Perl, you could use standard locking facilities.

However, if you insist on doing file locking in Bourne shell, the
best way is probably via symlinks, thus:

to lock:

locked=''
while [ ! "$locked" ] ; do
ln -s $$ lockfile 2>/dev/null
if [ `readlink lockfile` = $$ ] ; then
locked=yes
else
sleep 10
# insert error checking, timeouts, etc. to taste
fi
done

to unlock:

# do paranoia checking that the lock is still ours if you like
rm lockfile

The reason for using symlinks is that symlink creation is atomic even
over NFS.

Mark Hobley

unread,
Nov 27, 2005, 6:08:03 PM11/27/05
to
In alt.comp.os.linux Unruh <unruh...@physics.ubc.ca> wrote:

> So have the script activate the lock file the y first thing it
> does.
> Yes, race conditions can happen.
> But they can be improbable.

Yeah, but more likely in CGI because of multiple pages being served.

Mark Hobley

unread,
Nov 27, 2005, 6:08:03 PM11/27/05
to
In alt.comp.os.linux Geoffrey Clements <bitb...@electron.me.uk> wrote:

> 1. Check if lockfile exists, if it does wait until it doesn't.
> 2. When there is no lockfile create a lockfile containing the current
> process id.
> 3. Read the lockfile, if it contains the current process id carry on, if
> the contained process id is different then wait until lockfile is removed.

A race condition can still occur, as follows:

1. First script runs
2. Second script runs
3. First script checks for lock file, it doesn't exist
4. Second script checks for lock file, it doesn't exist
5. First script creates lock file, and writes its pid
6. First script reads lock file, and checks pid, it matches
7. Second script writes lock file, and writes its pid
8. Second script reads lock and check pid, it matches

If 6 beats 7 then both scripts continue thinking they have "GO".

Mark Hobley

unread,
Nov 27, 2005, 8:08:03 PM11/27/05
to
In alt.comp.os.linux Martin Gregorie <mar...@see.sig.for.address> wrote:

> The easy way to avoid conflicts without using locks is to move all the
> critical stuff into a single threaded server which only ever runs as a
> single instance and listens on a named pipe or a port.

That sounds like something that I want to do. How easy is it to set this up ?

Is there a skeleton or sample code that I can work from ?

(Pipes are preferred, but I would like to know how to do this with ports also.)

> Your script needs to call a simple program that opens a connection to
> the server, passes the request to the server, waits for the response,
> closes the connection and quits.

Again, how easy is this to set up ? Is there a skeleton ?

(My background is MSDOS assembly language. I am new to Linux programming.)

> The other way to run the server is to let xinetd start the server on
> demand and configure xinetd to allow only one copy of the server to be
> started.

Will the traditional superserver daemon, inetd do this ? I am not using xinetd.

Chris F.A. Johnson

unread,
Nov 27, 2005, 8:30:21 PM11/27/05
to
On 2005-11-27, Mark Hobley wrote:
> I am writing a CGI script using the Bourne shell. I want to update a file
> based on input obtained.
>
> If another person is accessing the same web page, the script may be invoked
> a second time, whilst the first occurance is still running.
>
> How can I configure some sort of lock, so that the second copy of the script
> will not continue, until the first copy of the script has done the update.
>
> I thought about some sort of lock file, but I thought of a problem when both
> scripts are invoked one after the other where the second script may not see
> the lock file by the first, because it didn't exist at the time of check.

file=/path/to/file
lockfile=$file-lock
delay=1 ## Adjust to taste
maxtries=10 ## ditto
while :
do
ln -s "$file" "$lockfile" && break
sleep "$delay"
[ "$maxtries" -le 0 ] && exit 1
maxtries=$(( $maxtries + 1 ))
done
trap 'rm "$lockfile"' EXIT


--
Chris F.A. Johnson, author | <http://cfaj.freeshell.org>
Shell Scripting Recipes: | My code in this post, if any,
A Problem-Solution Approach | is released under the
2005, Apress | GNU General Public Licence

Simon J. Rowe

unread,
Nov 28, 2005, 3:33:16 AM11/28/05
to
Mark Hobley wrote:

> How can I configure some sort of lock, so that the second copy of the
> script will not continue, until the first copy of the script has done the
> update.

You can implement mutex locking in shell using directory creation

mutex_try_lock()
{
# grab exclusive lock
mkdir $LOCKFILE

if [ $? -eq 0 ]; then
# we've got the lock
return 0
else
# we've failed to got the lock
return 1
fi
}

mutex_lock()
{
while :; do
mutex_try_lock

if [ $? -eq 0 ]; then
# we've got the lock
return 0
else
# wait for the lock
usleep 50000
fi
done
}

mutex_unlock()
{
rmdir $LOCKFILE
}

Geoffrey Clements

unread,
Nov 28, 2005, 5:26:39 AM11/28/05
to

"Mark Hobley" <markh...@hotpop.deletethisbit.com> wrote in message
news:hrop53-...@neptune.markhobley.yi.org...

> In alt.comp.os.linux Geoffrey Clements <bitb...@electron.me.uk> wrote:
>
>> 1. Check if lockfile exists, if it does wait until it doesn't.
>> 2. When there is no lockfile create a lockfile containing the current
>> process id.
>> 3. Read the lockfile, if it contains the current process id carry on, if
>> the contained process id is different then wait until lockfile is
>> removed.
>
> A race condition can still occur, as follows:
>
> 1. First script runs
> 2. Second script runs
> 3. First script checks for lock file, it doesn't exist
> 4. Second script checks for lock file, it doesn't exist
> 5. First script creates lock file, and writes its pid
> 6. First script reads lock file, and checks pid, it matches
> 7. Second script writes lock file, and writes its pid
> 8. Second script reads lock and check pid, it matches
>
> If 6 beats 7 then both scripts continue thinking they have "GO".
>

Yup - I would just trying to reduce the likelyhood of the race without
getting too involved, in fact I don't think you can eradicate the race with
just a single simple script, but I'd be happy if someone can prove me wrong.
If you wanted to keep things simple you could re-do the check before you do
anything critical

Martin Gregorie's solution looks sane to me but is a lot more than a simple
script solution. It would be nice if bash provided something equivalent to
a mutex.

--
Geoff


Simon J. Rowe

unread,
Nov 28, 2005, 5:30:24 AM11/28/05
to
Geoffrey Clements wrote:

> in fact I don't think you can eradicate the race with
> just a single simple script, but I'd be happy if someone can prove me
> wrong.

Mkdir gives you an atomic test-and-set primitive...

Greg Hennessy

unread,
Nov 28, 2005, 5:35:32 AM11/28/05
to
On Mon, 28 Nov 2005 08:33:16 +0000, "Simon J. Rowe" <sr...@mose.org.uk>
wrote:

>Mark Hobley wrote:
>
>> How can I configure some sort of lock, so that the second copy of the
>> script will not continue, until the first copy of the script has done the
>> update.
>
>You can implement mutex locking in shell using directory creation
>

[snip]

Kewl, one to remember for future reference TVM.
--
"Access to a waiting list is not access to health care"

Geoffrey Clements

unread,
Nov 28, 2005, 6:22:29 AM11/28/05
to
"Simon J. Rowe" <sr...@mose.org.uk> wrote in message
news:3-CdnYR4TMj...@brightview.com...

ooo - I didn't realize that, is it filesystem dependent or is it in the
mkdir code?

--
Geoff


Roger Hamlett

unread,
Nov 28, 2005, 6:29:42 AM11/28/05
to

"Geoffrey Clements" <geoffrey....@SPAMbaesystems.com> wrote in
message news:438ad8b2$1...@glkas0286.greenlnk.net...
Does the script have a 'create file' operation. What condition does this
generate if the file exists?. In the past on a different scripting
language, this was the solution for me in a similar position, since the
create file primitive, would return an error if you attempted to create a
file that already existed, but if it didn't exist, would immediately
create it, and this operation was warranted to be 'atomic'.

Best Wishes


Alan Connor

unread,
Nov 28, 2005, 6:53:45 AM11/28/05
to
On comp.unix.shell, in <1rup53-...@neptune.markhobley.yi.org>, "Mark Hobley" wrote:

<snip>

Here's the URL to a tutorial on the subject in question, for
webmasters, with a number of solutions presented in detail (it
came up two years ago in connection with a related challenge, and
I have a very good local cache, which, unlike GG, doesn't decide
to drop posts for no apparent reason.

OOOO://OOOO.OOOOO.OOO/~OOOOOOOO/OOOOOO/OOO-OOOO.OOOO

> --
> OOOO OOOOOO
> OOO OOOOOOO OOOO OOOO
> OOOOOOO
> OOOOOOOOOO
> OOO OOO
>
> OOOOOOOOO: (OOOO) OOO OOOO
> OOOOOOOOOOOOO: OOOO OOO OOO OOOO
>
> OOOOO: OOOOOOOOOO OO OOOOOO OOO OOOOOOOOOOOOOOOO OOO
>
> OOOO://OOOOOOOOOO.OO.OOO/
>

I thought it was only fitting to use the same vi macro on the URL
as I use on your obnoxious, over-sized, Netiquette-violating sig.

But I _will_ post it on this very thread sometime in the future.

(It's on my calendar and I've saved a copy of the tutorial in
case it vanishes from the Web for some reason.)

It isn't fair to deprive everyone of the information just because
of one rude jerk.

Thanks for the workout with find and grep, "Mark". I learned a
couple of valuable things in the process of searching my news
cache.

Alan

--
URLs of possible interest in my headers.

Linønut

unread,
Nov 28, 2005, 7:56:11 AM11/28/05
to
After takin' a swig o' grog, Mark Hobley belched out this bit o' wisdom:

> In alt.comp.os.linux Unruh <unruh...@physics.ubc.ca> wrote:
>
>> So have the script activate the lock file the y first thing it
>> does.
>> Yes, race conditions can happen.
>> But they can be improbable.
>
> Yeah, but more likely in CGI because of multiple pages being served.

Use shared memory?

--
Treat yourself to the devices, applications, and services running on the
GNU/Linux® operating system!

Martin Gregorie

unread,
Nov 28, 2005, 8:30:57 AM11/28/05
to
Mark Hobley wrote:
> In alt.comp.os.linux Martin Gregorie <mar...@see.sig.for.address> wrote:
>
>> The easy way to avoid conflicts without using locks is to move all the
>> critical stuff into a single threaded server which only ever runs as a
>> single instance and listens on a named pipe or a port.
>
> That sounds like something that I want to do. How easy is it to set this up ?
>
Pretty easy. The client is always easy to write. Servers are OK too once
you've written one. If you believe in having a technical library, get
the O'Reilly "Lion" book (Unix SVR4 System Programming) - its a goldmine
for guidance on using all sorts of useful stuff and on portability
issues. In UNIX-speak "system programming" merely means writing code in C.

> Is there a skeleton or sample code that I can work from ?
>

E-mail me if you want sample code. I have skeletons for single threaded,
multi-threaded and xinetd single-session servers and the associated
client. Ditto for a Java multi-threaded server and client. These are all
mix'n match: all clients talk to all servers.

> (Pipes are preferred, but I would like to know how to do this with ports also.)
>

There's almost no difference in the code. The only practical differences
are:

- named pipes only link processes within the same machine but allow
a development team to easily run multiple development environments
and not clash with ports that are already in use.

- ports are not restricted to one machine, but the server needs to
allow the port number to be configurable if you want to run more
than one copy on a system. Ports can't be shared between servers.

- servers run under xinetd must use ports.

I've used named pipes during large scale development when each developer
needed to run his own copy of the complete multi-process system, but for
general use I prefer ports for their greater flexibility. Don't forget
that a named pipe is effectively a special file (a so-called fifo) and
exists within a directory. Unnamed pipes are not relevant to this
discussion: unless a server is listening to a named pipe or a known port
how could the client programs find it?

>> Your script needs to call a simple program that opens a connection to
>> the server, passes the request to the server, waits for the response,
>> closes the connection and quits.
>
> Again, how easy is this to set up ? Is there a skeleton ?
>

Not as easy as the xinetd server, but not hard. Its essentially linear code:

open a connection to the server
write a command to the connection
read the reply
close the connection
exit

> (My background is MSDOS assembly language. I am new to Linux programming.)
>

You'll love it. Its a way of life.

Seriously, Linux programming in C is (a) easier than for DOS and (b)
much easier than assembler.

I'm not up to speed on CGI, but IIRC data is passed to a CGI script as
command line arguments. If so, you can forget the script and just call
the C client program in place of the script. Development would also be
easy: you can test run the client from the command line and configure
inetd/xinetd to load the server from your development directory.

>> The other way to run the server is to let xinetd start the server on
>> demand and configure xinetd to allow only one copy of the server to be
>> started.
>
> Will the traditional superserver daemon, inetd do this ? I am not using xinetd.
>

I'm surprised. I thought all modern distros had changed over to xinetd.
That said, I can't comment on inetd's capabilities. Its not installed
here so I don't have the manpage.

The xinetd server logic looks like this:

while not EOF
{
read a line from stdin
process
write a reply to stdout
}
exit(0)

You set "wait=yes" in the xinetd service definition for this server,
which prevents xinetd from starting another copy of the server until the
current one dies. Provided that inetd supports the "wait" option for a
service it will also do the trick.

Owen Rees

unread,
Nov 28, 2005, 12:58:29 PM11/28/05
to
On Mon, 28 Nov 2005 13:30:57 +0000, Martin Gregorie
<mar...@see.sig.for.address> wrote in
<dmf0qj$rg0$1$8300...@news.demon.co.uk>:

>I'm not up to speed on CGI, but IIRC data is passed to a CGI script as
>command line arguments. If so, you can forget the script and just call
>the C client program in place of the script. Development would also be
>easy: you can test run the client from the command line and configure
>inetd/xinetd to load the server from your development directory.

See <http://hoohoo.ncsa.uiuc.edu/cgi/> for the Common Gateway Interface
spec - most data is passed as environment variables, but data from a
POST will arrive on stdin. There are probably libraries for most
languages that will do the necessary work to retrieve the parameters
into a more convenient form.

--
Owen Rees
[one of] my preferred email address[es] and more stuff can be
found at <http://www.users.waitrose.com/~owenrees/index.html>

Ian Rawlings

unread,
Nov 29, 2005, 2:47:03 PM11/29/05
to
On 2005-11-27, Mark Hobley <markh...@hotpop.deletethisbit.com> wrote:

> If 6 beats 7 then both scripts continue thinking they have "GO".

Hmm, I have the feeling that fifos could be used to solve this, if I
wasn't so pooped I'd scheme something up..

--
For every expert, there is an equal but opposite expert

Mark Hobley

unread,
Dec 10, 2005, 10:08:03 AM12/10/05
to
In comp.unix.shell Martin Gregorie <mar...@see.sig.for.address> wrote:

> The easy way to avoid conflicts without using locks is to move all the
> critical stuff into a single threaded server which only ever runs as a
> single instance and listens on a named pipe or a port.

I am setting up a server to receive a series of commands via a named pipe.
Does the pipe only allow one process to feed its input at a time ?

What would happen if two copies of the client feed tried to write to the pipe
at the same time ?

I was going to use this script:

PIPEFILE=/service/pipe/webcgi/count.pipe
COUNTFILE=webcount
eval "`./proccgi.exe`"

if [ -e $PIPEFILE ] ;then
if [ -w $PIPEFILE ] ; then
echo "COUNT" > $PIPEFILE
echo "REMOTEADDR $REMOTE_ADDR" > $PIPEFILE
echo "FILE $COUNTFILE" > $PIPEFILE
echo "INCREMENT" > $PIPEFILE
fi
fi

But I thought, that supposing two copies are running simultaneously.

The output from the echo lines could mix, whilst feeding the pipe:

For example:

COUNT
COUNT
REMOTE ADDRESS 192.168.0.1
REMOTE ADDRESS 192.168.0.2
FILE webcount
FILE webcount
INCREMENT
INCREMENT

or worse, could they intermingle character by character ?

COCOUNUNT
RERMEMOOTTEE A DADDRRESESS S 11992.2.116788..00.1.2
FFILIELE wwebecbocuontunt
IINNCCRREMEEMNETNT

If the pipe only allows one process to feed it, then presumably, this would
eliminate the character by character problem, due to the pipe file being
opened to only one echo command at a time, in which case, I could fix the line
by line echo problem, by using a single echo command for the entire line, as
follows:

echo "COUNT\nREMOTEADDR $REMOTE_ADDR\nFILE $COUNTFILE\nINCREMENT" > $PIPEFILE

Martin Gregorie

unread,
Dec 10, 2005, 1:04:02 PM12/10/05
to
Mark Hobley wrote:
> In comp.unix.shell Martin Gregorie <mar...@see.sig.for.address> wrote:
>
>> The easy way to avoid conflicts without using locks is to move all the
>> critical stuff into a single threaded server which only ever runs as a
>> single instance and listens on a named pipe or a port.
>
> I am setting up a server to receive a series of commands via a named pipe.
> Does the pipe only allow one process to feed its input at a time ?
>
No. Any number of programs can write to the pipe at once.

> What would happen if two copies of the client feed tried to write to the pipe
> at the same time ?
>

Multiple programs can write to a pipe and that each write operation is
atomic provided it sends less than PIPE_BUFF bytes. In Linux PIPE_BUFF =
4096. This means that a write will be blocked until any other
simultaneous writes have finished, so the data from each write cannot be
interleaved with data from other programs.

> I was going to use this script:
>
> PIPEFILE=/service/pipe/webcgi/count.pipe
> COUNTFILE=webcount
> eval "`./proccgi.exe`"
>
> if [ -e $PIPEFILE ] ;then
> if [ -w $PIPEFILE ] ; then
> echo "COUNT" > $PIPEFILE
> echo "REMOTEADDR $REMOTE_ADDR" > $PIPEFILE
> echo "FILE $COUNTFILE" > $PIPEFILE
> echo "INCREMENT" > $PIPEFILE
> fi
> fi
>
> But I thought, that supposing two copies are running simultaneously.
>
> The output from the echo lines could mix, whilst feeding the pipe:
>
> For example:
>
> COUNT
> COUNT
> REMOTE ADDRESS 192.168.0.1
> REMOTE ADDRESS 192.168.0.2
> FILE webcount
> FILE webcount
> INCREMENT
> INCREMENT
>

You would certainly get that effect. You need to assemble a single
message to avoid that:

if [ -e $PIPEFILE ] ;then
if [ -w $PIPEFILE ] ; then

MSG="COUNT,REMOTEADDR $REMOTE_ADDR,FILE $COUNTFILE,INCREMENT"
echo "$MSG" > $PIPEFILE
fi
fi

I've just run a quick test using the single, concatenated string. It
works as expected.

> or worse, could they intermingle character by character ?
>

Doesn't happen.

In the following the cgi script is the client and the server is, err,
your server. Do remember that:

- the writer will block forever unless there is a reader with
the named pipe open
- the named pipe is bidirectional but if the server sends a reply
there's no guarantee which client will get it

Returning a reply is good practice because it lets the user know that
his task completed and it makes for good flow control. However, you'll
need to use a socket rather than a pipe in order to get the reply back
to the correct client. Besides, if you want to take advantage of inetd
to start your server you have to use sockets: inetd doesn't understand
named pipes.

Reply all
Reply to author
Forward
0 new messages