'AppStateSetBased.updateUser' is extremely expensive. Running it will
serialize both the old user and the new user (including all jobs).
This means that you're serializing almost 200 times too much
information. Changing this to something more sensible reduces the
runtime from 8m to 20s on my box.
The event file is quite a lot smaller at 8.9MB. This comes down to
230bytes per database insert (ControllerStressTest.insertu makes 202
inserts).
--
Cheers,
Lemmih
HAppS is still quite experimental so I wouldn't put money on it unless
you're willing to fix bugs when needed.
> * Are there certain types of web apps that are unlikely to work
> well with the HAppS web architecture?
No released version of HAppS supports sharding as of this time. This
limits the amount of data you can manipulate.
> * Are there changes I can make to my toy app's architecture -- be
> it data structures, buying new hardware, whatever -- that will enable
> me to get good performance against the stress test described below and
> in the demo?
The changes mentioned in my previous mail results in throughput of
~2000 transactions per second (using run-of-the-mill desktop
hardware). More optimization and better hardware could probably
improve that number a bit.
> * Are there other HAppS stress tests in the public domain, and
> what are the results so far?
I'm not aware of any.
--
Cheers,
Lemmih
Using a Map is half the solution, I'd think. It makes you realise that
only a key is needed to identify a user (as opposed to the entire user
info).
The methods that have User's in their arguments is what makes the
program slow. We can get around this by adding specialised functions
like: 'addUserJob :: UserName -> Job -> Update YourState ()'.
The 'addUserJob' is all you need to make the stresstest run fast.
However, it would be a good idea to get rid of 'updateUser'
altogether.
> I've decided that for an interim goal, I'd like to show my toy app
> with 100,000 users created, and maybe 2000 transactions per second as
> you had. Better would be a million users but I'll worry about that
> after I am satisfied that 10^5 is doable.
Inserting 100,000 users will be slow in any database if you use 200
transactions per user. Try reducing the number of transactions needed.
--
Cheers,
Lemmih
That's only ~130 transactions per second. Are you maxing out your CPU?
Btw, I tried building the new patches but Safe.hs is missing.
> This might be good enough for the real world, but I'm not sure.
>
> I'm wondering if there's a way to get better shutdown times if you
> checkpoint more often and not just at shutdown.
>
> Can checkpointing be made into a cron-like thing where it's just done
> every 5 minutes say?
Yes, but I wouldn't recommend it. Making a checkpoint writes the
entire state to disk.
--
Cheers,
Lemmih
--
aycan
Event execution continues while the checkpoint is being written to
disk. It's just a waste of resources to serialize the state every 5
minutes.
--
Cheers,
Lemmih
I am. By fixing a space leak and tuning the GC options, these are the
times I'm getting:
"Users 801 to 1000 have been inserted.
Stress test time: 8 secs"
"added user: user998
added user: user999
added user: user1000
creating checkpoint
time: Tue Oct 14 21:23:13 CEST 2008
shutting down system
time: Tue Oct 14 21:23:16 CEST 2008
shutting down system, time:
time: Tue Oct 14 21:23:16 CEST 2008"
My hardware is slightly newer:
david@desktop:~$ uname -a
Linux lemmih-desktop 2.6.24-16-generic #1 SMP Thu Apr 10 12:47:45 UTC
2008 x86_64 GNU/Linux
david@desktop:~$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 75
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4600+
stepping : 2
cpu MHz : 1000.000
cache size : 512 KB
[snip]
Make sure you don't get bitten by laziness and try +RTS -c -A5m -RTS.
--
Cheers,
Lemmih
How much memory usage are using seeing? After inserting 200k jobs,
HAppS on my box is using 245megs. We could probably get that number
even further down by using a more compact representation of strings.
We, at HAppS, are going for option 4: Throwing more machines at the
problem. Limited memory isn't the only problem of individual machines.
They are also limited in CPU capabilities and reliability. We're
trying to solve all of these problems by making HAppS a distributed
application.
--
Cheers,
Lemmih
It seems to me that the 'jobs' entry in 'UserInfos' is never forced in
the stresstest. This probably isn't a big issue, though.
--
Cheers,
Lemmih
RAM has become quite cheap. Adding a couple of gigs can be cheaper
than using developer time.
> Anyway, there are now 3 stress tests:
> -- 1 inserts all users, all jobs with one transaction
> -- 2 one transaction per user, all jobs get inserted at once
> -- 3 one transaction per job
>
> One transaction per job is too slow for inserting a lot of dummy data,
> though it looks like lemmih got better results in his own testing.
> Inserting all users is of course the fastest, however I couldn't do
> this for more than 200 users. The best option in practice was option
> 2, which allowed me to insert 1000 users at a time, eg by
>
> time wget http://www.happstutorial.com:5002/tutorial/stresstest/atomicinsertsalljobs/1000
>
> What I'm seeing with 25000 users is, I can only view the consultants
> page. If I try to look at consultants wanted (which checks if the jobs
> list is null) or show all jobs (which does a map fold over users to
> get all the jobs) the hypervisor kills happs for using too much
> memory.
Inserting one million jobs isn't too bad on my machine (which has 4Gb of ram):
david@desktop:happs-tutorial$ time wget
http://localhost:5001/tutorial/stresstest/atomicinsertsalljobs/100000
-o /dev/null
real 3m17.576s
user 0m0.004s
sys 0m0.004s
At this point, happs-tutorial is using 423megs of ram. It's only a
tiny bit more than firefox (:
I'll try to rewrite the tutorial using the BerkeleyDB binding I
uploaded yesterday[1].
>> We, at HAppS, are going for option 4: Throwing more machines at the
>> problem. Limited memory isn't the only problem of individual machines.
>> They are also limited in CPU capabilities and reliability. We're
>> trying to solve all of these problems by making HAppS a distributed
>> application.
>
> How would this work in the job board case? Could I accomplish 100,000
> with the same data model I have now, but using four machines? What
> would happen when I attempt to view all jobs? That's 20 million --
> whoa -- but I did make a paginator that allows you to scroll around in
> that many records keeping the actual browser page display down to a
> reasonable size. To display the data you don't need all the jobs, just
> the first 200, and pagination info for the rest. So really it comes
> down to doing a count of jobs records.
You wouldn't use exactly the data model you have now. It should look
more like this (excuse the formatting):
> data AppState = AppState{
> appUsers :: Map UserId User,
> appUsersByName :: Map UserName UserId,
> appJobs :: Map JobId Job,
> appJobsByName :: Map JobName JobId }
This makes it very easy to partition the users and the jobs over an
arbitrary number of nodes.
Listing the jobs can be done in many different ways. A list of job ids
could be shared on all nodes, or you could request a subset of the
jobs from each node and merge the responses, or you could do something
different depending on what you want.
> Is there an example somewhere in the head repos for sharding, on a
> large data set like what I have for the stress test?
Unfortunately there isn't. No sharding code has been released yet.
--
Cheers,
Lemmih
[1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/berkeleydb
Here are my timings using compact-map instead of berkeleydb:
david@desktop:happs-tutorial$ time wget
http://localhost:5001/tutorial/stresstest/atomicinsertsalljobs/100000
-o /dev/null
real 1m3.774s
user 0m0.000s
sys 0m0.000s
creating checkpoint
time: Tue Nov 11 15:34:38 CET 2008
shutting down system
time: Tue Nov 11 15:34:48 CET 2008
Memory usage at 269 megs. I'm using the compacting garbage collector.
I'd love to discuss the results with you in real time. Come by #happs
on irc.freenode.net some day.
--
Cheers,
Lemmih
Yes, all functions are tested against Data.Map for compliance.
> If so, are there any performance tradeoffs involved?
There are. CompactMap's are quite a bit slower than ordinary maps. Try
to void newtyped keys. The newtypes unfortunately keep dictionaries
around which hurt performance.
Here are some benchmarks:
http://darcs.haskell.org/~lemmih/compact-map/benchmarks/
--
Cheers,
Lemmih