Last night we rebuilt the RAID system on our main file server (due to
a failed hard drive) - even though we have a full backup on another
server, I wanted to have another regular backup before rebuilding the
RAID. This took a bit longer than expected because the way rsync
calculates checksums. However, this wasn't the problem that kept me
up till 6am.
In addition to replacing a failed drive on the main file server, we
wanted to finally retire one of our oldest servers (Sackson). Sackson
runs a database called MongoDB (www.mongodb.org). It's been growing
over time, and we intended to migrate it to one of our newer machines.
The process of backing up and importing a mongo database takes a long
time due to the sheer size of the data and the indices that this
database uses.
During the setup of mongo on a new server, I started noticing the OS
would "stall" several times when writing to disk. It turns out that a
combination of using a virtual server, with the particular hard drive
configuration on this machine doesn't seem to work properly - it takes
far too long to write the data to disk (this is a problem to solve for
a different time). Around 4am I decided to abandon the new server and
go back to a non-virtualized server.
We're going back in today and reinstalling a different OS and moving
things around.
I really apologize for the delay - nobody wants the site back up more than me.
--
Aldie
For those of us that are not as technical, what does this mean in terms of an ETA?
Sorry you've got the weight of the world on your shoulders getting things up and running! Please make sure you take your time getting things back in order - we don't need burnt out admins!
Keep up the great work - the majority of us are all happy to wait while things get back in order. Take care Aldie & team :)
BGG isn't a Mom & Pop operation anymore and hasn't been for years. It's "Aw, shucks, folks" attitude doesn't reflect the reality of its use by the users or owners.From IT to the Admins (apologies), BGG needs to grow out of the buddy system to something resembling a social network in the teens of the 21st Century.The only reason this downtime is acceptable is there's really nowhere else to go.That's part of the problem. If there was somewhere else to go, this wouldn't happen.
BGG isn't a Mom & Pop operation anymore and hasn't been for years. It's "Aw, shucks, folks" attitude doesn't reflect the reality of its use by the users or owners.From IT to the Admins (apologies), BGG needs to grow out of the buddy system to something resembling a social network in the teens of the 21st Century.The only reason this downtime is acceptable is there's really nowhere else to go.That's part of the problem. If there was somewhere else to go, this wouldn't happen.Am I angry? You bet. I can't think of anywhere else on the internet that this would be acceptable.