Simplest way to clear out all records in a database

260 views
Skip to first unread message

Martin Holmes

unread,
May 11, 2015, 1:06:26 PM5/11/15
to ica-ato...@googlegroups.com
We've been testing and tweaking our setup with CSV imports, and I now need to remove all the test data that's been inserted. Specifically, I'd like to remove 1054 Archival Descriptions. I can't see any way to do this in the admin interface. What's the best way to achieve this?

All help appreciated,
Martin

Dan Gillean

unread,
May 11, 2015, 1:28:24 PM5/11/15
to ica-ato...@googlegroups.com
Hi Martin,

If you want to clear ALL data (including user accounts, etc) out of the database, and restore it to the state it was in shortly after installing, then I suggest you try out the following command-line tool:

Run the task from the root directory of your AtoM installation, using the command-line. After running the task, you will be prompted to enter a new site name, description, and to create an initial administrators account, so you can log into the application.

I suggest that you also re-index the site after the purge, so the default taxonomy terms are properly indexed. 

Note that there *may* still be items left in your uploads directory.

If you want to remove just the archival descriptions and any related records (e.g. linked terms, or authority records, etc), there is no easy way to do this via the user interface, or the command-line. Instead, I think you will have to use SQL to delete the descriptions. Be sure to back up your database first, if there is data you want to keep! 

If you are comfortable working with SQL, you could take a look at the database, and craft a SQL query that will delete all descriptions - be sure the query touches all related tables, such as the i18n and slug tables, etc.

If not, you might try the following:

1) We have an example SQL query that will allow you to update all draft descriptions to published. You could simply reverse the parameters of the query, to make sure all descriptions are changed to draft:

  • UPDATE status SET status_id=158 WHERE type_id=160 AND object_id <> 1;

2) Now, you should be able to use the command-line task to delete all draft descriptions:

Please note, this will NOT delete other related entity types, like authority records, archival institutions, or terms created as a result of your descriptions.


Let us know if that helps!


Cheers,



Dan Gillean, MAS, MLIS
AtoM Product Manager / Systems Analyst,
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

--
You received this message because you are subscribed to the Google Groups "ICA-AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at http://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/9d166044-316c-43ad-aac8-f7438f3673bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Martin Holmes

unread,
May 11, 2015, 2:40:36 PM5/11/15
to ica-ato...@googlegroups.com
Hi Dan,

I'll ask our sysadmin to do this. It seems there's quite a lot of bulk work that can only be achieved at the command line in AtoM; I'm beginning to think I'll need to arrange command-line access with our sysadmins, otherwise I'll be bothering them frequently. Are there plans to add features to the admin web interface to handle this sort of thing, or would you expect anyone administering a database to have CLI access?

Cheers,
Martin

Dan Gillean

unread,
May 11, 2015, 4:44:09 PM5/11/15
to ica-ato...@googlegroups.com
Hi Martin,


The main challenge with doing things through the user interface is that the request needs to be sent via the browser, and browsers are subject to timeout limits on requests. For large-scale and/or long-running tasks, they need to be refactored so that they can be handled asynchronously in the background, thereby sidestepping the reliance on the browser. For this to happen, you need something on the server side that can handle the task request, and queue tasks when multiple requests are received.

In AtoM 2.2, our upcoming release, we've added just this - a job scheduler that can perform tasks asynchronously in the background, as well as a jobs page so that administrators can see the status of job requests. See the following issue ticket for more information (I've added some details and a few screenshots to the issue ticket, as well as a link to the documentation I've added about this new feature in our 2.2 documentation branch, which will become public when the release is out):

For now, the job scheduler is only being used for 2 new features - the ability to generate finding aids from hierarchical descriptions, and the ability to make PREMIS rights actionable on digital object and manage rights inheritance in lower-level descriptions.

As you may know, AtoM is a community-driven open source project, and we rely on community code contributions, and development sponsorship to be able to include new features in our releases. Basically, we at Artefactual release AtoM free for anyone to use, download, study, modify, distribute, etc, and also we offer our extensive documentation for free, as well as the support you'll find here in the user forum. To help sustain the project, Artefactual also offers paid value-added services for those who wish to engage them  - hosting, customization, data migration, training, consultation services, theming, and of course, custom development. We always aim to generalize our custom development so it can be included in a public release - this way, everyone in the community benefits from every development that any one institution sponsors, and the sponsoring institution does not need to worry about maintaining the customization - if it is included in a public release, Artefactual will maintain the feature in future releases. This model is sometimes called the bounty model in open source communities, and it allows us to offer AtoM, the docs, and the User Forum without any membership requirements or additional fees.

With each release, we try to freely tackle some of the bugs we find in the application - but feature development requires development sponsorship for Artefactual to be able to take it on. Now that we have the infrastructure in place for the job scheduler (sponsored as part of the finding aids development work that will be included in AtoM 2.2), it will become easier over time to perform further development and move more tasks to the job scheduler that currently risk timing out the system - bulk operations such as move, edit, delete, duplicate, or change publication status, for example, or bulk import and export via the user interface - but they still require someone willing to either sponsor the work with us, or hire their own developer and then hopefully share the code with the community so that we can incorporate it in a future public release.

It's likely that some tasks, intended for system administration and rarely needed when AtoM is used in production, will never be prioritized for inclusion in this way. While we've tried to make the user interface as easy to use as possible, installation and maintenance does require command-line access and comfort.

We often recommend that users have multiple versions of AtoM - it's good to have a test or development version of AtoM for internal use, so you can test things out before doing them in production. I'd recommend that if you don't have an internal test site in place, that you talk to your IT department about setting one up, and see if you can get CLI access to it, so you can familiarize yourself with our command-line tools and tasks:

Based on my experience offering support in this forum, I'd say that 90-95% of problems that users encounter can be solved with the tools I've documented on the above page, as well as clearing the cache, checking the logs, using debug mode, and re-indexing.

Cheers,



Dan Gillean, MAS, MLIS
AtoM Product Manager / Systems Analyst,
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

--
You received this message because you are subscribed to the Google Groups "ICA-AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at http://groups.google.com/group/ica-atom-users.
Reply all
Reply to author
Forward
0 new messages