Please don't. Publishing involves more than changing one field in a database. When a dataset is published, the following things happen:
- Something in the database changes for each file, also setting them to published
- Dataverse reaches out to a DOI or Handle server and changes the state of the DOI or Handle.
- Private URLs are deleted.
- The dataset is reindexed into Solr, deleting draft versions from Solr
- And on and on.
What problem are you trying to solve? :)
Thanks!
Phil
p.s. The reason it says RELEASED rather that PUBLISHED in the database is that in DVN 3 we would say you release a study (in Dataverse 4 we call a study a dataset). This is like releasing software, I guess. We decided that publishing makes more sense than releasing. Here's something I wrote interally on 2014-04-17 in the middle of the DVN 3 to Dataverse 4 rewrite in a thread called "Public, Private, Published, Unpublished, Release!":
Hmm, we straddle the publishing world and the software world, don't we? We adopt terms from both, which leads to the confusion.
In software we say:
- We'll *release* a new *version* next month
- What *version* are you running?
- We removed the *version* with a security flaw from our download site
- Someone *forked* *version* 2.3 and *released* it under a new name
In publishing, (I think) people say:
- A second *edition* was *published* in January
- I'm reviewing a *draft* of a manuscript
- It's a *unpublished* work
- The work has been *remixed* under the same license
My questions:
1. If datasets are "published" should they have "editions" rather than "versions"?
2. Do we have the concept of *remixing* a dataset? Can you *fork* a dataset?
3. Should we have a special term for a dataset that has never been published/released? We use the word "draft" for this but a draft also applies to an upcoming version of a released/published dataset.
4. Should we continue to borrow terms from both the publishing world and the software world or should we try to standardize on the terminology from one of these worlds?
Whatever we do, let's put in a guide a glossary for the terms we use.
Phil
p.s. I don't really thing we should use the term "editions"... I'm just trying to get people thinking. I think people tend to think of datasets more in terms of software than publishing. For example,
http://dataprotocols.org/data-packages/#recommended-fields says "version - a version string identifying the version of the package. It should conform to the Semantic Versioning requirements (
http://semver.org )." Semantic Versioning comes very much from the software world, not the publishing world.