I am going to attempt to describe as much of the process for option 4 as I can below, because this issue has come up before, and I'd like to try to write down as much as I have figured out for resolving it. However:
So: For option 4, we are going to have to progress one table at a time as we review your data and try to properly update it to the correct culture.
Initial steps and database schema overview
First things first: please make a backup of your database! We will be attempting to make changes directly in the MySQL database, so we want to make sure that if we get anything wrong, you have a backup you can load.
All of these steps will require access to the MySQL command prompt. See the following link for instructions. Additionally, you will need to know the database username and password set during installation - but the link below also includes ways to check this information if you don't know or remember it:
Now, to backup the database, see:
Here is where things will get more complicated. An archival description is spread across a number of different tables in AtoM's database. If you would like a visual to accompany the following explanation, we maintain Entity Relationship Diagrams that describe AtoM's database schema on our wiki, here:
Below I will try to give a summary of the various tables we will need to check and update, before providing some initial instructions on how to do so. I've also previously given a high-level overview of AtoM's data schema in this thread:
First, to handle any free-text field that can be translated, there are i18n tables (this is short for "internationalization" - bc in english this word has 18 characters, so developers regularly shorten it to i18n) for each major entity. So for example, non-translatable archival description fields (like the identifier, the internal object ID, relations to other entities and lower-level descriptions, etc) are stored in the main information_object table. Most free-text fields that can be translated (like the title, the scope and content, the extent and medium, etc) are found in the related information_object_i18n table. So, the first place we'll need to check and update the source_culture values will be the information_object_i18n table. However, there are other places we will need to consider as well.
Next, it's helpful to know that AtoM's data model was originally built around the ICA standards - meaning that fields from other standards, or AtoM-specific fields that do not map exactly to ISAD(G) for example, will go into separate tables. The main ones we use for this type of overflow are the property tables - property and property_i18n.
Additionally, there are separate notes tables in AtoM, so even some ISAD(G) note types will be there, rather than in the main information object tables, such as Archivist's Note, General Note, etc. So note_i18n will be another table to check and update.
Finally, there are the related entities to consider. For example, every time you add a subject, place, or genre access point, this is creating a (or linking to an existing) term in one of AtoM's taxonomies. Every time you add a creator or a name access point, AtoM will create or link to an authority record. The archival institution name will create or link to a repository record. There are physical storage containers, accession records, functions, rights statements and rightsholder records, and more to consider. All of these related entities work similarly to the information object tables - there will be a base table for fields that do not require translation, and then an i18n table. There's even a relation and relation_i18n table that is used for tracking the connections between some records - for example, when relating two different authority records together. It will be good long-term to check all of these entities as well, as many new stub records can be created in AtoM during the description creation process, whether via the user interface or via import.
Now, because AtoM is a multilingual application that can support multiple translations per record, all of these tables should have a column that defines the culture for any given row. I think this is generally called source_culture across the base tables, and just culture in the i18n tables so that's what we'll look for first.
Updating the main entities
In each table, we are going to want to:
- In the base entity, perform a count of how many records have the source_culture value set to en - this step is optional, but gives us a sense of how many records are affected, so we can check that the update queries are working as expected
- Select these records, and update the source_culture value from en to es in the selected records
- Check the related i18n table's culture values for en records
- Update those to es as well
Now, if you are (or have access to) a developer, it would be possible to craft a script that does everything we are going to do below, across all of AtoM's tables. I do not have such skills, so I will describe the manual process!
Let's start with the information object records - the main part of your archival descriptions affected by this issue. Fortunately, we already have an example of this query in our documentation we can use, at the start of this section. We will update the selected culture in that example query to English, like so:
- SELECT COUNT(*) FROM information_object WHERE source_culture='en';
That should tell us how many descriptions are affected.
Now we will try to update them:
UPDATE information_object io
SET io.source_culture TO "es";
You can then run the COUNT query again to see if it worked as expected. There shouldn't be any en records left! Next, we can try to do the same for the i18n table.
- SELECT COUNT(*) FROM information_object_i18n WHERE culture='en';
Then:
UPDATE information_object_i18n io18
SET io18.culture TO "es";
Essentially, from here, we are going to need to repeat this with pretty much ALL the main entity tables in AtoM! So, you will need the most recent Entity Relationship Diagram as a guide (
the most recent one is a searchable PDF), because while I will list more examples below, it will be up to you to work out how to update the remaining tables. I would say that you should do this next for
- The actor and actor_i18n tables
- The repository and repository_i18n tables
Access points
Now, this should address *most* of the translation issues in your site. Further changes are going to get more complicated, and we will need to be more careful as we proceed, and/or make some compromises.
First, let's do the easiest part of the hard part, haha - by which I mean, terms in taxonomies. AtoM does include some default (English) terms that are loaded as fixtures during installation. These terms we want to keep, because they are often the controlled values straight from the related standards templates, used in drop-down in edit pages. Consequently, we will only want to update some taxonomies.
So, let's start with the access points, because for Subjects and Places at least, these taxonomies arrive empty when AtoM is first installed - meaning the only terms in there should be Spanish ones you have added.
We will need to know the taxonomy ID for this, but fortunately, it's a stable number in our docs already:
- Subjects taxonomy ID: 35
- Places taxonomy ID: 42
You can actually see an output of all taxonomy names and their IDs with:
So, let's update the Subject terms. First, we will do the base term table, using the taxonomy_id as our limiting clause on the select:
UPDATE term t
SET t.source_culture TO 'en'
WHERE t.taxonomy_id='35';
Now we will try the same on the term_i18n table, but we will need to join it with the term table to use the same limiting WHERE clause:
UPDATE term_i18n t18
SET t18.culture TO 'en'
WHERE t.taxonomy_id='35';
We can then repeat those two steps for Places, by swapping in 42 instead of 35 in the two example queries above. You can look at the output of available taxonomies, and consider if there are others that do not have default English terms - if so, and you have used them, then you can also repeat the update for these. For example: if you've added Actor Occupation access points (taxonomy ID: 80). If you're not sure, don't update it! Use the user interface instead - navigate to Manage > Taxonomies, find the right taxonomy, and look at the terms. You can always just flip the user interface to Spanish, enter edit mode, and add a Spanish translation if you're not sure.
Notes
Now we are getting into the much more experimental areas! Proceed with caution! And note that I have not tested all of these queries, so if the previous steps worked, make another SQLdump backup of your database now, so we can roll back if something goes wrong!
Let's consider the note tables next. Here's a quick tip: you can always use the DESCRIBE option to get a sense of what fields are in a table:
mysql> DESCRIBE note_i18n;
+---------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+-------------+------+-----+---------+-------+
| content | text | YES | | NULL | |
| id | int | NO | PRI | NULL | |
| culture | varchar(16) | NO | PRI | NULL | |
+---------+-------------+------+-----+---------+-------+
3 rows in set (0.00 sec)
mysql> DESCRIBE note;
+----------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+---------------+------+-----+---------+----------------+
| object_id | int | NO | MUL | NULL | |
| type_id | int | YES | MUL | NULL | |
| scope | varchar(1024) | YES | | NULL | |
| user_id | int | YES | MUL | NULL | |
| source_culture | varchar(16) | NO | | NULL | |
| id | int | NO | PRI | NULL | auto_increment |
| serial_number | int | NO | | 0 | |
+----------------+---------------+------+-----+---------+----------------+
7 rows in set (0.00 sec)
We can run a count of English rows in these note tables with:
- SELECT COUNT(*) FROM note WHERE source_culture='en';
- SELECT COUNT(*) FROM note_i18n WHERE culture='en';
Here is where we need to pause.
Now, note tables will potentially have some of your archival description content - as I mentioned earlier, even ISAD(g) fields like Archivists Notes, or General Notes, etc, can appear in this table. However, we don't want to bulk update ALL notes, because they are also used for other entities, such as the Scope notes on terms. You would think that in an all-Spanish site there should be no issue bulk updating all of them.... but don't forget: AtoM includes a number of (English) default taxonomy terms when installed. And many of these affect how the underlying system works, and/or control the available values in standards-based templates, including ISAD(G), so we don't want to bulk update everything!
What follows below are just some examples - you may need to work out your own ways to proceed, since I haven't tried all of this myself.
Here's one way (I think I might have found a better way below, but some of this might still be useful so I will leave it):
Method 1 - note by note
We can try looking at some results from the note_i18n table, outputting 50 records at a time, and then skipping 50 in the next query to output 50 more. Perhaps we can even join with the note table and limit this to those affecting descriptions:
SELECT content,object_id FROM note_i18n no18
WHERE no18.culture='en' AND no.scope='QubitInformationObject'
LIMIT 50;
This would output only the actual note content, and the related description's object ID number for the first 50 results. If you wanted to skip the first 50 results and output the next 50, we could do it like so:
SELECT content,object_id FROM note_i18n no18
WHERE no18.culture='en' AND no.scope='QubitInformationObject'
LIMIT 50, 50;
If we see notes that we know belong to our descriptions (and therefore need updating) we can try to use the related object as a select, to update that record:
UPDATE note_i18n no18
SET no18.culture to 'en'
WHERE no.object_id='12345';
Only, replace the 12345 value with the actual object_id value you saw in the output from the query above. Then we repeat this but just on the note table (also replacing the 122345 placeholder with the actual object ID value:
UPDATE note no
SET no.source_culture to 'en'
WHERE no.object_id='12345
As you can imagine, this is going to be time consuming, unfortunately! So, let's try looking at this another way...
The (hopefully) better way to update notes
If we look in our docs, there is an example query that will output the IDs of the different note types. That might allow us to better restrict our updates to just those note types that we know are related to our descriptions. For example, if you are using the ISAD(g) description template, then really we just need to worry about General note and Archivists note, possibly the Language note. See the example query here:
So, General note has an ID of 125. Let's try crafting a query that will limit our selection to those notes used as General notes:
UPDATE note_i18n no18
SET no18.culture to 'en'
WHERE no18.culture='en' AND no.type_id='125'
UPDATE note no
SET no.source_culture to 'en'
WHERE AND no.type_id='125'
We could repeat this for Archivist's notes by change the type_id value in the queries above to 124.
Language notes will appear in other entities, but I think these should be the main entities without any default values, so you should be able to bulk update those notes as well.
For the source notes and scope notes of taxonomy terms:
- If you have added this info to access point taxonomies (like places and subjects) then we will need to deal with those. In that case, you will need to see if you can craft a way to join the term tables to the note tables (likely via the object table, which connects pretty much everything), so we can try to selectively ONLY update scope and source notes that affect the access point taxonomies we already updated. Let me know if this is in fact an issue, and I can try my best to help crafting something
- If you haven't added scope /source notes to your access point entries, then you can probably ignore those :D
Other tables
There may still be other tables we need to consider, such as:
- Physical storage containers that were created and linked to your descriptions
- Accession and Deaccession records, and their related Donors
- The property and relation tables
- There is an other_names_i18n table that might be used by any of your authority records - fortunately, that one should be simple to update
- etc
For cases where you are not sure how to proceed, one easier alternative option will be to:
- Make sure the user interface is set to Spanish via the language menu
- Navigate to the English-culture records you want to address and enter edit mode
- Add a "translation" in the Spanish interface, and save
At least this way, they will properly appear when you are searching and browsing in Spanish! And you can potentially correct the "English" versions later.
Hopefully, you do not have too much mixed-language content in these areas, and hopefully the information included in this long message will help you understand how you can at least start investigating!
Remember, back up often, proceed carefully, and let us know how it goes
After making updates
When you are done making updates via SQL:
First, do NOT get rid of your backups just yet. Make sure the system works as expected for the next week at minimum, to confirm there not any unexpected low-level issues caused by the changes that take some time to appear. In general it is always wise to take regular backups of your production site.
You will also need to run some of AtoM's maintenance tasks before using the system again. I would recommend basically running all of the most common ones, such as:
- Rebuild the nested set
- Generate slugs
- Clear the application cache
- Restart PHP-FPM
- Re-populate the search index
You will find links and further instructions for all of these tasks from our Troubleshooting page:
Good luck!