Hi Elissa,
AtoM's SKOS support is pretty basic at the moment. Not all SKOS elements and relationships are supported (for example, hidden labels, semantic relationships, mapping properties, examples, history/editorial/change notes, integrity conditions), and if the vocabulary mixes in other RDF vocabulary elements, AtoM may not know what to do with them. Additionally, while SKOS can support multi-hierarchies, currently AtoM taxonomies cannot.
Without looking at your SKOS file I'm not sure why some relationships are duplicated, but I suspect that AtoM expects all required information to be in the SKOS file itself - it won't follow URL links mid-file to find the proper label for related terms, for example, so it could have something to do with how AtoM is parsing your file.
In short, you may need to make some local modifications to the SKOS file to get it to import, and depending on the contents of the SKOS file, may also need to manually add back in some information if it currently uses unsupported elements. The best way to do this would likely be to make a simple test taxonomy of related terms in AtoM and then export them, and take a look at the results. You can then use this as a reference for what AtoM expects and supports when looking at the SKOS file itself.
One other limitation to keep in mind: currently there's no way in AtoM to export all terms in a taxonomy if they are not all directly related (e.g. sibling top-level terms) - so for the purposes of this experiment, make sure all terms in your test hierarchy are related to a single parent term, so you can export from that term and get all the descendants. For example, in the Subjects taxonomy, you might want to make a top-level "Subjects" term first, and then add your test terms underneath this.
Finally, it may be worth considering what exact problem you are trying to solve by importing 20K terms into AtoM, and whether or not there might be better ways to solve it. We did a client project in the past where we imported all Library of Congress Subject Heading terms into an AtoM instance. ... and then a year later, we did another project to remove most of them. Turns out, having hundreds of thousands of terms in AtoM was not a good user experience for staff or for end users - for example:
- Having so many terms had an impact on performance, making some pages load slower
- Additionally, some terms with many relationships (both to other terms and to descriptions) could not be edited/moved/deleted/etc via the user interface, because the web browser would hit the timeout limits before the operation would complete and all related resources could be updated
- Staff had trouble using the autocompletes to find the desired terms because there were so many available options, many of which were not intuitively the first thing a user would search for. As such, it didn't necessarily help with consistency of use to have so many available controlled vocabulary terms
- AtoM doesn't have a sort option on taxonomy browse pages to filter by relationships, so end users would see pages and pages of terms with no actual links to descriptions when trying to browse Subjects. In the end, this made subject-based discovery essentially impossible for end users.
- etc
So again, it's worth asking what the real problem is that you're trying to solve. As an example: perhaps the problem is that staff creating terms on the fly leads to inconsistent usage, and hurts end-user discovery. In that case, adding 20K terms to choose from may mean that access points are still applied inconsistently - and/or the proposed solution may create additional problems. If you want your staff to use a small subset of standardized controlled vocabulary terms for consistency, better discovery, etc. then perhaps selecting a smaller subset of the target vocabulary terms and creating them manually in AtoM's user interface might mean more upfront work, but a better end result. In such a case, you can still use the target vocabulary as your source, and in fact use the sourceNote field to provide a link directly to the reference term if desired (meaning you're still using the controlled vocabulary, just being selective about what terms you add).
That's just an example, but revisiting the actual problem and trying to think of different ways to solve it may help you approach this issue from another angle, and uncover unexpected solutions.
None of these are likely the ideal responses you were hoping for, but I do hope they help you find a workaround. Good luck!