How to manage a large taxonomy

18 views
Skip to first unread message

Jon Ku

unread,
Jul 18, 2024, 7:49:24 PM (4 days ago) Jul 18
to dotCMS User Group
We're a new, licensed dotCMS managed site, migrating from a bespoke knowledge base.

So far we have successfully tested the following:
  • WebDav to manage files/folders
  • dotCLI running
  • Webhooks to load content (future batch load of existing content)
  • prototype Taxonomy and Content page content types
  • Build a single page application to demonstrate hierarchical menus based on the Taxonomy content type with parent Relationship
  • Content page content type attached to the taxonomies with another Relationship

QUESTION
The question at hand is how to use native tools to allow easy author/editor access to select a Taxonomy (i.e. menus and breadcrumbs) when there are 1,000 different taxonomies.

This also requires that certain branches and certain leaf pages be attached to multiple parts of the hierarchy, and also that some taxonomies will have the same name, for example Billing is a common one. This seems to rule out Categories and Folders.

We are exploring Relationships as the best approach, having created a Taxonomy content object that has both parent and child relationships for navigation, as well as relating content to be found on those menus.

ISSUE
With all 1,000 taxonomies loaded, using the Related selector panel will be unwieldy at best. Search will help, however our current users are comfortable with navigating a tree structure UI similar to Folders in dotCMS.

SOLUTION
Is there a solution in place to manage this type of taxonomy using Relationships?

We speculate that creating a parallel folder structure to the taxonomy would do the following:
  • Use Relationship taxonomy contentlets on the front end to create menus, including duplicate branches.
  • Each taxonomy has a matching folder with the same title, and the folder contains that single taxonomy.
  • Users of the Relate panel can navigate the folders with the Sites/Folder selector widget on the left-hand side, and click on the resulting taxonomy once they reach it.
BIG QUESTION
Can a vtl actionlet or other workflow action generate a new folder and set its parent when the taxonomy contentlet is saved? I don't see that referenced.

Else is it realistic to have a separate batch process to traverse the taxonomy relationship tree and create a matching folder for each, and could that succeed if triggered on saving a new or edited taxonomy contentlet.

note: I'm using my personal Google account for this Google Group, my work email for some reason can't find it.

Mark Pitely

unread,
Jul 19, 2024, 10:18:16 AM (3 days ago) Jul 19
to dot...@googlegroups.com
It is fairly easy to write some html/vtl/javascript to interact with the API to 'reproduce' whatever the system does. That is, build a back-end-like app/page that hides the complexity from the user. It pulls the data they need and filters the options available to them and gives them a form to update. 
Keep the real backend for the developers. I don't know if you can create folders via the API, but I would be surprised if you couldn't. You also could just create a structure that provides 90% of what a folder would offer with 5 times the flexibility. (I'm clearly not a fan of folders at this stage, they are more of a legacy concept)

Here's an example concept:

This is a tool that allows people to create new pages, adding them to the navigation tree. (I have disabled the ability to actually do anything for the time being, and normally the page is password-protected).
They cannot create new top-level nav (a place where I am restricting them)
If you click new, it gives you a form, and if you complete it, it extracts the data from the old site (That's the 'verify' button, which does a lot of work) and then builds a new page in dotCMS with the old content along the correct path.  
This is all using a URLMap to make the folder structure. 
This is so I can have non-admin users help build out a site without really understanding what they are doing and controlling what options they have and also automating the actual cut-paste work - this sounds similar to what you need. It's a tool with training wheels. 
The code will follow.

Hope this helps!

Mark Pitely
Albright College

<style>
.openstar::before{ content: '\2606';}
.closedstar::before{ content: '\2605';}    
   
</style>

<div style="padding-top:20px"></div>
#set ($pull=$dotcontent.pull("+contentType:Navigation",0,"Navigation.title"))
#foreach($con in $pull)
#if (!$con.parent)
<div  class="toplevel" data-title="$con.title" data-top="$!con.path" data-path="$con.path" data-identifier="$con.identifier" id="p${velocityCount}" style="display:inline-block;font-size:22px;margin-bottom:10px"><span class="star openstar" style="cursor:pointer" id="id-${con.path}" onclick="focuser('id-${con.path}','p${velocityCount}')"></span> <a style="width:500px;min-width:500px;display:inline-block" target="_blank" href="/${con.path}">$con.title</a><div style="width:40px;border:2px solid red;display:inline-block;cursor:pointer;padding:3px;margin-left:20px;font-size:14px;" onclick="adder('p${velocityCount}')">Add</div></div>


#foreach($cone in $dotcontent.pull("+contentType:Navigation ",0,"Navigation.title"))
#if ($cone.parent.get(0).title==$con.title)
<div class="first-child hidden" data-title="$cone.title" data-top="$cone.top" data-path="$cone.path" data-identifier="$cone.identifier" id="c${velocityCount}" style="padding-left:20px;display:inline-block;font-size:16px;margin-top:10px;margin-bottom:10px"><a style="width:400px;min-width:400px;display:inline-block" target="_blank" href="/${cone.top}/${cone.path}">$cone.title</a><div style="display:inline-block;width:40px;border:2px solid red;cursor:pointer;padding:3px;margin-left:20px;font-size:14px" onclick="adder('c${velocityCount}')">Add</div></div>

#end
#end



#end
##Of has parent

#end
##of pull







<form id="theform" style="margin-top:20px;margin-bottom:20px" class="hidden">
   <div style="float:left;font-size:18px;">Add Page to <span style="color:#a51e36;" id="adder"></span></div>
    <div style="height:10px;clear:both"></div>
  <div style="width:100px;float:left">New Page Title</div>
  <input type="text" id="title" name="title" style="float:left;font-size;16px;width:400px">
  <div style="height:10px;clear:both"></div>
  <div style="width:100px;float:left">New Page Name (for url)</div>
  <input style="float:left;font-size;16px;width:400px" type="text" id="path" name="path">
   <div style="height:10px;clear:both"></div>
   <div style="width:100px;float:left">Wordpress URL</div>
  <input style="float:left;font-size;16px;width:400px" type="text" id="wordpressurl" name="wordpressurl" > <div style="float:left;border:1px solid green;margin-left:5px;cursor:pointer" onclick="gethtml()">Verify</div>
   <div style="height:10px;clear:both"></div>
   <div style="width:100px;float:left">Top-level Path</div>
  <input style="float:left;font-size;16px;width:400px"  type="text" id="top" name="top">
   <div style="height:10px;clear:both"></div>
   <div style="width:100px;float:left">Notes/Comments</div>
  <input style="float:left;font-size;16px;width:400px"  type="text" id="notes" name="notes">
   <div style="height:10px;clear:both"></div>
  <input style="display:none" type="text" id="identifier" name="identifier">
 
 <input id="button" class="hidden" type="button" onclick="poster();" value="Submit" style="color:#000">
</form>




<script>



function focuser(starid,whatid){
   
var where=document.getElementById(whatid);
 var parents=document.getElementsByClassName("toplevel");
 var children=document.getElementsByClassName("first-child");
 var star=document.getElementById(starid);
 
 
if (!where.classList.contains("focus")){
    star.classList.remove("openstar");
    where.classList.add("focus");
    star.classList.add("closedstar");
for (var i = 0; i < parents.length; i += 1){
    parents[i].classList.add("hidden");
}
for (var i = 0; i < children.length; i += 1){
    children[i].classList.add("hidden");
 if (children[i].dataset.top==where.dataset.top) children[i].classList.remove("hidden");  
}    
where.classList.remove("hidden");  
return;
}//of not focused  

if (where.classList.contains("focus")){
    star.classList.remove("closedstar");
    star.classList.add("openstar");
    where.classList.remove("focus");
   
for (var i = 0; i < parents.length; i += 1){
    parents[i].classList.remove("hidden");
}
for (var i = 0; i < children.length; i += 1){
    children[i].classList.add("hidden");
 
}    
 
}//of not focused





   
}//of fucntion

function adder(whatid){
    console.log("Adder");
    console.log(whatid);
    what=document.getElementById(whatid);
    console.log(what);
    title=what.dataset.title;
    adder=document.getElementById('adder');
    topt=what.dataset.top;
    path=what.dataset.path;
    id=what.dataset.identifier;
   form=document.getElementById('theform');
   form.classList.remove("hidden");
   adder.innerHTML=title;
   
   var ftop=document.getElementById('top');
var fpath=document.getElementById('path');
var ftitle=document.getElementById('title');
var fid=document.getElementById('identifier')
var notes=document.getElementById('notes');
ftop.value=topt;
fid.value="+identifier:"+id;

   
}

function show_news(out){
   
    alert(out);
   
}



function gettoken(res){
 console.log(res);
 console.log(res.entity);
 console.log(res.entity.token);
document.token=res.entity.token;    
   
}


function gethtml(){
pather=document.getElementById('wordpressurl');
path=pather.value;
   
fullpath='https://www.albright.edu/wp-content/themes/albright2017/functions/ripper.php?url='+path;  
   
  fetch(fullpath)
    .then(function(response) {
        // When the page is loaded convert it to text
        return response.text()  })
    .then(function (text) {
        document.html=text;  
      console.log(text);
      pather.style.color="green";
      document.getElementById('button').classList.remove('hidden');
    });    
   
 
}

function finish(res){
    console.log(res);
    location.reload();
}


function poster(){
var formData = new FormData();


var top=document.getElementById('top').value;
var path=document.getElementById('path').value;
var title=document.getElementById('title').value;
var wordpressurl=document.getElementById('wordpressurl').value;
var identifier=document.getElementById('identifier').value;
var notes=document.getElementById('notes').value;

//wordpressurl='https://www.albright.edu/about-albright/about-reading-pa/';

//var html=document.getElementById('htmlcontent').value;

var dataObj={"contentlet":[{"title":title,"contentType":"navigation","top":top,"path":path,"wordpress":wordpressurl,
"parent":identifier,'html':document.html,'notes':notes}]};
   
   
jsonout=JSON.stringify(dataObj);    
console.log(jsonout);    
     
     
     
     
     
     
     
     
     
     
     
let url = '/api/v1/workflow/actions/default/fire/PUBLISH';      
fetch(url, {
  method: 'POST',
  headers: {
    'Accept': 'application/json, text/plain, */*',
    'Content-Type': 'application/json',
    "Authorization": "Bearer "+document.token
  },
  body: jsonout
}).then(res => res.json())
  .then(res => finish(res));
         
     

}

    let url = '/api/v1/authentication/api-token';  
fetch(url, {
  method: 'POST',
  headers: {
    'Accept': 'application/json, text/plain, */*',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({"user":"a...@albright.edu", "password":"XXXXXXX", "expirationDays": 10 })
}).then(res => res.json())
  .then(res => gettoken(res));


</script>




--
http://www.dotcms.com - Open Source headless/hybrid CMS
---
You received this message because you are subscribed to the Google Groups "dotCMS User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dotcms+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dotcms/99156bec-23fe-4350-aa56-37d088bfe953n%40googlegroups.com.

John Michael Thomas

unread,
Jul 19, 2024, 10:58:52 AM (3 days ago) Jul 19
to dotCMS User Group
It sounds like you've chosen to use a content type to represent the taxonomies in a tree structure, which is a time-tested approach many customers use.  It also sounds like the main challenge you have with that is how the relationships among those taxonomies are displayed - e.g. it would be easier in a tree structure than in a flat structure.

If I'm right about both of those, then I'll follow up on what Mark said, that you can use code to represent that however you want.  And if you want to keep that within the dotCMS back-end, you can use a custom field to display and select the taxonomies (you don't have to use an external app).

There's a couple caveats to using custom fields for this that may be helpful to know if this will work for your needs.

1. The value of the custom field that gets stored in the content is a string.

So, once a user has selected the taxonomies, you'll need to represent that as a string.  And when you load a content item, the custom field will need to parse that string to show the already assigned taxonomy in whatever tree component you use.

2. When that string gets indexed, Elasticsearch tokenizes it.

By default, ES will strip out all white space and punctuation.  So, for example, if you were storing the taxonomy "path" as something like "Billing/Department/BillingCode", then it would strip out the slashes and index each of those labels separately - which would mean you couldn't search on the whole path.  And that's probably not what you want.

But you can override how each individual field is tokenized using the esCustomMapping field variable.  So, you can set that field up to index the entire path as a single string, which will allow you to include slashes in your search, to pull content with taxonomies matching any part of a path.

If you think custom fields might work for you, then it's probably worth taking a look at some of our custom field examples on the demo site.  I don't think we have any examples there that use a tree component, but you can use pretty much any vanilla JS component easily.  You can also use components from specific frameworks (Next, Angular, etc.), though to do that you'll also need import the appropriate libraries for the framework that the component relies on.

Hope it helps,
John
Reply all
Reply to author
Forward
0 new messages