Document date and title

242 views
Skip to first unread message

Raul

unread,
Mar 20, 2018, 9:13:22 AM3/20/18
to Mayan EDMS
Hi guys,

I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware would be a good match for it.

I am not at a state where I have installed EDMS 3.0 and am wondering about the following:

1.) How can I create an index of the year, month of the document date itself?
2.) How can I set the document title so that it builds it out of for example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to documents and adds them to a special cabinet? All outgoing from keywords that match with the OCR result.

Thanks for your help :)

Michael Price

unread,
Mar 23, 2018, 1:25:11 AM3/23/18
to Mayan EDMS
Hi,

How are you liking version 3.0?

Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating system fields and do not persist during upload via web. It could be possible to retain these values if the document is uploaded via a watch folder or staging folder. Since these methods open the file to be loaded into Mayan directly from the operating system the file creation could be read.

2. It is not possible using the normal installation. However I found this online: https://gitlab.com/mayan-edms/document_renaming it seems to do what you want. Haven't tried it and looks outdated. If there is enough interest we could add something like this in our fork of Mayan.

3. Not yet possible at the level you want right now but it is getting there. There is a workflow feature called triggers and another called actions. These allow you to create a workflow that will respond (trigger) based on an event (OCR finished) and perform an action (tag the document, move to the cabinet). The problem is that the triggers and actions are static. You can't program any kind of intelligence in them. There is no method to add a decision (what folder based on what OCR content). We have been talking about solving this with what we called workflow filters. The specs are still in design phase as we don't want to create a whole separate programming language for this. Eric is particularly interested in this still (we wants to auto tag documents based on OCR content) so this will get done as soon as we figure out the design.

Raul

unread,
Apr 1, 2018, 1:55:44 PM4/1/18
to Mayan EDMS
Hi Michael,

thanks for your response.
I finally got my server in place and am ready to rock now.

Regarding my questions:

1) I was more thinking about something like it is done in paperless (https://github.com/danielquinn/paperless).
Here the date is being extracted out of the OCR content and used as document date. Which is a very nice feature.

2) You are right the addon is outdated.

3) That would be definitely nice. I the way it is right now the OCR content is only used for searching. However, there could be so much more you could do with it. Like search for keywords and trigger the auto assign to a cabinet or get the document date etc....

Michael Price

unread,
Apr 1, 2018, 4:51:12 PM4/1/18
to mayan...@googlegroups.com
Hello,

1) They must be using a regular expression feature to extract the data. I must warn that it is never a good idea to rely on OCR output for data. OCR is one of those features that will never work 100%. If you do have some fallback logic to avoid adding garbage data. We could add a post OCR processing step to add a feature like this. The best place for that would be the workflow engine. I think there are post OCR triggers. We would need a regular expression workflow action.

2) A pity. Looks very interesting. I have a lot on my plate but will take a look to see how difficult it would be add this as a standard app.

3) We added Filters to Paperattor which work a bit like SmartLinks. We are looking into reusing this method to add workflow trigger filters. This means that you can make a workflow to trigger the transition and add tags or extract OCR data for metadata only is certain condition programmed in the workflow filter is met.

Keep the ideas and use cases coming, they give us a good roadmap to develops the next set of features.

--

---
You received this message because you are subscribed to a topic in the Google Groups "Mayan EDMS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mayan-edms/ONBovsdTKfI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mayan-edms+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Raul

unread,
Apr 9, 2018, 5:30:22 AM4/9/18
to Mayan EDMS
Hi Robert,

what do you think about my question 1?
Would it be possible to implement such a feature - with a few lines - into mayan?
This information could for example be used for sorting the documents.
Sorting them by the document name doesn't make a lot of sense if you ask me.

Looking forward to your feedback.

Matthias Löblich

unread,
Apr 10, 2018, 3:25:26 AM4/10/18
to Mayan EDMS
Hello,

1) Please take a look at my mayan-extension: https://gitlab.com/startmat/document_analyzer


br
Matthias
To unsubscribe from this group and all its topics, send an email to mayan-edms+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages