Watch folder - automatic metadata

654 views
Skip to first unread message

JSDCKR Family

unread,
Jul 15, 2017, 8:45:22 PM7/15/17
to Mayan EDMS
Kia ora, dear Mayan Team

Thank you for such incredibly impressive software. We run a small charity, and Mayan EDMS (which we recently happened upon by chance) is revolutionary for us.

My question is this : It would speed up our workflow tremendously if there were a way to automatically add metadata when uploads take place from the Watch folder. The metadata could either be stored in a file associated with the document (eg by filename with a different extension); or transferred from an existing metadata field (eg 'Subject' from a PDF).

I see there has been some discussion of this in the past. Is there an existing way of achieving this (which I've missed?) - OR any suggestions on how I might go about making it happen with your remarkable API?

Your with grateful thanks again

Jamie

Jonathon Exley

unread,
Jul 17, 2017, 2:16:43 AM7/17/17
to mayan...@googlegroups.com
It sound like the analyser plugin might do this for you: https://gitlab.com/startmat/document_analyzer 

Jonathon

--

---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

JSDCKR Family

unread,
Jul 17, 2017, 4:59:45 AM7/17/17
to Mayan EDMS
Thanks so much, Jonathon

...have you installed this plugin successfully? Not sure why (probably inexperience), but when I run mayan-edms.py migrate, it throws :

 File "/usr/share/mayan-edms/local/lib/python2.7/site-packages/django/apps/registry.py", line 124, in check_apps_ready
    raise AppRegistryNotReady("Apps aren't loaded yet.")

...are you able to help with this, by any chance?

Thank you again,

Jamie
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+...@googlegroups.com.

Roberto Rosario

unread,
Jul 25, 2017, 2:15:37 AM7/25/17
to Mayan EDMS
The document analyzer can extract file metadata and store it as Mayan metadata, but the plugin has not updated for recent versions of Mayan. It is done by @startmat so hopefully he is tracking the mailing list and might update the app.

As for a native approach, I not favor the approach of a second metadata file with the same name. Potential for many stray files and extensions on Mayan has no meaning due to OS differences. The approach I would like to see implemented is the 'metadata mapping' talked about a while ago. It will be a single file containing metadata for several documents. This way it can assign metadata when the documens are being uploaded or assign the metadata much later after the files have been uploaded, all with the same code base. It can also be applied to the email sources with almost no change. The metadata map file will be another attachment in the email. The only thing that remain for this feature is defining the file format. JSON and CSV are the competing suggestions. JSON has the potential to provide more information but it harder to write by hand and is not exported by spreadsheet programs. CSV is more archaic but can be produced more easily. Then there is the matter of defining the 'key' column, that will match the document to the metadata map row, should we use a single field? Support multiple combinations? Support transformations to combine multiple fields? Those two are the blockers for the feature and discussion along with test cases from those experienced in the topic will help reach a decision and add the feature. Since you already have a need and a test case for it your input will be very helpful.

Matthias Löblich

unread,
Jul 25, 2017, 3:21:43 AM7/25/17
to Mayan EDMS
Hi,
I hope to find time to update the document analyzer within the next weeks.

br
Matthias

Roberto Rosario

unread,
Jul 25, 2017, 7:26:04 AM7/25/17
to Mayan EDMS
Thanks Matthias!

Robert Schöftner

unread,
Jul 25, 2017, 10:50:46 AM7/25/17
to mayan...@googlegroups.com
Am Dienstag, 25. Juli 2017, 00:21:43 CEST schrieb Matthias Löblich:
> Hi,
> I hope to find time to update the document analyzer within the next weeks.
>

I'm by no means a django developer, but I managed to update the document
analyzer to work with current (2.4 or so) mayan version. I can send a pull-
request or a diff-file when I get around to clean it up a bit. It even contains
a prototype backend for "zimg" based barcode and qr-code recognition.

regards

Robert

Matthias Löblich

unread,
Jul 25, 2017, 3:03:02 PM7/25/17
to mayan...@googlegroups.com
Hi Robert,
sounds good.  I like to see the "zimg" based barcode prototype backend :-)

greetings to Eggendorf from Klosterneuburg !

Matthias




--

---
You received this message because you are subscribed to a topic in the Google Groups "Mayan EDMS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mayan-edms/zCWUA8ySjwQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mayan-edms+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages