Good morning all I wanted to see if anyone has solved a problem that I am currently dealing with with Amatica. We have a bunch of files across several folders that are in the DOCX, XLSX, and PPTX format. What we want to do is normalize the access copy to have it be in PDFs and CSVs. I know that if I were to run this process in Archivematica it is giving me an error because of the XML nature of the files and it is unable to process.
At the moment the work around that I am doing is creating batch files to do and process them to do the conversion. I was able to get it working with DOCX to PDF via a Powershell script, but I am struggling with doing this with the XLSX to CSV and PPTX to PDF. Here is the script so you can get an idea of what I am looking for.
***
$CurrentPath = $PWD.Path
$documents_path = $CurrentPath
$word_app = New-Object -ComObject Word.Application
Get-ChildItem -Path $documents_path -Recurse -Filter *.docx | ForEach-Object {
$document = $word_app.Documents.Open($_.FullName)
$pdf_filename = "$($_.DirectoryName)\$($_.BaseName).pdf"
$document.SaveAs([ref] $pdf_filename, [ref] 17)
$document.Close() }
$word_app.Quit()
***
Has anyone able to solve the other formats or have a solution that is tackling these issues that I am not familiar with? Thank you in advance for your help.