TAASSC 1.3.8

287 views
Skip to first unread message

Timothy William Lawrence

unread,
Nov 23, 2022, 11:00:07 AM11/23/22
to Suite of automatic linguistic analysis tools

Good afternoon Professors and PhD students!

Firstly, thanks for admitting me into the group and developing such a great tool. I am writing you for some insight to get the TAASSC 1.3.8 working if you can provide any. 

On my first attempts to process single file with one text of around 100 words, I was able to get results in three cvs files. However, the data in the sca.cvs appeared incomplete. I only received results in two measures: nwords and MLS. However, the data appeared accurate in the two measures.

Shortly after my first attempts, the TAASSC 1.3.8 completely stopped working and is stuck on “Loading Database 5 of 5… (please be patient)”. I´ve tried using the TAASSC on both macOS Catalina Version 10.15.7 and Windows on several different computers with no avail. I´ve also made sure that every file I upload is a “filename.txt.”. I downloaded Python 3.11 and Java 8 Update 351 packages, however I am not actively working with them at this point.

It might be my ineptitude with the TAASSC 1.3.8, however I would like to ask if you know what the issue might be or provide me with some guidance.

I would be extremely grateful for any suggestion on how to move forward.

Thanks for your time!

Best regards,

Tim Lawrence

Kristopher Kyle

unread,
Nov 23, 2022, 5:22:51 PM11/23/22
to Timothy William Lawrence, Suite of automatic linguistic analysis tools
Hi Tim,

Could you send me a sample of your data? It would appear that you have followed the correct procedures (though I remember having an issue with a sub-version of Catalina) - there may be an encoding issue with your texts.

Best,

Kris

--
You received this message because you are subscribed to the Google Groups "Suite of automatic linguistic analysis tools" group.
To unsubscribe from this group and stop receiving emails from it, send an email to linguistic-analysi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/linguistic-analysis-tools/c70ee7e2-a478-4f16-86a5-38c745aef3c3n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Kristopher Kyle
Associate Professor
Department of Linguistics
University of Oregon

Timothy William Lawrence

unread,
Nov 24, 2022, 8:58:32 AM11/24/22
to Suite of automatic linguistic analysis tools
Hi Kyle,

Attached you'll find a sample of a couple of the  files I tried to upload to the TAASSC 1.3.8. The texts were downloaded from the EFCAMDAT (https://philarion.mml.cam.ac.uk) if that makes any difference. After downloading them in a .zip file, I copy and pasted them to a word document which I then saved as a .txt file before uploading them to the TAASSC 1.3.8.

I am interested to know what the issue is. Also, It was weird that the first time it worked, but later it didn't.

Thank you!

Best,

Tim

C2.Critizing a Celebtity.Written Texts.txt
EFCAMDAT.B2.Writing a Movies Review Pilot Doc Corpus Corrected.txt

Suite of automatic linguistic analysis tools

unread,
Dec 9, 2022, 7:11:50 PM12/9/22
to Suite of automatic linguistic analysis tools
Hi Tim,

Sorry for the delay - the end of the term has been a bit crazy for me.

I looked at your texts and immediately found some encoding issues (i.e., some characters that are not represented in ASCI - see red upside-down question marks). This isn't a problem if you just process the data, but it IS a problem if you use one of the output options. 
Screen Shot 2022-12-09 at 4.09.19 PM.png

Frankly, this is a bug in TAASC 1.3.8 (there should be better error handling in the code), and we will address this in the next version, which should come out in 2023.

Attached are the results for those files.

Best,

Kris
results.csv

J.M. Odóna

unread,
May 12, 2023, 11:21:58 AM5/12/23
to Suite of automatic linguistic analysis tools
Greetings Kyle and Tim,

I just posted a question that seems very similar to Tim's. I am getting all 0s in SCA output except for word number and mean length of sentence. I'm using my college freshman research papers. But they don't obviously have unrecognized characters like the ones you noted in Tim's sample, so I don't know if the problem is stemming from the same cause or not. You mentioned that a 2023 version may address this. Can either of you, or others, recommend how I can get successfully get SCA output across all fourteen classic SCA indices? I've also tried Dr. Lu's web based version but it has a number of limitations that the TAASC does not, including being unable to process my students' paper lengths, so it seems like it is a better direction to try to find out how to get the output of these standard indices from TAASC. Any help would be much appreciated! I know it is the end of the semester and probably the worst time to be asking for assistance, so I appreciate any thoughts you have!

Jocelyn

Suite of automatic linguistic analysis tools

unread,
May 12, 2023, 2:29:31 PM5/12/23
to Suite of automatic linguistic analysis tools
Hi Jocelyn,

Did you try running the tool with the "text output" and "xml output" boxes unchecked?

Have you installed the Java Development Kit (version 8) for your system?

If you have tried both of the above, feel free to send a sample of texts (or the corpus). I can take a look to see if there are issues with the data and try to run the data for you.

Best,

Kris

J.M. Odóna

unread,
May 12, 2023, 7:30:48 PM5/12/23
to Suite of automatic linguistic analysis tools
Kris,
Thank you so much for your response. Per your inquiries, I uninstalled and reinstalled Java Development Kit v. 8, and then I ran tests with both "text output" and "xml output" checked, unchecked, and individually checked - sadly to no avail. I'm sending my zipped corpus! It includes 61 files. They are research papers written for me last semester. From each I removed the titles, headings, in-text citations, and references. But maybe there is something else that faults with the TASSC. I very much so appreciate your help. 
Jocelyn

You received this message because you are subscribed to a topic in the Google Groups "Suite of automatic linguistic analysis tools" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/linguistic-analysis-tools/qCcYrYDu5Oc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to linguistic-analysi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/linguistic-analysis-tools/bd317151-bfc5-4c5e-97d8-c688c1565645n%40googlegroups.com.
111 Samples.zip

Kristopher Kyle

unread,
May 15, 2023, 2:37:04 PM5/15/23
to J.M. Odóna, Suite of automatic linguistic analysis tools
Hi Jocelyn,

I ran a couple of tests on your data.

First, I would suggest removing the lines "------------" from your texts (this isn't the main issue).

TAASSC uses two different parsers - a dependency parser for most indices and a constituency parser for the L2SCA indices. There is a lot that happens in the background, but it looks like the spaces in your filenames are causing an issue on the L2SCA end but not on the TAASSC end.

So, if you replace the spaces in the filenames with "_", the L2SCA data should run just fine.

Let me know if you have further issues!

Best,

Kris


For more options, visit https://groups.google.com/d/optout.

J.M. Odóna

unread,
May 15, 2023, 3:21:27 PM5/15/23
to Kristopher Kyle, Suite of automatic linguistic analysis tools
Oh my goodness, that's it?! Fantastic! Thank you so much, Kris. On another note, I teach writing at two community colleges and use a grammar curriculum I created over many years in response to my students' needs, but paradoxically never actually studied grammar formally myself (since degrees in English and literature unfortunately do not require it!). Can you make any reading recommendations in the form of books, studies, curricula, etc. that present advanced English syntax for someone like myself who is neither a beginner nor a trained linguist?
Thanks for everything!
Reply all
Reply to author
Forward
0 new messages