Scaled Frequency in collocation text results

422 views
Skip to first unread message

Yannis Panagis

unread,
Apr 11, 2022, 12:47:01 PM4/11/22
to AntConc-Discussion
Hi,

I have recently started with AntConc 4.x and looks and works very smoothly!

After experimenting with collocations, when saving the results to a text file, through "File>Save text results", I see a column "Freq(Scaled)" in the results, that doesn't appear in the collocations' tab.

I might have missed something, but I couldn't find what this column represents, in neither the manual nor the forum. Any hints are welcome!

PS: The following trick that is probably known, gives the same result as with exporting to text; Select all results from the tab, copy and paste, even in an Excel sheet. There is a catch that you have to select the appropriate Page size, in order to copy all the results...

Thanks in advance!

Best regards,
Yannis

Laurence Anthony

unread,
Apr 18, 2022, 2:33:35 AM4/18/22
to AntConc-Discussion
Hi Yannis,

When you export the results, you are effectively saving all the columns that are stored in the database. It's equivalent to a "results dump". As you say, if you copy and paste the results, you'll get what you see in the interface view. I guess this might be confusing, but it can also be convenient. Otherwise, you would have to turn on all the display options to be able to get a complete set of results, which could also be inconvenient. 

Do you think I should change the way the export works?

Laurence.

Yannis Panagis

unread,
Apr 18, 2022, 12:13:41 PM4/18/22
to AntConc-Discussion
Hi Laurence,

Thank you for your response. The way the exports are being done is not very confusing at the end, but it would help perhaps if you provide the option for them to be saved as CSV (they are essentially tab-separated now, which is already very good).

I basically feel unsure about how "Freq(Scaled)" is being computed. For example the collocate "with" in the file I am attaching, has a value of 29120 in the Freq(Scaled) column. Is the formula to Scaled frequencies, given somewhere?

Best regards,
Yannis
Collocate_results.txt

Laurence Anthony

unread,
Apr 20, 2022, 12:04:26 AM4/20/22
to AntConc-Discussion
Hi Yannis,

The current TSV format can be directly imported into most tools (including Excel) exactly as the CSV format can. The advantage of TSV is that you can also copy and paste the file content directly into Excel and all the columns will be maintained. You cannot do that with CSV. TSV formatted tables also look much better in plain text as the columns are clearly visible and there is no need to quote the values to avoid the problem of commas included in the contents. I generally always use TSV for corpus work. What I might consider is adding an option to export the results directly in Excel format. This is probably what most users actually want. 

Note that the fastest and easiest way to export results into Excel is just to copy and paste the results from the display. In the old AntConc 3x version, copy and pasting wasn't always good, but in AntConc 4, copy-paste works very well across all tools, including the KWIC display.

Freq(Scaled) is calculated by multiplying the frequency by the collocate window size. It's necessary to get the correct scaled frequency value to calculate the statistic correctly. But, as you say, it's a bit confusing because it doesn't really have any meaning outside of the statistical measure, which is why I hide it in the main display.

Laurence.

Shin Ishikawa

unread,
Mar 17, 2023, 6:03:46 AM3/17/23
to AntConc-Discussion
Hi. Laurence and all
CC  Yannis

Now I am checking the new functions of Antconc V4 for the coming semester, which is really an exciting update.
Regarding the main result display for  the collocates analysis, you mentioned that


>>>
Freq(Scaled) is calculated by multiplying the frequency by the collocate window size. It's necessary to get the correct scaled frequency value to calculate the statistic correctly. But, as you say, it's a bit confusing because it doesn't really have any meaning outside of the statistical measure, which is why I hide it in the main display.
>>>

Thank you for your explanation, which helped me a lot. Here are small questions and comments.

1) First, regarding the calculation of Freq (scale), I understood it as follows: when s/o examines the collocates of X in the width of -4 to +4, and if some collocating word (Y, for example) occurs 7 times in the whole corpus, then, its "scale" frequency is calculated as 56 (freq:7 X window width:8). Is this right?

2) You also say you hid "Freq (scale)" in the main window,  but it is still shown in the result display, meanwhile, "Freq (LR)" is NOT displayed there. This is a bit confusing esp. for novice users, because "Freq (LR)"  is the most important information in the collocates analysis. (It is included in Sort-by option!). Maybe Freq scale should be hidden, and Freq (LR) should be displayed in the main window. (Just FYI, I am using the V4.2.0). 

Sorry if I misunderstand your intention and if I missed your comment made somewhere else.
And thank you very much for your lasting efforts for this wonderful concordancer.

Regards
Shin Ishikawa @ Kobe U
2022年4月20日水曜日 13:04:26 UTC+9 Laurence Anthony:
Reply all
Reply to author
Forward
0 new messages