Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

dict versus list-like thinking, and large language models

4 views

Skip to first unread message

pataphor

unread,

Nov 27, 2023, 6:18:54 PM11/27/23

The news about the Dunning-Kruger effect being the result of a
statistical artefact, e.g. that of autocorrelation, where unbeknownst
to the researcher, the variables that were made available to them were
sometimes combined and then presented as a single original variable,
hit me kind of hard.

From my personal experience I already knew that it can be exceedingly
hard to even get access to uncorrupted data or to the mechanisms by
which they were created, for example in order to know about leading
questions or other experimental conditions that would influence how
that data would start to look like. Imagine an election were some
people were not allowed to vote, or were propagandized into being
silent or just did not have any of the options they would vote for
being combined in acceptable ways in some existing party's program, this
would kind of resemble what I am trying to address here, at least the
spirit of it.

Then, after the data is gathered there is still some time to roughen
them up a bit afterwards, like rounding the decimals, turn them into
percentages, add data from different locations, or setups, or points
in time together in a single variable, or present them as graphs
where the axes are unclear or where it's hard to see where exactly the
data projects on them. This is just a few examples, the possibilities
are endless, and only making certain subsets of the data available to a
highly controlled subset of researchers is not the least important of
them.

But why would this happen?

I think it is the result of people trying to make it harder for other
people to publish earlier than them or to publish a rebuttal to their
research based on a detailed analysis of the properties and
circumstances of their data and their gathering circumstances and their
processing of it.

There is also the problem of not wanting to go against the prevailing
narrative, like trying to present results that would cast doubt on the
idea of educating everyone based on their own cultural background, when
the zeitgeist is very much in favor of that. Of course, trying to do
research that would kind of point to the idea that students coming from
far away, speaking a different language and having different cultural
norms, to the point of it being possible to assign each of them a
'cultural difference' score relative to the main culture would be out
of the question in such environments, no matter how meticulously one
explains how one built up that concept out of the underlying variables
measured.

Yes, not all data manipulation is harmful, in fact it is often very
useful and beneficial. But one still needs to be able to explain what
went in to it and how it was done.

As all these kinds of things and the data roughening, protectionism and
cultural silencing went farther along it led in the end to a great
replication crisis, and the autocorrelation debacle involved in the
Dunning-Kruger effect was, in my opinion, just a small symptom of a
very much larger general problem.

So psychological research just didn't work, or at least it was lacking
a factual basis, even though it went on for a long time on a high
prestige course, duping many in the process.

it is in that context that I would like to look at large language
models taking over the role that once belonged to structured thought.

As commercial advertising and propaganda needed more data in order to
become more effective (from a commercial viewpoint of course, I don't
mean to imply this to be a need for the people so targeted) an
environment emerged where systems that don't rely on humans to guide
them at first, could start to take over, and eventual 'corrections' to
their output could just be done afterwards, so we arrived at
reinforcement learning with human feedback.

This neatly bypassed all human preferences and was the ideal system to
obfuscate the sources of the data by which the privacy was infringed
on, of the general population, via the commercial surveillance system,
resulting from advertising.

So, because of human corruption, the attempts of each human to get in a
better relative position than other humans, even if that end was
reached not by improving oneself but just by keeping opportunities from
others, we had arrived at a static situation, where improvements could
only be made in relatively small details and always under control of
large institutions that could simply buy up the competition and then
silence it, or, in the case of governments, make laws that would hinder
progress, possibly via proxy of people providing other people with large
sums of money to make it so.

But all of this means that the 'success' of large language models was
as much an artefact, except on a large social scale, as was the
Dunning-Kruger effect. It could only flourish in a state of illegible
data, and it is far from certain, personally I would guess it is very
much not the case, that it would be a good or intelligent research
method in general.

For one thing, it is kind of against the spirit of scientific research,
that one formulate a hypothesis first, then gather data to prove or
disprove it, always going from the standpoint of a 'null hypothesis',
assuming the thing one wants to see isn't there, except if the data and
analysis of it would make it very unlikely that it didn't exist.

For a second reason, I would like to go towards data structures like
hash maps (also called 'dictionaries' or 'dicts' in Python, a
programming language I have engaged with a lot) and arrays (which are
sometimes instantiated as 'lists', also in Python).

Hash maps are unordered and thus do not suffer as much from the
implicit 'linearity' of lists, their use is optimized for finding
things fast without having to process each item until one stumbles on
the right one. Of course there are ways to help each data type gain
more of the advantages of the other, so the difference is more in the
kind of practical mental approach than in ultimate computing efficiency.

But still, the most efficient way of coding stuff, in my experience, is
to set some arbitrary values with dicts, that can span large mental
spaces, and then, after arriving at some point defined by a dictionary
value, start making short explorations defined by lists. In this way
the code is more legible, enabling large jumps to other terrains, and
one can still switch to list processing when it is easier or faster to
handle ordered or linear data that way.

Maybe this would not be the case for the large language models internal
functioning, they might not need the human way of understanding that
enabled us to progress from the state of wilderness to more structured
and larger scale civilizations, after all, a neural network is not a
brain.

But my bet would be that even for LLM, especially if they would
communicate with humans or other sapients, it would be beneficial to
agree on certain concepts and structures, as exemplified by
dictionaries, and to only start processing detail once the overall
structure is defined and agreed upon.

What is kind of weird is that these kinds of insights or whatever one
would call them now seem to appear regardless of how people approach
the concept of the illegibility of neural networks, and sometimes I
wonder if the large scale processes I am intuiting are not just the
result of my own thoughts, but are more some kind of collective
awakening, and I don't know who inspired whom, even if some of those
people hold other thoughts on other subjects that I very much disagree
with.

0 new messages