Coincidentally I was discussing this with some other people only a couple of days ago. It's a mess. I understand that different engines resolve the ambiguity in different ways. PCRE2 copies Perl, and it's documented. This is from the pcre2pattern man page:
The handling of a backslash followed by a digit other than 0 is compli-
cated, and Perl has changed over time, causing PCRE2 also to change.
Outside a character class, PCRE2 reads the digit and any following dig-
its as a decimal number. If the number is less than 10, begins with the
digit 8 or 9, or if there are at least that many previous capture groups
in the expression, the entire sequence is taken as a backreference. A
description of how this works is given later, following the discussion
of parenthesized groups. Otherwise, up to three octal digits are read
to form a character code.
Inside a character class, PCRE2 handles \8 and \9 as the literal charac-
ters "8" and "9", and otherwise reads up to three octal digits following
the backslash, using them to generate a data character. Any subsequent
digits stand for themselves. For example, outside a character class:
\040 is another way of writing an ASCII space
\40 is the same, provided there are fewer than 40
previous capture groups
\7 is always a backreference
\11 might be a backreference, or another way of
writing a tab
\011 is always a tab
\0113 is a tab followed by the character "3"
\113 might be a backreference, otherwise the
character with octal code 113
\377 might be a backreference, otherwise
the value 255 (decimal)
\81 is always a backreference
Note that octal values of 100 or greater that are specified using this
syntax must not be introduced by a leading zero, because no more than
three octal digits are ever read.
A pattern such as (a\1) can indeed never match, but that is irrelevant to its interpretation. (There are plenty of ways to write patterns that can never match.) In that pattern \1 is a back reference.
For your comments 1-3, remember that these things can be inside repetitions. Other than the first time round the loop, a self-reference or forward reference may make sense. Example: /(\d|\+\1){2}/ (a made-up example) which matches any two digits or a digit followed by + and the same digit. Your item 4 is covered by the rules I quoted above. You can check these things by running pcre2test with the -d option:
re> /(y)\10001/
------------------------------------------------------------------
0 19 Bra
3 7 CBra 1
8 y
10 7 Ket
13 @01
19 19 Ket
22 End
------------------------------------------------------------------
This shows that it has interpreted \100 as octal (the character @) followed by a literal 01.