this is explained in the doc.
#1 labels one branch, while $1 labels all branches within the clade including the branch. the following two are the same:
(((rabbit, rat) $1, human), goat_cow, marsupial);
(((rabbit #1, rat #1) #1, human), goat_cow, marsupial);
and both are different from
(((rabbit, rat) #1, human), goat_cow, marsupial);
your intuition is o.k., but you would need to reconstruct ancestral sequences even if you want to estimate dN/dS for the tip branches. you need to know sequences at the two ends of the branch. anyway codeml is an ML method, so it averages over all possible ancestral reconstructions. if you want to know the details, you can read a book which covers likelihood calculation on a tree, for example, chapter 4 of yang (2014).
Yang Z. 2014. Molecular Evolution: A Statistical Approach. Oxford University Press, Oxford, England.
ziheng