Moving endnotes to inline

63 views
Skip to first unread message

Cooper Cavalier

unread,
May 14, 2021, 3:05:26 PM5/14/21
to BBEdit Talk
I have a text/html file which has thousands of endnotes.  The file has the inline endnote reference as [1], [2], [3], etc. which correspond to the appropriate endnote.  For the app publisher I am working for, the endnotes must be placed in line with specific coding.

I am totally stumped in finding a way either inside of Bbedit or using a Python script.  I am doing this editing on a Mac.
 
Currently the inline link looks like this:
<p>It has been estimated that about one-fifth of Jesus' teachings was about money matters.<sup><a href="#_ftn19" name="_ftnref19">[19]</a></sup>

The endnote looks like this:
<h3 pb_toc=index><a href="#_ftnref18" name="_ftn18" title="">[18]</h3><sup>[18]</sup></a>See Werner G. Marx, &quot;Money Matters in Matthew,&quot; <i>Bibliotheca Sacra</i> 136:542 (April-June 1979):148-57.

It needs to instead look like this:
<p>It has been estimated that about one-fifth of Jesus' teachings was about money matters.<pb_endnote>See Werner G. Marx, &quot;Money Matters in Matthew,&quot; <i>Bibliotheca Sacra</i> 136:542 (April-June 1979):148-57.</pb_endnote>

Any help would be deeply appreciated since  there is no way I can do this manually.

Thank you!

Doug


MediaMouth

unread,
May 14, 2021, 3:30:36 PM5/14/21
to bbe...@googlegroups.com
You can actually user classic browser JS + a little jQuery to do the basic work, and then download the adjusted HTML(s) into new files

On May 14, 2021, at 12:05, Cooper Cavalier <dstrom...@gmail.com> wrote:

I have a text/html file which has thousands of endnotes.  The file has the inline endnote reference as [1], [2], [3], etc. which correspond to the appropriate endnote.  For the app publisher I am working for, the endnotes must be placed in line with specific coding.
--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/5a5f865a-c7d9-4b55-8f2d-0fbdef43d406n%40googlegroups.com.

Bruce Van Allen

unread,
May 14, 2021, 3:37:59 PM5/14/21
to BBEdit Talk
On 14 May 2021, at 12:04, Cooper Cavalier wrote:

> I have a text/html file which has thousands of endnotes. The file has
> the
> inline endnote reference as [1], [2], [3], etc. which correspond to
> the
> appropriate endnote. For the app publisher I am working for, the
> endnotes
> must be placed in line with specific coding.
>
> I am totally stumped in finding a way either inside of Bbedit or using
> a
> Python script. I am doing this editing on a Mac.

Looks like a script would make a pass through the doc to find all the
inline links and match them to their endnotes (confirming, too, that the
end note exists and that every link has an endnote).

Perhaps the result of that first pass would be a lookup hash of the
relevant info for each citation instance, with the ref numbers as the
hash’s keys.

Then a second pass: regex matching on the overall pattern of the
citation link with a () capture of the ref number. The ref number is
then looked up in the hash, and the citation info stored there is
substituted for the original citation link verbiage.

I can’t guide you for doing this in Python, but I assume it has regex
etc capabilities to do the above; Perl, my preferred programming
language, wrote the book.

If the above possible approach still requires you to devote more time
than you have to expanding your scripting-fu, there are several
scripting adepts on this list who might offer scripts in Perl,
Applescript, Javascript, Python, and of course multi-step find/replace,
text factories, etc in BBEdit.

HTH

> *Currently the inline link looks like this:*
> <p>It has been estimated that about one-fifth of Jesus' teachings was
> about
> money matters.<sup><a href="#_ftn19" name="_ftnref19">[19]</a></sup>
>
> *The endnote looks like this:*
> <h3 pb_toc=index><a href="#_ftnref18" name="_ftn18"
> title="">[18]</h3><sup>[18]</sup></a>See Werner G. Marx, &quot;Money
> Matters in Matthew,&quot; <i>Bibliotheca Sacra</i> 136:542 (April-June
> 1979):148-57.
>
> *It needs to instead look like this:*
> <p>It has been estimated that about one-fifth of Jesus' teachings was
> about
> money matters.<pb_endnote>See Werner G. Marx, &quot;Money Matters in
> Matthew,&quot; <i>Bibliotheca Sacra</i> 136:542 (April-June
> 1979):148-57.</pb_endnote>
>
> Any help would be deeply appreciated since there is no way I can do
> this
> manually.
>
> Thank you!
>
> Doug
>
>
> --
> This is the BBEdit Talk public discussion group. If you have a feature
> request or need technical support, please email
> "sup...@barebones.com" rather than posting here. Follow @bbedit on
> Twitter: <https://twitter.com/bbedit>
> ---
> You received this message because you are subscribed to the Google
> Groups "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to bbedit+un...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/bbedit/5a5f865a-c7d9-4b55-8f2d-0fbdef43d406n%40googlegroups.com.

Thanks,

- Bruce

_bruce__van_allen__santa_cruz__ca

@lbutlr

unread,
May 14, 2021, 4:08:31 PM5/14/21
to BBEdit Talk
On 14 May 2021, at 13:04, Cooper Cavalier <dstrom...@gmail.com> wrote:
The endnote looks like this:
<h3 pb_toc=index><a href="#_ftnref18" name="_ftn18" title="">[18]</h3><sup>[18]</sup></a>See Werner G. Marx, &quot;Money Matters in Matthew,&quot; <i>Bibliotheca Sacra</i> 136:542 (April-June 1979):148-57.

That is going to be harder if your endnotes do not have closing tags. I am not sure that the jquery suggestion will work at all, and a script is going to have to know where the note ends, which will take some code depending on how the footnote is formatted.

If you can add end tags or the note is on a single logical line, it becomes a lot simpler. However, with no closing tags and the endnote one a single line by itself, something likes this: 

Script searches Fo;e1 for
<sup><a href="#_ftn19" name="_ftnref(\d+)">\[\d+\]</a></sup> 

And assigns \1 to something like $footNum then searches for

<h3 pb_toc=index><a href="#_ftnref\d+" name="_ftn${footNum}" title="">\[\d+\]</h3><sup>\[\d+\]</sup></a>(.*)

In the File2. In the second search, \1 is the contents of your footnote, say $footText

You will replace your first search found text in File1 with:

<pb_endnote>$footText</pb_endnote>

BUT, this assumes that all the footnotes are formatted exactly the same way.

Obviously, work on a duplicate copy of File1

You could do this all in awk, but only if you have already endured the pain of learning awk which I'm guessing you haven't because then you would have just used awk. Awks is very powerful, obtuse, and unforgiving.

I would probably end up writing this as a shell script with sed because awk makes my head hurt.

Basic login:

Repeat with each line {
If it contain a footnote {
get the number
Search File2 for the footnote text
Replace the <sup></sup> with the footnote text
}
}
Write the new file

Multiple footnotes together will be another step, or run the script multiple times until there are no matches.

-- 
DODGEBALL STOPS AT THE GYM DOOR Bart chalkboard Ep. BABF12

Christopher Stone

unread,
May 14, 2021, 8:40:37 PM5/14/21
to BBEdit-Talk
On 05/14/2021, at 14:04, Cooper Cavalier <dstrom...@gmail.com> wrote:
I have a text/html file which has thousands of endnotes.  The file has the inline endnote reference as [1], [2], [3], etc. which correspond to the appropriate endnote.  For the app publisher I am working for, the endnotes must be placed in line with specific coding.

I am totally stumped in finding a way either inside of Bbedit or using a Python script.  I am doing this editing on a Mac.


Hey Doug,

If I've understand correctly all footnote references are going to look like: _ftn18 and [18].

What think I would do is (without having seen the full scope of the problem):

A) Split off all of the footnotes into their own file.

B) Write a regular expression or a script to massage all of the footnotes into their new format.
     - Leaving the footnote reference as a prefix to the new reference.
     - Make sure each footnote is formatted such that they all may be easily read into an array.

C) Read the footnotes into an array (in Perl for me).

D) Iterate through the array and find the matching footnote ref in the body document.
     - Make the appropriate substitutions.


Another alternative (for me) would be to do the job with BBEdit and AppleScript using the same general methodology, except that I'd use the page of footnotes as the array.

Pros:

  - Easier to prototype and visualize.

Cons:

  - Much slower execution time.

--
Best Regards,
Chris
/

Media Mouth

unread,
May 14, 2021, 9:04:55 PM5/14/21
to BBEdit Talk
On May 14, 2021, at 1:08 PM, @lbutlr <kre...@kreme.com> wrote:

That is going to be harder if your endnotes do not have closing tags.  I am not sure that the jquery suggestion will work at all, and a script is going to have to know where the note ends, which will take some code depending on how the footnote is formatted.

Point well-taken: If the HTML malformed jQuery won't work until it's fixed, but I suspect it's just a cut & paste error in the OP?
If not, some JS automation can probably fix it, and barring that, some manual fixes.

The JS/JQ code would be something like this (forgive any typos -- couldn't test, and is based on your example.)

$("a").each(inlineA => {
if (inlineA.href.indexOf("#_ftn") < 0) return; //skip all anchors that don't refer to endnotes.
var inlineSup = inlineA.parentNode; //Gets the <sup> element that contains inlineA
var endNoteName = inlineA.href.substr(1); //get the endnote anchor name from the inline anchor's href
var endNoteA = $(`a[name="${endNoteName}"]`);
if (endNoteA.length == 0) {
console.log("Failed to find this endnote " + endNoteName);
return;
}
if (endNoteA.length > 1) console.log("Note: There's more than one endnote!");
endNoteA = endNoteA[0]; //get only the first endnote for this inline anchor.
endNoteH3 = endNoteA.parentNode; //get the h3 element containing the endnote anchor.
pbEndnote = $("<pb_endnote>").html(endNoteH3.innerHTML)[0]; //Make the <pb_endnote> and fill it.  Needs a little cleanup.
$(pbEndNote).find("a").remove(); //This is the cleanup.
$(inlineSup).replaceWith(pbEndNote); //puts the <pb_endnote> into place.
});




--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.

Roland Küffner

unread,
May 17, 2021, 9:22:16 AM5/17/21
to BBEdit Talk
Hi,

it is not quite clear, how consistent your search and replace patterns are. But if they are, BBEdit's Canonize-command could deliver an easy solution.

In a nutshell: You can use a simple text file as a search-replace dictionary (think of a text based key-value array). Each line in such a textfile would have the simple form:

searchterm<tab>replaceterm


So with your example a line might look something like this:

_ftnref19    <pb_endnote>See Werner G. Marx, &quot;Money Matters in Matthew,&quot; <i>Bibliotheca Sacra</i> 136:542 (April-June 1979):148-57.</pb_endnote>


In Endnote you should be able to set up an export filter (or are those called Styles? I have not used Endnote for years) – but it should not be too dificult to assemble a custom text file that you can use as a lookup table with Canonize.

Canonize is a pretty powerful feature. You can even use grep patterns in those files. Be sure to check out the fine manual to learn more about them.

Regards,
Roland

--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.

@lbutlr

unread,
May 18, 2021, 4:20:21 PM5/18/21
to BBEdit Talk
On 17 May 2021, at 07:22, Roland Küffner <medien...@gmail.com> wrote:
> it is not quite clear, how consistent your search and replace patterns are. But if they are, BBEdit's Canonize-command could deliver an easy solution.

Oh, that is a much better solution. And a simple grep pattern on the footnotes file will easily format it for this. Where is this in the menus? I can't find "Cannon" in the help.

--
I got fired from the zoo for braiding the snakes.

@lbutlr

unread,
May 18, 2021, 4:36:32 PM5/18/21
to BBEdit Talk
On 18 May 2021, at 14:20, @lbutlr <kre...@kreme.com> wrote:
> On 17 May 2021, at 07:22, Roland Küffner <medien...@gmail.com> wrote:
>> it is not quite clear, how consistent your search and replace patterns are. But if they are, BBEdit's Canonize-command could deliver an easy solution.
>
> Oh, that is a much better solution. And a simple grep pattern on the footnotes file will easily format it for this. Where is this in the menus? I can't find "Cannon" in the help.

Oops. I R dum. "Cannon" doesn't work 😃

--
'Nothing works against magic. Except stronger magic. And then the
only thing that beats stronger magic is even stronger magic. And
the next thing you know...' 'Phooey?' --Sourcery

Cooper Cavalier

unread,
May 18, 2021, 5:31:01 PM5/18/21
to BBEdit Talk
Thanks for all the input.  I am working on preparing another book (Isaiah) and then I’m going to put my mind to this. Hopefully it doesn’t explode since my real programming experience is in COBOL, many years ago with IBM mainframe computers.  This is one of those things where I know it can be done (I mean, it’s just logic).  So…fingers crossed!
Reply all
Reply to author
Forward
0 new messages