Formatting Multiple Choice Questions Using Regex

405 views
Skip to first unread message

anotherhoward

unread,
Sep 13, 2023, 11:32:46 AM9/13/23
to BBEdit Talk
The image below (IMAGE 1) shows a sample of the original format of the multiple choice questions I need to convert into another format. As there could be from 10-99 of them, I made the third sample #10 so that one has a double-digit number.

If the solution can be presented in REGEX, that would be helpful; however, I am open to other solutions.

Howard

IMAGE 1
MC1.png
The image after this one (IMAGE 2) shows how I need the first image's contents reformatted.

Here is the input in text format:
1. Why is the "Description" component important when reflecting on sports?
   a) It helps you set future goals.
   b) It provides context and sets the stage for reflection.
   c) It summarizes the main lessons learned.
   d) It assesses the positive and negative aspects.
   
2. Why is the "Evaluation" component important?
   a) This is choice one.
   b) This is choice two.
   c) This is choice three.
   d) This is choice four.
   
10. Why is the "Analysis" component important?
   a) This is choice one.
   b) This is choice two.
   c) This is choice three.
   d) This is choice four.

Here is how I need the first image's contents reformatted.

IMAGE 2
MC-1 reformatted.png
In the reformat:
1. Each item starts with "MC."
2. Immediately after "MC" is the multiple-choice question.
3. Then, for each item, there are the four possible answers, from "Choice 1" to "Choice 4."
5. Each possible answer is followed by "Correct" (once) and "Incorrect" (three times), with the position of "Correct" varying.
6. Each of an item's components except its last one needs to be followed by a TAB press. The last one in each item is followed by a RETURN.

Kevin Shay

unread,
Sep 13, 2023, 12:34:59 PM9/13/23
to bbe...@googlegroups.com
How do you know which answer is correct for each question?

--
This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/c26f2d21-c067-4489-8c90-0b9e7475c80dn%40googlegroups.com.
Message has been deleted

Kevin Shay

unread,
Sep 13, 2023, 12:45:39 PM9/13/23
to bbe...@googlegroups.com
Right... but given the input, how do I determine which response to output as "Correct" in the output?

On Wed, Sep 13, 2023 at 12:43 PM 'anotherhoward' via BBEdit Talk <bbe...@googlegroups.com> wrote:
Each question has one correct response associated with it. See Image 2.
Howard

anotherhoward

unread,
Sep 13, 2023, 12:51:46 PM9/13/23
to BBEdit Talk
Kevin,

The correct answers are shown in Image 2.

Here are the correct answers:
Question 1: C
Question 2: B
Question 3: C

Howard
On Wednesday, September 13, 2023 at 12:34:59 PM UTC-4 Kevin Shay wrote:

anotherhoward

unread,
Sep 13, 2023, 12:55:52 PM9/13/23
to BBEdit Talk
Kevin,
Is this of help? I’ve underlined the correct answers in the input.
Howard

Here is the input in text format:
1. Why is the "Description" component important when reflecting on sports?
   a) It helps you set future goals.
   b) It provides context and sets the stage for reflection.
   c) It summarizes the main lessons learned.
   d) It assesses the positive and negative aspects.
   
2. Why is the "Evaluation" component important?
   a) This is choice one.
   b) This is choice two.
   c) This is choice three.
   d) This is choice four.
   
10. Why is the "Analysis" component important?
   a) This is choice one.
   b) This is choice two.
   c) This is choice three.
   d) This is choice four.

Bruce Van Allen

unread,
Sep 13, 2023, 1:06:21 PM9/13/23
to BBEdit Talk
Here’s what I think what Kevin is getting at: if you want a generalized method that works for every/any question, not just the three in your example, then that method has to know which answer to designate as correct.

E.g., you’ve said there can be as many as 99 questions; which answer for question 3 would be the correct one? For question 4? etc.

There’s nothing in image 1 (the input) that indicates whether a), b), c), or d) is the correct answer for each question.

— Bruce

_bruce__van_allen__santa_cruz_ca_
> HowardOn Wednesday, September 13, 2023 at 12:34:59 PM UTC-4 Kevin Shay wrote:
> How do you know which answer is correct for each question?
>
> On Wed, Sep 13, 2023 at 11:32 AM 'anotherhoward' via BBEdit Talk <bbe...@googlegroups.com> wrote:
> The image below (IMAGE 1) shows a sample of the original format of the multiple choice questions I need to convert into another format. As there could be from 10-99 of them, I made the third sample #10 so that one has a double-digit number.
>
> If the solution can be presented in REGEX, that would be helpful; however, I am open to other solutions.
>
> Howard
>
> IMAGE 1
>
> The image after this one (IMAGE 2) shows how I need the first image's contents reformatted.
>
> Here is the input in text format:
> 1. Why is the "Description" component important when reflecting on sports?
> a) It helps you set future goals.
> b) It provides context and sets the stage for reflection.
> c) It summarizes the main lessons learned.
> d) It assesses the positive and negative aspects.
>
> 2. Why is the "Evaluation" component important?
> a) This is choice one.
> b) This is choice two.
> c) This is choice three.
> d) This is choice four.
>
> 10. Why is the "Analysis" component important?
> a) This is choice one.
> b) This is choice two.
> c) This is choice three.
> d) This is choice four.
>
> Here is how I need the first image's contents reformatted.
>
> IMAGE 2
>
> In the reformat:
> 1. Each item starts with "MC."
> 2. Immediately after "MC" is the multiple-choice question.
> 3. Then, for each item, there are the four possible answers, from "Choice 1" to "Choice 4."
> 5. Each possible answer is followed by "Correct" (once) and "Incorrect" (three times), with the position of "Correct" varying.
> 6. Each of an item's components except its last one needs to be followed by a TAB press. The last one in each item is followed by a RETURN.
>
>
> --
> This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
> ---
> You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/c26f2d21-c067-4489-8c90-0b9e7475c80dn%40googlegroups.com.
>
> --
> This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
> ---
> You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/077eb64d-c854-42ca-87d0-628d6a15e92dn%40googlegroups.com.

anotherhoward

unread,
Sep 13, 2023, 1:11:11 PM9/13/23
to BBEdit Talk

Bruce,

Please look at my last response, and let me know if that’s provides the needed information.

Thanks,
Howard

Bruce Van Allen

unread,
Sep 13, 2023, 1:44:00 PM9/13/23
to bbe...@googlegroups.com
anotherhoward,

So you’re saying that the input text will have the correct answers for each question underlined?

This seems challenging:

* A regular expression will not detect text formatting such as underlining - in fact in plain text such as what we work with in BBEdit, there IS no underlining. Some word processors allow searching by text formatting; whether they’re scriptable I don’t know.

* It’s your work :), but it seems tedious to have to manually underline or otherwise mark the correct answers for each of up to 99 questions, especially if the only reason to do so is to help with this transformation. When I’ve built quizzes, the answers are in a separate index file that is programmatically accessible for grading quizzes; such a thing would also be useful for the kind of script that could transform your input into your desired output.

# Answers_Quiz_016.txt
Q A
1 a
2 c
3 a
4 d
5 c
...


* I suppose this is also a matter of what control you have over the input - the set of questions and the knowledge/designation of the correct answers. Also, how frequently you have to do this.

More description?
> To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/70132027-fcfd-49e2-ad33-da99f628543en%40googlegroups.com.

anotherhoward

unread,
Sep 13, 2023, 2:55:34 PM9/13/23
to BBEdit Talk

Hi Bruce,

I misunderstood what was being asked and appreciate the time you put into your response. I do have a separate answer file for each of my tests. Further, in the forum, l only underlined the correct answers’ text because I thought that was the information I was being asked for. My school’s LMS will grade the exams for me once I supply it with the correct answers, which is easy for me to do.

To make things easier here, as most of my tests contain no more than 20 multiple-choice questions, for now, in the reformat, please just use for all the possible answers for each question (there are four, from 
Choice 1 to Choice 4), the word “Incorrect.” Then, once I have the code that will enable me to reformat a specific test’s questions, it should take me only a few minutes in BBEdit to change the correct answer in each question from “Incorrect” to “Correct.”

Please let me know if you have any questions about what I wrote.

Howard

David G Wagner

unread,
Sep 13, 2023, 6:08:55 PM9/13/23
to bbe...@googlegroups.com
That is still a lot of work on your end. Why not use a square bracket or } as the correct answer then it should be handled in hopefully one pass and no one is doing extra work. Just thought…

Wags ;)

WagsWorld
Hebrews 4:15
Ph(primary) : 408-914-1341
Ph(secondary): 408-761-7391

anotherhoward

unread,
Sep 13, 2023, 6:38:13 PM9/13/23
to BBEdit Talk
David,
That sounds like a viable solution. I'm open to that idea. Do I have to do anything to my submission for it to be done the way you suggested?
Howard

David G Wagner

unread,
Sep 14, 2023, 12:29:54 AM9/14/23
to bbe...@googlegroups.com
I am not the best at regex, but the thought of how to indicate what was correct was really all I was offering.  If no one comes forward with a solution, then would give it shot for  you.


Wags ;)

WagsWorld
Hebrews 4:15
Ph(primary) : 408-914-1341
Ph(secondary): 408-761-7391

anotherhoward

unread,
Sep 14, 2023, 10:13:58 AM9/14/23
to BBEdit Talk

Here is new input updated to indicate the correct answer {C} in each multiple-choice question.

1. Why is the "Description" component important when reflecting on sports?
   a) It helps you set future goals.
   b) It provides context and sets the stage for reflection.
   c) It summarizes the main lessons learned. {C}

   d) It assesses the positive and negative aspects.
   
2. Why is the "Evaluation" component important?
   a) This is choice one.
   b) This is choice two. {C}

   c) This is choice three.
   d) This is choice four.
   
10. Why is the "Analysis" component important?
   a) This is choice one.
   b) This is choice two.
   c) This is choice three. {C}

   d) This is choice four.

Rick Gordon

unread,
Sep 14, 2023, 7:49:03 PM9/14/23
to BBEdit Talk
This will require two regexes, which could be set up as a text factory.

The first will handle everything except for the Correct field. It is:

--
FIND:
(?:^\d+\.\h+)([^"\r]+)("?)([^"\r]+)("?)(.+)\r\h+(?:a\)\h)(?:[^{\r]+)(\{C\})?\r\h+(?:b\)\h)(?:[^{\r]+)(\{C\})?\r\h+(?:c\)\h)(?:[^{\r]+)(\{C\})?\r\h+(?:d\)\h)(?:[^{\r]+)(\{C\})?\r\h*

REPLACE:
MC\t"\1\2\2\3\4\4\5"\tChoice 1\tIncorrect\6\tChoice 2\tIncorrect\7\tChoice 3\tIncorrect\8\tChoice 4\tIncorrect\9
--

That would change this:

1. Why is the "Description" component important when reflecting on sports?
   a) It helps you set future goals.
   b) It provides context and sets the stage for reflection.
   c) It summarizes the main lessons learned. {C}
   d) It assesses the positive and negative aspects.

2. Why is the "Evaluation" component important?
   a) This is choice one. {C}
   b) This is choice two.
   c) This is choice three.
   d) This is choice four.

10. Why is the "Analysis" component important?
   a) This is choice one.
   b) This is choice two.
   c) This is choice three.
   d) This is choice four. {C}

…to this:

MC    "Why is the ""Description"" component important when reflecting on sports?"    Choice 1    Incorrect    Choice 2    Incorrect    Choice 3    Incorrect{C}    Choice 4    Incorrect
MC    "Why is the ""Evaluation"" component important?"    Choice 1    Incorrect{C}    Choice 2    Incorrect    Choice 3    Incorrect    Choice 4    Incorrect
MC    "Why is the ""Analysis"" component important?"    Choice 1    Incorrect    Choice 2    Incorrect    Choice 3    Incorrect    Choice 4    Incorrect{C}
--

Then, a second regex:
--
FIND:
Incorrect\{C\}

REPLACE:
Correct
--

…would change the field for the correct answer.

Rick Gordon 

anotherhoward

unread,
Sep 15, 2023, 11:01:49 AM9/15/23
to BBEdit Talk
I pasted your first REGEX code and the relevant input into Pattern Playground in both the Find and Replace sections and nothing is happening. When I click Next, I just get a beep and am getting the message "No matches found." I've checked everything and it matches what's in your post in this forum. What could I be doing wrong?
Howard

Gordon REGEX (1).png

Rick Gordon

unread,
Sep 15, 2023, 4:44:12 PM9/15/23
to BBEdit Talk
Howard, you would need to have this in Replace:

MC\t"\1\2\2\3\4\4\5"\tChoice 1\tIncorrect\6\tChoice 2\tIncorrect\7\tChoice 3\tIncorrect\8\tChoice 4\tIncorrect\9

Your replace field shows the desired outcome, not the Replace element of the regex.

Rick Gordon

anotherhoward

unread,
Sep 15, 2023, 4:59:20 PM9/15/23
to BBEdit Talk

 When I fixed the Replace pattern, BBEdit is still beeping when I tap NEXT in the Pattern Playground. The search is still not finding anything.

Rick Gordon

unread,
Sep 15, 2023, 5:42:30 PM9/15/23
to BBEdit Talk
The Find code should work based on your examples, but explaining it out:
  1. (?:^\d+\.\h+)
    Starting at the beginning of the paragraph, any number of digits followed by a period and at least one space or tab
    Process and ignore. It will be replaced by MC followed by a tab. [ MC\t ]
  2. ([^"\r]+)
    Anything but a straight quote or a return
    Will be preceded by a straight quote, followed by the captured text. [ "\1 ]
  3. ("?)
    One or no straight quotes
    Will be replaced by itself twice; so one quote will be replaced by two quotes and nothing will be replaced by nothing. [ \2\2 ]
  4. ([^"\r]+)
    Same as #2 [ \3 ]
  5. ("?)
    Same as #3 [ \4\4 ]
  6. (.+)
    Anything else to the end of the paragraph.
    The captured text, followed by a straight quote [ \5" ]
  7. \r\h+
    A return followed by at least one series of spaces or tabs
    Replaced by one tab [ \t ]
  1. (?:a\)\h)(?:[^{\r]+)(\{C\})?\r\h+
  1. a. [ (?:a\)\h ]  Lowercase "a" followed by a right paren and a space or tab;
    b. [ (?:[^{\r]+) ]  Anything but a left brace or a return, up to the return;
    c. [ (\{C\})? ]  One or zero instances of Capital "C" enclosed in curly braces;
    d. [ \r\h+ ]  A return followed by one or more spaces or tabs.
    Process and ignore. It will be replaced by "Choice 1" followed by a tab, followed by "Incorrect", followed by the result of one or zero instances of Capital "C" enclosed in curly braces [ Choice 1\tIncorrect\6\t ]
  1. (?:b\)\h)(?:[^{\r]+)(\{C\})?\r\h+
  1. For answer "b", analogous to the explanation in step 8 [ Choice 2\tIncorrect\7\t ]
  1. (?:c\)\h)(?:[^{\r]+)(\{C\})?\r\h+
  1. For answer "c", analogous to the explanation in step 8 [ Choice 3\tIncorrect\8\t ]
  1. (?:d\)\h)(?:[^{\r]+)(\{C\})?\r\h* 
  1. For answer "d", analogous to the explanation in step 8, except for allowing for no spaces after the return, and not adding a final tab [ Choice 4\tIncorrect\9 
---------

On Friday, September 15, 2023 at 8:01:49 AM UTC-7 anotherhoward wrote:

Rick Gordon

unread,
Sep 15, 2023, 5:45:20 PM9/15/23
to BBEdit Talk
Well, the numbered list formatting got screwed up from step 8, but I think the intent is clear.

Rick Gordon

unread,
Sep 15, 2023, 6:38:25 PM9/15/23
to BBEdit Talk
Actually, in case there should be no quoted element in the question, step 4 and step 6 should end with * instead of +, since in that case, the whole line would be captured in step 2.

So I'm revising my FIND component to:

(?:^\d+\.\h+)([^"\r]+)("?)([^"\r]*)("?)(.*)\r\h+(?:a\)\h)(?:[^{\r]+)(\{C\})?\r\h+(?:b\)\h)(?:[^{\r]+)(\{C\})?\r\h+(?:c\)\h)(?:[^{\r]+)(\{C\})?\r\h+(?:d\)\h)(?:[^{\r]+)(\{C\})?\r\h*
On Friday, September 15, 2023 at 2:42:30 PM UTC-7 Rick Gordon wrote:

anotherhoward

unread,
Sep 15, 2023, 7:13:15 PM9/15/23
to BBEdit Talk

Thanks for the detailed explanation. I’m working through it.
Reply all
Reply to author
Forward
0 new messages