Query on MOSAIC -eventpartition-block segmentation

upre...@gmail.com

unread,

Sep 30, 2022, 6:00:44 AM9/30/22

to mosaic

Dear Dr Balijepalli,

I am a PhD student working on nanopore signal processing, and I came across your very useful MOSAIC package for the analyzing nanopore signals. I have installed the MOSAIC package, and is running smoothly as well.

However, due to my limited skill to decode the source code, I am unable to figure out exactly how many blocks are created from the whole signal. I understood that the parameter blockSizeSec in eventSegment.py determines the size of the window for the event partition. So, if I have 100 seconds long signal and I chose blockSizeSec to be 1 second, then I will get each block of size 1 second. Now, my question is that do you divide the 100 seconds signal into 100 blocks without any overlap? If then, is it not possible that some event be lost while segmenting the signal without any overlap?

I hope I have made my question clear. Thanking you in advance.

Kind Regards,

Pratima Upretee

Balijepalli, Arvind K. (Fed)

unread,

Sep 30, 2022, 10:05:00 AM9/30/22

to mosaic

Hi Pratima,

The block size is the length of data that is read in at a time. Each block may contain several events depending on your experiment. While each block of data is used to calculate some average values such as the baseline current, etc., the raw data is appended to a queue so that you never lose any events. The total number of events both the ones that were properly analyzed and the ones that were rejected are available either by querying the data base or by looking at the run log. In the MOSAIC interface, the statistics page also displays this information.

Hope this helps.

Arvind

--
View the MOSAIC documentation at https://pages.nist.gov/mosaic.
---
You received this message because you are subscribed to the Google Groups "mosaic" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mosaic+un...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/mosaic/a78b9c89-91fa-428d-ad52-1f29655ede5cn%40list.nist.gov.

Pratima Upretee

unread,

Oct 3, 2022, 7:15:18 AM10/3/22

to mos...@list.nist.gov

Dear Dr. Balijepalli,

Thank you so much for the reply. I understood that block size is used to determine the threshold current at certain time instance. Is it the only role of block size in the code?

Furthermore, I still do not understand how the raw data are appended to each other. Is there any overlap between them?

Let us suppose that I have a 3 seconds long signal, and I chose block size of 1 second. Now, if one of my event starts at 2.80s and ends at 3.20s, will this event be detected?

I hope this example makes my question clear.

Thanking you.

Have a nice day.

Warm regards,

Pratima Upretee.

You received this message because you are subscribed to a topic in the Google Groups "mosaic" group.
To unsubscribe from this topic, visit https://groups.google.com/a/list.nist.gov/d/topic/mosaic/opOd9WNniK8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mosaic+un...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/mosaic/SA1PR09MB7824749BA96294D4F5E4C614F3569%40SA1PR09MB7824.namprd09.prod.outlook.com.

Balijepalli, Arvind K. (Fed)

unread,

Oct 3, 2022, 10:01:14 AM10/3/22

to mosaic

Hi Pratima,

Yes, block size is used to compute the baseline, but also check for drift, track the open current, etc. You can see the relevant functions in mosaic/partition/metaEventPartition.py.

Data is requested on demand by the partitioning code. In you example, for an event starting at 2.8 s with a block size of 1 s, metaEventPartition will detect the start of the event and when it runs of out data request, more. This request is fulfilled by mosaic/trajio/metaTrajIO.py, which will just add one more block of data to the pipeline. Once complete, metaEventPartition will continue analyzing data until the event ends. If there is no data available, for example, if the measurement was stopped in the middle of the event, then that partial event will be rejected and processing will stop.

Thanks,

To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/mosaic/CA%2BMDE09LwsOeUs3%3DS4x8CM3KKt0t9i1abDm4%3D1wLk%2B2Fbajqvg%40mail.gmail.com.

Pratima Upretee

unread,

Oct 4, 2022, 3:47:27 AM10/4/22

to mos...@list.nist.gov

Dear Dr. Balijepalli,

Thank you so much for the nice explanation. Previously, I thought the time series won't be appended, and the events lying in between two blocks would be discarded. Now, I am clear that events are not missed at all, unless the part of the event is missing in the raw (whole) signal.

But I have one last question. It might be a silly question, but I was wondering, if my event shares the samples with two blocks(half samples in block A and half samples in block B), what will be the threshold current for detecting that particular event? Because as per my understanding, threshold current for block A is different from the threshold current for block B.

Thanking you.

Have a nice day.

Kind regards,

Pratima Upretee

To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/mosaic/SA1PR09MB78245560A162FF52350EE2D0F35B9%40SA1PR09MB7824.namprd09.prod.outlook.com.

--

Pratima Upretee

GSM: +32465962522

Ghent University - imec

IDLab

AA Tower | Technologiepark Zwijnaarde 122

9052 Ghent

Balijepalli, Arvind K. (Fed)

unread,

Oct 4, 2022, 10:15:28 AM10/4/22

to mosaic

Hi Pratima,

The events are processed in two steps: i) The event partition code (mosaic/partition) identifies an event by comparing it to a threshold. This includes all the individual levels within an event and ii) each recognized event is processed based on the selected algorithm (mosaic/process).

If you have multiple levels within an event, I would select either ADEPT or cusumPLUS to identify the sub-states. If the data predominantly has a single state then ADEPT 2-state may be more appropriate and faster. The documentation (https://pages.nist.gov/mosaic) provides additional details on the differences.

To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/mosaic/CA%2BMDE08GzOFwgduKadRK5MyUsWzrGaowT2W9iWW0mo0uNwtdUg%40mail.gmail.com.

Pratima Upretee

unread,

Oct 4, 2022, 10:57:21 AM10/4/22

to mos...@list.nist.gov

Dear Dr. Balijepalli,

I think I did not make myself clear. I am currently only concerned about the event partition code, how you detect the events before proceeding further analysis.

For example, I have a 4 seconds long signal, and I chose a block size of 1 second. Now, for calculating threshold current and other parameters, we will have 4 blocks of 1 second. Let us call them Block_1, Block_2, Block_3 and Block_4. Then I will have 4 threshold currents, threscurr1, threscurr2, threscurr3 and threscurr4. Now, if one of my events (let us call it event_34) starts at 2.80s and ends at 3.20s, it lies between Block 3 and Block 4. From our previous conversationI understood that even if the event lies between two blocks it is not lost and is detected. But now my question is: In this case, how this event, event_34, is detected, and which threshold will be used (threscurr3 or threscurr4)?

I hope I have made my question clear here.

Thanking you.

Warm regards.

Pratima Upretee

To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/mosaic/SA1PR09MB7824606AE55AF645708D582BF35A9%40SA1PR09MB7824.namprd09.prod.outlook.com.

Reply all

Reply to author

Forward