hughola martinus naming

0 views

Skip to first unread message

Theodora Wallingford

unread,

Aug 3, 2024, 12:45:43 AM8/3/24

to cuebravlana

At the end of the day, training a machine learning model is like studying for a test. You (the model) use learning resources such as books, past exams, flash cards etc. (train set) to perform well on a test/exam (test set). Knowing your learning resources perfectly doesn't mean you are overfitting. You would be overfitting if this is all you knew and couldn't perform well on the exam at all.

The purpose of a model will always be to minimize loss. not increase accuracy. so parameters of any model using any optimizer like adam optimizer(a common optimizer), will try to gain momentum towards parameter values where the loss is least, in other words "minimum deviation".

but in your case we dont have extreme accuracy like 0.99< . so its safe to say that your model is performing good and is not overfitting. good models do not overfit, they strictly converge to an arbitrary value as 0.924 in your case.

It seems to me that if the end devices are "Meraki MR" devices, it seems to be always accurately (did not measure actual length but reported length looks reasonable to me). If end devices are PC (mainly Intel NUC), also the case.

However, I am getting wrong figures if the end device are Sonos AMP and Dahua NVR. Eg have a Dahua NVR which is in the same rack connected via a 2m patch cable, but cable test reported it as 19m (1Gfdx and all 4 pairs OK).

Yes, I have had "weird" and inconsistent results as well. I don't consider it much more than a base line test and don't completely trust it. That said, I think it is still a useful starting point, but I would not take the results and tell a cable contractor they need to go fix their cable necessarily without closer inspection.

Not trying to necro a post, but I came across this thread trying to figure out an XXX result that we received from a cable test on pair 2 when troubleshooting a device not connecting to the switch and this post was the only reference I found to it. I contacted support, and they gave me the following answer:

"XXX means pair 2 are failed to report voltage for unknown reason."

Hope this helps if anyone else comes across this. We're pretty sure the contractor installed two bad cables (the other one shows open on pair 4, so we have to force it onto 100Mbps), so we believe the result is likely accurate.

I was testing the cable length from a POE port on the MS225 to a Cisco Catalyst 9100ax AP and it was wildly inaccurate. It displayed 25.25 m or approximately 83 ft. in the Meraki dashboard. I then used a Fluke MicroScanner2 and the cable length displayed was 32 ft. So a 51 ft variance is a massive inaccuracy. Bottom line I will never use the cable length measurement from the Meraki device for any reason. Sad face emoji.

I am now trying to understand why is this happening and how to fix it. With my previous printer, there was a calibration test model that one would print and input the measurements back to the software, and this would basically fix dimensional issues, shrinkage, etc.

I find it strange that there is no dimensional calibration in the prusa slicer, even when printing injection moulding parts they will adjust dimmensions just for plastic shrinkage, shouldn't prusa slicer do the same?

My printer is VERY accurate, it comes up with the same deviation on every single print. Therefore, the way that the previous printer worked, in which a set of prints were measured and re-entered in the slicer, I think it is reasonable.

Underlying reason that X & Y are more likely to be dimensionally off compared to Z is the mechanics of the axes. Z distances is controlled by stepper motor angle changes rotating a screw mechanism whereas X and Y depend on the diameter of the drive pulley. It is easer to machine a screw with an accurate twist/distance than getting the diameter of the drive pulley exactly right.

Pulleys supplied with both my Prusa's are slightly smaller diameter than would move X and Y the desired distance. It is a small error - about 0.5 to 1% too little diameter. There are some other pulleys I have found with more accurate diameters, but later examples of those obtained by other users were reported to have eccentricity issues.

you did not use the same source dataset for test. You should do a proper train/test split in which both of them have the same underlying distribution. Most likely you provided a completely different (and more agreeable) dataset for test

The other answers are correct in most cases. But I'd like to offer another perspective. There are specific training regimes that could cause the training data to be harder for the model to learn - for instance, adversarial training or adding Gaussian noise to the training examples. In these cases, the benign test accuracy could be higher than train accuracy, because benign examples are easier to evaluate. This isn't always a problem, however!

I have a test set with 10 subfolders where each subfolder name = label.
my question is: How can I calculate the accuracy after the prediction? (how can I compare the predicted labels with the actual labels)

Thanks @marcmuc for the quick response.
I followed the method you mention and I got the accuracy For the test set but when I tried to show the top losses or draw the confusion matrix it shows the real validation set data! not the test set.

use like this and it works but really not comfortable, dont know if this is the proper way of applying all train transforms, and also assumes that I know what normalization I do need (if I drop normalization, results become invalid):

What am I looking to have is like this:
-model eval pipeline for exported models, not for production/api inference
-load model in a black box way (as we would for inference), using load_learner
-no knowledge of transforms, normalize,whatsoever like in load_learner/inference
-load a validation set (multiple images, with labels) so batch mode is preferred
-but using for evaluating a validation dataset and running all interpret functions

Drop last does not happen on the validation set, hence why we set valid to the train. This is before databunch and before anything was lost to shuffle and drops You can see this if you do a show_batch between the train and valid after databunching,

In medicine and statistics, sensitivity and specificity mathematically describe the accuracy of a test that reports the presence or absence of a medical condition. If individuals who have the condition are considered "positive" and those who do not are considered "negative", then sensitivity is a measure of how well a test can identify true positives and specificity is a measure of how well a test can identify true negatives:

If the true status of the condition cannot be known, sensitivity and specificity can be defined relative to a "gold standard test" which is assumed correct. For all testing, both diagnoses and screening, there is usually a trade-off between sensitivity and specificity, such that higher sensitivities will mean lower specificities and vice versa.

A test which reliably detects the presence of a condition, resulting in a high number of true positives and low number of false negatives, will have a high sensitivity. This is especially important when the consequence of failing to treat the condition is serious and/or the treatment is very effective and has minimal side effects.

A test which reliably excludes individuals who do not have the condition, resulting in a high number of true negatives and low number of false positives, will have a high specificity. This is especially important when people who are identified as having a condition may be subjected to more testing, expense, stigma, anxiety, etc.

There are different definitions within laboratory quality control, wherein "analytical sensitivity" is defined as the smallest amount of substance in a sample that can accurately be measured by an assay (synonymously to detection limit), and "analytical specificity" is defined as the ability of an assay to measure one particular organism or substance, rather than others.[2] However, this article deals with diagnostic sensitivity and specificity as defined at top.

Imagine a study evaluating a test that screens people for a disease. Each person taking the test either has or does not have the disease. The test outcome can be positive (classifying the person as having the disease) or negative (classifying the person as not having the disease). The test results for each subject may or may not match the subject's actual status. In that setting:

After getting the numbers of true positives, false positives, true negatives, and false negatives, the sensitivity and specificity for the test can be calculated. If it turns out that the sensitivity is high then any person who has the disease is likely to be classified as positive by the test. On the other hand, if the specificity is high, any person who does not have the disease is likely to be classified as negative by the test. An NIH web site has a discussion of how these ratios are calculated.[3]

Consider the example of a medical test for diagnosing a condition. Sensitivity (sometimes also named the detection rate in a clinical setting) refers to the test's ability to correctly detect ill patients out of those who do have the condition.[4] Mathematically, this can be expressed as:

A negative result in a test with high sensitivity can be useful for "ruling out" disease,[4] since it rarely misdiagnoses those who do have the disease. A test with 100% sensitivity will recognize all patients with the disease by testing positive. In this case, a negative test result would definitively rule out the presence of the disease in a patient. However, a positive result in a test with high sensitivity is not necessarily useful for "ruling in" disease. Suppose a 'bogus' test kit is designed to always give a positive reading. When used on diseased patients, all patients test positive, giving the test 100% sensitivity. However, sensitivity does not take into account false positives. The bogus test also returns positive on all healthy patients, giving it a false positive rate of 100%, rendering it useless for detecting or "ruling in" the disease.[citation needed]