A couple issues with ML Kit on iOS

Derek Gottfrid

unread,

Jun 10, 2018, 7:08:57 PM6/10/18

to Firebase Google Group

I have been working on the MLKit beta on iOS using Objective - C. I have noticed a couple things and wanted to hear what others have seen or thought.

1. Image orientation - It doesn't really work. You need to pre-rotate your image to LeftTop to make it work. Some of the documentation and examples seem to suggest otherwise. This doesn't really affect barcodes but does for Face and Text. I am not sure about Labeling but I think it mostly works regardless of orientation. I am also seeing different results depending on using UIImage vs CMSampleBufferRef. UIImage works if you set orientation but CMSampleBufferRef fails regardless of your setting - particularly for Text. Also the documentation is confusing for orientation and it's unclear if when creating FIRVisionImage from UIImage if they are pulling off orientation for the object or you need to set it. In general the orientation has cause all kinds of confusion between what I believe are bugs, unclear documentation and lack of full examples in both objective c and swift.

2. Completion callbacks - This is an interesting design choices and it makes it slightly hard to chaining detectors together. The Mobile Vision has this capability as due Apple's native detectors. Not sure how I would address this but given the beta state - I think I would just kind of work around it for now.

3. Memory issues - I am perhaps doing something wrong here but I am noticing some wild swings in memory usage when trying to do Text in real time ultimately leading to an OOM situation. There isn't much guidance on holding on to instances of detectors or not which would be super helpful. Also I noted that early on if I didn't retain a reference to certain things particular w. the callback complete function that I was seeing some nils. That one was easily fixed but more clarity on the callback function would be helpful.

Overall, I am excited to see this mature and the docs, examples, and perf/stability improve. Is there a central location of release notes, upcoming releases, and known issues. Right now there are lots of different information tucked into corners of the internet - StackExchange comments, Reddit threads, in this group etc.

Thanks,

D

Shai Ben-Tovim

unread,

Jun 12, 2018, 9:57:37 AM6/12/18

to Firebase Google Group

Hi Derek,

I reported the memory issues to the Firebase team - in realtime face detection on video output it was serve - and they've confirmed that they are working to improve internal memory management to avoid this problem.

Shai

Ibrahim Ulukaya

unread,

Jun 12, 2018, 10:34:34 AM6/12/18

to Firebase Google Group

Hi Derek,

Thanks for letting us know the issues you are facing.

1) We have a Swift sample (recently updated), supporting both Face and Text (as well as Label and Barcode). For Face and Text, the sample apps support both static image (UIImage) and live video feed (CMSampleBufferRef). The live video feed also works with the front cameras, in all orientations. The developer does need to properly set the image orientation (FIRVisionImageMetadata) in the FIRVisionImage. Here is our implementation of that logic in the sample app. We'll shortly add the ObjC sample as well.

2) We are interested to learn more details about your use cases of chaining detectors together. Can you provide some examples and explanations of what you want to achieve?

3) This is a known issue, which is already fixed and will be shipped in the next release.

This is a good suggestion. You can use our StackOverflow tag to follow known issues and questions. We will update further if we provide this in another location as well.

- Ibrahim

Reply all

Reply to author

Forward