Hi Emmanuel, thank you for this important questions. I'll add a caveat. I've developed this software and have a decent understanding of LPA but I'm not an expert.
1.
You are using the estimate_profiles() function, not estimate_profiles_mplus(), correct? Regarding your question (and Timur's), if you're using estimate_profiles(), then you're really using the mclust package "under the hood". My understanding of how it works is that there are two steps. First, the data is clustered using a hierarchical cluster analysis. Then, the results of that hierarchical cluster analysis are used as the starting values for a maximum likelihood (expectation-maximization, or EM) algorithm. The first, hierarchical step is deterministic. It will be the same every time you run it. The second, expectation-maximization step, isn't, though in practice and (often) in my experience, because the hierarchical clustering provides the start points, you often will get the same results from estimate_profiles(). But not always. So it's not completely out of the ordinary, and it is probably more likely with larger datasets and more variables - there are basically more solutions available for the algorithm to find, though some may be local solutions.
I don't have any easy answers or great ideas for how to address this. One may be to choose the solution with the lower log-likelihood. That may be the simplest and best way to go. Perhaps run the function a number of times (I don't have a great idea for how many - maybe 10). And save/record the log-likelihood each time. Especially if the lowest log-likelihood was replicated, there's a good chance that that is a good solution. Another idea -- do you have access to MPlus? It uses a different procedure I'd be happy to describe and talk/think through with you - no hierarchical clustering. Another option is to consider using the RMixMod package:
https://cran.r-project.org/web/packages/Rmixmod/index.html; I've wanted to build some code into tidyLPA to interface to that package but haven't yet.
2. My understanding is that this means none of the models with the other model types (3 and 6) converged. It's not necessarily good nor bad; it happens a lot, in my experience. Choose from among the (simpler) models that converged.
3. Yes, I think you should consider those as irrelevant. The key distinction is that the model did converge - the profile/class with no 'assignments' is still identified /estimated. It's just not associated with any responses for which it is the highest probability. So this is probably not a good solution and as you said can be ignored.
Josh