Vertex Mls-100

0 views

Skip to first unread message

Niklas Terki

unread,

Aug 5, 2024, 11:45:16 AM8/5/24

to riterbbemi

Inmany scenarios we want to generate a dense 3D geometry, i.e., a triangle mesh. However, from a multi-view stereo method, or a depth sensor we only obtain an unstructured point cloud. To get a triangle mesh from this unstructured input we need to perform surface reconstruction. In the literature there exists a couple of methods and Open3D currently implements the following:

The alpha shape [Edelsbrunner1983] is a generalization of a convex hull. As described here one can intuitively think of an alpha shape as the following: Imagine a huge mass of ice cream containing the points \(S\) as hard chocolate pieces. Using one of these sphere-formed ice cream spoons we carve out all parts of the ice cream block we can reach withoutbumping into chocolate pieces, thereby even carving out holes in the inside (e.g., parts not reachable by simply moving the spoon from the outside). We will eventually end up with a (not necessarily convex) object bounded by caps, arcs and points. If we now straighten all round faces to triangles and line segments, we have an intuitive description of what is called the alpha shape of \(S\).

The implementation is based on the convex hull of the point cloud. If we want to compute multiple alpha shapes from a given point cloud, then we can save some computation by only computing the convex hull once and pass it to create_from_point_cloud_alpha_shape.

The ball pivoting algorithm (BPA) [Bernardini1999] is a surface reconstruction method which is related to alpha shapes. Intuitively, think of a 3D ball with a given radius that we drop on the point cloud. If it hits any 3 points (and it does not fall through those 3 points) it creates a triangles. Then, the algorithm starts pivoting from the edges of the existing triangles and every time it hits 3 points where the ball does not fall through we createanother triangle.

Open3D implements this method in create_from_point_cloud_ball_pivoting. The method accepts a list of radii as parameter that corresponds to the radii of the individual balls that are pivoted on the point cloud.

The Poisson surface reconstruction method [Kazhdan2006] solves a regularized optimization problem to obtain a smooth surface. For this reason, Poisson surface reconstruction can be preferable to the methods mentioned above, as they produce non-smooth results since the points of the PointCloud are also the vertices of the resulting triangle mesh without any modifications.

Open3D implements the method create_from_point_cloud_poisson which is basically a wrapper of the code of Kazhdan. An important parameter of the function is depth that defines the depth of the octree used for the surface reconstruction and hence implies the resolution of the resulting triangle mesh. A higher depth value means a mesh with more details.

Poisson surface reconstruction will also create triangles in areas of low point density, and even extrapolates into some areas (see bottom of the eagle output above). The create_from_point_cloud_poisson function has a second densities return value that indicates for each vertex the density. A low density value means that the vertex is only supported by a low number of points from the input point cloud.

We can further use the density values to remove vertices and triangles that have a low support. In the code below we remove all vertices (and connected triangles) that have a lower density value than the \(0.01\) quantile of all density values.

In the examples above we assumed that the point cloud has normals that point outwards. However, not all point clouds already come with associated normals. Open3D can be used to estimate point cloud normals with estimate_normals, which locally fits a plane per 3D point to derive the normal. However, the estimated normals might not be consistently oriented. orient_normals_consistent_tangent_plane propagates the normal orientation using a minimum spanning tree.

Sentence embedding is an important research topic in natural language processing (NLP) since it can transfer knowledge to downstream tasks. Meanwhile, a contextualized word representation, called BERT, achieves the state-of-the-art performance in quite a few NLP tasks. Yet, it is an open problem to generate a high quality sentence representation from BERT-based word models. It was shown in previous study that different layers of BERT capture different linguistic properties. This allows us to fusion information across layers to find better sentence representation. In this work, we study the layer-wise pattern of the word representation of deep contextualized models. Then, we propose a new sentence embedding method by dissecting BERT-based word models through geometric analysis of the space spanned by the word representation. It is called the SBERT-WK method. No further training is required in SBERT-WK. We evaluate SBERT-WK on semantic textual similarity and downstream supervised tasks. Furthermore, ten sentence-level probing tasks are presented for detailed linguistic analysis. Experiments show that SBERT-WK achieves the state-of-the-art performance. Our codes are publicly available.

We describe a novel way of representing a symbolic knowledge base (KB) called a sparse-matrix reified KB. This representation enables neural modules that are fully differentiable, faithful to the original semantics of the KB, expressive enough to model multi-hop inferences, and scalable enough to use with realistically large KBs. The sparse-matrix reified KB can be distributed across multiple GPUs, can scale to tens of millions of entities and facts, and is orders of magnitude faster than naive sparse-matrix implementations. The reified KB enables very simple end-to-end architectures to obtain competitive performance on several benchmarks representing two families of tasks: KB completion, and learning semantic parsers from denotations.

Today's research progress in the field of multi-document summarization is obstructed by the small number of available datasets. Since the acquisition of reference summaries is costly, existing datasets contain only hundreds of samples at most, resulting in heavy reliance on hand-crafted features or necessitating additional, manually annotated data. The lack of large corpora therefore hinders the development of sophisticated models. Additionally, most publicly available multi-document summarization corpora are in the news domain, and no analogous dataset exists in the video game domain. In this paper, we propose GameWikiSum, a new domain-specific dataset for multi-document summarization, which is one hundred times larger than commonly used datasets, and in another domain than news. Input documents consist of long professional video game reviews as well as references of their gameplay sections in Wikipedia pages. We analyze the proposed dataset and show that both abstractive and extractive models can be trained on it. We release GameWikiSum for further research:

Today, recommender systems are an inevitable part of everyone's daily digital routine and are present on most internet platforms. State-of-the-art deep learning-based models require a large number of data to achieve their best performance. Many datasets fulfilling this criterion have been proposed for multiple domains, such as Amazon products, restaurants, or beers. However, works and datasets in the hotel domain are limited: the largest hotel review dataset is below the million samples. Additionally, the hotel domain suffers from a higher data sparsity than traditional recommendation datasets and therefore, traditional collaborative-filtering approaches cannot be applied to such data. In this paper, we propose HotelRec, a very large-scale hotel recommendation dataset, based on TripAdvisor, containing 50 million reviews. To the best of our knowledge, HotelRec is the largest publicly available dataset in the hotel domain (50M versus 0.9M) and additionally, the largest recommendation dataset in a single domain and with textual reviews (50M versus 22M). We release HotelRec for further research:

Fine-tuning pretrained contextual word embedding models to supervised downstream tasks has become commonplace in natural language processing. This process, however, is often brittle: even with the same hyperparameter values, distinct random seeds can lead to substantially different results. To better understand this phenomenon, we experiment with four datasets from the GLUE benchmark, fine-tuning BERT hundreds of times on each while varying only the random seeds. We find substantial performance increases compared to previously reported results, and we quantify how the performance of the best-found model varies as a function of the number of fine-tuning trials. Further, we examine two factors influenced by the choice of random seed: weight initialization and training data order. We find that both contribute comparably to the variance of out-of-sample performance, and that some weight initializations perform well across all tasks explored. On small datasets, we observe that many fine-tuning trials diverge part of the way through training, and we offer best practices for practitioners to stop training less promising runs early. We publicly release all of our experimental data, including training and validation scores for 2,100 trials, to encourage further analysis of training dynamics during fine-tuning.

We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at -dnn.