Description
WHALE is a new resource of knowledge graph embeddings generated from the Web Data Commons dataset—the largest collection of structured data from the web. The dataset encompasses 97, 689, 391, 384 RDF triples extracted from over 21, 968, 201 domains, posing a significant scalability challenge for embedding algorithms. The dataset is generated using the state-of-the-art knowledge graph embedding model DeCal. The resulting embeddings, dubbed Whale-embeddings, are publicly available [https://embeddings.cc/] and constitute the largest knowledge graph embedding resource released to date.