sindicetech-freebase

Sindice Freebase Distribution

Note: these are technical notes just to start up the distributions.

For everything else, from introduction to documentation, FAQ, please refer to http://sindicetech.com/freebase

There is a possibility to use Google Cloud or Amazon Web Services. If you want to use Amazon Web Services, pleas refer here. If you want to use Google Cloud, follow instructions below.


To get started, first you must have an account on Google Cloud and be familiar with it. If you are not familiar with Google cloud, you can find more more about it here.


Once you have an GC account setup:


1. Join this google group to be able to access the image on Google Cloud


2. Install google cloud sdk


curl https://dl.google.com/dl/cloudsdk/release/install_google_cloud_sdk.bash | bash


3. Create your project  from the Google Cloud web interface.  You will have to register to google cloud, and activate billing


https://cloud.google.com/console/project?getstarted=https://cloud.google.com


4. Authenticate


Enter the following command in your shell


gcloud auth login --no-launch-browser


Press on link, choose accept and enter project id


5. Copy the script below to your machine and modify it to add the necessary parameters (e.g. PROJECTID) . Note: to execute it successfully you must be member of this google group. Finally execute the script


-------Script bellow -----------------------

#!/bin/bash # debug set -x # Don’t change. The new instance will look for this diskname. DISKNAME=freebase-data2 # Newer Image may be available, check the Google group. IMAGE_ID=sindice-freebase-061014

# Must be changed. PROJECTID=<your-project-id> # Change to the data center you want to use. ZONE=us-central1-a # Change to the name you want to use. INSTANCE_NAME=sindice-freebase # This is the smallest instance to use. Other possible values are: n1-standard-2, n1-standard-4, n1-standard-8. More information about sizes can be found here. #If you are using different type instances, than mentioned above, you need to configure virtuoso configuration file "/home/administrator/virtuosodb/virtuoso.ini" according to this documentation #Important note from the DB vendor: For best performance, the reccomended RAM size allocated to the Virtuoso DB should be 10GB per billion triples. Given freebase size, a 20+Gb Ram setting would be recommended. #using smaller settings will likely cause timeouts especially in cold start. The ini file mentioned above can also be used to change the timeout values and allow longer queries to complete TYPE=n1-standard-1 function createdisk(){ # Creates a 200 GB disk. gcutil adddisk --size_gb=200 --project=$PROJECTID --zone=$ZONE $DISKNAME; } function createbootdisk(){ # Creates bootdisk. gcutil adddisk --project=$PROJECTID --size_gb=99 --zone=$ZONE --source_image=projects/t-lexicon-434/global/images/$IMAGE_ID $INSTANCE_NAME ; } function createinstance(){ # Create instance. gcutil --service_version="v1" --project=$PROJECTID addinstance $INSTANCE_NAME --machine_type=$TYPE --network="default" --tags $INSTANCE_NAME --external_ip_address="ephemeral" --zone=$ZONE --disk="$INSTANCE_NAME,deviceName=$INSTANCE_NAME,mode=READ_WRITE,boot" --disk="$DISKNAME,deviceName=freebase-data2,mode=READ_WRITE" } function createfirewall(){ #Create firewall rules gcutil addfirewall for-sindice --description="allow web access, access to virtuoso, rsync access ." --allowed="tcp:80, tcp:8890, tcp:873" --target_tags $INSTANCE_NAME --project=$PROJECTID } function checkifready(){ STATUS=$(gcutil --project=$PROJECTID getdisk $INSTANCE_NAME | grep " status " | cut -d "|" -f 3 | tr -d ' ');

echo $STATUS while [ "$STATUS" != "READY" ]; do sleep 10; STATUS=$(gcutil  --project=$PROJECTID getdisk $INSTANCE_NAME | grep " status " | cut -d "|" -f 3 | tr -d ' ');

echo “waiting for Image to be ready” done; } #Main part createdisk; createbootdisk; checkifready; createinstance; createfirewall; exit 0;

#----End of Script----------


6. You can access the application by going to the IP shown after executing the script. A guide to using the applications is available on the home page.


7. It will take more than a hour to decompress the freebase data and make it available through the virtuoso endpoint. When, the status box says 'Done', you are ready to go!


8. To remove everything created by previous scrip, run next script. (Again put your project ID)

-------Script bellow -----------------------

#!/bin/bash

DISKNAME=freebase-data2


# Must be changed.

PROJECTID=<your-project-id>

ZONE=europe-west1-b

INSTANCE_NAME=sindice-freebase


#Deleting firewall rules

gcutil deletefirewall for-sindice --project=$PROJECTID --force

#Deleting instance and  persistent disk

gcutil   deleteinstance $INSTANCE_NAME   --force --delete_boot_pd --zone=$ZONE

#Deleting persistent disk, containing  data

gcutil  deletedisk $DISKNAME     --zone=$ZONE --force


#----End of Script----------





Showing 1-15 of 15 topics
image now available on Amazon Web Services sindicetech-freebase 10/13/14
What is the id of newest image? sindicetech-freebase 10/13/14
new image mohammed alserr 1/28/15
503 error Service Temporarily Unavailable Jing Lu 12/2/14
get all the facts whose name has some keywords Jing Lu 11/24/14
Ca not delete the disk Azad Abad 11/18/14
need help to run the script on GC Jing Lu 11/4/14
Could not find IP Azad Abad 10/29/14
dump support for different languages mohammed alserr 10/13/14
New version available Giovanni Tummarello 8/25/14
StackOverflow Error Deepak Krishnan 7/31/14
Deploy freebase dump on an another server Alexis Fleuriot 3/6/14
Problem with the given script Mohammad Khodadadi 2/26/14
'IN' Clause request with a list of "mid" Alexis Fleuriot 2/24/14
Convert MQL Freebase request to SPARQL Alexis Fleuriot 2/20/14