The three steps can be accomplished by running some scripts that can be found in the starchat-docker/scripts directory.The scripts assume you are using the default port 8888 and the index based on english language. If you need to change these parameters edit accordingly the variabile PORT and INDEX_NAME you can find in the scripts
Now you have to load the configuration file for the actual chat, aka decision table. We have provided the example configuration file starchat-docker/scripts/decision_table_starchat_doc.csv running the script
Now StarChat is running and you can configure and test the installation as explained in InstallationIf you get org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes be sure that docker-starchat/elasticsearch is accessible to docker service.
Have a look at the file docker-starchat/docker-compose.yml. ForManaus to have good performance, you need to provide decent languagestatistics. Update the file /manaus/statistics_data/english/word_frequency.tsv with aword-frequency file with the following format:
This examples demonstrate how to deploy an open-source LLM from Amazon S3 to Amazon SageMaker using the new Hugging Face LLM Inference Container. We are going to deploy the HuggingFaceH4/starchat-beta.
The model.tar.gz archive includes all our model-artifcats to run inference. We will use the huggingface_hub SDK to easily download HuggingFaceH4/starchat-beta from Hugging Face and then upload it to Amazon S3 with the sagemaker SDK.
To deploy HuggingFaceH4/starchat-beta to Amazon SageMaker we create a HuggingFaceModel model class and define our endpoint configuration including the hf_model_id, instance_type etc. We will use a g5.12xlarge instance type, which has 4 NVIDIA A10G GPUs and 96GB of GPU memory.