Self-host
If we plan to host Elasticsearch and Milvus services for production usage, we need a server (for example, an AWS instance) to provide reliable and scalable services. In this section, we list the following instructions to install Elasticsearch and Milvus on a server.
Install Keyword Search
We use Elasticsearch under the hood as keyword search implementation due to its high performance and robustness. We follow the Elasticsearch install guide to install Elasticsearch version 8.13 on an AWS EC2 instance (for example, a t2.medium
with 2 vCPU and 4 GiB Memory). Users may refer to the official Elasticsearch doc for greater details. To be self-contained, we list the installation commands required in this doc.
Download and install archive for Linux
First we download and install archive for linux. The Linux archive for Elasticsearch v8.13.2 can be downloaded and installed as follows:
Run Elasticsearch from the command line
We run the following command to start Elasticsearch:
When starting Elasticsearch for the first time, security features are enabled and configured by default. Specifically,
- Authentication and authorization are enabled, and a password is generated for the
elastic
built-in superuser. - Certificates and keys for TLS are generated for the transport and HTTP layer, and TLS is enabled and configured with these keys and certificates.
The password for the elastic
user is output to your terminal. Take a note of the this password es_passwd
which will be used to connect Elasticsearch server.
The default setting requires both elastic password and TLS to access Elasticsearch service. To make it simple, you can disable TLS by changing the enabled: true
to enabled: false
at config/elasticsearch.yml
. After the change, the ssl config block looks like the following:
Check that Elasticsearch is running
You can test that your Elasticsearch node is running by sending an HTTPS request to port 9200 on localhost
:
The call returns a response like this:
Run as a daemon
To run Elasticsearch as a daemon, specify -d
on the command line, and record the process ID in a file using the -p
option:
To shut down Elasticsearch, kill the process ID recorded in the pid file:
Install Vector Database
We use Milvus under the hood as vector database implementation due to its high performance and robustness. We follow the Milvus instructions at here to install Milvus Standalone Service on a t2.medium
instance (the same one as we installed Elasticsearch service).
To be self-contained, we list the installation commands required in this doc.
Configure Milvus
We include milvus
directory in denser-retriever
repo to support Milvus installation. Open the milvus/standalone/docker-compose.yml
file and find the following block:
We need two modifications:
- Change the path of
/home/ubuntu/denser-retriever/docker/milvus/standalone/milvus.yaml
to point to your correctmilvus.yaml
path - Change the path of
/home/ubuntu/milvus_data_retriever
to point to the location you wish to store the embeddings. This location should have sufficient storage space to store embeddings for a large collection of datasets.
Start Milvus
Follow the instructions at here, we run the following command to start Milvus:
User Access Authentication
We follow the instructions at here to enable user access authentication. Specifically, we set common.security.authorizationEnabled
in milvus.yaml
as true
when configuring Milvus.
A root
user (password: Milvus
) is created along with each Milvus instance by default. To prevent your milvus server from public access, you can edit the file reset_password.py
to replace the YOUR_PASSWORD
with your real password, and then run the following cmd to change. Take a note of your changed password as it would be required to access vector database.
Test Milvus
To test milvus connection and operators, we run the following command from any computers. It tests the connection to Milvus server, creating Milvus index, querying with Milvus index and finally deleting Milvus index.
Stop Milvus
To stop the Milvus, you can simply run the following command under directory of milvus/standalone
.