Deploying LocalAI
Starting LocalAI
You can refer to the official Getting Started guide for deployment, or quickly integrate following the steps below: (These steps are derived from LocalAI Data query example)-
First, clone the LocalAI code repository and navigate to the specified directory.
-
Download example LLM and Embedding models.
Here, we choose two smaller models that are compatible across all platforms.
ggml-gpt4all-jserves as the default LLM model, andall-MiniLM-L6-v2serves as the default Embedding model, for quick local deployment. -
Configure the .env file.
NOTE: Ensure that the THREADS variable value in
.envdoesn’t exceed the number of CPU cores on your machine. -
Start LocalAI.
The LocalAI request API endpoint will be available at http://127.0.0.1:8080. And it provides two models, namely:
-
LLM Model:
ggml-gpt4all-jExternal access name:gpt-3.5-turbo(This name is customizable and can be configured inmodels/gpt-3.5-turbo.yaml). -
Embedding Model:
all-MiniLM-L6-v2External access name:text-embedding-ada-002(This name is customizable and can be configured inmodels/embeddings.yaml).
If you use the Dify Docker deployment method, you need to pay attention to the network configuration to ensure that the Dify container can access the endpoint of LocalAI. The Dify container cannot access localhost inside, and you need to use the host IP address.
-
LLM Model:
-
Integrate the models into Dify.
Go to
Settings > Model Providers > LocalAIand fill in: Model 1:ggml-gpt4all-j- Model Type: Text Generation
-
Model Name:
gpt-3.5-turbo -
Server URL: http://127.0.0.1:8080
If Dify is deployed via docker, fill in the host domain:
http://<your-LocalAI-endpoint-domain>:8080, which can be a LAN IP address, like:http://192.168.1.100:8080
all-MiniLM-L6-v2- Model Type: Embeddings
-
Model Name:
text-embedding-ada-002 -
Server URL: http://127.0.0.1:8080
If Dify is deployed via docker, fill in the host domain:
http://<your-LocalAI-endpoint-domain>:8080, which can be a LAN IP address, like:http://192.168.1.100:8080
Edit this page | Report an issue