OpenAI Compatible Services
Concept
After the launch of ChatGPT by OpenAI, many intermediaries and new AI services have emerged. These services provide OpenAI-compatible calling methods for the convenience of users already using OpenAI.
Configuration Instructions
Custom API Endpoint
The base request address for your custom AI model service's API must be compatible with the OpenAI interface format, ending with the path chat/completions, for example, https://api.groq.com/openai/v1/chat/completions.
You should configure it in Salad Dictionary as: https://api.groq.com/openai/v1/
apiKey
Typically a long string used to authenticate your request identity, which can be found on the account key settings page of your AI model service platform. Different platforms use different formats, and an incorrect or expired apiKey will not test successfully.
For example: sk-O6FclmECYFJyTB664b6681D06eEb47919b8570Be1fA2B434
.
The name of this apiKey may vary across platforms. For example, on the OpenRouter platform, it is called apiKey, while on the Groq platform, it is called api_key, and it may also be secret_key. They essentially mean the same thing.
Prompt
The prompt is the hint used for single-segment translation. If you do not understand what these hints are for, it is best to keep the default. If you do need to adjust it, please note:
- Note that
{{from}}
represents the language of the paragraph,{{to}}
represents the target language, and{{text}}
represents the text content of the paragraph. These placeholders need to be retained.
Model Name
Accurately speaking, it refers to the model name string sent during the request. Different platforms have different formats for model names, and different model names represent different model choices, which can affect billing and rate control. Please strictly follow the platform documentation to select and fill in, especially for those who want precise control over model versions. For example, in Ollama, phi3:14b-medium-4k-instruct-q4_0
represents the medium version of Microsoft's open-source model Phi3 (14B parameters, Medium size) with a context window of 4K size, fine-tuned with the Q4_0 quantization method. Be careful not to mistakenly use base models or non-instruct/chat models, as these models are not specifically trained on conversational data, and the instruction-following effect may be poor, which may lead to failure in returning translations.
Steps to Apply for ChatGPT
-
Prepare a ChatGPT account Please prepare a ChatGPT account and ensure you can access https://chat.openai.com and have normal conversations.
-
Create a Secret Key
- Open https://platform.openai.com/account/api-keys
- Click the "Create new secret key" button, and a pop-up will appear showing the Secret Key.
- Copy the Secret Key.
- Fill in OpenAI's Secret Key in Salad Dictionary Enter the Secret Key into Salad Dictionary under "Settings" - "Dictionary Account" - "ChatGPT".
What Other Services Are Compatible with the OpenAI Interface?
Here are some mainstream services compatible with the OpenAI interface. You can refer to the documentation of these services to configure Salad Dictionary.
Ollama Local Deployment Open Source Model
- Install, configure, and start Ollama
- Download and install Ollama from the official website.
- Set up cross-origin permissions and start
- macOS: Execute the command
launchctl setenv OLLAMA_ORIGINS "*"
in the command line, then start the app. - Windows: Go to Control Panel - System Properties - Environment Variables - User Environment Variables, create 2 new environment variables: Variable Name
OLLAMA_HOST
Variable Value0.0.0.0
, Variable NameOLLAMA_ORIGINS
Variable Value*
, then start the app. - Linux: Execute the command
OLLAMA_ORIGINS="*" ollama serve
in the command line.
- macOS: Execute the command
-
The translation service configuration is as follows: apiKey: ollama Model: Please refer to the specific Tags of the models in the model library, for example,
qwen:14b-chat-v1.5-q4_1
Custom API Endpoint: http://localhost:11434/v1/ If you are running the Ollama service on another host within the local network, replace localhost with your host's IP address. The concurrency rate should be adjusted according to the computing power of the host and the model used. -
Reference Documentation https://github.com/ollama/ollama/blob/main/docs/api.md https://github.com/ollama/ollama/issues/2335 For those using LM-Studio for deployment, you can refer to its documentation, as the configuration method is similar, but you need to download the model and run it first.
Groq Official Platform
- apiKey: Obtain the key from this page.
- Model: As of the writing of this article, there are four models: llama3-8b-8192, llama3-70b-8192, mixtral-8x7b-32768, gemma-7b-it. Please test and select according to your translation needs. Currently, the Chinese-to-English effect of these models is acceptable, but the English-to-Chinese effect is poor, and it is not recommended for English-to-Chinese scenarios.
- Custom API Endpoint: https://api.groq.com/openai/v1/
- Rate Control: You can check your account's request rate limit on this page. If the REQUESTS PER MINUTE for the selected model is 30, it is recommended to set the maximum request number per second to 1 or 2, and not too high.
Deepseek Official Platform
- apiKey: Obtain the key from this page.
- Model: As of the writing of this article, only the deepseek-chat model from this platform is recommended for translation.
- Custom API Endpoint: https://api.deepseek.com/
OpenRouter Intermediary Platform
- apiKey: Obtain the key from this page.
- Model: Check the model list on this model page. For example, anthropic/claude-3-haiku.
- Custom API Endpoint: https://openrouter.ai/api/v1/
- Rate Limit: Please refer to here. As of the writing of this article, if your account balance is $10, you can send 10 requests per second; if it is $20, then 20 QPS, and so on. Although concurrency can be high, since the platform also rents resources from the official platform and shares a large rate limit pool, if many users are making requests simultaneously, it may cause request failures. This situation is not a limitation of the OpenRouter platform; the HTTP response code for failed requests is 200, but the returned payload indicates a rate limit error, resulting in no translation being displayed (i.e., the returned text cannot be parsed, and the translation is empty). In this case, the plugin has not yet implemented corresponding handling for empty translation exceptions, and it is inconvenient to retry. When encountering this situation, you can only switch translation services to retranslate. Of course, you can also build your own API intermediary to handle such situations.
Others
Other platforms are similar, mainly involving obtaining apiKey, model name, request address, and paying attention to rate limit information.