Published on

Enabling local function calling with gorilla-openfunctions-v2 7B LLM using the openAI protocol


Image from the author


In the rapidly evolving landscape of natural language processing and conversational AI, the ability to perform function calling within language models has emerged as a powerful tool for extending their capabilities. OpenAI's API has been at the forefront of this development, providing a standardized protocol for function calling. However, what if you want to leverage the power of function calling with a local language model like gorilla-llm/gorilla-openfunctions-v2 ?In this article, we'll explore a Python implementation that bridges the gap, allowing you to use gorilla-llm/gorilla-openfunctions-v2 with the OpenAI function calling protocol.

The challenge

While OpenAI's API offers seamless function calling capabilities, integrating this functionality with a local language model like gorilla-llm/gorilla-openfunctions-v2 presents a challenge. The local model may not natively support the OpenAI protocol, making it difficult to leverage the benefits of a good system/code maintainability by not following a standard adopted by business solutions and the community. This is where our Python implementation comes into play.

The solution

To address this challenge, I have developed a Python wrapper that acts as an intermediary between the OpenAI API client and the gorilla-llm/gorilla-openfunctions-v2 local language model. By serving gorilla-llm/gorilla-openfunctions-v2 with an engine that mimics the OpenAI chat completion protocol (such as TGI Hugging Face, Vllm, Aphrodite engine, or others), we can intercept the input and output of the local model and manipulate the data to append/re-format the engine response with our magic sauce of the function calling inference from the great Gorilla Open Functions V2.

When a user interacts with the wrapped client, the wrapper first processes the user's input and injects the necessary function specifications and a special model's tags. This modified input is then passed to the local engine for processing. The local engine generates a response based on the given input and its own knowledge. Upon receiving the generated response, the wrapper analyzes it, looking for any function calls that match the provided function specifications. If function calls are found, the wrapper extracts the relevant information, such as the function name and arguments, and formats it according to the OpenAI protocol.

Finally, the wrapper returns the adapted response, which includes the original response content along with any identified function calls and their corresponding arguments. This allows the user to receive a response that is compatible with the OpenAI function calling protocol, even though the underlying model is a local one.

GitHub repository

To dive deeper into the implementation and explore the code, visit our GitHub repository: Local gorilla-openfunctions-v2 with OpenAI Function Calling Protocol. The repository contains the Python implementation, there is a simple example because the solution is really simple.

While the current implementation provides a simple and efficient solution, there is room for improvement in the parser of the LLM output. The existing structures of the output of the engine used (TGI, Vllm, Aphrodite, or others) could be reused since they already mimics the OpenAI protocol, instead of creating the response object completely from scratch. This is an area where contributions from the community are welcome.


By leveraging the power of this Python simple implementation, developers can now seamlessly integrate function calling capabilities into the gorilla-llm/gorilla-openfunctions-v2 local language model, opening up new possibilities for extending its functionality. We invite you to explore the repository, experiment with the code, and contribute to its development.

Stay tuned for more exciting developments in the world of natural language processing and conversational AI! Don't forget to follow and your clap in the little hand =)

Best regards,