Introduction

Machine Learning Serving focused on GenAI & LLMs with simplicity as the top priority.

GitHub License

Installation¶

Stable:

pip install FastServeAI

Latest:

pip install git+https://github.com/gradsflow/fastserve-ai.git@main

Usage/Examples¶

YouTube: How to serve your own GPT like LLM in 1 minute with FastServe.

Serve Custom Model¶

To serve a custom model, you will have to implement handle method for FastServe that processes a batch of inputs and returns the response as a list.

from fastserve import FastServe


class MyModelServing(FastServe):
    def __init__(self):
        super().__init__(batch_size=2, timeout=0.1)
        self.model = create_model(...)

    def handle(self, batch: List[BaseRequest]) -> List[float]:
        inputs = [b.request for b in batch]
        response = self.model(inputs)
        return response


app = MyModelServing()
app.run_server()

You can run the above script in terminal, and it will launch a FastAPI server for your custom model.

Deploy¶

Lightning AI Studio ⚡️¶

python fastserve.deploy.lightning --filename main.py \
    --user LIGHTNING_USERNAME \
    --teamspace LIGHTNING_TEAMSPACE \
    --machine "CPU"  # T4, A10G or A10G_X_4

Contribute¶

Install in editable mode:

git clone https://github.com/gradsflow/fastserve-ai.git
cd fastserve
pip install -e .

Create a new branch

git checkout -b ＜new-branch＞

Make your changes, commit and create a PR.

Last update: May 1, 2024