Press "Enter" to skip to content

Serving Databricks Models via API Management Endpoints

Drew Furgiuele makes available a model:

When it comes to generative AI projects I’d argue that the hardest and most tedious part has moved into a new area: hosting and serving your models. Whether you’re working with CPU intensive models, or models that require GPU horsepower, sourcing the hardware, building out deployment pipelines, configuring monitoring, and then securing everything is real, serious work that requires everyone to lean in to get it right.

And then, there’s the real question of how you’re going to use those models: will you be setting up automation and doing batch processing using your models and infrastructure? Or do you want to get really serious and offer up real-time inference? If the latter, you can add one more thing to solve for: managing your front-end APIs that you will have to build to support that use case.

Click through to see how you can use an API management tool (like Azure API Management) to assist in these things.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.