Every week at Modelbit we talk to dozens of ML teams at companies across all types of industries who have been given an incredibly important mission: finding meaningful ways to incorporate machine learning into their business and its products.
Increasingly our customers are using hosted cloud notebooks, like Deepnote, to build and train their ML models before using Modelbit to deploy them to production. When we heard about Deepnote’s new AI Copilot, we were curious to put it to the test to see if it could help ML teams develop meaningful ML models faster.
Can you go from idea to inference in minutes?
That’s what we set out to answer when we tested using Deepnote AI and Modelbit together. Below, we’ll walk through the process of deciding on a model, building and training it using AI, and deploying it to production with one line of code via Modelbit.
AI for code generation
In June of 2023 Deepnote released their new AI Copilot. In addition to providing autocomplete, Deepnote AI lets developers generate full cells of code based on English-language prompts. It’s especially good at generating the boilerplate code we all write as part of our day-to-day work. And nowhere is there more boilerplate code than in Machine Learning.
There’s been true innovation in the development of new modeling technologies themselves. It seems that every day there’s a new breakthrough, a new announcement from Google or Meta, or a new open-source model on HuggingFace. And for most ML practitioners, the art of feature selection, experimentation, and zeroing in on the right model still requires some human ingenuity.
But for every model that gets built, there’s just a lot of same-y boilerplate code that builds the model, trains it, and deploys it. In this post we’ll explore using Deepnote AI to build that code for us.
Building a fraud model
One of the most popular use cases for incorporating ML into core products we’ve noticed is financial technology companies building fraud detection models. So in this walkthrough we’ll build a simple fraud model. We know how we want this to work: Pull down the training data, build a pipeline, fit it, and deploy it. Let’s see how the AI does.
We started by summarizing the problem in a prompt and asking for the import statements. Here’s what we got in return from the AI:
Not gonna lie: This is pretty damn perfect. A great example of boilerplate: Normally we would have to Google constantly for what the exact names of these modules are, where they are in the packages, and so forth. But Deepnote AI just knows. We did a little back-and-forth to get the package names right, but all in all, a really nice time-saver.
We wrote the next part of the code ourselves: Pulling down the training dataframe full of transactions, with some being fraudulent. Next, we logged into Modelbit, which we’ll use to deploy the model to a REST endpoint:
Now let’s pick some features and build training and target DataFrames. Can Deepnote AI do this for us?
Wow. Pandas syntax is famously counterintuitive and inconsistent with vanilla Python syntax. Making their C-level operations run fast requires these tradeoffs, but it makes Pandas a constant source of Googling and trial-and-error to do your data manipulation work.
This is so far our favorite use of Deepnote AI: Write my boilerplate Pandas code for me! This code is both more succinct and was faster to do than writing it by hand. We’ll be using Deepnote AI for all our Pandas coding needs from now on.
The generation of the target DataFrame went similarly well:
Finally, we want to split the data into training and testing data, fit the pipeline, and score it. Similarly very boilerplate, which again would normally require lots of poring over Scikit-Learn docs to remember exact function call syntax. Instead, we again used Deepnote’s AI:
We see that the AI has a penchant for re-importing libraries. This does appear to be an area for improvement, but not a major one. We quickly deleted these lines before proceeding.
Here’s a moment where the AI responds well to some feedback. This code is right the first time, which is impressive. But this pipeline will be a little bit brittle in production if the model receives a category feature it wasn’t trained on in advance. Rather than correct it in code, we ask the AI to correct it for us:
The AI gets "handle_unknown=’ignore’" – another piece of syntax we’d have to spend time looking up – right on the first try.
Here’s the AI getting us an r2 score:
It seems likely at this point that the AI has previous cells provided as part of its prompts every time we ask for a new prompt. Once again there are some unnecessary imports here. But otherwise, the code is correct on the first try.
Deploying the model to a REST API
To deploy the model to production, we build a quick inference function – a function that takes the inputs we’ll receive in production and returns an inference. As we want to be careful about how our production code works, we write this cell ourselves:
Finally, we ask the AI to deploy it for us to Modelbit!
Similarly impressive. While Modelbit and Deepnote are premier partners for deploying ML models written in Depenote, we’re still a startup, and it speaks well of the AI that – spurious imports aside 😉– it knew not just how to deploy to Modelbit, but why we might want to do that.
Here are the results of running the AI’s deployment code:
Now that we’re deployed in Modelbit, we get all the benefits of a production model. Modelbit automatically detects and containerizes the model’s required Python environment:
Additionally, Modelbit provides our production APIs:
Since we’re already in Deepnote, let’s test the API right from our notebook:
As we can see, the REST API that calls our model is ready to be integrated into our product or website.
Opportunities for AI-generated ML code
The state of the art in AI-generated code, as available today in Deepnote AI, is clearly ready for action. We used human judgment in a couple key places: Selecting and engineering our features; choosing our model technology; building the actual inference function that will run over and over again in production.
Meanwhile, Deepnote AI was more than capable of handing the boilerplate: The raft of imports, the Pandas DataFrame manipulation, all the pipeline building and training and scoring. Opportunities remain in obvious areas like not re-importing libraries over and over again, as well as deeper areas like handling much of the model experimentation. We’re excited to see what comes next.
We started Modelbit to make it easier than ever for ML teams to deploy the models they build to production and make them available to call via endpoints like REST. It’s incredibly exciting for us to see teams like the one at Deepnote working in tandem to enable ML teams to move quicker and make a bigger impact.
Both Modelbit and Deepnote have free trials, so try it for yourself today!