Sep 9, 2024
Code Interpreting with Groq and E2B in JavaScript
This AI data analyst can plot a linear regression chart based on CSV data. It uses LLMs powered by Groq, and the Code Interpreter SDK by E2B for the code interpreting capabilities. The SDK quickly creates a secure cloud sandbox powered by Firecracker. Inside this sandbox is a running Jupyter server that the LLM can use.
Read more about models powered by Groq here.
The AI agent performs a data analysis task on an uploaded CSV file, executes the AI-generated code in the sandboxed environment by E2B, and returns a chart, saving it as a PNG file. The code is processing the data in the CSV file, cleaning the data, and performing the assigned analysis, which includes plotting a chart.
Full code
Key links
Outline
Prerequisites
Install the SDKs
Set up the API keys and model instructions
Add code interpreting capabilities and initialize the model
Upload the dataset
1. Prerequisites
Create an index.ts
file for the main program, copy the env.template
file, and save it to a .gitignore
file.
Get the E2B API key here and the Groq API key here.
Download the CSV file from here and upload it to the same directory as your program. Rename it to data.csv
.
2. Install the SDKs
3. Set up the API keys and model instructions
In this step you upload your E2B and Groq API keys to the program. In the JS & TS case, the API keys are stored in the .env
file, You pick the model of your choice by uncommenting it. There are some recommended models that are great at code generation, but you can add a different one from here.
The model is assigned a data scientist role and explained the uploaded CSV. You can choose different data but need to update the instructions accordingly.
4. Add code interpreting capabilities and initialize the model
Now we define the function that will use the code interpreter by E2B. Every time the LLM assistant decides that it needs to execute code, this function will be used. Read more about the Code Interpreter SDK here.
We also initialize the Groq client. The function for matching code blocks is important because we need to pick the right part of the output that contains the code produced by the LLM. The chat function takes care of the interaction with the LLM. It calls the E2B code interpreter anytime there is a code to be run.
5. Upload the dataset
The CSV data is uploaded programmatically, not via AI-generated code. The code interpreter by E2B runs inside the E2B sandbox. Read more about the file upload here.
6. Put everything together
Finally, we put everything together and let the AI assistant upload the data, run an analysis, and generate a PNG file with a chart. You can update the task for the assistant in this step. If you decide to change the CSV file you are using, don't forget to update the prompt too.
7. Run the program and see the results
The file is generated within the notebook. The plot shows the linear regression of the relationship between GDP per capita and life expectancy from the CSV data.