If you’re an aspiring AI professional, becoming an LLM engineer offers an exciting and promising career path. But where should you start? What should your trajectory look like? How should you learn? In one of my previous posts, I laid out the complet...
If you’re an aspiring AI professional, becoming an LLM engineer offers an exciting and promising career path.
But where should you start? What should your trajectory look like? How should you learn?
In one of my previousposts, I laid out the complete roadmap to become an AI / LLM Engineer. Reading this article will give you insights into the types of skills you’ll need to acquire and how to start learning.
The Best Way to Learn is to BUILD!
As Andrej Karpathy puts it:
Karpathy's message on how to become an expert at a thing
Andrej emphasizes that you should build concrete projects, and explain everything you learn in your own words. (He also instructs us to only compare ourselves to a younger version of ourselves – never to others.)
And I agree – building projects is the best way to not just learn but really grok these concepts. It will further sharpen the skills you’re learning to think about cutting edge use cases.
But the main challenge with this learning philosophy is that good projects can be hard to find.
And that’s the problem I am trying to resolve. I want to help people, including myself, discover and build practical and real-world projects that help you develop skills that are worth showcasing in your portfolio.
If you’re a beginner who knows basic to intermediate programming, your initial projects should showcase that you can comfortably build applications with LLMs.
They should demonstrate that:
you know what APIs are
you know how to consume them
you know how to build products that people actually want to use
Building a chatbot provides a great starting point, but at this point everyone has developed one. And there are many solutions for easy Streamlit based prototypes. So, you need to develop something that’s actually usable and has the potential to reach a wider audience.
I’d suggest building a chatbot for WhatsApp or Discord or Telegram. Build a chatbot which solves a problem people struggle with, a problem that companies have started to build solutions for.
If I had to pick a good and, arguably, the most common AI project that every company has started to work on, it would be RAG-powered chatbots.
But before you get to building RAG-powered bots, you should start building something slightly more basic but practical with LLMs.
To kick things off, let’s start by building a YouTube Summariser.
## setting up the language modelfrom langchain_together import ChatTogetherimport api_keyllm = ChatTogether(api_key=api_key.api,temperature=0.0, model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo")
The next step is to process the YouTube videos as a data source. For this we’ll need to understand the concept of document loaders.
Document loaders provide a unified interface to load data from various sources into a standardized Document format.
They automatically extract and attach relevant metadata to the loaded content.
The metadata can include source information, timestamps, or other contextual data that can be valuable for downstream processing.
LangChain offers loaders for CSV, PDF, HTML, JSON, and even specialized loaders for sources like YouTube transcripts or GitHub repositories, as listed in their integrations page.
LangChain supports different types of document loaders
Use the LLM to summarize and extract key points from the transcript:
# show the extracted page contentdata[0].page_content
The page_content attribute contains the complete transcript as shown in the output below:
Youtube video transcript from the youtube loader
Now that we have the transcript, we simply need to pass this to the LLM we configured above along with the prompt to summarise.
First, let’s understand a simple method:
Langchain offers the invoke() method to which you need to pass the system message and the user or human message.
The system message is essentially the instructions for the LLM on how it is supposed to process the human request.
And the human message is simply what we want the LLM to do.
# This code creates a list of messages for the language model:# 1. A system message with instructions on how to summarize the video transcript# 2. A human message containing the actual video transcript# The messages are then passed to the language model (llm) for processing# The model's response is stored in the 'ai_msg' variable and returnedmessages =[("system","""Read through the entire transcript carefully. Provide a concise summary of the video's main topic and purpose. Extract and list the five most interesting or important points from the transcript. For each point: State the key idea in a clear and concise manner. - Ensure your summary and key points capture the essence of the video without including unnecessary details. - Use clear, engaging language that is accessible to a general audience. - If the transcript includes any statistical data, expert opinions, or unique insights, prioritize including these in your summary or key points.""",),("human", data[0].page_content),]ai_msg = llm.invoke(messages)ai_msg
But this method won’t work when you have more variables and when you want a more dynamic solution.
A PromptTemplate in LangChain is a powerful tool that helps in creating dynamic prompts for large language models (LLMs). It allows you to define a template with placeholders for variables that can be filled in with actual values at runtime.
This helps in managing and reusing prompts efficiently, ensuring consistency and reducing the likelihood of errors in prompt creation.
A PromptTemplate consists of:
Template String: The actual prompt text with placeholders for variables.
Input Variables: A list of variables that will be replaced in the template string at runtime.
# Set up a prompt template for summarizing a video transcript using LangChain# Import necessary classes from LangChainfrom langchain.prompts import PromptTemplatefrom langchain import LLMChain# Define a PromptTemplate for summarizing video transcripts# The template includes instructions for the AI model on how to process the transcriptproduct_description_template = PromptTemplate( input_variables=["video_transcript"], template=""" Read through the entire transcript carefully. Provide a concise summary of the video's main topic and purpose. Extract and list the five most interesting or important points from the transcript. For each point: State the key idea in a clear and concise manner. - Ensure your summary and key points capture the essence of the video without including unnecessary details. - Use clear, engaging language that is accessible to a general audience. - If the transcript includes any statistical data, expert opinions, or unique insights, prioritize including these in your summary or key points. Video transcript: {video_transcript} """)
A chain is a sequence of steps that consists of a language model, PromptTemplate, and an optional output parser.
Create an LLMChain with the custom prompt template
Generate a summary of the video transcript using the chain
Here, we are using LLMChain but you can also use LangChain Expression Language as well to do this:
## invoke the chain with the video transcript chain = LLMChain(llm=llm, prompt=product_description_template)# Run the chain with the provided product detailssummary = chain.invoke({"video_transcript": data[0].page_content})
This will give you the summary object which has the text attribute that contains the response in markdown format.
summary['text']
The raw response will look like this:
summary response from simple LLM chain
To see the Markdown formatted response:
from IPython.display import Markdown, displaydisplay(Markdown(summary['text']))
And there you go:
Structure summary display using Markdown function
So, the core functionality of our YouTube summariser is now working.
But this is working in your Jupyter Notebook, to make it more accessible, we’d need to get this functionality deployed on WhatsApp.
Establishing connection between youtube and flask server using Twilio
For this, we’d need to serve our YT summarisation functionality as an API endpoint for which we are going to use Flask. You can also use FastAPI.
Now we’ll turn all the code in the Jupyter notebook into functions. So, add a function to check if it is a valid youtube URL, then define the summarise function that is basically a compilation of what we wrote in the Jupyter notebook.
You can configure our endpoint in the following manner:
@app.route('/summary', methods=['POST'])
def summary():
url = request.form.get('Body') # Get the JSON data from the request body
print(url)
if is_youtube_url(url):
response = summarise(url)
else:
response = "please check if this is a correct youtube video url"
print(response)
resp = MessagingResponse()
msg = resp.message()
msg.body(response)
return str(resp)
Once your app.py is ready with your Flask API, run the Python script, and you should have your server running locally on your system.
The next step is to make your local server connect with WhatsApp, and that’s where we’ll use Twilio.
Twilio allows us to implement this handshake by offering a WhatsApp sandbox to test your bot. You can follow the steps in this guide here to build this connection.
It’s a 3-project course that contains two other more complex projects. I’ll give you a brief summary of those other projects here so you can try them out for yourselves. And if you’re interested, you can check out the course as well.
This bot acts as a customer service representative for an airline. It can answer questions related to flight status, baggage inquiries, ticket booking, and more. It uses Langchain’s Router and LLM models to dynamically generate responses based on the user’s input.
Different prompt templates are defined for various customer queries, such as flight status, baggage inquiries, and complaints.
Based on the query, the router selects the appropriate template and generates a response.
Twilio then sends the response back to the WhatsApp chat.
Wiplane's project 2 - Airline customer support to handle different types of queries
This chatbot answers questions related to airline services using a document-based system. The document is converted into embeddings, which are then queried using Langchain’s RAG system to generate responses. Companies want developers these days who have these skills, so this is an especially practical project.
The guidelines/rules document is embedded using FAISS and HuggingFace models.
When a user submits a question, the RAG system retrieves relevant information from the document.
The system then generates a response using a pre-trained LLM and sends it back via Twilio.
Wiplane's project 3 - RAG powered support bot
These 3 projects will get you started so you can continue experimenting and learning more about AI engineering.
Wiplane's 3 project course on building LLM powered whatsapp chatbots
Customer Support is the most funded category in AI because it reduces the cost instantly if AI can handle communication with disgruntled users.
So, we build bots that can handle different types of queries, intelligent RAG powered bots which will have access to proprietary documents to provided up-to-date information to the users.
In this tutorial, we focused on building a fun YouTube video summarizer tool that is served on WhatsApp.
The bot's core functionality includes:
Receiving a YouTube URL
Validating the URL
Retrieving the video transcript
Using an LLM to summarize the content
Returning the summary to the user
We used a number of Python packages including langchain-together, langchain-community, langchain, pytube, and youtube-transcript-api.
The project uses the Llama 3.1 model via Together AI's API.
We built the core summarisation functionality using
Using LangChain's invoke() method with system and human messages
Using PromptTemplate and LLMChain for more dynamic solutions
To make the tool accessible via WhatsApp:
The functionality is served as an API endpoint using Flask
Twilio is used to connect the local server with WhatsApp
A WhatsApp sandbox is used for testing the bot
To continue building further projects, check out the course.
It is a beginner track course where you start from learning to build with LLMs, then apply those skills to build 3 different types of LLM applications. Not just that – you learn to serve your applications as WA chatbots.