Sunday, September 14, 2025

Constructing Analysis Brokers for Tech Insights

Share


ChatGPT one thing like: “Please scout all of tech for me and summarize developments and patterns based mostly on what you suppose I might be eager about,” you realize that you simply’d get one thing generic, the place it searches a number of web sites and information sources and arms you these.

It is because ChatGPT is constructed for basic use circumstances. It applies regular search strategies to fetch info, typically limiting itself to a couple internet pages.

Simple illustration of using ChatGPT search to build a report | Image by author

This text will present you find out how to construct a distinct segment agent that may scout all of tech, combination thousands and thousands of texts, filter information based mostly on a persona, and discover patterns and themes you’ll be able to act on.

The purpose of this workflow is to keep away from sitting and scrolling by boards and social media by yourself. The agent ought to do it for you, grabbing no matter is helpful.


We’ll be capable to pull this off utilizing a novel information supply, a managed workflow, and a few immediate chaining methods.

The three totally different processes, the API, fetching/filtering information, summarizing | Picture by writer

By caching information, we are able to preserve the price down to a couple cents per report.

If you wish to attempt the bot with out booting it up your self, you’ll be able to be part of this Discord channel. You’ll discover the repository tech-reports-discord” goal=”_blank” rel=”noreferrer noopener”>right here if you wish to construct it by yourself.

This text focuses on the final structure and find out how to construct it, not the smaller coding particulars as you could find these in tech-reports-discord” goal=”_blank” rel=”noreferrer noopener”>Github.

Notes on constructing

In case you’re new to constructing with brokers, you would possibly really feel like this one isn’t groundbreaking sufficient.

Nonetheless, if you wish to construct one thing that works, you have to to use various software program engineering to your AI purposes. Even when LLMs can now act on their very own, they nonetheless want steerage and guardrails.

For workflows like this, the place there’s a clear path the system ought to take, you must construct extra structured “workflow-like” techniques. When you’ve got a human within the loop, you’ll be able to work with one thing extra dynamic.

The explanation this workflow works so effectively is as a result of I’ve an excellent information supply behind it. With out this information moat, the workflow wouldn’t be capable to do higher than ChatGPT.

Making ready and caching information

Earlier than we are able to construct an agent, we have to put together an information supply it will probably faucet into.

One thing I feel lots of people get unsuitable once they work with LLM techniques is the idea that AI can course of and combination information solely by itself.

In some unspecified time in the future, we would be capable to give them sufficient instruments to construct on their very own, however we’re not there but when it comes to reliability.

So once we construct techniques like this, we want information pipelines to be simply as clear as for another system.

The system I’ve constructed right here makes use of an information supply I already had accessible, which implies I perceive find out how to educate the LLM to faucet into it.

It ingests hundreds of texts from tech boards and web sites per day and makes use of small NLP fashions to interrupt down the principle key phrases, categorize them, and analyze sentiment.

This lets us see which key phrases are trending inside totally different classes over a selected time interval.


To construct this agent, I added one other endpoint that collects “info” for every of those key phrases.

This endpoint receives a key phrase and a time interval, and the system types feedback and posts by engagement. Then it course of the texts in chunks with smaller fashions that may resolve which “info” to maintain.

The “facts” extracting process for each keyword | Image by author
We apply a final LLM to summarize which info are most essential, protecting the supply citations intact.


This can be a form of immediate chaining course of, and I constructed it to imitate LlamaIndex’s quotation engine.

The primary time the endpoint is named for a key phrase, it will probably take as much as half a minute to finish. However because the system caches the end result, any repeat request takes only a few milliseconds.

So long as the fashions are sufficiently small, the price of operating this on a number of hundred key phrases per day is minimal. Later, we are able to have the system run a number of key phrases in parallel.

You possibly can in all probability think about now that we are able to construct a system to fetch these key phrases and info to construct totally different reviews with LLMs.

When to work with small vs bigger fashions

Earlier than shifting on, let’s simply point out that selecting the best mannequin dimension issues.

I feel that is on everybody’s thoughts proper now.

There are fairly superior fashions you should use for any workflow, however as we begin to apply increasingly LLMs to those purposes, the variety of calls per run provides up rapidly and this could get costly.

So, when you’ll be able to, use smaller fashions.

You noticed that I used smaller fashions to quote and group sources in chunks. Different duties which might be nice for small fashions embrace routing and parsing pure language into structured information.

In case you discover that the mannequin is faltering, you’ll be able to break the duty down into smaller issues and use immediate chaining, first do one factor, then use that end result to do the following, and so forth.

You continue to need to use bigger LLMs when it’s essential to discover patterns in very giant texts, or whenever you’re speaking with people.

On this workflow, the price is minimal as a result of the info is cached, we use smaller fashions for many duties, and the one distinctive giant LLM calls are the ultimate ones.

How this agent works

Let’s undergo how the agent works underneath the hood. I constructed the agent to run inside Discord, however that’s not the main target right here. We’ll deal with the agent structure.

I cut up the method into two components: one setup, and one information. The primary course of asks the consumer to arrange their profile.


Since I already know find out how to work with the info supply, I’ve constructed a reasonably in depth system immediate that helps the LLM translate these inputs into one thing we are able to fetch information with later.

PROMPT_PROFILE_NOTES = """
You're tasked with defining a consumer persona based mostly on the consumer's profile abstract.
Your job is to:
1. Decide a brief persona description for the consumer.
2. Choose essentially the most related classes (main and minor).
3. Select key phrases the consumer ought to observe, strictly following the foundations under (max 6).
4. Resolve on time interval (based mostly solely on what the consumer asks for).
5. Resolve whether or not the consumer prefers concise or detailed summaries.
Step 1. Persona
- Write a brief description of how we should always take into consideration the consumer.
- Examples:
- CMO for non-technical product → "non-technical, skip jargon, deal with product key phrases."
- CEO → "solely embrace extremely related key phrases, no technical overload, straight to the purpose."
- Developer → "technical, eager about detailed developer dialog and technical phrases."
[...]
"""

I’ve additionally outlined a schema for the outputs I want:

class ProfileNotesResponse(BaseModel):
 persona: str
 major_categories: Listing[str]
 minor_categories: Listing[str]
 key phrases: Listing[str]
 time_period: str
 concise_summaries: bool

With out having area information of the API and the way it works, it’s unlikely that an LLM would determine how to do that by itself.

You could possibly attempt constructing a extra in depth system the place the LLM first tries to study the API or the techniques it’s supposed to make use of, however that may make the workflow extra unpredictable and dear.

For duties like this, I attempt to all the time use structured outputs in JSON format. That means we are able to validate the end result, and if validation fails, we re-run it.

That is the best solution to work with LLMs in a system, particularly when there’s no human within the loop to examine what the mannequin returns.

As soon as the LLM has translated the consumer profile into the properties we outlined within the schema, we retailer the profile someplace. I used MongoDB, however that’s non-compulsory.

Storing the persona isn’t strictly required, however you do must translate what the consumer says right into a type that permits you to generate information.

Producing the reviews

Let’s take a look at what occurs within the second step when the consumer triggers the report.

When the consumer hits the /information command, with or with out a time interval set, we first fetch the consumer profile information we’ve saved.

This offers the system the context it must fetch related information, utilizing each classes and key phrases tied to the profile. The default time interval is weekly.

From this, we get an inventory of prime and trending key phrases for the chosen time interval that could be attention-grabbing to the consumer.

Example of trending keywords that can come up from the system in two different categories | Image by author
With out this information supply, constructing one thing like this could have been tough. The information must be ready upfront for the LLM to work with it correctly.

After fetching key phrases, it might make sense so as to add an LLM step that filters out key phrases irrelevant to the consumer. I didn’t do this right here.

The extra pointless info an LLM is handed, the more durable it turns into for it to deal with what actually issues. Your job is to guarantee that no matter you feed it’s related to the consumer’s precise query.

Subsequent, we use the endpoint ready earlier, which comprises cached “info” for every key phrase. This offers us already vetted and sorted info for each.

We run key phrase calls in parallel to hurry issues up, however the first particular person to request a brand new key phrase nonetheless has to attend a bit longer.

As soon as the outcomes are in, we mix the info, take away duplicates, and parse the citations so every truth hyperlinks again to a selected supply by way of a key phrase quantity.

We then run the info by a prompt-chaining course of. The primary LLM finds 5 to 7 themes and ranks them by relevance, based mostly on the consumer profile. It additionally pulls out the important thing factors.

Short chain of prompting, breaking the task into smaller ones | Image by author
The second LLM cross makes use of each the themes and authentic information to generate two totally different abstract lengths, together with a title.

We are able to do that to ensure to scale back cognitive load on the mannequin.
This final step to construct the report takes essentially the most time, since I selected to make use of a reasoning mannequin like GPT-5.

You could possibly swap it for one thing quicker, however I discover superior fashions are higher at this final stuff.

The total course of takes a couple of minutes, relying on how a lot has already been cached that day.

Try the completed end result under.

How the <a href=tech scounting bot works in Discord | Picture by writer”/>
If you wish to take a look at the code and construct this bot your self, you could find it tech-reports-discord” goal=”_blank” rel=”noreferrer noopener”>right here. In case you simply need to generate a report, you’ll be able to be part of this channel.

I’ve some plans to enhance it, however I’m blissful to listen to suggestions when you discover it helpful.

And if you would like a problem, you’ll be able to rebuild it into one thing else, like a content material generator.

Notes on constructing brokers

Each agent you construct will probably be totally different, so that is in no way a blueprint for constructing with LLMs. However you’ll be able to see the extent of software program engineering this calls for.

LLMs, not less than for now, don’t take away the necessity for good software program and information engineers.

For this workflow, I’m largely utilizing LLMs to translate pure language into JSON after which transfer that by the system programmatically. It’s the best solution to management the agent course of, but in addition not what folks normally think about once they consider AI purposes.

There are conditions the place utilizing a extra free-moving agent is right, particularly when there’s a human within the loop.

However, hopefully you discovered one thing, or acquired inspiration to construct one thing by yourself.

If you wish to observe my writing, observe me right here, my website, Substack, or LinkedIn.



tech-insights/”>Supply hyperlink

Read more

Read More