Llm Technical Track

Architecting and Building Production-Grade GenAI Systems - Part 4: Running Operations & User Experience

2 min read

Dec 14, 2023

Architecting and Building Production-Grade GenAI Systems Part 4 If you haven't read part 3, please click here to get started.

Running Operations

10. Logging and Monitoring

Any production application needs to have full instrumentation capabilities in order to provide excellent operational support. We want to be able to log the operations of our LLM, the operations of our API layer and all the supporting Azure infrastructure.

We also want to leverage modern tooling that allows us to troubleshoot and debug issues end to end so we can quickly do root cause analysis, problem management and resolution.

Using Azure Monitor we can centralize logging, alerting and also apply the “Application Insights” extension. This will enable us to have all the logs in one place, create metrics based and custom KQL based alerts. Application Insights also takes all this information and enables us to go end to end from the function call, look at network latency, LLM response latency and debug the function execution on a very granular level.

Once again I’m zooming in to the diagram for brevity but the idea would be that all services log operations to the same Azure monitor workspace.

11. Cost Management

Monitor resource usage and optimize costs by dynamically scaling resources based on demand and provision as needed. Implement cost analysis tools to identify areas for savings and turn on any recommendation generators and cost alerting. Overlooking cost management can lead to budget overruns and nasty surprises.

In our scenario we can take advantage of Azure’s built-in cost management abilities to look at the breakdown of the services and monitor consumption. Another good thing about Azure’s OpenAI service is that the logging of LLM related use metrics like token consumption is already done automatically so we can easily get to a higher level of detail.

Small update to add the cost management component:

User Experience

12. User Experience

In this scenario we are not building a front-end user facing application so user experience would not be focused on the GUI that we are putting in front of people. Instead, since we are building a back-end API service for our LLM capabilities, we would prioritize a developer-friendly interface and experience for developers. This could even include external developers if we are developing a B2B or partner enabled solution.

13. Documentation and Support

Provide comprehensive documentation for developers on how to interact with the system's APIs. Offer support channels for them to report issues or seek assistance. In keeping with the spirit of the project, you can even use your own LLM to create a chatbot to support the solution itself!

In the next blog, let's discuss data handling and recovery.

← Analyzing a Movie Dataset Housed on MongoDB Through GraphQL - Part 1: Introduction & Mongo Database Setup

Architecting and Building Production-Grade GenAI Systems - Part 5: Data Handling & Recovery →

Get Email Notifications

No Comments Yet

Let us know what you think

Architecting and Building Production-Grade GenAI Systems - Part 4: Running Operations & User Experience

Running Operations

10. Logging and Monitoring

11. Cost Management

User Experience

12. User Experience

13. Documentation and Support

Share this

You May Also Like

Architecting and Building Production-Grade GenAI Systems - Part 3: Ethical Considerations and Compliance

Architecting and Building Production-Grade GenAI Systems - Part 6: Model Management & Improvement

Architecting and Building Production-Grade GenAI Systems - Part 2: Architectural Matters (Continued)

Get Email Notifications

No Comments Yet