Features

Logging & Monitoring

Logging & Monitoring

Logging and monitoring are two essential practices in LLMOps. Logging refers to the practice of capturing and recording relevant data and events associated with the deployment, operation, and usage of language models. Some key aspects of logging in LLM include: usage logs, performance logs, error logs, etc.

Monitoring on the other hand refers to the continuous and real-time observation and tracking of language model behavior and related system components to assess performance, detect anomalies, and ensure operational efficiency and reliability. Monitoring an LLM involves collecting and analyzing data to gain insights into how the language model is functioning and how it is interacting with users and other software components.

In Orquesta every interaction with an LLM generates a log for you which can be found on the dashboard, these logs are available for Prompts, Endpoints and Playground.

The log contains the following:

  1. Timestamp
    This is the time at which the log is recorded in Orquesta and it contains the year, month and date, the specific time and the timezone of the user.


  2. Provider/Model
    This is an entity or organization that offers language models as a service. These providers (OpenAI, Cohere, Google, etc.) develop, manage, and make language models available to users and applications through APIs or other access methods. While model refers to a specific language model that has been created, trained, and deployed for various natural language processing tasks.


  3. Prompt
    The prompt is the instruction you used to interact with a language model. It is a text that serves as a request to the language model, guiding it to generate a desired response.


  4. Variant
    A prompt variant is a variation of a prompt that is used to generate text from a LLM. Prompt variants are used to improve the quality and diversity of the generated text. They can be created by: changing the wording of the prompt, adding or removing information from the prompt, using different formats for the prompt.


  5. Version
    Each unique state of a Prompt is saved as a different version. This allows you to track changes over time and rollback to previous configurations if necessary. Versions are usually numbered sequentially and may also include metadata like who made changes and when.


  6. Response time
    This is the duration it takes for a language model to process a given input and produce a meaningful and relevant output. It is measured in milliseconds.


  7. Latency
    Latency is the time it takes for a large language model (LLM) to process a request and return a response, measured in milliseconds.


  8. Cost
    Costs helps teams and organizations understand and manage their LLMOps costs more effectively. The input cost, output cost and the total cost are logged by Orquesta.


  9. Score
    It is a numerical value or metric used to assess the quality, relevance, or appropriateness of the generated text or response produced by the language model.


  10. Metadata
    This is an additional information or data that provides essential details about language models and their associated resources.


  11. LLM response
    This is the output generated by a language model in response to a specific input.


  12. Economics
    This contains the completion tokens, prompt tokens, total tokes which are used the generate the real-time-dashboards.


  13. Tokens per second
    This is another facet of latency, focusing on the model's processing speed. It measures how many tokens (words, characters, or subword units) the LLM can generate in a given timeframe, typically expressed as tokens per second (TPS).


  14. Request
    The number of times a particular Prompt is requested by the application during a specific time period.


  15. Context
    The Context is the environment (dev-test, production, etc.) and setting where the subject that requests the Prompt or Remote Config are being evaluated operates.

Wrap up

In summary, logs provide a detailed record of events and interactions, offering insights into system performance, security, and compliance. Monitoring complements logging by providing real-time assessment of system health and resource usage, allowing proactive issue resolution.

Get started today.

Check out Orquesta documentation.

Follow us on Twitter, LinkedIn and GitHub.

Logging & Monitoring

Logging and monitoring are two essential practices in LLMOps. Logging refers to the practice of capturing and recording relevant data and events associated with the deployment, operation, and usage of language models. Some key aspects of logging in LLM include: usage logs, performance logs, error logs, etc.

Monitoring on the other hand refers to the continuous and real-time observation and tracking of language model behavior and related system components to assess performance, detect anomalies, and ensure operational efficiency and reliability. Monitoring an LLM involves collecting and analyzing data to gain insights into how the language model is functioning and how it is interacting with users and other software components.

In Orquesta every interaction with an LLM generates a log for you which can be found on the dashboard, these logs are available for Prompts, Endpoints and Playground.

The log contains the following:

  1. Timestamp
    This is the time at which the log is recorded in Orquesta and it contains the year, month and date, the specific time and the timezone of the user.


  2. Provider/Model
    This is an entity or organization that offers language models as a service. These providers (OpenAI, Cohere, Google, etc.) develop, manage, and make language models available to users and applications through APIs or other access methods. While model refers to a specific language model that has been created, trained, and deployed for various natural language processing tasks.


  3. Prompt
    The prompt is the instruction you used to interact with a language model. It is a text that serves as a request to the language model, guiding it to generate a desired response.


  4. Variant
    A prompt variant is a variation of a prompt that is used to generate text from a LLM. Prompt variants are used to improve the quality and diversity of the generated text. They can be created by: changing the wording of the prompt, adding or removing information from the prompt, using different formats for the prompt.


  5. Version
    Each unique state of a Prompt is saved as a different version. This allows you to track changes over time and rollback to previous configurations if necessary. Versions are usually numbered sequentially and may also include metadata like who made changes and when.


  6. Response time
    This is the duration it takes for a language model to process a given input and produce a meaningful and relevant output. It is measured in milliseconds.


  7. Latency
    Latency is the time it takes for a large language model (LLM) to process a request and return a response, measured in milliseconds.


  8. Cost
    Costs helps teams and organizations understand and manage their LLMOps costs more effectively. The input cost, output cost and the total cost are logged by Orquesta.


  9. Score
    It is a numerical value or metric used to assess the quality, relevance, or appropriateness of the generated text or response produced by the language model.


  10. Metadata
    This is an additional information or data that provides essential details about language models and their associated resources.


  11. LLM response
    This is the output generated by a language model in response to a specific input.


  12. Economics
    This contains the completion tokens, prompt tokens, total tokes which are used the generate the real-time-dashboards.


  13. Tokens per second
    This is another facet of latency, focusing on the model's processing speed. It measures how many tokens (words, characters, or subword units) the LLM can generate in a given timeframe, typically expressed as tokens per second (TPS).


  14. Request
    The number of times a particular Prompt is requested by the application during a specific time period.


  15. Context
    The Context is the environment (dev-test, production, etc.) and setting where the subject that requests the Prompt or Remote Config are being evaluated operates.

Wrap up

In summary, logs provide a detailed record of events and interactions, offering insights into system performance, security, and compliance. Monitoring complements logging by providing real-time assessment of system health and resource usage, allowing proactive issue resolution.

Get started today.

Check out Orquesta documentation.

Follow us on Twitter, LinkedIn and GitHub.

Logging & Monitoring

Logging and monitoring are two essential practices in LLMOps. Logging refers to the practice of capturing and recording relevant data and events associated with the deployment, operation, and usage of language models. Some key aspects of logging in LLM include: usage logs, performance logs, error logs, etc.

Monitoring on the other hand refers to the continuous and real-time observation and tracking of language model behavior and related system components to assess performance, detect anomalies, and ensure operational efficiency and reliability. Monitoring an LLM involves collecting and analyzing data to gain insights into how the language model is functioning and how it is interacting with users and other software components.

In Orquesta every interaction with an LLM generates a log for you which can be found on the dashboard, these logs are available for Prompts, Endpoints and Playground.

The log contains the following:

  1. Timestamp
    This is the time at which the log is recorded in Orquesta and it contains the year, month and date, the specific time and the timezone of the user.


  2. Provider/Model
    This is an entity or organization that offers language models as a service. These providers (OpenAI, Cohere, Google, etc.) develop, manage, and make language models available to users and applications through APIs or other access methods. While model refers to a specific language model that has been created, trained, and deployed for various natural language processing tasks.


  3. Prompt
    The prompt is the instruction you used to interact with a language model. It is a text that serves as a request to the language model, guiding it to generate a desired response.


  4. Variant
    A prompt variant is a variation of a prompt that is used to generate text from a LLM. Prompt variants are used to improve the quality and diversity of the generated text. They can be created by: changing the wording of the prompt, adding or removing information from the prompt, using different formats for the prompt.


  5. Version
    Each unique state of a Prompt is saved as a different version. This allows you to track changes over time and rollback to previous configurations if necessary. Versions are usually numbered sequentially and may also include metadata like who made changes and when.


  6. Response time
    This is the duration it takes for a language model to process a given input and produce a meaningful and relevant output. It is measured in milliseconds.


  7. Latency
    Latency is the time it takes for a large language model (LLM) to process a request and return a response, measured in milliseconds.


  8. Cost
    Costs helps teams and organizations understand and manage their LLMOps costs more effectively. The input cost, output cost and the total cost are logged by Orquesta.


  9. Score
    It is a numerical value or metric used to assess the quality, relevance, or appropriateness of the generated text or response produced by the language model.


  10. Metadata
    This is an additional information or data that provides essential details about language models and their associated resources.


  11. LLM response
    This is the output generated by a language model in response to a specific input.


  12. Economics
    This contains the completion tokens, prompt tokens, total tokes which are used the generate the real-time-dashboards.


  13. Tokens per second
    This is another facet of latency, focusing on the model's processing speed. It measures how many tokens (words, characters, or subword units) the LLM can generate in a given timeframe, typically expressed as tokens per second (TPS).


  14. Request
    The number of times a particular Prompt is requested by the application during a specific time period.


  15. Context
    The Context is the environment (dev-test, production, etc.) and setting where the subject that requests the Prompt or Remote Config are being evaluated operates.

Wrap up

In summary, logs provide a detailed record of events and interactions, offering insights into system performance, security, and compliance. Monitoring complements logging by providing real-time assessment of system health and resource usage, allowing proactive issue resolution.

Get started today.

Check out Orquesta documentation.

Follow us on Twitter, LinkedIn and GitHub.

Start powering your SaaS with LLMs

Start

powering

your SaaS

with LLMs

Start powering your SaaS with LLMs