Introduction

By Len Bass, Qinghua Lu, Ingo Weber, Liming Zhu
Feb 24, 2026

📄 Contents

␡

⎙ Print

< Back Page 5 of 9 Next >

This chapter is from the book 

Engineering AI Systems: Architecture and DevOps Essentials

Learn More Buy

1.5 AI Model Quality

As shown in Figure 1.4, the quality of an AI model depends primarily on two factors: the data used for inferencing and the type of AI model used. We will discuss these two factors next, followed by the model development life cycle (the bottom-half circle in Figure 1.2).

Figure 1.4 Influences on the quality of the AI model.

1.5.1 Data

The definition of an AI system given earlier in this chapter states that the system “infers” based on the input it receives. An AI system, whether during the training of the AI model or at the time of inference, will have to handle data from a myriad of sources. Possible input sources include databases of all types, enterprise systems (such as customer relationship management [CRM] and enterprise resource planning [ERP] systems), web scraping, social media, public datasets, and sensor data.

We begin by discussing the data used for model inferencing and its preparation.

Regardless of the source of the data, it is likely to have problems. Common problems include missing values, outliers, inconsistent data formats, duplicate data, data quality issues, and unstructured data. Problems in the data will impact the quality of the resulting model and model output. Consequently, the data should be cleaned and organized to accommodate the model chosen. Each of these problems is associated with a collection of techniques to clean the data. Some of these techniques are discussed in Chapter 5.

In addition to data cleaning, the features of the data might be improved. A feature is a property of a data point that is used as input into an AI model. The process of improving the features is called feature engineering. Techniques used in feature engineering include encoding categorical features into numerical values and scaling the features so that one feature does not dominate the model building process. Again, we discuss this topic further in Chapter 5.

Effectively managing the entire life cycle of AI and ML models requires streamlined practices and tools. This is where MLOps comes into play.

We elaborate on these processes in Chapters 3–6, where we discuss selecting an AI model, preparing it, and integrating it with the rest of the system. Next, we consider the various types of AI models.

1.5.2 AI Model Types

A designer chooses to use an AI model when there is no good explicit algorithmic solution for the problem they are trying to solve with the system. The functionality of an AI model relies on two critical components: the development of a knowledge base and the operation of a computational engine. The computational engine, often referred to as the inference engine or the deployed model, is responsible for processing inputs and generating outputs based on the knowledge base.

The knowledge base is formed by either human inputs or AI learning algorithms, which transform data into a structured form. The knowledge base in this structured form can, for example, be an ML model, a rule set, or an expert knowledge base. Once the knowledge base is established and structured, the inference engine utilizes it to make predictions, generate content, or achieve other types of results. The AI portion of a system packages the inference engine together with some representation of the knowledge base. The important point of this discussion is that an AI portion is executable: It can be invoked at runtime as a function or a service.

The AI model is based on one of two categories of AI techniques: symbolic or non-symbolic. Symbolic AI is based on symbols, such as logical statements or rules with variables; non-symbolic or sub-symbolic AI is based on ML. A subcategory of ML-based systems is formed by systems based on foundation models (FMs). These categories differ in how knowledge is encoded in the knowledge base and how the inference engine operates. Chapter 3, AI Background, goes into much more detail about these categories. Chapter 4, Foundation Models, goes into detail about FMs. We provide a short summary here to introduce the distinctions among these categories.

Symbolic AI

Symbolic systems, also called Good Old-Fashioned AI (GOFAI), symbolic AI, or expert systems, are sets of rules that are evaluated to produce the value of a response or the value of an internal variable.

Suppose you wish to create an AI model that will predict which movie you will choose in any particular context. In a symbolic AI model, the developers might encode rules such as “If the user watched more than two action movies in the past week, recommend another action movie” or “If it is a sunny Saturday afternoon, suggest a warm family movie.” With enough rules, a symbolic AI model can reason over the knowledge base to make a movie suggestion.

A rule is an if-then statement: If the specified condition is true, then perform some activity. A set of such rules provides a filter through which a result might be generated. These rules may be used for many different purposes:

Diagnosis—medical, student behavior, or help desk–related. Example: If the user has version X of an application, some functionality will be unavailable.
Real-time process control. Example: If a storm is predicted to start in 10 minutes, close the outside blinds on the building.
Risk assessment. Example: If the annual income is less than three times the annual mortgage payments, do not grant the home loan.
Executing business rules. Example: If the insurance claim exceeds $2000, perform additional manual checks.

The set of rules constitutes the knowledge base for a symbolic system. The rules are preprocessed by the computational mechanism, a rules engine, to speed up the inference portion of a symbolic system. The rules engine sorts through the preprocessed input to find matches with the “if” portion of a rule. Modern systems can perform more sophisticated processing than just filtering. The set of matches constitutes a list of possible responses. If the list of matches has multiple items, then additional input is required to determine a system response. This input can come from a human or from a set of conflict resolution rules. Large rule bases may have contradictory information, and a rules engine can identify the conflicts for further analysis. Rule-based systems are customized by modifying the list of rules.

Other types of symbolic AI models include planning and ontology reasoning:

For planning, actions are described in terms of preconditions and effects. For example, consider a cloud control API, where one action would be to start a virtual machine (VM). The precondition is that the launch configuration (the image or other source from which to launch the VM) and the firewall settings have been defined; the effect is that a new VM is started with the specified firewall settings and from the respective launch configuration. AI planning takes such a description of possible actions as input, as well as a starting state and a desired goal state. The algorithm then creates a plan of actions to get from the start to the goal state.
Ontology reasoning, where concepts (e.g., medications, headache pills, vegetables, apples, tools, screwdrivers), their relations, and rules (every food and every medication has an expiration date) are defined. Based on this information, logic reasoning can be performed—for example, inferring that headache pills have an expiry date.

Machine Learning

Machine learning (ML) uses statistical techniques to generate results. The training set for ML is the input to the training, which in turn generates the knowledge base. A data value in the training set is labeled by a collection of variables.

A model is trained by identifying the features that characterize the knowledge base and using those features, along with associated ML and statistical techniques, to determine the model’s parameters. A subclass of ML, called deep learning, automatically identifies these features. “Narrow” ML models are trained for a specific set of goals and capabilities, which distinguishes them from the more general-purpose FMs. Some of the main types of ML models are summarized here:

Classification: Assigning a category to an input (e.g., this picture contains a dog).
Regression: Inferring a continuous value instead of a discrete category (e.g., predicting that a particular insurance claims process will take three more days to complete).
Clustering: Grouping similar data points together without prior knowledge of the groups (e.g., the behavior of customers in this group seems similar).

ML models can be further customized by modifying the training set or the training hyperparameters (e.g., how many clusters are we looking for?) and regenerating the knowledge base.

Foundation Model

A foundation model (FM) is a type of ML model that leverages neural networks as the core of its architecture. It differs from traditional ML models in two key aspects:

It is trained on an extensive and diverse dataset, often comprising billions or even trillions of data points.
The training data is largely unlabeled, unlike in traditional ML, where data is typically structured, labeled, and often numerical or categorical.

The term foundation reflects the model’s general-purpose nature, as it is not trained for a specific task. Instead, it serves as a base model that can be adapted to various specialized applications by incorporating additional data and fine-tuning. This customization allows for application-specific performance.

Large language models (LLMs) are a type of FM. LLMs are trained on huge amounts of text. These models are often generative, meaning they are capable of producing sequences of text. The transformer architecture is the most commonly used ML model architecture for building LLMs. OpenAI’s GPT-3 and GPT-4 models⁵ are the most well-known examples of LLMs, but open-source alternatives are also available, including Mistral⁶ and Llama.⁷ To find a wide variety of open-source LLMs and other pretrained AI models, you can visit the model hub Hugging Face.⁸

FMs are customized or complemented through, for example, fine-tuning, prompting, retrieval-augmented generation (RAG), and “guardrails” that may preprocess input to and postprocess output from the FM and other components. A guardrail serves as a safeguard to ensure the safe and responsible use of AI technologies and prevent some attacks. It may include strategies, mechanisms, and policies designed to prevent misuse, protect user privacy, and promote transparency and fairness. RAG is also a popular method of complementing FMs, whereby specific data that is related to a request to an AI is retrieved and used to augment the input to the model. The RAG data is often specific and private for a given context (e.g., internal knowledge of a particular organization).

1.5.3 Model Development Life Cycle

Once the model is developed, the next step is model build—that is, building an executable artifact that includes the model or access to it. If the model is included, this step involves transforming the model into a deployable format that can be executed within the system. The model build stage ensures that the AI model is ready for integration.

After the model is built, it needs to be thoroughly tested to assess its accuracy and identify any potential risks or biases. The model test stage involves evaluating the model’s performance against predefined metrics and criteria. It is important to ensure that the model operates reliably and produces accurate results.

Once the model has been tested and approved, it can be released for integration into the system. The model release stage involves finalizing the model for deployment and approving it for integration with other components of the system.

1.5.4 Resource Allocation for AI Parts

As mentioned earlier in the discussion of resources, the allocation for the AI portion of the system depends on the AI techniques used. Resource requirements for these different techniques depend on the technique chosen.

ML: The training phase of an ML model is performed on either a local resource or a cloud resource—or in special cases, via edge/on-device learning. The resulting executable can be allocated to an edge resource, if small enough. Otherwise, it is allocated to either local or remote resources.
FM: An FM is typically hosted on cloud resources. Access can be through API calls or service message calls. Some FMs are trained for specific domains. If small enough, they can run on edge devices directly, such as phones or smart speakers, for real-time applications. Techniques for compressing or distilling FMs and reducing their resource requirements are a matter of ongoing research and will evolve over time. We discuss these techniques in Chapter 4.