Generative Analysis for Generative AI

By Jim Arlow and Ila Neustadt
Feb 17, 2025

📄 Contents

␡

1.1 Introduction
1.2 Chapter contents
1.3 Communication and neuro linguistic programming (nlp)
1.4 Abstraction
1.5 Finding the right level of abstraction for Generative AI
1.6 Choice of Generative AI
1.7 Applying Generative AI to an example problem domain
1.8 Modeling in Generative Analysis
1.9 Chapter summary

⎙ Print

< Back Page 7 of 9 Next >

This chapter is from the book 

Generative Analysis: The Power of Generative AI for Object-Oriented Software Engineering with UML

Learn More Buy

1.7 Applying Generative AI to an example problem domain

Here is the problem we will investigate:

The problem: The business problem is to keep a list of employees for a company that has several departments.

We’re going to keep it very simple, and we are not going to ask for persistence via some sort of database or any sort of user interface. We just want to see how the AI handles the key business concepts. We will generate Python because Python code is easy to read compared to many other languages. In the rest of this book, we will usually work at a level of abstraction that is higher than code but is precise enough to generate code. This would be an unnecessary distraction right now, so we will go straight to code. We will not define a level of abstraction, because we want to empirically identify the highest level of abstraction appropriate for code generation.

1.7.1 Generating a class

Let’s start right at the top of the abstraction tree by seeing whether Copilot knows enough about the business concept of an employee to generate some code. Copilot has the settings Creative, Balanced, and Precise. We set Copilot to Precise. The conversation is shown in Figure 1-3.

Copilot Precise

Employee 1.1

Prompt

“Generate some Python code for an employee.”

Response

Sure! Here’s a simple Python class that represents an employee:

class Employee:
    def __init__(self, name, age, salary):
        self.name = name
        self.age = age
        self.salary = salary

    def __str__(self):
        return f'Employee: {self.name}, Age: {self.age}, Salary:{self.
salary}'

    def give_raise(self, amount):
        self.salary += amount

This class has an __init__ method that takes in the employee’s name, age, and salary. It also has a __str__ method that returns a string representation of the employee and a give_raise method that increases the employee’s salary by a given amount.

Is there anything else you would like to add to this code?

Suggested prompts

Add a method to calculate the bonus.

Add a method to calculate the tax.

Add a method to calculate the net pay.

Figure 1-3 Employee 1.1 conversation

This Employee class is quite plausible, but is it consistently generated? If we clear the conversation and run the query again, we get a different answer, as shown in Figure 1-4.

Copilot Precise

Employee 2.1

Prompt

“Generate some Python code for an employee.”

Response

Sure! Here’s a simple Python class that represents an employee:

class Employee:
    def __init__(self, name, age, salary):
        self.name = name
        self.age = age
        self.salary = salary

    def get_name(self):
        return self.name

   def get_age(self):
        return self.age

    def get_salary(self):
        return self.salary

This code defines an Employee class with three instance variables: name, age, and salary.

The class also has three methods that return the values of these instance variables. Is there anything specific you would like the code to do?

Suggested prompts

Add a method to give a raise.

Add a method to calculate years until retirement.

Add a method to display employee information.

Figure 1-4 Employee 2.1 conversation. We get a different answer.

If we clear the conversation and try it again (Employee 3.1, not shown), we get back to Employee 1.1. There seem to be at least two possible answers from Copilot, and it appears to be random which answer we get.

Despite this, Copilot has achieved quite a lot based on virtually zero information. It has

Recognized that “employee” is a business concept that in object-oriented languages needs to be realized as a class, and it has reified this to an Employee class
Come up with a plausible set of data that might be associated with an employee (name, age, and salary) and reified that data as attributes of the Employee class
Asked us if there is anything else we would like to add to the Employee class or anything specific that we would like the code to do
Suggested several follow-on prompts that sound wildly optimistic

The methods that Copilot has chosen to generate are strange. In Employee 1.1 we got the Python __str__(self) method. This is a Python “magic” or “dunder” (double underscore) method, which is a standard Python method that should return a String representation of an instance of the class. We also got a business method, give_raise(self, amount), that adds an amount to the salary attribute, which is wrong (see later). In Employee 2.1, we got methods to get the values of the attributes (get_name(self), etc.), but we did not get the corresponding methods to set the values of the attributes.

In both cases, the Suggested prompts are deeply concerning. In Employee 1.1, we got the following prompts.

Add a method to calculate the bonus.
Add a method to calculate the tax.
Add a method to calculate the net pay.

And in the second case, we got these prompts.

Add a method to give a raise.
Add a method to calculate years until retirement.
Add a method to display employee information.

It all sounds too good to be true, and indeed it is far, far too good to be true. Let’s return to conversation Employee 1.1 (Figure 1-3) and ask Copilot for a method to calculate tax. This is shown in conversation Employee 1.2 (Figure 1-5).

Copilot Precise

Employee 1.2

Prompt

“Add method to calculate tax.”

Response

Sure! Here’s an updated version of the Employee class that includes a calculate_tax method:

class Employee:
    def __init__(self, name, age, salary):
        self.name = name
        self.age = age
        self.salary = salary

    def __str__(self):
        return f'Employee: {self.name}, Age: {self.age}, Salary: {self.
salary}'

    def give_raise(self, amount):
        self.salary += amount

    def calculate_tax(self):
        tax_rate = 0.3
        tax_amount = self.salary * tax_rate
        return tax_amount

This method calculates the tax amount based on the employee’s salary and a fixed tax rate of 30%. You can adjust the tax rate as needed.

Is there anything else you would like to add to this code?

Suggested prompts

Add a method to calculate bonus.

Add a method to calculate net pay.

Add a method to calculate pension.

Figure 1-5 Employee 1.2 conversation. Copilot hallucinates a tax algorithm.

The calculate_tax(self) method is a dangerous fiction. In nlp terms, it is a generalization. We all know that tax rates vary and that tax can be very hard to calculate, and yet it is generalized to a hard-coded rate of 0.3. As well, taxes would never be calculated in an Employee class but would be calculated in a separate accounting system that was properly audited.

Now we see that we can “Add a method to calculate pension.” Okay—we can’t resist it, let’s go down the rabbit hole (Figure 1-6).

Copilot Precise

Employee 1.3

Prompt

“Add method to calculate pension.”

Response

Sure! Here’s an updated version of the Employee class that includes a calculate_pension method:

class Employee:
    def __init__(self, name, age, salary):
        self.name = name
        self.age = age
        self.salary = salary

    def __str__(self):
        return f'Employee: {self.name}, Age: {self.age}, Salary:{self.
salary}'

    def give_raise(self, amount):
        self.salary += amount

    def calculate_tax(self):
        tax_rate = 0.3
        tax_amount = self.salary * tax_rate
        return tax_amount

    def calculate_pension(self):
        pension_rate = 0.05
        pension_amount = self.salary * pension_rate
        return pension_amount

This method calculates the pension amount based on the employee’s salary and a fixed pension rate of 5%. You can adjust the pension rate as needed.

Is there anything else you would like to add to this code?

Suggested prompts

Add a method to calculate bonus.

Add a method to calculate net pay.

Add a method to calculate vacation days.

Figure 1-6 Employee 1.3 conversation. Down the rabbit hole.

Once again, the method is a dangerous fiction for much the same reason as calculate_tax(self) is. We see that pension calculation is ridiculously naive with the pension_rate hard-coded at an arbitrary value (a generalization). Also, pensions are usually handled in their own system.

The first line of both Employee 1.1 and Employee 2.1 is interesting because in both cases it says “Sure! Here’s a simple Python class that represents an employee:”. The use of the word simple implies that we can ask for a more detailed answer. We will clear the conversation and try a new prompt, as shown in Figure 1-7.

Copilot Precise

Employee 3.1

Prompt

“Generate some detailed Python code for an employee.”

Response

Sure! Here’s a more detailed version of the Employee class that includes several additional methods:

class Employee:
    def __init__(self, name, age, salary, position):
        self.name = name
        self.age = age
        self.salary = salary
        self.position = position
        self.bonus = 0

    def __str__(self):
        return f'Employee: {self.name}, Age: {self.age}, Salary:{self.
salary}, Position: {self.position}'

    def give_raise(self, amount):
        self.salary += amount

    def calculate_tax(self):
        tax_rate = 0.3
        tax_amount = self.salary * tax_rate
        return tax_amount

   def calculate_pension(self):
        pension_rate = 0.05
        pension_amount = self.salary * pension_rate
        return pension_amount

   def calculate_bonus(self):
        if self.position == 'Manager':
            self.bonus = self.salary * 0.1
        elif self.position == 'Supervisor':
            self.bonus = self.salary * 0.05
        else:
            self.bonus = 0
        return self.bonus

    def calculate_net_pay(self):
        net_pay = self.salary - (self.calculate_tax() +
self.calculate_pension()) + self.calculate_bonus()
        return net_pay

    def calculate_vacation_days(self):
        if self.position == 'Manager':
            vacation_days = 30
        elif self.position == 'Supervisor':
            vacation_days = 25
        else:
            vacation_days = 20
        return vacation_days

This version of the Employee class includes additional instance variables for the employee’s position and bonus. It also includes several new methods for calculating the employee’s bonus, net pay and vacation days based on their position.

Is there anything else you would like to add to this code?

Suggested prompts

Add a method to calculate overtime pay.

Add a method to calculate sick days.

Add a method to calculate health insurance.

Figure 1-7 A more detailed answer

We find that the Employee has now gained a position and a bonus as well as methods that promise to calculate tax, pension, bonus, net pay, and vacation days. We think this is a good example of what is commonly referred to as AI hallucination (a distortion). Because Copilot is operating in an information vacuum, yet we keep asking more of it, it has turned inward and is essentially guessing what an Employee class might look like based on some internal representation in the neural net that we can’t examine. This is both useful, as a source of ideas, and dangerous, as a source of truth.

Each of these new business methods has been implemented using a process of generalization. Copilot has given us what is the most probable implementation of these methods based on its huge training set. We are given general solutions to specific problems that probably satisfy no one.

We know that the term hallucination is not really accurate when applied to an AI, but it is surprisingly similar in effect to human hallucination, which is a kind of trance in which attention becomes internally directed and information is internally generated with little or no reference to external reality. We also must remember that Generative AIs such as Copilot are prediction engines based on Large Language Models. Given an input, the engine will predict the most probable response based on its enormous training data set. This response may or may not be reasonable, sensible, useful, or factually correct.

This leads us to state our first and most important principle of Generative Analysis for Generative AI:

If you don’t say exactly what you want, you will get what you are given.

Recognizing these problems, it is tempting to get into a dialogue with the AI to get it to fix the code to our satisfaction. This is certainly possible, but we think it is generally a bad idea because it is too easy to elicit hallucinatory results. Also, it is quite a slow and laborious technique, and it is much easier to just be very specific about what we want. This is the approach we always take in Generative Analysis. However, as Generative AIs advance, and certainly if we ever achieve AGI, then an approach based on dialogue might become feasible and even preferable.

How can we be more specific? We need to specify the attributes that we want the employee to have. We should also specify methods, but we will put this aside for now. Consider conversation Employee 4.1 (Figure 1-8).

Copilot Precise

Employee 4.1

Prompt

“Generate some Python code for the following: “An employee has a first name, a last name, a unique identifier, an address, an email address a salary and a job title.””

Response

Here’s a simple Python class that represents an Employee with the attributes you mentioned:

class Employee:
    def __init__(self, first_name, last_name, unique_id, address, email,
salary, job_title):
        self.first_name = first_name
        self.last_name = last_name
        self.unique_id = unique_id
        self.address = address
        self.email = email
        self.salary = salary
        self.job_title = job_title

Is there anything else you would like to add to this class?

Suggested prompts

Add a method to calculate the annual salary.

Add a method to print the employee’s full name.

Add a method to change the employee’s job title.

Figure 1-8 Being more specific gets us what we want.

By being very specific, we got exactly what we wanted. Two out of three of the Suggested prompts are also quite reasonable.

Add a method to calculate the annual salary: This is plausible, but dangerous because it assumes that the “salary” is not already a yearly salary. Is it then weekly, monthly, quarterly, or something else? We don’t know until we ask the AI to generate the code.
Add a method to print the employee’s full name: This is just a simple Python print method that should work okay.
Add a method to change the employee’s job title: This is just a simple setter method for the job title. It should be okay.

The Generative Analysis approach to Suggested prompts is to take them as useful suggestions for things we might want to add into our model, but to avoid using them directly. However, it is always fine to try them out to see what they do and then incorporate that into your model or analysis activity in some other way if you like it. Our approach is pragmatic because Suggested prompts that promise to deliver interesting business semantics generally deliver fictions, whereas prompts that are merely about Python plumbing aren’t that useful and just suck you into a conversational rabbit hole.

1.7.2 Generating a model

Now that we can generate a class, we need to generate the whole model. Suppose we have the following fragment of a detailed analysis document:

“A company has many departments, and each department employs one or more employees. A company has a name, address, email address and vat number. A department has a name and a unique identifier. An employee has a first name, a last name, a unique identifier, an address, an email address a salary and a job title.”

This is much more specific, so we should be able to generate some decent code from it, as shown in Figure 1-9.

Copilot Precise

Company 1.1

Prompt

Response

Here is a Python code that represents the given information:

class Company:
    def __init__(self, name, address, email, vat_number):
        self.name = name
        self.address = address
        self.email = email
        self.vat_number = vat_number
        self.departments = []

    def add_department(self, department):
        self.departments.append(department)

class Department:
    def __init__(self, name, identifier):
        self.name = name
        self.identifier = identifier
        self.employees = []

    def add_employee(self, employee):
        self.employees.append(employee)

class Employee:
    def __init__(self, first_name, last_name, identifier, address,
email, salary, job_title):
        self.first_name = first_name
        self.last_name = last_name
        self.identifier = identifier
        self.address = address
        self.email = email
        self.salary = salary
        self.job_title = job_title

Suggested prompts

Can you add a method to calculate the total salary of all employees in a department?

Can you add a method to find an employee by their identifier?

Figure 1-9 Generating Python code from a precise narrative

That is so much better! By being specific, we have bypassed Copilot’s tendency to hallucinate, and it has given us pretty much what we want. We have a Company that has zero or more Departments, and we have a Company method to add a new Department. Each Department has zero or more Employees, and Department has a method to add Employees. The attributes for each class are just what we asked for. Furthermore, the Suggested prompts are now entirely reasonable because they relate directly to the structure of the model rather than to hallucinatory business semantics.

However, the Generative AI has omitted a business rule. Read that input prompt again, and then look at the code. Can you spot the missing rule?

We stated that “a department employs one or more employees.” However, this business rule is not enforced in the generated Python code. It has been ignored entirely and has not even been captured as a comment. This is a clear case of deletion.

Let’s now add some requirements related to finding employees expressed as “shall” statements. We will look at how to formulate these in a later chapter.

A company shall be able to return a list of its employees.
A company shall be able to find an employee by name.
A company shall be able to find an employee by unique identifier.

We can just append these requirements to our existing prompt (Figure 1-10).

Copilot Precise

Company 2.1

Prompt

“Generate Python code for the following: “A company has many departments, and each department employs one or more employees. A company has a name, address, email address and vat number. A department has a name and a unique identifier. An employee has a first name, a last name, a unique identifier, an address, an email address a salary and a job title. A company shall be able to return a list of its employees. A company shall be able to find an employee by name. A company shall be able to find an employee by unique identifier.””

Response

Here’s a Python code that should do what you’re asking for:

class Company:
    def __init__(self, name, address, email, vat):
        self.name = name
        self.address = address
        self.email = email
        self.vat = vat
        self.departments = []

    def add_department(self, department):
        self.departments.append(department)

    def list_employees(self):
        employees = []
        for department in self.departments:
            employees.extend(department.employees)
        return employees

   def find_employee_by_name(self, first_name, last_name):
        for department in self.departments:
            for employee in department.employees:
                if employee.first_name == first_name and
employee.last_name == last_name:
                    return employee
        return None

    def find_employee_by_id(self, id):
        for department in self.departments:
            for employee in department.employees:
                if employee.id == id:
                    return employee
        return None


class Department:
    def __init__(self, name, id):
        self.name = name
        self.id = id
        self.employees = []

    def add_employee(self, employee):
        self.employees.append(employee)


class Employee:
    def __init__(self, first_name, last_name, id, address, email,
salary, job_title):
        self.first_name = first_name
        self.last_name = last_name
        self.id = id
        self.address = address
        self.email = email
        self.salary = salary
        self.job_title = job_title

Is there anything else you’d like to know?

Suggested prompts

How do I create a company?

How do I add a department to a company?

How do I add an employee to a department?

Figure 1-10 Adding some requirements

Given how little effort it required to input the necessary information, this is not a bad result. It captures the gist of the problem in Python. We have Company, Department, and Employee classes with exactly the attributes we specified. A Company has zero or more Departments, and each Department has zero or more Employees, so the correct relationships are in place. We have also generated two business methods on Company to find an Employee by full name or identifier.

Before we close this example, the Suggested prompts at the end of Company 2.1 are very intriguing, so let’s see what happens by continuing the conversation (Figure 1-11).

Copilot Precise

Company 2.2

Prompt

“Show me how to create a company, add a department and add two employees to that department.”

Response

Sure! Here’s an example of how you can create a company, add a department to it and then add two employees to that department:

# Create a company
my_company = Company(name="My Company", address="123 Main St",
email="info@mycompany.com", vat="123456789")

# Create a department
my_department = Department(name="Sales", id="001")

# Add the department to the company
my_company.add_department(my_department)

# Create two employees
employee1 = Employee(first_name="John", last_name="Doe", id="0001",
address="456 Elm St", email="johndoe@email.com", salary=50000,
job_title="Sales Manager")
employee2 = Employee(first_name="Jane", last_name="Smith", id="0002",
address="789 Oak St", email="janesmith@email.com", salary=40000,
job_title="Sales Associate")

# Add the employees to the department
my_department.add_employee(employee1)
my_department.add_employee(employee2)

Is there anything else you’d like to know?

Suggested prompts

How do I list all employees in a company?

How do I find an employee by name?

How do I find an employee by unique identifier?

Figure 1-11 Generating example code

The generated code shows how to use the Company, Department, and Employee classes in a short program. The Suggested prompts offer to generate even more example code. Note that this is part of the same conversation. If we were to start a new conversation, Copilot would forget all about our Company example.

1.7.3 Generating UML

In the example above, we specifically looked at generating Python code, and we expect that code generation will be a primary use case in most software engineering projects. However, as much as we love Python, we would very much like to work at a higher level of abstraction. Although the level is higher, it will still be precise enough to generate code when needed. We really want to work at the level of UML models.

Here is our first attempt. We just ask Copilot for a UML model and see what we get (Figure 1-12).

Copilot Precise

Company UML 3.1

Prompt

“Generate UML for the following: “A company has many departments, and each department employs one or more employees. A company has a name, address, email address and vat number. A department has a name and a unique identifier. An employee has a first name, a last name, a unique identifier, an address, an email address, a salary, and a job title. A company shall be able to return a list of its employees. A company shall be able to find an employee by name. A company shall be able to find an employee by unique identifier.””

Response

Here is the UML class diagram for the given description:

+------------------+
| Company          |
+------------------+
| - name: String |
| - address: String |
| - email: String |
| - vatNumber: String |
+------------------+
| + getEmployees(): List<Employee> |
| + findEmployeeByName(name: String): Employee |
| + findEmployeeById(id: String): Employee |
+------------------+
       | 1
       | *
+------------------+
| Department       |
+------------------+
| - name: String  |
| - id: String    |
+------------------+
       | 1
       | *
+------------------+
| Employee         |
+------------------+
| - firstName: String |
| - lastName: String  |
| - id: String        |
| - address: String   |
| - email: String     |
| - salary: Double    |
| - jobTitle: String  |
+------------------+

Is there anything else you would like to know?

Suggested prompts

Can you explain the diagram?

What is UML?

How do I create a UML diagram?

Figure 1-12 An ASCII graphics UML class diagram generated by Copilot.

Just asking for a UML model sort of works but is not useful. First, the diagram is in ASCII graphics! This is quite fun, but it is not fit for purpose. Even worse, the diagram is wrong. The relationship between Department and Employee states in the diagram that a “Department has many (*) Employees, and each Employee works for one (1) Department.” However, the specification clearly states that a department has one or more (1..*) employees, which we can break down into the following atomic business rules.

Business rule: Each department shall have at least one employee.
Business rule: Each department may have more than one employee.

If we use the “Can you explain the diagram?” prompt, Copilot doubles down on this error (the error appears in dark gray shading in Figure 1-13).

Copilot Precise

Company UML 3.2

Prompt

“Can you explain the diagram?”

Response

Sure! The diagram shows three classes: Company, Department, and Employee. Each class has its attributes and methods listed within the box representing the class.

The Company class has four attributes: name, address, email, and vatNumber, all of type String. It also has three methods: getEmployees(), which returns a list of employees; findEmployeeByName(name: String), which takes a name as an argument and returns an employee with that name; and findEmployeeById(id: String), which takes an id as an argument and returns an employee with that id.

The Department class has two attributes: name and id, both of type String.

The Employee class has seven attributes: firstName, lastName, id, address, email, salary, and jobTitle. The first six are of type String while the salary is of type Double.

The diagram also shows the relationships between the classes. A company can have multiple departments, indicated by the line connecting the two classes with a 1 on the company side and a * on the department side. Similarly, a department can have multiple employees, indicated by the line connecting the two classes with a 1 on the department side and a * on the employee side.

Is there anything else you would like to know?

Suggested prompts

Can you explain the diagram?

What is UML?

How do I create a UML diagram?

Figure 1-13 Insisting on the error

We will show how to fix this multiplicity error shortly.

1.7.3.1 What about XMI?

UML has a standard XML textual representation called XMI (XML Metadata Interchange) format, and we can generate XMI by simply replacing “Generate UML” in Company 3.1 with “Generate XMI.” Can this solve our problem? No. The result is an abject failure, and we will not bother to show the details here. The generated XMI has syntax errors and will not load into any of the UML tools we have access to. XMI is, in principle, human readable, but in practice this is only with great difficulty, and different vendors have slightly different flavors of XMI, so fixing the syntax errors just isn’t worth it. None of this is surprising. While we can expect there to be a lot of Python code in the Generative AI training set, few developers use XMI, so there must be hardly anything there to work with.

In our opinion, XMI is one of the more problematical aspects of UML. It was designed as an import/export format for UML models so that there could be interoperability between UML modeling tools. As anyone who has ever tried to use it will tell you, this is a great idea in principle, but in practice it just doesn’t work. Each vendor seems to have their own flavor of XMI that is subtly (or sometimes not so subtly) incompatible with everyone else’s. And no matter whom you ask, the incompatibility is always the fault of the other party. The truth of the matter lies buried somewhere in the pages of the XMI standard, but good luck finding it.

Part of the problem is that XMI is a very complex and heavyweight import/export format. Even a simple UML class diagram generates pages of XMI because the whole underlying UML metamodel is exported. We think that UML urgently needs a lightweight import/export format that is human and Generative AI readable, and that XMI should be abandoned as unfit for purpose as we move forward into an AI-assisted future.

1.7.3.2 PlantUML

The solution to our UML generation problems is PlantUML.

PlantUML generates UML diagrams (not models!) from a simple textual representation. We explain the difference between the diagrams and models in considerable detail in UML 2 and the Unified Process [Arlow 1]. This immediately makes it much simpler than XMI. Also, PlantUML is used in the Microsoft GitHub code repository, so there is a decent amount of PlantUML code available.

Overall, Copilot generates PlantUML code very well, but it requires a small amount of prompt engineering, as we will explain. Let’s go back to our Company example and update it to generate PlantUML (Figure 1-14).

Copilot Precise

Company PlantUML 4.1

Prompt

“Generate PlantUML for the following: “1 company has 0..* departments. 1 department employs 1..* employees. A company has a name, address, email address and vat number. A department has a name and a unique identifier. An employee has a first name, last name, unique identifier, address, email address, salary, and a job title. A company shall be able to return a list of its employees. A company shall be able to find an employee by name. A company shall be able to find an employee by unique identifier.””

Response

Here is the PlantUML code for the class diagram you described:

@startuml
!define MASTER_MARKUP_COLOR #F0E68C

class Company {
  + name: String
  + address: String
  + emailAddress: String
  + vatNumber: String
  + getEmployees(): List<Employee>
  + findEmployeeByName(name: String): Employee
  + findEmployeeById(id: String): Employee
}

class Department {
  + name: String
  + uniqueIdentifier: String
}

class Employee {
  + firstName: String
  + lastName: String
  + uniqueIdentifier: String
  + address: String
  + emailAddress: String
  + salary: Double
  + jobTitle: String
}

Company "1" *-- "0..*" Department : has >
Department "1" *-- "1..*" Employee : employs >
@enduml

This code generates a class diagram that shows the relationships between the Company, Department, and Employee classes. The Company class has methods to return a list of its employees and to find an employee by name or unique identifier. The Department and Employee classes have attributes as described in your message.

Suggested prompts

Can you explain what PlantUML is?

How do I use this code?

Can you generate a sequence diagram instead?

Figure 1-14 Generating a class diagram in PlantUML

You can see that the PlantUML code is quite readable, and there is excellent documentation on the PlantUML website should you want to create it or edit it yourself. To view the diagram, we need a PlantUML viewer. There are many options available, but we like the web-based viewer PlantText.

The generated class diagram is shown in Figure 1-15.

Figure 1-15 UML class diagram for our Company model

This is just what we want, but we had to do a bit of prompt engineering to get it.

If you look at the prompt in Company PlantUML 4.1 (Figure 1-14), we have broken the prompt down into propositions (PN) and requirements (RN), as follows.

P1: 1 company has 0..* departments.
P2: 1 department employs 1..* employees.
P3: A company has a name, address, email address and vat number.
P4: A department has a name and a unique identifier.
P5: An employee has a first name, last name, unique identifier, address, email address, salary, and a job title.
R1: A company shall be able to return a list of its employees.
R2: A company shall be able to find an employee by name.
R3: A company shall be able to find an employee by unique identifier.

This is a form that we have found always works very well with Generative AI. Propositions P1 and P2 are about relationships between things. P3, P4, and P5 are ontological statements about what things exist, and R1, R2, and R3 are requirements for the behavior of those things. You can generally put these things in any order. We will discuss propositions and requirements in much greater detail in Chapter 3.

The relationship propositions, P1 and P2, are stated in a very particular way. We have found that the only way to get PlantUML to get the multiplicities right on the relationships is to embed them in the prompt in UML syntax as shown. Although the wording “1 company has 0..* departments” is a bit clumsy, it is still clear enough that anyone can understand what it means, and it generates the correct PlantUML code. Unfortunately, statements such as “A company has zero or more departments” typically give the wrong multiplicities. However, if you want slightly better readability, then we find statements such as “One (1) company has many (0..*) departments” will also work.

If we put the engineered prompt Company PlantUML 4.1 back into Copilot and ask for Python instead of PlantUML, the generated code still does not enforce the 1 to 1..* business rule between Department and Employee. This rule is not even noted as a comment in the code. Sometimes these multiplicities represent very important business rules (as we will see in Chapter 7), and it is disturbing that they can be lost so easily.

Once we have some satisfactory Generative AI output, such as PlantUML or Python code, we can feed it back into the AI to generate a narrative, as shown in Figure 1-16.

Copilot (Precise)

Narrative from PlantUML 1.1

Prompt

“Create a descriptive narrative describing the following:

<PlantUML from Company PlantUML 4.1>”

Response

The code you provided is a UML class diagram that describes the relationship between three classes: Company, Department and Employee.

The Company class has five attributes: name, address, emailAddress, vatNumber and methods to get employees and find employees by name or id.

The Department class has two attributes: name and uniqueIdentifier.

The Employee class has seven attributes: firstName, lastName, uniqueIdentifier, address, emailAddress, salary and jobTitle.

The diagram shows that a Company can have zero or more Departments and a Department can have one or more Employees.

Suggested prompts

What is UML?

What is the purpose of this diagram?

Figure 1-16 Generating a narrative

Notice that the 1 to 1..* business rule between Department and Employee has been stated correctly in the narrative.

When we ask Copilot to extract a list of propositions from the PlantUML code, it doesn’t know what we mean. However, we can get it to extract a list of requirements from the PlantUML, as shown in Figure 1-17.

Copilot Precise

Requirements from PlantUML 1.1

Prompt

“Create a list of requirements from the following PlantUML code:

<PlantUML from Company PlantUML 4.1>”

Response

Sure! Here are some requirements that can be inferred from the code you provided:

The system should allow the creation of a Company object with a name, address, email address and VAT number.
The system should allow the creation of a Department object with a name and unique identifier.
The system should allow the creation of an Employee object with a first name, last name, unique identifier, address, email address, salary and job title.
The system should allow the retrieval of all employees in a Company object.
The system should allow the retrieval of an Employee object by name or id.
The system should allow the addition of Departments to a Company object.
The system should allow the addition of Employees to a Department object.

Suggested prompts

What is UML?

What is the purpose of this diagram?

Figure 1-17 Extracting requirements

This list is useful, but notice that once again the 1 to 1..* business rule between Department and Employee has not been captured, even though it was explicit in the PlantUML code and also appeared in the generated narrative. This is a serious issue that we need to monitor. We also notice that the terminology is not exact—we have both “employees” and “Employees” in the above requirements.

We have seen in this section that we can generate accurate class diagrams from a precise narrative using a little bit of prompt engineering. We have also seen that we can generate narratives and requirements. However, Copilot is prone to deletions, and a key business rule, the 1 to 1..* business rule between Department and Employee, seems to come and go. The lesson from this is that we need to check the outputs of Generative AI very carefully indeed. This leads us to another Generative Analysis principle:

Generative Analysis Principle
Never trust Generative AI. Check everything!

In fact, this is specialization of a more general Generative Analysis principle that we call our first X Files principle:

X Files Principle
Trust no one.

Generative Analysis takes it as axiomatic that all information is to be distrusted until it has been analyzed. We discuss this in much more detail later. The good news is that our Second X Files principle is as follows:

Second X Files Principle
The truth is out there.

We also take it as axiomatic that through analysis and research, we can always get to the truth—at least in the restricted world of software engineering.

1.7.3.3 UML models and Generative AI

We have seen above that we can take a precise enough narrative and use it as a prompt to generate Python code, UML class diagrams, and UML requirements. Later in the book we will demonstrate that such a narrative can be used to create many kinds of UML artifacts, different types of code, databases, documentation, and even simulations. Thus, in a Generative AI–assisted analysis approach, the UML model loses a lot of its attraction. As it stands now, we can only get AI-generated artifacts into a UML model manually via transcription. Similarly, once the artifacts are in the UML model, we can only get information out to create prompts to use with Generative AI manually via reverse transcription. This is entirely unsatisfactory, and we hope that UML tool vendors will address this issue sooner rather than later.

Because sufficiently precise narratives can be used as prompts to generate code and UML artifacts, the narrative begins to take center stage as the main “source of truth” in the software development project. The implications for the UML model are that it will be incomplete and possibly inconsistent because some of the UML artifacts will only exist as generated diagrams outside of the model itself. We now have a complicated picture where the “source of truth” in the project is a combination of the UML model, the diagrams, and the narratives. Presently, these things are not well integrated, but we are sure that over time they will be. Figure 1-18 is a mind map that shows some of the pros and cons of UML models versus precise narratives as sources of truth.

Figure 1-18 Sources of truth

Fortunately, Generative Analysis already handles this complex situation well because it is predicated on Literate Modeling (which we discuss in detail in Chapter 7 and [Arlow 3]).

A Literate Model comprises a narrative written about, and directly referencing, a UML model. The aim is to make the information encrypted in the UML model available to the largest possible number of stakeholders, even those who do not know UML, via a precise human-readable narrative. Let’s ask Copilot to give us a quick summary of Literate Modeling (Figure 1-19).

Copilot Precise

Literate Modeling 1.1

Prompt

“What is Literate Modeling?”

Response

Literate Modeling is a technique that combines the benefits of literate programming with visual modeling. It was discovered and first described by author Jim Arlow. The technique can increase the accessibility and comprehensibility of a visual model by embedding it in an explanatory narrative. This narrative can be used to describe the model’s purpose, its structure, and its behavior^1,2.

Literate Models are UML diagrams that are embedded in texts explaining the models. In that way end-users, managers and domain experts gain useful understanding of the models, whilst object-oriented analysts see exactly and precisely how the models define business requirements and imperatives¹,³.

You can learn more about Literate Modeling on discovery.ucl.ac.uk¹ or informit.com².

References

1 https://discovery.ucl.ac.uk/id/eprint/933/

2 https://www.informit.com/articles/article.aspx?p=460398&seqNum=6

3 https://www.researchgate.net/publication/2238960_Literate_Modelling_-_Capturing_Business_Knowledge_with_the_UML

Suggested prompts

What is literate programming?

What are some other techniques for visual modeling?

Figure 1-19 Copilot explains Literate Modeling.

It has always been the case that a precise Literate Model narrative is virtually interchangeable with the UML model itself. It is only a small step to refine these narratives to a sufficient degree so that Generative AI can generate UML and other artifacts directly from them. We will demonstrate this many times throughout the rest of the book. This new, AI-assisted object modeling process now works as follows.

Express the object model as precise narratives that can either be used directly as Generative AI prompts or be turned into Generative AI prompts with small modifications. We will find that the Literate Modeling style of narrative, which always reduces to a string of propositions and requirements, is ideal for this.
Generate UML diagrams, code, and other artifacts directly from the narratives as needed.
Use a UML modeling tool to capture, by transcription, the most important UML artifacts if that is deemed necessary.
Use a UML modeling tool to create UML artifacts that can’t be generated.

In this new world, rather than seeking some level of completeness in the UML model, we take the pragmatic approach that this is no longer necessary. In fact, this has already been the case in many software engineering projects for quite some time. Creation of a UML model has, in many cases erroneously, been seen as an unnecessary overhead. Now, with Generative AI, provided we have precise narratives supported by UML models where necessary, we have an adequate source of truth and we can generate many other artifacts as needed.

In this new world of Generative AI–assisted analysis, UML becomes more a matter of visualization than modeling because the “model” is now distributed between UML and precise narratives. The new “source of truth” is therefore the Literate Model because it naturally combines these two things in a precise manner.

1.7.4 What have we learned from the example?

To get good code and UML generation from Generative AI, we have learned that we need to be very precise.

First, we need to specify an ontology, the things that exist.
1. We need to specify, for each thing, the attributes we want it to have.
2. We need to specify, for each thing, the business methods we want it to have.
Then, we need to specify the relationships between the things.

We also need to be very critical.

We need to examine the generated output very carefully because it is likely to contain errors.
Suggested features are likely to be wrong or inappropriate. However, they can provide useful input into the modeling process.
Generative AI is not good at enforcing business rules expressed as multiplicities.

The simple Company example illustrates the point we have made several times since the start of this book: The level of abstraction for our model (in this case, some text) is significantly lower than the average business analysis document because we need to be very precise and detailed about the ontology and relationships, right down to the attribute and method levels. Similarly, the level of abstraction is very much higher than for a Python program because Copilot quite successfully fills in many of the Python coding details, leaving us to concentrate on the big picture. Remember:

If you don’t say exactly what you want, you will get what you are given.

It is gratifying to us that the prompts that gave decent code generation results look like fragments of the Literate Models we introduced in Enterprise Patterns and MDA that were created using our initial ideas on Generative Analysis. It appears that we managed to nail the level of abstraction pretty well. This isn’t surprising, because the level of abstraction was designed to be precise enough for code generation.

< Back Page 7 of 9 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Email Address

Generative Analysis for Generative AI

This chapter is from the book

This chapter is from the book

This chapter is from the book 

1.7 Applying Generative AI to an example problem domain

1.7.1 Generating a class

1.7.2 Generating a model

1.7.3 Generating UML

1.7.3.1 What about XMI?

1.7.3.2 PlantUML

1.7.3.3 UML models and Generative AI

1.7.4 What have we learned from the example?

InformIT Promotional Mailings & Special Offers