Refactor ORM 2: The Query Object and Dynamic Query Language

This article explores how to use query object instances to automatically generate dynamic query conditions. By abstracting query conditions into four types of fields—predicate suffix fields, logical suffix fields, subquery fields, and entity relationship fields—it achieves automatic construction of different kind of query conditions.

1. Introduction

After refactoring the code that uses if statements to build dynamic queries, we implemented a method to directly construct dynamic queries using instances of query objects. This approach involves the following steps:

Iterating over all fields of the query object.
Using reflection to obtain field values, deriving corresponding query conditions based on annotations on the fields, and adding these values to a parameter list.
Combining multiple query conditions into query clauses using AND.

With this approach, adding a new query condition only requires adding a field to the query object and annotating it with the corresponding SQL condition. Additionally, due to its generality, the implementation of dynamic query construction can be maintained and released in a separate repository.

However, as the number and type of query conditions increase, the drawbacks of declaring query conditions using annotations gradually become apparent. The main issues are:

There is redundancy between column names and field names in query conditions;
Variable-length placeholders for IN operators require special handling;
Multiple query conditions need to be combined using the OR operator;
Subqueries need to be supported.

Upon analysis, we found that query conditions involving the equality operator can be directly derived from field names, thus avoiding redundant SQL conditions in annotations. This led us to an optimization direction—eliminating SQL annotations.

2. Categorization of Query Conditions

Based on the characteristics of query conditions, we categorize them into the following four types, each derived from a specific type of field mapping:

Basic query conditions, including column names, operators, and parameters
Query conditions connected using AND/OR
Query conditions with subqueries
Query conditions involving many-to-many relationships

2.1. Predicate Suffix Fields

Basic query conditions consist of column names, comparison operators, and parameters, and can be automatically mapped using predicate suffix fields.

The naming convention for predicate suffix fields typically includes the column name followed by a predicate suffix, which directly represents common comparison operators. For example, Eq represents =, Gt represents >, etc.

This approch is very common in DSLs (Domain-Specific Languages). For example, in Spring Data JPA, the findBy method generates queries by concatenating attribute names and conditions in the method name (e.g., findByAgeGreaterThanAndNameContaining).

Another advantage of naming fields with predicate suffixes is that when multiple query conditions for a column require different comparison operators, different suffixes can avoid naming conflicts between fields.

For instance, in the UserQuery class, the code that originally declared query conditions using the @QueryField annotation can be simplified as follows:

public class UserQuery {
    private String name;     // name = ?
    private String nameLike; // name LIKE ?
    private Integer ageGt;   // age > ?
}

When constructing query conditions based on predicate suffixes, placeholders and parameters are dynamically generated based on the field values. For example:

For IN query conditions, the number of placeholders is automatically generated according to the number of parameters.
For LIKE conditions, the % is automatically added to the field value based on the predicate suffix. See Appendix A for a corresponding table of predicate suffixes.

This design also makes the construction of query objects more intuitive. Developers can directly map field names and their suffixes to corresponding SQL conditions when assigning values to fields.

The predicate suffix table in Appendix A lists common predicate suffixes and their corresponding SQL query conditions, providing a reference for defining these fields.

2.2. Logical Suffix Fields

Multiple query conditions connected by logical operators AND or OR are mapped using logical suffix fields.

The logical suffix field is of type Collection or a query object, used to map multiple query conditions. The name of the logical suffix field includes the logical suffix (AND/OR), specifying the logical operator connecting the query conditions.

The type of the logical suffix field can be a query object or a collection, where each field or element maps to a query condition. All mapped conditions are combined using the logical operator specified by the logical suffix.

The table below shows examples of mapping three types of logical suffix fields with the suffix Or:

Field	Value (JSON format)	Query Condition
`List<String> nameOr`	`["test1", "test2"]`	`name = ? OR name = ?`
`UserQuery userOr`	`{"name": "John", "age": 30}`	`name = ? OR age = ?`
`List<UserQuery> usersOr`	`[{"name": "test", "ageGe": 30, "ageLt": 40}, {"nameLike": "admin", "age": 40}]`	`(name = ? AND age >= ? AND age < ?) OR (name LIKE ? AND age = ?)`

For example, the query condition id = ? OR name = ? AND age > ? can be represented in the UserQuery class by adding fields userAnd and userOr, which map query conditions connected by logical operators:

public class UserQuery {
    Integer id; 
    String name; 
    String nameLike;
    Integer ageGt; 
    UserQuery userAnd; 
    UserQuery userOr; 
}

Then, using the following code, the instance of UserQuery is created, and query conditions are generated from the userOr field:

UserQuery userAnd = UserQuery.builder().name("John").ageGt(30).build();
UserQuery userOr = UserQuery.builder().id(5).userAnd(userAnd).build();
UserQuery userQuery = UserQuery.builder().userOr(userOr).build();

List<Object> argList = new ArrayList<>();
String where = buildWhere(query, argList); 
// SQL: WHERE id = ? OR name = ? AND age > ?

In this example, both AND and OR operators are used, where AND has higher precedence than OR. Since the precedence of logical operators differs, we view the combination of conditions in SQL as hierarchical, not flat. Conditions connected by AND form one level, while those connected by OR form another level. Conditions from different levels are combined using the same logical operator. By exploiting the hierarchical structure of objects, we can express the hierarchy of query clauses.

The following diagram shows the relationship between the assigned fields in the UserQuery instance and the hierarchical structure of query conditions:
Logical Operator Hierarchy Comparison

2.3. Subquery Fields

For general subquery conditions, such as age > (SELECT avg(age) FROM t_user [WHERE]), we can divide them into three parts for mapping:

The condition part age >, which can reuse the format of predicate suffix fields. To avoid conflicts with the existing ageGt predicate suffix, we can append additional characters, such as ageGtAvg, to the suffix. When mapping, we ignore these additional characters.
The main part of the subquery SELECT avg(age) FROM t_user, which can be declared using annotations for the column and table names, such as @Subquery(select = "avg(age)", from = "t_user").
The WHERE clause of the subquery, which can be constructed by reusing the query object mapping method.

Subquery fields should be of type UserQuery, which allows the subquery to be dynamically constructed.

For example, in the UserQuery class, we can add an ageGtAvg field to build the query condition with a subquery:

public class UserQuery {
  // age > (SELECT avg(age) FROM t_user [WHERE])
  @Subquery(select = "avg(age)", from = "t_user")
  UserQuery ageGtAvg;
}

Thus, a single field can be used to construct a query condition that includes a subquery, dynamically building the query statement.

2.4. Entity Relationship Query Fields

To conform to the third normal form (3NF), for two entities with a many-to-many relationship, we usually introduce an intermediate table, transforming the many-to-many relationship into two one-to-many relationships. The table name and field names in such intermediate tables have a specific format: the table name includes both entity names, and the foreign keys are named entity_id. This allows us to derive query conditions from the names of the entities.

For example, in the RBAC model, there are three entities: user, role, and permission. Users and roles have a many-to-many relationship, and the intermediate table's name can be derived from user and role, such as a_user_and_role, with foreign keys named user_id and role_id. The query condition to find users by role would be:

SELECT * FROM t_user
WHERE id IN (SELECT user_id FROM a_user_and_role WHERE role_id IN (
  SELECT id FROM t_role [WHERE]))

By deriving the table and foreign key names from the entities user and role, we can represent this query condition using the expression <user,role>.

Similarly, if role and permission have a many-to-many relationship, we use the expression <user,role,perm> to represent the relationship between user and permission through role. This expression would produce a query to find users with specific permissions:

SELECT * FROM t_user
WHERE id IN (SELECT user_id FROM a_user_and_role WHERE role_id IN (


  SELECT role_id FROM a_role_and_perm WHERE perm_id IN (
    SELECT id FROM t_perm [WHERE])))

In these cases, the abstract entity path is used to represent the relationship between two entities. This abstract path can contain any number of nodes to express the relationships between entities.

For example, when permissions and menus also have a many-to-many relationship, the expression <user,role,perm,menu> can represent the relationship between users and menus.

We define the fields for querying users by their roles and permissions as follows:

public class UserQuery {
  //...
  @DomainPath({"user", "role"})
  RoleQuery role;
  @DomainPath({"user", "role", "perm"})
  PermQuery perm;
}

To query users assigned to a role, we can build a RoleQuery object using the role name and assign it to the role field of the UserQuery instance:

RoleQuery roleQuery = RoleQuery.builder().roleName("vip").build();
UserQuery userQuery = UserQuery.builder().role(roleQuery).build();
String where = buildWhere(userQuery, argList);

The corresponding query clause is:

WHERE id IN (SELECT user_id FROM a_user_and_role WHERE role_id IN (
  SELECT id FROM t_role WHERE role_name = ?))

Moreover, when the abstract entity path contains only one node, it can represent one-to-many or many-to-one relationships by specifying a foreign key name.

For example, in the Menu entity, which uses the foreign key parent_id to link the parent menu's ID, the query fields are defined as follows:

public class MenuQuery extends PageQuery {

    // many-to-one: query submenus by parent's conditions
    // parent_id IN (SELECT id FROM t_menu [WHERE])
    @DomainPath(value = "menu", localField = "parentId")
    private MenuQuery parent;

    // one-to-many: query parent menus by children's conditions
    // id IN (SELECT parent_id FROM t_menu [WHERE])
    @DomainPath(value = "menu", foreignField = "parentId")
    private MenuQuery children;
}

The relationships between parent and child menus are represented as <menu:menu|parentId,id> and <menu:menu|id,parentId>.

Entity relationship query fields provide an out-of-the-box query pattern for entity relationships.

3. Application Scope

Since the definition of query objects only relies on basic features of object-oriented languages, this approach can be applied to various object-oriented programming languages. Different database types share commonalities in query condition syntax, so the object-based mapping approach can generate SQL statements and also construct query statements for NoSQL databases like MongoDB.

The diagram below illustrates how the unified query object model can be used to construct query conditions for different databases. This approach's versatility enables us to share the same query construction logic across multiple database systems, improving code reuse and system flexibility.

4. Conclusion

This article introduces an improved method for automatically generating dynamic query conditions based on query object instances, which maps query conditions to field features. This approach abstracts query conditions into four field types to enable automatic construction. It reduces the need for manual SQL writing, significantly enhancing code generality and extensibility, and provides a clear and efficient solution for complex queries.

Appendix A: Predicate Suffix Table

Suffix Name	Field Name	Field Value	SQL Query Condition	MongoDB Query Condition
-	id	5	id = 5	{"id":5}
Eq	idEq	5	id = 5	{"idEq":5}
Ne	idNe	5	id != 5	{"idNe":{"$ne":5}}
Gt	idGt	5	id > 5	{"idGt":{"$gt":5}}
Ge	idGe	5	id >= 5	{"idGe":{"$gte":5}}
Lt	idLt	5	id < 5	{"idLt":{"$lt":5}}
Le	idLe	5	id <= 5	{"idLe":{"$lte":5}}
In	idIn	[1,2,3]	id IN (1,2,3)	{"id":{"$in":[1, 2, 3]}}
NotIn	idNotIn	[1,2,3]	id NOT IN (1,2,3)	{"id":{"$nin":[1, 2, 3]}}
Null	memoNull	false	memo IS NOT NULL	{"memo":{"$not":{"$type", 10}}}
Null	memoNull	true	memo IS NULL	{"memo":{"$type", 10}}
NotLike	nameNotLike	"arg"	name NOT LIKE '%arg%'	{"name":{"$not":{"$regex":"arg"}}}
Like	nameLike	"arg"	name LIKE '%arg%'	{"name":{"$regex":"arg"}}
NotStart	nameNotStart	"arg"	name NOT LIKE 'arg%'	{"name":{"$not":{"$regex":"^arg"}}}
Start	nameStart	"arg"	name LIKE 'arg%'	{"name":{"$regex":"^arg"}}
NotEnd	nameNotEnd	"arg"	name NOT LIKE '%arg'	{"name":{"$not":{"$regex":"arg$"}}}
End	nameEnd	"arg"	name LIKE '%arg'	{"name":{"$regex":"arg$"}}
NotContain	nameNotContain	"arg"	name NOT LIKE '%arg%'	{"name":{"$not":{"$regex":"arg"}}}
Contain	nameContain	"arg"	name LIKE '%arg%'	{"name":{"$regex":"arg"}}
Rx	nameRx	"arg\d"	name REGEXP 'arg\d'	{"name":{"$regex":"arg\d"}}

Appendix B: GitHub Repos

Java version: http://github.com/doytowin/doyto-query

Golang version: http://github.com/doytowin/goooqo

DOYToWin