Hints

A hint is an instruction to the optimizer. When writing SQL, you may know information about the data that is unknown to the optimizer. Sometimes the optmizer might not generate the most optimal query plan. Hints enable to change aspects of the query plan generated by the optimizer.

There are two ways to define a hint:

Inline hints
Context hints

1. Inline Hints

Inline hints are defined as special comments. Depending on the hint type, it can be written in two locations:

Query Hint: right after the SELECT keyword.
Table Hint: right after the referenced table name.

Here you can see the kinds of hint:

select /*+ fixJoinOrder */ sum(a.i) from t1 /*+ forceAccess(idx1) */ a, t2 b where a.v = b.v;

The syntax is as follows:

hintComment:
	'/*+' hint [, hint ]* '*/'

hint:
	hintName
	| hintName '(' optionKey '=' optionVal [, optionKey '=' optionVal ]* ')'
	| hintName '(' hintOption [, hintOption ]* ')'

optionKey:
	simpleIdentifier
	| stringLiteral

optionVal:
	stringLiteral

hintOption:
	simpleIdentifier
 	| numericLiteral
	| stringLiteral

2. Query Hints

2.1. Force the Use of an Index

This hint forces the access to a table through a defined index. The hint name is forceAccess. It takes two parameters:

Table name.
Index name.

This is an example forcing to use index idx1 to access table t1:

select /*+ forceAccess(t1, idx1) */ sum(i) from t1 a
	where a.i between 1 and 5 and a.i2 between 11 and 20;

2.2. Disable Pushdown

Queries in LeanXcale can be executed either on the query engine, in the storage engines, or both, depending on the query. The part of the query executed in the storage engines is said to be pushed down to the storage engines. Sometimes one might want to prevent pushing down operations to storage engines for a given table in a query. In general, this will negatively impact performance, so only use it, if you actually know what you are doing. This means that all the work will be done by the query engine except for the scans.

This is an example that disables the pushdown for table t1:

select /*+ disablePushdown(t1) */ sum(i) from t1 a where a.i
	between 1 and 5 and a.i2 between 11 and 20;

2.3. Disable Join Pushdown

This hint disables the use of relops query engine. In general, this will negatively impact performance. This means that all join actions will be executed by the Java query engine instead of relops (the C query engine).

This is an example:

select /*+ skip_relops */ sum(a.i) from t1 a, t2 b where a.v = b.v

2.4. Skip Online Aggregate

This hint disables translating table’s query into a query over any of its online aggregates. In general, this will negatively impact performance.

select /*+ skip_oa(item) */ max(i_price), min(i_price), count(*) from item;

2.5. Fix Join Order

This hint sets the order of the joins as they are written in the query. This hint works at query level, so it will affect to all the joins present in the query.

This is an example:

select /*+ fixJoinOrder */ sum(a.i) from t1 a, t2 b where a.v = b.v

2.6. Force Join Type

This hint allows to force the kind of join which will be used between two given tables. This is an example:

select /*+ forceJoinType(NESTED_LOOP, t1, t2) */ sum(a.i) from t1 a, t2 b where a.v = b.v;

Where JOIN_TYPE is one of the available types:

CORRELATE
MERGE_JOIN
NESTED_LOOP

2.7. Or To Union

This hint transforms OR conditions into UNION clauses. This is an example:

select /*+ orToUnion(t1) */ sum(i) from t1
	where i between 1 and 3 and v is not null or i between 18 and 20;

2.8. Or To Union All

This hints transforms OR conditions into UNION ALL clauses. This is an example:

select /*+ orToUnionAll(t1) */ sum(i) from t1
	where i between 1 and 3 and v is not null or i between 18 and 20;

2.9. Define Parallel Table

This hint is used to define a table over which the scans can be parallelized. This is an example:

select /*+ enableParallelTable(t1) */ sum(i) from t1
	where i between 1 and 3 and v is not null or i between 18 and 20;

2.10. Enable Parallel Pushed Down Aggregates

This hint enables the planner rules which allows to transform a scan with an aggregation program into a parallel scan (with the same aggregation program). To be chosen, this parallel scan will have to beat the rest of the alternatives in terms of cost. This is an example:

select /*+ enableParallelAgg */ sum(i) from t1
	where i between 1 and 3 and v is not null or i between 18 and 20;

3. Table Hints

3.1. Force the Use of an Index

This hint forces the access to a table through a defined index. To use it invoke the function as follows: An example is:

select sum(i) from t1 /*+ forceAccess(t1) */ a
	where a.i between 1 and 5 and a.i2 between 11 and 20;

3.2. Disable Pushdown

This hint disables the pushdown of operations to the storage engine for a given table. In general, this will negatively impact performance. This means that all the work will be done by the query engine except the scans. This is an example:

select sum(i) from t1  /*+ disablePushdown */ a
	where a.i between 1 and 5 and a.i2 between 11 and 20;

3.3. Skip Online Aggregate

This hint disables translating table’s query into a query over any of its online aggregates. This is an example:

select max(i_price), min(i_price), count(*) from item /*+ skip_oa */;

3.4. Or to Union

This hints transforms OR conditions into UNION clauses. This is an example:

select sum(i) from t1 /*+ orToUnion */
	where i between 1 and 3 and v is not null or i between 18 and 20;

3.5. Or to Union All

This is an example: This hints transforms OR conditions into UNION ALL clauses.

select sum(i) from t1 /*+ orToUnionAll */
	where i between 1 and 3 and v is not null or i between 18 and 20;

3.6. Define Parallel Table

This hint is used to define a table over which the scans can be parallelized. This is an example:

select sum(i) from t1 /*+ enableParallelTable */
	where i between 1 and 3 and v is not null or i between 18 and 20;

3.7. Batch Size

This hint applies only to DMLs and is equivalent to BATCH statements. The following statements are equivalent:

upsert into t2 /*+ batchSize(5000) */ (pk, v, fk)
	select i2, v, i from t1 where v is not null;
batch 5000 upsert into t2 (pk, v, fk)
	select i2, v, i from t1 where v is not null;

4. Context Hints

Context Hints are established within a session for a given connection and affect all queries issued in that session after being set. The actions that can be performed for context hints are:

Enable context hints for the connection
Disable context hints for the connection
List a context hint
Refresh plan cache
Set a context hint

In the following subsections we explain the different steps.

4.1. Enable Context Hints

In order to allow to use context hints, they have to be enabled. This can be done using enableHints table function as follows:

EXEC enableHints()

The output is:

Hints enabled for connection ...

4.2. Disable Context Hints

Disable the hints deletes all hints set in the session. However, the execution plans that have been created using queries with the context hints activated will remain in the plan cache.

This can be done using disableHints function as follows:

EXEC disableHints()

The output is:

Hints disabled for connection ...

4.3. List Context Hints

Lists hints is useful to know the defined context hints for the present connection, and to retrieve their ID (which can be used to remove them).

This can be done using "listHints" function as follows:

EXEC listHints()

The output will be a table with a row for each defined hint and three columns: ID that identifies the hint within the connection, the hint type, and the description of the hint.

4.4. Set Context Hints

In order to set a context hint you have to invoke the appropriate function for the hint you want to define. The following subsections explain each of the available hints.

4.4.1. Force the Use of an Index

This hint forces the access to a table through a defined index. It is used as follows:

EXEC forceAccess('TABLE_NAME','INDEX_NAME')

Where TABLE_NAME is the qualified name of the table (i.e., db-APP-CUSTOMER) and INDEX_NAME is the qualified name of the index (i.e., db-APP-CUSTOMERIDX). This function does not check the existence of the index nor the table so, if you force the access through an inexistent index, the queries that use this table will fail.

4.4.2. Disable Pushdown

This hint disables the pushdown of operations to the storage engine. In general, this will negatively impact performance. This means that all the work will be done by query engine except the scans that are always performed on the storage engine side.

It is used as follows:

EXEC disablePushdown('TABLE_NAME')

Where TABLE_NAME is the qualified name of the table (i.e. db-APP-CUSTOMER).

4.4.3. Fixed Join Order

This hint sets the order of the joins as they are written in the query. It is defined as follows:

EXEC fixJoinOrder()

4.4.4. Force Join Type

This hint allows to force the kind of join which will be used between two given tables. It is set by:

EXEC forceJoinType('JOIN_TYPE', 'TABLE_1', 'TABLE_2')

Where JOIN_TYPE is one of the available types:

CORRELATE,
MERGE_JOIN,
NESTED_LOOP

TABLE_1 and TABLE_2 are the unqualified names of the affected tables (i.e. CUSTOMER or WAREHOUSE). This function does not check that the chosen join type is available for a given pair of tables so if not the query will fail.

4.4.5. Or To Union

This hint transforms OR conditions into UNION clauses. It accepts a Boolean parameter which indicates whether the UNION will be a UNION ALL or not.

It is activated by:

EXEC orToUnion(all)

Where all is true for a UNION ALL or false otherwise.

4.4.6. Parallelization Hints

The hints provided in the next subsections are used to manage the parallel mode. Parallel mode will force parallelism in scans at the query engine. Scans are always parallelized at the storage engine level.

Enable Parallel Mode

This hint is used to enable the planner rules that transform a scan into a parallel scan when possible, and propose it as an alternative way to traverse a table (which will compete with the rest of alternatives in terms of cost).

It is defined as follows:

EXEC enableParallelMode()

Define Parallel Tables

This hint is used to define the set of tables over which the scans can be parallelized. This hint accepts a Java regular expression as a table name, so you can use a regexp to cover several tables or invoke this hint for each single one.

It is set as follows:

EXEC enableParallelTable(REGEXP)

Where REGEXP is a Java regular expression which applies over the fully qualified table name (i.e., 'db-APP-.' or '.' for all the tables).

Enable Parallel Pushed Down Aggregates

This hint enables the planner rules which allows to transform a storage engine scan with an aggregation program into a storage engine parallel scan (with the same aggregation program). To be chosen by the query planner, this parallel scan will have to beat the rest of the alternatives in terms of cost.

It can be defined as follows:

EXEC enableParallelAgg()

4.5. Remove Context Hints

A hint can be removed for the present connection. But, if the hint has been used to execute a query and an execution plan has been created and stored in the query plan cache, remove the hint won’t remove the query plan and the queries will continue to use it for their execution.

Removing a hint is performed as follows:

EXEC removeHint(HINT_ID)

Where HINT_ID (an integer, without quotes) is the identifier of the hint that wants to be removed, and can be obtained from the listHints table function.

5. Clean Query Plan Cache

This table function removes all the query plans that have been cached for a given table (not only for this connection but for every query in the server). It can be useful to ensure that a new query plan will be calculated when a hint is added or removed. However, it has an associated overhead, it will force to generate the plan for all subsequent queries that is an expensive operation.

It is invoked as follows:

EXEC cleanPlanCache('TABLE_NAME')

Where TABLE_NAME is the qualified name of the table (i.e. db-APP-CUSTOMER) from which the query plans will be removed.

6. Enable/Disable Planner Logger

These two functions let the user to enable or disable the optimizer logger. To be able to use it, you have to define a logger in your log4j configuration for the class org.apache.calcite.plan.RelOptPlanner with the level you want.

When enabling it, the log level for the RelOptPlanner logger will be set to ALL, and the planner will log all the query optimization steps in the appender configured for the RelOptPlanner logger.

Use it with caution because the amount of generated data can affect negatively the performance of the query preparation.

They can be invoked as follows:

EXEC enablePlannerDebug()
EXEC disablePlannerDebug()