The Architecture of Presto Optimizer, part 1

Written by

Vladimir Ozerov

Jan 4, 2021

Abstract

Presto is an open-source distributed SQL query engine for big data. Presto provides a connector API to interact with different data sources, including RDBMSs, NoSQL products, Hadoop, and stream processing systems. Created by Facebook, Presto received wide adoption by the open-source world (Presto, Trino) commercial companies (e.g., Ahana, Qubole).

Presto comes with a sophisticated query optimizer that applies various rewrites to the query plan. In this blog post series, we investigate internals of Presto optimizer. In the first part, we discuss the organization of relational tree, the optimizer interface, and the design of the rule-based optimizer. In the second part, we will discuss concrete optimization rules and transformations.

Please refer to the original paper by Facebook to get a better understanding of Presto's capabilities and design.

We will use the Presto Foundation fork version 0.245 for this blog post.

Relational Tree

Presto optimizer works with relational operators. Similarly to other SQL optimizers, such as Apache Calcite, Presto performs syntax and semantic analysis of the original SQL string and then produces the logical relational tree:

  1. The ANTLR-based parser converts the original query string into an abstract syntax tree (AST)
  2. The analyzer performs the semantic validation of the AST.
  3. The converter creates the logical relational tree from the AST.

Every node in the tree represents a relational operation and implements a common PlanNode interface, which exposes a unique node's ID, node's inputs, and node's output. The interface also allows traversing the tree with a visitor pattern, used extensively during the optimization. Examples of relational operations:

  1. TableScanNode - scans a table.
  2. ProjectNode, FilterNode, AggregationNode, JoinNode - common relational operations.

Consider the following query:


SELECT 
    orderstatus, 
    SUM(totalprice) 
FROM orders 
GROUP BY orderstatus

The associated query plan might look like this:


OutputNode
  ProjectNode[group, sum]
    AggregationNode[group=orderstatus, sum=SUM(totalprice)]
      ProjectNode[orderstatus=orders.orderstatus, totalprice=orders.totalprice]
        TableScanNode[orders]

Optimizer Inteface

When the logical plan is ready, we can start applying optimizations to it. In Presto, there is the general PlanOptimizer interface that every optimization phase implements. The interface accepts one relational tree and produces another.


public interface PlanOptimizer
{
    PlanNode optimize(
        PlanNode plan,
        Session session,
        TypeProvider types,
        PlanVariableAllocator variableAllocator,
        PlanNodeIdAllocator idAllocator,
        WarningCollector warningCollector
    );
}

The optimization program builder PlanOptimizers creates a sequence of optimizers that are invoked sequentially to optimize the relational tree. Optimization problems often split into several phases to keep logical and computational complexity under control. In Presto, there are more than 70 optimization phases that every relational tree will pass through.

The majority of optimization phases use the rule-based optimizer that we will discuss further. Other phases rely on custom optimizers that make no use rules but apply a custom transformation logic. For example, the PredicatePushDown optimizer moves filters down in the relational tree, and PruneUnreferencedOutputs removes unused fields that could be generated during the AST conversion or the previous optimization phases. We will discuss the most important custom optimizers in the second part of this blog post series.

Presto may also reoptimize the query plan in runtime. The details of this process are out of the scope of this blog post.

Rule-Based Optimizer

Presto uses the rule-based IterativeOptimizer for the majority of optimization phases.

In rule-based optimization, you provide the relational tree and a set of pluggable optimization rules. A rule is a self-contained code that defines the relational tree pattern it should be applied to and the transformation logic. The optimizer then applies the rules to the relational tree using some algorithm.

The main advantage of rule-based optimizers is extensibility. Instead of having a monolithic optimization algorithm, you split the optimizer into smaller self-contained rules. To extend the optimizer, you create a new rule that doesn't affect the rest of the optimizer code.

Rule-based optimizers could be either cost-based or heuristic. In cost-based optimizers, a particular transformation is chosen based on the estimated cost assigned to operators. Heuristic optimizers don't use costs and could produce arbitrary bad plans in the worst case.

Presto rule-based optimizer is a heuristic optimizer, although some specific rules may use costs to pick a single transformation from multiple alternatives. An example is the ReorderJoins rule that selects a single join order with the least cost from multiple alternatives.

We now describe the most important parts of the Presto rule-based optimizer: the `Memo` class, rule matching, and the search algorithm.

MEMO

MEMO is a data structure used primarily in cost-based optimizers to encode multiple alternative plans efficiently. From a bird's eyes view, the MEMO data structure works as follows: 

  1. Every operator in the initial relational tree is put into an equivalence group, which is also an operator. Inputs of operators are replaced with relevant groups.
  2. When a new transformation is available, we do not copy the whole plan but add the newly created operator to the appropriate equivalence group.
  3. When the optimization is finished and the cheapest plan is found, selected groups are replaced with normal operators. We call this process copy-out.

The main advantage of MEMO is that multiple alternative plans could be encoded in a very compact form. Consider that we want to explore different join orders for the query and to transform `a JOIN b` into `b JOIN a`. The original query tree:


Join1
  TableScan(a)
  TableScan(b)

When we initialize the MEMO, the plan is converted to the following form:


G1: [Join1(G2 x G3)]
G2: [TableScan(a)]
G3: [TableScan(b)]

To add an alternative join order, we add the new operator to the join's equivalence group. Now we have two equivalent operators in the group `G1`: `a JOIN b` and `b JOIN a`.


G1: [Join1(G2 x G3), Join2(G3 x G2)]
G2: [TableScan(a)]
G3: [TableScan(b)]

Finally, if `b JOIN a` happened to be cheaper than `a JOIN b`, we copy-out it from MEMO to produce the resulting tree:


Join2
  TableScan(b)
  TableScan(a)  

Presto also uses a MEMO-like data structure but in a slightly different way. There is the Memo class that stores groups. The optimizer initializes the `Memo`, which populates groups via a recursive traversal of the relational tree. However, every group in `Memo` may have only one operator. That is, Presto doesn't store multiple equivalent operators in a group. Instead, as we will see below, Presto unconditionally replaces the current operator with the transformed operator. Therefore, the `Memo` class in Presto is not a MEMO data structure in a classical sense because it doesn't track equivalent operators. In Presto, you may think of the group as a convenient wrapper over an operator, used mostly to track operators' reachability during the optimization process.

Rule Matching

To optimize the relational tree, you should provide the optimizer with one or more rules. Every rule in Presto implements the Rule interface.

First, the interface defines the pattern, which may target an arbitrary part of the tree. It could be a single operator (filter in the PruneFilterColumns rule), multiple operators (filter on top of the filter in the MergeFilters rule), an operator with a predicate (join pattern in the ReorderJoins rule), or anything else.

Second, the interface defines the transformation logic. The result of the transformation could be either a new operator that replaces the previous one or no-op if the rule failed to apply the transformation for whatever reason.

Search Algorithm

Now, as we understand the Presto rule-based optimizer's core concepts, let's take a look at the search algorithm.

  1. The `Memo` class is initialized with the original relational tree, as we discussed above.
  2. For every `Memo` group, starting with the root, the method exploreGroup is invoked. We look for rules that match the current operator and fire them. If a rule produces an alternative operator, it replaces the original operator unconditionally. The process continues until there are no more available transformations for the current operator. Then we optimize operators' inputs. If an alternative input is found, it may open up more optimizations for the parent operator, so we re-optimize the parent. Presto relies on timeouts to terminate the optimization process if some rules continuously replace each other's results. Think of `b JOIN a`, that replaces `a JOIN b`, that replaces `b JOIN a`, etc. You may run the TestIterativeOptimizer test to see this behavior in action.
  3. In the end, we copy-out the final plan from `Memo`.

This is it. The search algorithm is very simple and straightforward.

The main drawback is that the optimizer is heuristic and cannot consider multiple alternative plans concurrently. That is, at every point in time, Presto has only one plan that it may transform further. In the original paper from 2019, Facebook engineers mentioned that they explore an option to add a cost-based optimizer:

We are in the process of enhancing the optimizer to perform a more comprehensive exploration of the search space using a cost-based evaluation of plans based on the techniques introduced by the Cascades framework.

There is also a document dated back to 2017 with some design ideas around cost-based optimization.

Summary

In this blog post, we explored the design of the Presto optimizer. The optimization process is split into multiple sequential phases. Every phase accepts a relational tree and produces another relational tree. Most phases use a rule-based heuristic optimizer, while some rules rely on custom logic without rules. There were some thoughts to add the cost-based optimizer to Presto, but it hasn't happened yet.

In the second part of this series, we will explore the concrete optimization rules and custom phases of Presto's query optimization. Stay tuned!

We are always ready to help you with your SQL query optimizer design. Just let us know.