When a user submits a query to a database, the optimizer translates the query string to an intermediate representation (IR) and applies various transformations to find the optimal execution plan. Apache Calcite uses relational operators as the intermediate representation. In this blog post, we discuss the design of the relational operators in Apache Calcite.
A typical database may execute an SQL query in multiple ways, depending on the selected operators' order and algorithms. One crucial decision is the order in which the optimizer should join relations. In this blog post, we define the join ordering problem and estimate the complexity of join planning.
In this blog post, we will discuss what a cost of a query plan is and how it can drive optimizer decisions.
Query optimization is an expensive process that needs to explore multiple alternative ways to execute the query. The query optimization problem is NP-hard, with the number of possible plans growing exponentially with the query's complexity. This blog post will discuss memoization - an important technique that allows rule-based optimizers to consider billions of alternative plans in a reasonable time.
In this blog post, we discuss rule-based optimization - a common pattern to explore equivalent plans used by modern optimizers. Then we analyze the rule-based optimization in Apache Calcite, Presto, and CockroachDB.
Presto is an open-source distributed SQL query engine for big data. In this blog post series, we explore the internals of the Presto query optimizer. In the first part, we discuss the relational tree organization, the optimizer interface, and the design of the rule-based planner.
In this blog post, we will explore how to define and enforce custom physical properties (traits) in Apache Calcite.