When a user submits a query to a database, the optimizer translates the query string to an intermediate representation (IR) and applies various transformations to find the optimal execution plan. Apache Calcite uses relational operators as the intermediate representation. In this blog post, we discuss the design of the relational operators in Apache Calcite.
Query optimization is an expensive process that needs to explore multiple alternative ways to execute the query. The query optimization problem is NP-hard, with the number of possible plans growing exponentially with the query's complexity. This blog post will discuss memoization - an important technique that allows rule-based optimizers to consider billions of alternative plans in a reasonable time.
In this blog post, we discuss rule-based optimization - a common pattern to explore equivalent plans used by modern optimizers. Then we analyze the rule-based optimization in Apache Calcite, Presto, and CockroachDB.
Presto is an open-source distributed SQL query engine for big data. In this blog post series, we explore the internals of the Presto query optimizer. In the first part, we discuss the relational tree organization, the optimizer interface, and the design of the rule-based planner.
In this blog post, we will explore how to define and enforce custom physical properties (traits) in Apache Calcite.