Overview of Materialize view in Rails
Understand performance improvement by using Materialize view in Rails
The most common question that a backend developer can hear is how one can improve application performance if the application is performing very slow. The answer to it could vary as there could be many answers to it. One of the answers to the question could be using Materialize view. So let's explore what exactly Materialize view is and how it can improve the performance of poorly performing Rails applications.
What is a SQL View?
Before jumping onto the Materialize view, let’s understand quickly what a SQL view is. So let’s assume a scenario where we have a blog post application in which users can add posts so that other users can like and comment on them. Now we have a frequent requirement to load recent posts with their associated comments and users as below,
class Post
...def self.recent_activity
Post
.joins(comments: [:user])
.where("posts.created_at < ?", DateTime.now)
.order('posts.updated_at ASC, comments.created_at ASC')
.select("posts.id as post_id, posts.description as
post_description, comments.id as comment_id,
comments.content as comment_content, users.id as user_id,
users.username as user_username")
.group("posts.id, comments.id, users.id")
endend# On Rails console
=> Post.recent_activity
Post Load (6.7ms) SELECT posts.id as post_id, posts.description as post_description, comments.id as comment_id, comments.content as comment_content, users.id as user_id, users.username as user_username FROM "posts" INNER JOIN "post_comments" ON "post_comments"."post_id" = "posts"."id" INNER JOIN "comments" ON "comments"."id" = "post_comments"."comment_id" INNER JOIN "users" ON "users"."id" = "comments"."user_id" WHERE (posts.created_at < '2022-09-11 18:30:00') GROUP BY posts.id, comments.id, users.id ORDER BY posts.updated_at ASC, comments.created_at
In this case, we can implement SQL view to encapsulate the big complex SQL query as explained in this blog post on understanding SQL view in Postgresql. But as mentioned in the blog, the query time remains the same even using a SQL view. For enhancing the overall query time we’ll need to use materialized view which we will be going to understand in this section.
Materialized View
The process of creating a materialized view is quite similar to creating a SQL view. So first of all, we will be using a Scenic gem to create the materialized view. And then to create a view we can use the command below,
rails generate scenic:view latest_activity_mat --materialized
This will create two files as below,
- db/views/latest_activity_mats_v01.sql
- db/migrate/20220919150638_create_latest_activity_mats.rb
Similar to a SQL view, the next thing is to add the SQL query for our operation in the newly generated .sql
extension file as below,
# db/views/latest_activities_v01.sqlSELECT
posts.id as post_id,
posts.description as post_description,
comments.id as comment_id,
comments.content as comment_content,
users.id as user_id,
users.username as user_username
FROM posts
INNER JOIN post_comments ON post_comments.post_id = posts.id
INNER JOIN comments
ON comments.id = post_comments.comment_id
INNER JOIN users
ON users.id = comments.user_id
WHERE (posts.created_at < NOW())
GROUP BY posts.id, comments.id, users.id
ORDER BY posts.updated_at ASC, comments.created_at
Along with this, the migration file will look something like below,
class CreateLatestActivityMats < ActiveRecord::Migration[7.0]
def change
create_view :latest_activity_mats, materialized: true
end
end
Cool! so now we can migrate with rails db:migrate
followed by the creation of latest_activity_mat.rb
file as below,
# app/models/latest_activity_mat.rbclass LatestActivityMat < ApplicationRecord
belongs_to :post
def self.refresh
Scenic.database.refresh_materialized_view(table_name, concurrently: false, cascade: false)
end def readonly?
true
end
end
Here two main thing to notice is that,
readonly?
: So as to make the model in read-only mode by the method which can be a private method as well and to restrict any writing of data (so no full CRUD operation, only retrieve data)self.refresh
: To refresh the Materialized view data when required, let’s discuss it in a while
One more optional thing that we can do is add a unique key for each record as self.primary_key = :comment_id
. Now for the association part, we can set has_many :latest_activities
in Post model, so we can have records for the latest posts with all required attributes as below,
# On Rails console=> Post.last.latest_activity_mats
Post Load (0.6ms) SELECT "posts".* FROM "posts" ORDER BY "posts"."id" DESC LIMIT $1 [["LIMIT", 1]]
LatestActivityMat Load (0.4ms) SELECT "latest_activity_mats".* FROM "latest_activity_mats" WHERE "latest_activity_mats"."post_id" = $1 [["post_id", 1]]
=>
[#<LatestActivityMat:0x000055bca717a0c0 post_id: 1, post_description: "AAAAAAAaaaa", comment_id: 1, comment_content: "coment test", user_id: 1, user_username: "root">,
#<LatestActivityMat:0x000055bca7122988 post_id: 1, post_description: "AAAAAAAaaaa", comment_id: 2, comment_content: "content 2", user_id: 1, user_username: "root">,
...]
Now we have a more maintainable structure since we have a separate model to access required records along with performance enhancement. So when we compare the overall response with some test data we can observe the response time between a normal query method and materialized view we get results as below,
user system total real
normal-method (0.5ms) 0.053656 0.000269 0.053925 ( 0.061923)
mat-view (0.2ms) 0.005111 0.000182 0.005293 ( 0.005339)
Since I have fewer data to load, the result is quite quick, yet we can see a big difference and the materialized view response time is awe-inspiring. It could show even 100 times faster response as well.
So basically, materialized view cache the data in a temporary table. But the table data does not get updated automatically for newly added/updated records. To update the materialized view data, the materialized view data needs a trigger to the refresh method manually (which may be automated with the scheduler). In our case, data can be refreshed by calling the refresh method as LatestActivityMat.refresh
which may take significant time and depends on the amount of data, system speed, etc.
So that’s how we can significantly improve the performance of the Rails application using Materialized view.
Stay tuned for more such interesting topics!