User Preference Recommendation Systems
Scenarios
Recommendation systems are widely used on Internet platforms and in traditional industries. They analyze user behavior, preferences, and context information to provide personalized content recommendations. Common scenarios include:
- E-commerce and retail: Products are recommended based on users' browsing and purchase history (for example, "You May Like").
- Streaming media and content platforms: Content (such as movies and short videos) is recommended based on watching records and ratings.
- Travel services: Destinations are recommended based on user preferences.
- App recommendation: Apps are recommended based on users' app download and usage habits.
This section uses movies as an example to describe how to design a recommendation system database.
Prerequisites
You have installed the pg_trgm extension.
Design and Implementation
The pg_trgm extension of PostgreSQL provides string similarity calculation based on trigrams. You can use this extension to build a simple user recommendation system.
- Connect to your RDS for PostgreSQL instance.
- Create a test table on the instance.
CREATE TABLE movies ( id INT, -- Movie ID title VARCHAR(255), -- Movie name description TEXT, -- Movie description genres VARCHAR(255)[], -- Movie genre year INTEGER -- Movie year );
- Search for movies whose titles are similar to a specified movie title.
-- Create a GIN index to accelerate similarity queries. CREATE INDEX movies_title_gin_idx ON movies USING gin(title gin_trgm_ops); SELECT m2.id, m2.title, similarity(m1.title, m2.title) AS similarity_score FROM movies m1, movies m2 WHERE m1.id = 123 AND -- Target movie ID m1.id != m2.id AND similarity(m1.title, m2.title) > 0.3 -- Similarity threshold ORDER BY similarity_score DESC LIMIT 10;
- Search for movies based on the similarity of movie descriptions.
-- Create a GIN index for descriptions. CREATE INDEX movies_description_gin_idx ON movies USING gin(description gin_trgm_ops); SELECT m2.id, m2.title, similarity(m1.description, m2.description) AS similarity_score FROM movies m1, movies m2 WHERE m1.id = 123 AND m1.id != m2.id AND similarity(m1.description, m2.description) > 0.1 -- The description similarity threshold can be set to a small value. ORDER BY similarity_score DESC LIMIT 10;
- Search for movies based on multiple features such as the title, description, and genre.
SELECT m2.id, m2.title, ( 0.5 * similarity(m1.title, m2.title) + 0.3 * similarity(m1.description, m2.description) + 0.2 * (SELECT COUNT(*) FROM unnest(m1.genres) AS g1 JOIN unnest(m2.genres) AS g2 ON g1 = g2)::FLOAT / GREATEST(array_length(m1.genres, 1), array_length(m2.genres, 1)) ) AS combined_similarity FROM movies m1, movies m2 WHERE m1.id = 123 AND m1.id != m2.id ORDER BY combined_similarity DESC LIMIT 10;
- Limit the search scope based on the user configuration. For example, run the following SQL statements to search for only movies of the same type:
SELECT m2.id, m2.title, similarity(m1.title, m2.title) AS similarity_score FROM movies m1, movies m2 WHERE m1.id = 123 AND m1.id != m2.id AND m1.genres && m2.genres AND -- Must have at least one same genre. similarity(m1.title, m2.title) > 0.3 ORDER BY similarity_score DESC LIMIT 10;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot