使用 MySQL 从 IMDB 数据库查询

文件

我写了一个 SQL 查询来回答以下问题:

在 IMBD 数据库中查找所有与 Yash Chopra 一起制作的电影比任何其他导演都多的演员。

示例架构:

person
(pid *
,name
);

m_cast
(mid *
,pid *
);

m_director
(mid*
,pid*
);

* = (component of) PRIMARY KEY

以下是我的查询:

WITH common_actors AS 
        (SELECT A.actor_id as actors, B.director_id as director_id, B.movies as movies_with_director,
        B.director_id as yash_chops_id, B.movies as movies_with_yash_chops FROM
        (SELECT M_Cast.PID as actor_id, M_Director.PID as director_id, COUNT(*) as movies from M_Cast
        left join M_Director  
        ON M_Cast.MID = M_Director.MID
        GROUP BY actor_id, director_id) A
        JOIN
        (SELECT M_Cast.PID as actor_id, M_Director.PID as director_id, COUNT(*) as movies from M_Cast
        left join M_Director  
        ON M_Cast.MID = M_Director.MID
        GROUP BY actor_id, director_id
        )B
        ON A.actor_id = B.actor_id
        WHERE B.director_id in (SELECT PID FROM Person WHERE Name LIKE 
        '%Yash%Chopra%'))

SELECT distinct actors as actor_id, movies_with_yash_chops as total_movies FROM common_actors
    WHERE actors NOT IN (SELECT actors FROM common_actors WHERE movies_with_director > movies_with_yash_chops)

并且由此获得的结果长度为:430 行。然而,获得的结果应该是 243 行的长度。任何人都可以建议我在查询中出错的地方吗?我的做法对吗?

示例结果:

    Actor name
  0 Sharib Hashmi
  1 Kulbir Badesron
  2 Gurdas Maan
  3 Parikshat Sahni
...
242 Ramlal Shyamlal

提前致谢!

草莓

考虑以下:

DROP TABLE IF EXISTS person;

CREATE TABLE person
(person_id SERIAL PRIMARY KEY
,name VARCHAR(20) NOT NULL UNIQUE
);

DROP TABLE IF EXISTS movie;

CREATE TABLE movie
(movie_id SERIAL PRIMARY KEY
,title VARCHAR(50) NOT NULL UNIQUE
);

DROP TABLE IF EXISTS m_cast;

CREATE TABLE m_cast
(movie_id INT NOT NULL
,person_id INT NOT NULL
,PRIMARY KEY(movie_id,person_id)
);

DROP TABLE IF EXISTS m_director;

CREATE TABLE m_director
(movie_id INT NOT NULL
,person_id INT NOT NULL
,PRIMARY KEY(movie_id,person_id)
);

INSERT INTO person (name) VALUES
('Steven Feelberg'),
('Manly Kubrick'),
('Alfred Spatchcock'),
('Fred Pitt'),
('Raphael DiMaggio'),
('Bill Smith');

INSERT INTO movie VALUES
(1,'Feelberg\'s Movie with Fred & Raph'),
(2,'Feelberg and Fred Ride Again'),
(3,'Kubrick shoots DiMaggio'),
(4,'Kubrick\'s Movie with Bill Smith'),
(5,'Spatchcock Presents Bill Smith');

INSERT INTO m_director VALUES
(1,1),
(2,1),
(3,2),
(4,2),
(5,3);

INSERT INTO m_cast VALUES
(1,4),
(1,5),
(2,4),
(3,5),
(4,6),
(5,6);

我包括电影表只是为了便于参考。它与实际问题无关。另请注意,此模型假设演员只列出一次,无论他们是否在给定的电影中担任多个角色。

以下查询询问“每个演员和导演合作的频率”...

演员是任何曾在任何电影中担任演员的人。导演是曾担任任何电影导演的任何人。

SELECT a.name actor
     , d.name director
     , COUNT(DISTINCT ma.movie_id) total
  FROM person d
  JOIN m_director md 
    ON md.person_id = d.person_id
  JOIN person a
  LEFT
  JOIN m_cast ma 
    ON ma.person_id = a.person_id
   AND ma.movie_id = md.movie_id
  JOIN m_cast x
    ON x.person_id = a.person_id
 GROUP
    BY actor
     , director;
     
+-------------------+-------------------+-------+
| actor             | director          | total |
+-------------------+-------------------+-------+
| Fred Pitt         | Alfred Spatchcock |     0 |
| Fred Pitt         | Manly Kubrick     |     0 |
| Fred Pitt         | Steven Feelberg   |     2 |
| Raphael DiMaggio  | Alfred Spatchcock |     0 |
| Raphael DiMaggio  | Manly Kubrick     |     1 |
| Raphael DiMaggio  | Steven Feelberg   |     1 |
| Bill Smith        | Alfred Spatchcock |     1 |
| Bill Smith        | Manly Kubrick     |     1 |
| Bill Smith        | Steven Feelberg   |     0 |
+-------------------+-------------------+-------+

通过观察,我们可以看出:

  • 唯一一个比其他导演更经常与费尔伯格合作的演员是弗雷德·普里特
  • 拉斐尔·迪卡普里奥 (Raphael DiCaprio) 和比尔·史密斯 (Bill Smith) 与两位导演(尽管导演不同)的合作频率相同

编辑:虽然我并没有认真提倡将其作为解决方案,但以下只是为了证明上面提供的内核确实是解决问题所需的全部内容......

SELECT x.*
  FROM 
     ( SELECT a.* 
 FROM 
    ( SELECT a.name actor
           , d.name director
           , COUNT(DISTINCT ma.movie_id) total
        FROM person d
        JOIN m_director md 
          ON md.person_id = d.person_id
        JOIN person a
        LEFT
        JOIN m_cast ma 
          ON ma.person_id = a.person_id
         AND ma.movie_id = md.movie_id
        JOIN m_cast x
          ON x.person_id = a.person_id
       GROUP
          BY actor
           , director
    ) a
 LEFT
 JOIN
    ( SELECT a.name actor
           , d.name director
           , COUNT(DISTINCT ma.movie_id) total
        FROM person d
        JOIN m_director md 
          ON md.person_id = d.person_id
        JOIN person a
        LEFT
        JOIN m_cast ma 
          ON ma.person_id = a.person_id
         AND ma.movie_id = md.movie_id
        JOIN m_cast x
          ON x.person_id = a.person_id
       GROUP
          BY actor
           , director
    ) b
   ON b.actor = a.actor
  AND b.director <> a.director 
  AND b.total > a.total
WHERE b.actor IS NULL
) x
LEFT JOIN
     ( SELECT a.* 
 FROM 
    ( SELECT a.name actor
           , d.name director
           , COUNT(DISTINCT ma.movie_id) total
        FROM person d
        JOIN m_director md 
          ON md.person_id = d.person_id
        JOIN person a
        LEFT
        JOIN m_cast ma 
          ON ma.person_id = a.person_id
         AND ma.movie_id = md.movie_id
        JOIN m_cast x
          ON x.person_id = a.person_id
       GROUP
          BY actor
           , director
    ) a
 LEFT
 JOIN
    ( SELECT a.name actor
           , d.name director
           , COUNT(DISTINCT ma.movie_id) total
        FROM person d
        JOIN m_director md 
          ON md.person_id = d.person_id
        JOIN person a
        LEFT
        JOIN m_cast ma 
          ON ma.person_id = a.person_id
         AND ma.movie_id = md.movie_id
        JOIN m_cast x
          ON x.person_id = a.person_id
       GROUP
          BY actor
           , director
    ) b
   ON b.actor = a.actor
  AND b.director <> a.director 
  AND b.total > a.total
WHERE b.actor IS NULL
) y
ON y.actor = x.actor AND y.director <> x.director
WHERE y.actor IS NULL;

+-----------+-----------------+-------+
| actor     | director        | total |
+-----------+-----------------+-------+
| Fred Pitt | Steven Feelberg |     2 |
+-----------+-----------------+-------+

这将返回每个演员的列表,以及他们最常合作的导演。在这种情况下,因为 Bill Smith 和 Raphael DiMaggio 最常与两位导演平等合作,所以他们被排除在结果之外。

解决您的问题的方法很简单,就是从该列表中选择 Yash Chopra 被列为主管的所有行。

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章

使用Java查询MySQL数据库

使用nodejs查询mysql数据库

Drupal 7使用If条件mysql数据库选择查询

使用大型mysql数据库改善查询更新

查询MySQL数据库并使用响应中的变量显示

如何使用 <= - mysql 从数据库中查询值

使用md5密码查询MySql数据库

使用prepareStatement.setDate查询MySQL数据库

使用属性从类内部查询mysql数据库

使用Moq模拟插入查询到MySQL数据库

使用查询更新mysql数据库文本

如何使用python中的API查询MySQL数据库?

使用sql子查询更新mysql数据库

如何使用查询备份MySQL数据库?

是否有可能在php中使用sqlite数据库和mysql数据库并相互运行查询?

查询MySQL数据库

查询MySQL数据库

如何使用简单的 HTML Dom 从 For 循环作为单个查询在 MYSQL 数据库中插入数据

使用从jQuery和PHP从MySQL数据库查询的JSON数据填充HTML选择字段

使用 MySQL 和 PHP 查询时出错,基于显示数据库中的数据

使用准备好的查询使用Go创建新的MySQL数据库

使用MySQL数据库查询结果使用RApache和Brew绘制R图

使用php查询mysql数据库以使用morris.js进行绘图

在Laravel 5.2中使用多个MySQL数据库连接查询关系存在

使用Java程序将mysql查询传递给数据库

Django:如何使用 MySQL 数据库检索最新的、唯一命名的记录的查询集

PHP system()使用MySQL在许多数据库上运行查询

使用PHP中的MYSQL查询将值插入数据库

如何使用Mysql和node.js进行同步数据库查询?