李林超博客
首页
归档
留言
友链
动态
关于
归档
留言
友链
动态
关于
首页
大数据
正文
12.Hive经典练习题
Leefs
2021-12-06 PM
1524℃
0条
[TOC] ### 前言 对于Hive的基础查询操作,因为和其它SQL语句都大同小异,将不在进行详细介绍,小编在网上找了一些Hive查询的练习题,大家可以通过这些练习来更加熟练的掌握Hive的基础查询操作。 原文地址:[hive 50道经典练习](https://blog.csdn.net/qq_35831052/article/details/96424832) 本篇将通过**beeline 客户端**来运行Hive命令,如果没安装的可参考如下文章:[06.使用JDBC方式访问Hive](https://lilinchao.com/archives/1702.html) ### 一、数据准备 + **学生表(student)** | 学号(sid) | 学生姓名(sname) | 学生生日(sbirth) | 学生性别(ssex) | | ----------- | --------------- | ---------------- | ---------------- | | 01 | 赵雷 | 1990-01-01 | 男 | | 02 | 钱电 | 1990-12-21 | 男 | | 03 | 孙风 | 1990-05-20 | 男 | | 04 | 李云 | 1990-08-06 | 男 | | 05 | 周梅 | 1991-12-01 | 女 | | 06 | 吴兰 | 1992-03-01 | 女 | | 07 | 郑竹 | 1989-07-01 | 女 | | 08 | 王菊 | 1990-01-20 | 女 | + **课程表(course)** | 课程编号(cid) | 课程名称(cname) | 教师编号(tid) | | --------------- | ----------------- | --------------- | | 01 | 语文 | 02 | | 02 | 数学 | 01 | | 03 | 英语 | 03 | + **教师表(teacher)** | tid(教师编号) | tname(教师姓名) | | --------------- | ----------------- | | 01 | 张三 | | 02 | 李四 | | 03 | 王五 | + **成绩表(sc)** | 学号(sid) | 课程编号(cid) | 成绩(score) | | ----------- | --------------- | ------------- | | 01 | 01 | 80 | | 01 | 02 | 90 | | 01 | 03 | 99 | | 02 | 01 | 70 | | 02 | 02 | 60 | | 02 | 03 | 80 | | 03 | 01 | 80 | | 03 | 02 | 80 | | 03 | 03 | 80 | | 04 | 01 | 50 | | 04 | 02 | 30 | | 04 | 03 | 20 | | 05 | 01 | 76 | | 05 | 02 | 87 | | 06 | 01 | 31 | | 06 | 03 | 34 | | 07 | 02 | 89 | | 07 | 03 | 98 | + 建表语句 ```sql -- student.txt 01 赵雷 1990-01-01 男 02 钱电 1990-12-21 男 03 孙风 1990-05-20 男 04 李云 1990-08-06 男 05 周梅 1991-12-01 女 06 吴兰 1992-03-01 女 07 郑竹 1989-07-01 女 08 王菊 1990-01-20 女 -- course.txt 01 语文 02 02 数学 01 03 英语 03 -- teacher.txt 01 张三 02 李四 03 王五 -- sc.txt 01 01 80 01 02 90 01 03 99 02 01 70 02 02 60 02 03 80 03 01 80 03 02 80 03 03 80 04 01 50 04 02 30 04 03 20 05 01 76 05 02 87 06 01 31 06 03 34 07 02 89 07 03 98 -- 建表语句 -- 学生表 create table student(`sid` int,sname string,sbirth date,ssex char(2)) row format delimited fields terminated by '\t'; -- 导入 load data local inpath '/home/hadoop/datas/student.txt' into table student; -- 课程表 create table course(cid int,cname string,tid int) row format delimited fields terminated by '\t'; -- 导入 load data local inpath '/home/hadoop/datas/course.txt' into table course; -- 教师表 create table teacher(tid int,tname string) row format delimited fields terminated by '\t'; -- 导入 load data local inpath '/home/hadoop/datas/teacher.txt' into table teacher; -- 成绩表 create table sc(`sid` int,cid int,score int) row format delimited fields terminated by '\t'; -- 导入 load data local inpath '/home/hadoop/datas/sc.txt' into table sc; ``` ### 二、题目 #### 普通聚合函数练习 > 2.1 查询01课程比02课程成绩高的所有学生的学号 ```sql select sc1.sid,sc1.score score1,sc2.score score2 from sc sc1 join sc sc2 on sc1.sid=sc2.sid where sc1.cid = 1 and sc2.cid = 2 and sc1.score>sc2.score; -- 运行结果 +----------+---------+---------+ | sc1.sid | score1 | score2 | +----------+---------+---------+ | 2 | 70 | 60 | | 4 | 50 | 30 | +----------+---------+---------+ ``` > 2.2 查询平均成绩大于60分的同学的学号和平均成绩 ```sql select sid,avg(score) avgscore from sc group by sid having avgscore>60; -- 运行结果 +------+--------------------+ | sid | avgscore | +------+--------------------+ | 1 | 89.66666666666667 | | 2 | 70.0 | | 3 | 80.0 | | 5 | 81.5 | | 7 | 93.5 | +------+--------------------+ ``` **having 与 where 不同点** (1)where后面不能写分组函数,而having后面可以使用分组函数。 (2)having 只用于group by分组统计语句。 > 2.3 查询所有同学的学号、姓名、选课数、总成绩 ```sql select stu.sid,stu.sname, count(sc.cid) countcourse, case when sum(sc.score) is null then 0 else sum(sc.score) end sumscore from student stu left join sc on sc.sid=stu.sid group by stu.sid,stu.sname; -- 查询结果 +----------+------------+--------------+-----------+ | stu.sid | stu.sname | countcourse | sumscore | +----------+------------+--------------+-----------+ | 1 | 赵雷 | 3 | 269 | | 2 | 钱电 | 3 | 210 | | 3 | 孙风 | 3 | 240 | | 4 | 李云 | 3 | 100 | | 5 | 周梅 | 2 | 163 | | 6 | 吴兰 | 2 | 65 | | 7 | 郑竹 | 2 | 187 | | 8 | 王菊 | 0 | 0 | +----------+------------+--------------+-----------+ ``` > 2. 4 查询姓'李'的老师的个数 ```sql select count(*) from teacher where tname like '李%'; -- 查询结果 +------+ | _c0 | +------+ | 1 | +------+ ``` > 2.5 查询没有学过“张三”老师课程的同学的学号、姓名 ```sql select stu.sid,sname from student stu join course cs join teacher t left join sc on stu.sid=sc.sid and cs.cid=sc.cid and t.tid=cs.tid where tname='张三' group by stu.sid,sname having sum(case when sc.sid is null then 0 else 1 end) =0; -- 运行结果 +----------+--------+ | stu.sid | sname | +----------+--------+ | 6 | 吴兰 | | 8 | 王菊 | +----------+--------+ ``` > 2.6 查询学过“张三”老师所教的所有课的同学的学号、姓名 ```sql select stu.sid,sname from student stu join course cs join teacher t left join sc on stu.sid=sc.sid and cs.cid=sc.cid and t.tid=cs.tid where tname='张三' group by stu.sid,sname having sum(case when sc.sid is null then 0 else 1 end) !=0; -- 查询结果 +----------+--------+ | stu.sid | sname | +----------+--------+ | 1 | 赵雷 | | 2 | 钱电 | | 3 | 孙风 | | 4 | 李云 | | 5 | 周梅 | | 7 | 郑竹 | +----------+--------+ ``` > 2.7 查询学过01并且也学过编号02课程的同学的学号、姓名 ```sql select stu.sid,stu.sname from student stu join sc sc1 on sc1.sid=stu.sid join sc sc2 on sc2.sid=sc1.sid where sc1.cid=01 and sc2.cid=02; -- 查询结果 +----------+------------+ | stu.sid | stu.sname | +----------+------------+ | 1 | 赵雷 | | 2 | 钱电 | | 3 | 孙风 | | 4 | 李云 | | 5 | 周梅 | +----------+----------- ``` > 2.8 查询课程编号02的成绩比课程编号01课程成绩低的所有同学的学号、姓名 ```sql select t1.sid,sname from (select sid,score score02 from SC where cid=02)t1 join (select sid,score score01 from SC where cid=01)t2 on t1.sid=t2.sid join Student on Student.sid = t1.sid where t1.score02
2.9 查询所有课程成绩小于60的同学的学号、姓名 ```sql select stu.sid,stu.sname from student stu left join sc on sc.sid=stu.sid and sc.score>=60 group by stu.sid,stu.sname having sum(case when sc.sid is null then 0 else 1 end)=0; -- 查询结果 +----------+------------+ | stu.sid | stu.sname | +----------+------------+ | 4 | 李云 | | 6 | 吴兰 | | 8 | 王菊 | +----------+------------+ ``` > 2.10 查询没有学全所有课的同学的学号、姓名 ```sql select stu.sid,stu.sname from student stu left join course cs left join sc on sc.sid=stu.sid and cs.cid=sc.cid group by stu.sid,stu.sname having sum(case when sc.cid is null then 1 else 0 end)>0; -- 查询结果 +----------+------------+ | stu.sid | stu.sname | +----------+------------+ | 5 | 周梅 | | 6 | 吴兰 | | 7 | 郑竹 | | 8 | 王菊 | +----------+------------+ ``` > 2.11 查询至少有一门课与学号为01同学所学相同的同学的学号和姓名 ```sql select distinct st.sid,st.sname from student st join sc sc1 on st.sid=sc1.sid join sc sc2 on sc1.cid=sc2.cid where sc2.sid=1; -- 查询结果 +---------+-----------+ | st.sid | st.sname | +---------+-----------+ | 1 | 赵雷 | | 2 | 钱电 | | 3 | 孙风 | | 4 | 李云 | | 5 | 周梅 | | 6 | 吴兰 | | 7 | 郑竹 | +---------+-----------+ ``` > 2.12 查询张三老师教的课的平均成绩 ```sql select avg(score) avgscore from sc join course co on sc.cid=co.cid join teacher t on t.tid=co.tid where tname='张三'; -- 查询结果 +--------------------+ | avgscore | +--------------------+ | 72.66666666666667 | +--------------------+ ``` > 2.13 查询学习“张三”老师课的成绩表记录 ```sql select sc.* from sc join course c on sc.cid=c.cid join teacher t on c.tid=t.tid where t.tname='张三'; -- 查询结果 +---------+---------+-----------+ | sc.sid | sc.cid | sc.score | +---------+---------+-----------+ | 1 | 2 | 90 | | 2 | 2 | 60 | | 3 | 2 | 80 | | 4 | 2 | 30 | | 5 | 2 | 87 | | 7 | 2 | 89 | +---------+---------+-----------+ ``` > 2.14 查询没有上过编号03课程的同学学号的02号课的成绩 ```sql select sc.* from sc left join (select * from sc where sc.cid = '3') sc2 on sc.sid =sc2.sid where sc2.cid is null and sc.cid=2; -- 查询结果 +---------+---------+-----------+ | sc.sid | sc.cid | sc.score | +---------+---------+-----------+ | 5 | 2 | 87 | +---------+---------+-----------+ ``` > 2.15 按平均成绩从高到低显示所有学生的“语文”、“数学”、“英语”三门的课程成绩 ```sql select sc.sid, max(case course.cname when '语文' then sc.score else 0 end) yuwen, max(case course.cname when '数学' then sc.score else 0 end) shuxue, max(case course.cname when '英语' then sc.score else 0 end) yingyu, count(sc.cid) kechengshu, avg(sc.score) pingjunfen from sc join course on sc.cid=course.cid group by sc.sid order by pingjunfen; -- 查询结果 +---------+--------+---------+---------+-------------+---------------------+ | sc.sid | yuwen | shuxue | yingyu | kechengshu | pingjunfen | +---------+--------+---------+---------+-------------+---------------------+ | 6 | 31 | 0 | 34 | 2 | 32.5 | | 4 | 50 | 30 | 20 | 3 | 33.333333333333336 | | 2 | 70 | 60 | 80 | 3 | 70.0 | | 3 | 80 | 80 | 80 | 3 | 80.0 | | 5 | 76 | 87 | 0 | 2 | 81.5 | | 1 | 80 | 90 | 99 | 3 | 89.66666666666667 | | 7 | 0 | 89 | 98 | 2 | 93.5 | +---------+--------+---------+---------+-------------+---------------------+ ``` > 2.16 查询各科成绩最高和最低的分:以如下的形式显示:课程ID,最高分,最低分 ```sql select cid, max(score) maxscore, min(case when score is null then 0 else score end) minscore from sc group by cid; -- 查询结果 +------+-----------+-----------+ | cid | maxscore | minscore | +------+-----------+-----------+ | 1 | 80 | 31 | | 2 | 90 | 30 | | 3 | 99 | 20 | +------+-----------+-----------+ ``` > 2.17 按各科平均成绩从低到高和及格率的百分数从高到低顺序 ```sql select avg(score) avgscore, concat(cast(sum(case when score >= 60 then 1 else 0 end)/count(sc.sid) as string),'%') jigelv from sc group by cid order by avgscore asc,jigelv desc; -- 查询结果 +--------------------+----------------------+ | avgscore | jigelv | +--------------------+----------------------+ | 64.5 | 0.6666666666666666% | | 68.5 | 0.6666666666666666% | | 72.66666666666667 | 0.8333333333333334% | +--------------------+----------------------+ ``` > 2.18 查询如下课程平均成绩和及格率的百分数(用”1行”显示): 语文(01),数学(02),英语(03) ```sql select max(case t1.cid when 1 then concat(t1.avgscore,':',jigelv) else 0 end) as yuwen, max(case t1.cid when 2 then concat(t1.avgscore,':',jigelv) else 0 end) as shuxue, max(case t1.cid when 3 then concat(t1.avgscore,':',jigelv) else 0 end) as yingyu from (select sc.cid,avg(score) avgscore, concat(cast(sum(case when score >= 60 then 1 else 0 end)*100/count(sc.sid) as string),'%') jigelv from sc join course cs on sc.cid=cs.cid group by sc.cid,cs.cname having cs.cname='语文' or cs.cname='数学' or cs.cname='英语') t1; -- 查询结果 yuwen | shuxue | yingyu 64.5:66.66666666666667% | 72.66666666666667:83.33333333333333% | 68.5:66.66666666666667% ``` > 2.19 查询如下课程成绩第3名到第6名的学生成绩单:语文(01),数学(02),英语(03) ```sql select a.* from ( select sc.*, rank() over(distribute by sc.cid sort by sc.score desc) rk from sc) a where a.rk between 3 and 6; -- 查询结果 +--------+--------+----------+-------+ | a.sid | a.cid | a.score | a.rk | +--------+--------+----------+-------+ | 5 | 1 | 76 | 3 | | 2 | 1 | 70 | 4 | | 4 | 1 | 50 | 5 | | 6 | 1 | 31 | 6 | | 5 | 2 | 87 | 3 | | 3 | 2 | 80 | 4 | | 2 | 2 | 60 | 5 | | 4 | 2 | 30 | 6 | | 2 | 3 | 80 | 3 | | 3 | 3 | 80 | 3 | | 6 | 3 | 34 | 5 | | 4 | 3 | 20 | 6 | +--------+--------+----------+-------+ ``` > 2.20 统计下列各科成绩,各分数段人数:课程ID,课程名称,[100-85],[85-70],[70-60],[ 小于60] ```sql select a.cid,a.cname,a.px, count(a.px) from ( select cs.cid,cs.cname, (case when score<60 then '[小于60]' when score<70 then '[70-60]' when score<85 then '[85-70]' else '[100-85]' end) as px from sc join course cs on sc.cid=cs.cid ) a group by a.cid,a.cname,a.px; -- 查询结果 +--------+----------+-----------+------+ | a.cid | a.cname | a.px | _c3 | +--------+----------+-----------+------+ | 1 | 语文 | [85-70] | 4 | | 1 | 语文 | [小于60] | 2 | | 2 | 数学 | [100-85] | 3 | | 2 | 数学 | [70-60] | 1 | | 2 | 数学 | [85-70] | 1 | | 2 | 数学 | [小于60] | 1 | | 3 | 英语 | [100-85] | 2 | | 3 | 英语 | [85-70] | 2 | | 3 | 英语 | [小于60] | 2 | +--------+----------+-----------+------+ ``` #### 开窗函数练习 > 2.21 查询学生平均成绩及其名次 ```sql select a.*, rank() over(distribute by 1 sort by a.avgscore) rk from ( select sc.sid, avg(sc.score) avgscore from sc group by sc.sid ) a; -- 查询结果 +--------+---------------------+-----+ | a.sid | a.avgscore | rk | +--------+---------------------+-----+ | 6 | 32.5 | 1 | | 4 | 33.333333333333336 | 2 | | 2 | 70.0 | 3 | | 3 | 80.0 | 4 | | 5 | 81.5 | 5 | | 1 | 89.66666666666667 | 6 | | 7 | 93.5 | 7 | +--------+---------------------+-----+ ``` > 2.22 查询各科成绩前三名的记录(不考虑成绩并列情况) ```sql select a.*,cs.cname from ( select sc.*, row_number() over(distribute by sc.cid sort by sc.score desc) rk from sc ) a join course cs on cs.cid=a.cid where a.rk<4; -- 查询结果 +--------+--------+----------+-------+-----------+ | a.sid | a.cid | a.score | a.rk | cs.cname | +--------+--------+----------+-------+-----------+ | 3 | 1 | 80 | 1 | 语文 | | 1 | 1 | 80 | 2 | 语文 | | 5 | 1 | 76 | 3 | 语文 | | 1 | 2 | 90 | 1 | 数学 | | 7 | 2 | 89 | 2 | 数学 | | 5 | 2 | 87 | 3 | 数学 | | 1 | 3 | 99 | 1 | 英语 | | 7 | 3 | 98 | 2 | 英语 | | 2 | 3 | 80 | 3 | 英语 | +--------+--------+----------+-------+-----------+ ``` > 2.23 查询每门课程成绩最好的前两名 ```sql select sid,cid,score from ( select sc.*, rank() over(distribute by cid sort by score) rk from sc ) aa where aa.rk<3; -- 查询结果 +------+------+--------+ | sid | cid | score | +------+------+--------+ | 6 | 1 | 31 | | 4 | 1 | 50 | | 4 | 2 | 30 | | 2 | 2 | 60 | | 4 | 3 | 20 | | 6 | 3 | 34 | +------+------+--------+ ```
标签:
Hadoop
,
Hive
非特殊说明,本博所有文章均为博主原创。
如若转载,请注明出处:
https://lilinchao.com/archives/1718.html
上一篇
11.Hive DML数据操作
下一篇
13.Hive分区表详细介绍
评论已关闭
栏目分类
随笔
2
Java
326
大数据
229
工具
31
其它
25
GO
47
NLP
4
标签云
Map
HDFS
gorm
Nacos
Hbase
Shiro
Livy
数据结构
JavaSE
SQL练习题
JavaWEB项目搭建
随笔
Zookeeper
SpringCloud
Kibana
MyBatisX
国产数据库改造
链表
算法
Typora
MySQL
MyBatis
ajax
Spark
CentOS
JavaWeb
栈
Linux
FastDFS
Java编程思想
友情链接
申请
范明明
庄严博客
Mx
陶小桃Blog
虫洞
评论已关闭