edit-icon download-icon

Sample 1

Last Updated: Mar 27, 2017

Section 1: Create a temporary table

  1. %spark
  2. import org.apache.commons.io.IOUtils
  3. import java.net.URL
  4. import java.nio.charset.Charset
  5. // Zeppelin creates and injects sc (SparkContext) and sqlContext (HiveContext or SqlContext)
  6. // So you don't need create them manually
  7. // load bank data
  8. val bankText = sc.parallelize(
  9. IOUtils.toString(
  10. new URL("http://emr-sample-projects.oss-cn-hangzhou.aliyuncs.com/bank.csv"),
  11. Charset.forName("utf8")).split("\n"))
  12. case class Bank(age: Integer, job: String, marital: String, education: String, balance: Integer)
  13. val bank = bankText.map(s => s.split(";")).filter(s => s(0) != "\"age\"").map(
  14. s => Bank(s(0).toInt,
  15. s(1).replaceAll("\"", ""),
  16. s(2).replaceAll("\"", ""),
  17. s(3).replaceAll("\"", ""),
  18. s(5).replaceAll("\"", "").toInt
  19. )
  20. ).toDF()
  21. bank.registerTempTable("bank")

Section 2: Query the table structure

  1. %sql
  2. desc bank

Section 3: Query the number of employees of each age group below 30

  1. %sql select age, count(1) value from bank where age < 30 group by age order by age

Section 4: Query the information of employees at the age less than or equal to 20

  1. %sql select * from bank where age <= 20
Thank you! We've received your feedback.